Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 2/4] arm64: arch_timer: Introduce a generic erratum handing mechanism for fsl-a008585
From: Marc Zyngier @ 2016-10-27 10:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1477553651-13428-2-git-send-email-dingtianhong@huawei.com>

On 27/10/16 08:34, Ding Tianhong wrote:
> The workaround for hisilicon,161601 will check the return value of the system counter
> by different way, in order to distinguish with the fsl-a008585 workaround, introduce
> a new generic erratum handing mechanism for fsl-a008585 and rename some functions. 
> 
> v2: Introducing a new generic erratum handling mechanism for fsl erratum a008585.
> 
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> ---
>  arch/arm64/include/asm/arch_timer.h  | 20 +++++++++-----
>  drivers/clocksource/arm_arch_timer.c | 51 +++++++++++++++++++++---------------
>  2 files changed, 43 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
> index eaa5bbe..118719d8 100644
> --- a/arch/arm64/include/asm/arch_timer.h
> +++ b/arch/arm64/include/asm/arch_timer.h
> @@ -31,15 +31,21 @@
>  
>  #if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585)
>  extern struct static_key_false arch_timer_read_ool_enabled;
> -#define needs_fsl_a008585_workaround() \
> +#define needs_timer_erratum_workaround() \
>  	static_branch_unlikely(&arch_timer_read_ool_enabled)

This is too generic a name. Please make it more specific.

>  #else
> -#define needs_fsl_a008585_workaround()  false
> +#define needs_timer_erratum_workaround()  false
>  #endif
>  
> -u32 __fsl_a008585_read_cntp_tval_el0(void);
> -u32 __fsl_a008585_read_cntv_tval_el0(void);
> -u64 __fsl_a008585_read_cntvct_el0(void);
> +
> +struct arch_timer_erratum_workaround {
> +	int erratum;

What is the use of this field?

> +	u32 (*read_cntp_tval_el0)(void);
> +	u32 (*read_cntv_tval_el0)(void);
> +	u64 (*read_cntvct_el0)(void);
> +};
> +
> +extern struct arch_timer_erratum_workaround *erratum_workaround;

This is a very generic name for something that has a global visibility.
Please choose a more specific variable name.

>  
>  /*
>   * The number of retries is an arbitrary value well beyond the highest number
> @@ -62,8 +68,8 @@ u64 __fsl_a008585_read_cntvct_el0(void);
>  #define arch_timer_reg_read_stable(reg) 		\
>  ({							\
>  	u64 _val;					\
> -	if (needs_fsl_a008585_workaround())		\
> -		_val = __fsl_a008585_read_##reg();	\
> +	if (needs_timer_erratum_workaround())		\
> +		_val = erratum_workaround->read_##reg();	\

As mentioned in my initial review, you've now broken modular access to
any of the registers. Please fix it.

>  	else						\
>  		_val = read_sysreg(reg);		\
>  	_val;						\
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index 73c487d..e4f7fa1 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -95,10 +95,32 @@ early_param("clocksource.arm_arch_timer.evtstrm", early_evtstrm_cfg);
>   */
>  
>  #ifdef CONFIG_FSL_ERRATUM_A008585
> +struct arch_timer_erratum_workaround *erratum_workaround = NULL;
> +
>  DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled);
>  EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled);
>  
> -static int fsl_a008585_enable = -1;
> +
> +static u32 fsl_a008585_read_cntp_tval_el0(void)
> +{
> +	return __fsl_a008585_read_reg(cntp_tval_el0);
> +}
> +
> +static  u32 fsl_a008585_read_cntv_tval_el0(void)
> +{
> +	return __fsl_a008585_read_reg(cntv_tval_el0);
> +}
> +
> +static u64 fsl_a008585_read_cntvct_el0(void)
> +{
> +	return __fsl_a008585_read_reg(cntvct_el0);
> +}

So now that you've indirected all calls through a set of pointers, why
do you keep the __fsl_a008585_read_reg() macro inside the include file?
I don't think it has any purpose in being globally visible now.

> +
> +static struct arch_timer_erratum_workaround arch_timer_fsl_a008585 = {

And here's the proof that the erratum field is useless.

> +	.read_cntp_tval_el0 = fsl_a008585_read_cntp_tval_el0,
> +	.read_cntv_tval_el0 = fsl_a008585_read_cntv_tval_el0,
> +	.read_cntvct_el0 = fsl_a008585_read_cntvct_el0,
> +};
>  
>  static int __init early_fsl_a008585_cfg(char *buf)
>  {
> @@ -109,26 +131,12 @@ static int __init early_fsl_a008585_cfg(char *buf)
>  	if (ret)
>  		return ret;
>  
> -	fsl_a008585_enable = val;
> +	if (val)
> +		erratum_workaround = &arch_timer_fsl_a008585;
> +
>  	return 0;
>  }
>  early_param("clocksource.arm_arch_timer.fsl-a008585", early_fsl_a008585_cfg);
> -
> -u32 __fsl_a008585_read_cntp_tval_el0(void)
> -{
> -	return __fsl_a008585_read_reg(cntp_tval_el0);
> -}
> -
> -u32 __fsl_a008585_read_cntv_tval_el0(void)
> -{
> -	return __fsl_a008585_read_reg(cntv_tval_el0);
> -}
> -
> -u64 __fsl_a008585_read_cntvct_el0(void)
> -{
> -	return __fsl_a008585_read_reg(cntvct_el0);
> -}
> -EXPORT_SYMBOL(__fsl_a008585_read_cntvct_el0);
>  #endif /* CONFIG_FSL_ERRATUM_A008585 */
>  
>  static __always_inline
> @@ -891,9 +899,10 @@ static int __init arch_timer_of_init(struct device_node *np)
>  	arch_timer_c3stop = !of_property_read_bool(np, "always-on");
>  
>  #ifdef CONFIG_FSL_ERRATUM_A008585
> -	if (fsl_a008585_enable < 0)
> -		fsl_a008585_enable = of_property_read_bool(np, "fsl,erratum-a008585");
> -	if (fsl_a008585_enable) {
> +	if (!erratum_workaround && of_property_read_bool(np, "fsl,erratum-a008585"))
> +		erratum_workaround = &arch_timer_fsl_a008585;

It used to be possible to disable the erratum workaround on the command
line, and you've just removed that feature. Please restore it.

> +
> +	if (erratum_workaround) {
>  		static_branch_enable(&arch_timer_read_ool_enabled);
>  		pr_info("Enabling workaround for FSL erratum A-008585\n");
>  	}
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH V5 0/3] ACPI,PCI,IRQ: revert penalty calculation for ISA and SCI interrupts
From: Rafael J. Wysocki @ 2016-10-27 10:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1477283492-26657-1-git-send-email-okaya@codeaurora.org>

On Mon, Oct 24, 2016 at 6:31 AM, Sinan Kaya <okaya@codeaurora.org> wrote:
>
> By the time ACPI gets initialized, this code tries to determine an
> IRQ number based on penalty values in this array. It will try to locate
> the IRQ with the least penalty assignment so that interrupt sharing is
> avoided if possible.
>
> A couple of notes about the external APIs:
> 1. These API can be called before the ACPI is started. Therefore, one
> cannot assume that the PCI link objects are initialized for calculating
> penalties.
> 2. The polarity and trigger information passed via the
> acpi_penalize_sci_irq from the BIOS may not match what the IRQ subsystem
> is reporting as the call might have been placed before the IRQ is
> registered by the interrupt subsystem.
>
> The reverted changes were in the direction to remove these external API and
> try to calculate the penalties at runtime for the ISA, SCI as well as PCI
> IRQS.
> This didn't work out well with the existing platforms.
>
> V5:
> * clean up the commit message for 1/3 and 2/3
> * drop 3/3
> * bring back ACPI, PCI IRQ: add PCI_USING penalty for ISA interrupts as 3/3
>
> V4:
> https://www.spinics.net/lists/arm-kernel/msg537158.html
> * Drop ACPI, PCI IRQ: add PCI_USING penalty for ISA interrupts
> * A new patch to isolate early boot ISA penalty calculations from dynamic
> penalty calculation by directly modifying the array members in
> ("ACPI, PCI, IRQ: assign ISA IRQ directly during early boot stages")
> * Now that we isolated both SCI and ISA interrupts, revert commit ("Revert
> "ACPI,PCI,IRQ: separate ISA penalty calculation"") and commit
> ("487cf917ed0d
> (Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()")
> to share code between ISA and PCI penalties as originally intended.
>
> V3:
> http://www.spinics.net/lists/arm-kernel/msg536208.html
> * drop patch #1 as discussed with Bjorn
> * add patch #3 to track SCI irq and penalty separately
>
> V2: https://lkml.org/lkml/2016/10/1/106
> * Commit message updates
>
> V1:
> http://lists-archives.com/linux-kernel/28673954-revert-acpi-pci-irq-reduce-static-irq-array-size-to-16.html
> * initial implementation
>
> Sinan Kaya (3):
>   ACPI, PCI, IRQ: assign ISA IRQ directly during early boot stages
>   ACPI: pci_link: penalize SCI correctly
>   ACPI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs
>
>  arch/x86/kernel/acpi/boot.c |  1 +
>  drivers/acpi/pci_link.c     | 38 +++++++++++++++++++++-----------------
>  include/linux/acpi.h        |  1 +
>  3 files changed, 23 insertions(+), 17 deletions(-)

I've queued up this series for the next ACPI pull request with 4.9 fixes.

Thanks,
Rafael

^ permalink raw reply

* [PATCH] arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
From: Marc Zyngier @ 2016-10-27 10:40 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161027100428.GA17829@cbox>

On 27/10/16 11:04, Christoffer Dall wrote:
> On Thu, Oct 27, 2016 at 10:49:00AM +0100, Marc Zyngier wrote:
>> Hi Christoffer,
>>
>> On 27/10/16 10:19, Christoffer Dall wrote:
>>> On Mon, Oct 24, 2016 at 04:31:28PM +0100, Marc Zyngier wrote:
>>>> Architecturally, TLBs are private to the (physical) CPU they're
>>>> associated with. But when multiple vcpus from the same VM are
>>>> being multiplexed on the same CPU, the TLBs are not private
>>>> to the vcpus (and are actually shared across the VMID).
>>>>
>>>> Let's consider the following scenario:
>>>>
>>>> - vcpu-0 maps PA to VA
>>>> - vcpu-1 maps PA' to VA
>>>>
>>>> If run on the same physical CPU, vcpu-1 can hit TLB entries generated
>>>> by vcpu-0 accesses, and access the wrong physical page.
>>>>
>>>> The solution to this is to keep a per-VM map of which vcpu ran last
>>>> on each given physical CPU, and invalidate local TLBs when switching
>>>> to a different vcpu from the same VM.
>>>
>>> Just making sure I understand this:  The reason you cannot rely on the
>>> guest doing the necessary distinction with ASIDs or invalidating the TLB
>>> is that a guest (which assumes it's running on hardware) can validly
>>> defer any neccessary invalidation until it starts running on other
>>> physical CPUs, but we do this transparently in KVM?
>>
>> The guest wouldn't have to do any invalidation at all on real HW,
>> because the TLBs are strictly private to a physical CPU (only the
>> invalidation can be broadcast to the Inner Shareable domain). But when
>> we multiplex two vcpus on the same physical CPU, we break the private
>> semantics, and a vcpu could hit in the TLB entries populated by the
>> another one.
> 
> Such a guest would be using a mapping of the same VA with the same ASID
> on two separate CPUs, each pointing to a separate PA.  If it ever were
> to, say, migrate a task, it would have to do invalidations then.  Right?

This doesn't have to be ASID tagged. Actually, it is more likely to
affect global mappings. Imagine for example that the kernel (which uses
global mappings for its own page tables) decides to create per-cpu
variable using this trick (all the CPUs have the same VA, but use
different PAs). No invalidation at all, everything looks perfectly fine,
until you start virtualizing it.

> Does Linux or other guests actually do this?

Linux may hit it with CPU hotplug, which uses global mappings (which a
vcpu using an ASID tagged mapping could then hit if the VAs overlap).

> 
> I would suspect Linux has to eventually invalidate those mappins if it
> wants the scheduler to be allowed to freely move things around.
> 
>>
>> As we cannot segregate the TLB entries per vcpu (but only per VMID), the
>> workaround is to nuke all the TLBs for this VMID (locally only - no
>> broadcast) each time we find that two vcpus are sharing the same
>> physical CPU.
>>
>> Is that clearer?
> 
> Yes, the fix is clear, just want to make sure I understand that it's a
> valid circumstance where this actually happens.  But in either case, we
> probably have to fix this to emulate the hardware correctly.
> 
> Another fix would be to allocate a VMID per VCPU I suppose, just to
> introduce a terrible TLB hit ratio :)

But that would break TLB invalidations that are broadcast in the Inner
Shareable domain. To do so, you'd have to trap every TBLI, and issue
corresponding invalidations for all the vcpus. I'm not sure I want to
see the performance number of that solution... ;-)

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH 4/9] pinctrl: meson: allow gpio to request irq
From: Jerome Brunet @ 2016-10-27 10:42 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CACRpkdZR8T8W5WMmEJVhPR0sUQyfPzgHpjhDiRsyXvybKDYJWQ@mail.gmail.com>

On Wed, 2016-10-26 at 16:44 +0200, Linus Walleij wrote:
> On Wed, Oct 26, 2016 at 4:23 PM, Jerome Brunet <jbrunet@baylibre.com>
> wrote:
> > 
> > [Me]
> > > 
> > > We usually refer to the local numberspace on the GPIO controller
> > > as "offsets", so line offsets 0...31 on a gpiochip with 31 lines.
> > > 
> > > The ngpio in struct gpio_chip is the number of lines on that
> > > controller,
> > > and should nominally map 1:1 to hwirq sources.
> > 
> > Indeed it should be the the case, and for meson, it is pretty
> > close.
> > The irqchip controller provide a number of hwirq. Each hwirq maps
> > to
> > one, and only one, pin. But since not every pins are connected to
> > the
> > irqchip controller, the opposite is not true.
> > 
> > Taking an example with 16 gpios, here is what it could look like
> > with
> > the exception we have on meson :
> > 
> > gpio offset [ 0??1??2??3??4??5??6??7??8??9??10 11 12 13 14 15 ]
> > hwirq num???[ 0??1??2??3] NC NC[4??5??6??7??8??9??10]NC NC NC
> > 
> > Like gpio offset are used (internally) in the driver to find
> > appropriate gpio registers and bit, the hwirq has a meaning too.
> > It is the setting you put in the channel multiplexer of the
> > controller
> > to select the proper pin to spy on.
> > 
> > In the end, these gpio offset and hwirq number are different. I
> > would
> > prefer to have hwirq == gpio and go your way, it would make my life
> > easier, but I don't see how it would work.
> > 
> > The irqchip controller cares only about the hwirq number. You can
> > actually request an interrupt directly to the controller by asking
> > the
> > proper hwirq number (in DT for example), without involving the gpio
> > driver (tested).
> > 
> > The relation between the pins and the interrupt number is provided
> > by
> > the manufacturer in the Datasheet [1], in the section GPIO
> > Interrupt.
> 
> I think I kind of get it.
> 
> This reminds me of recent patches to the generic GPIOLIB_IRQCHIP
> where we made it possible to "mask" set of IRQ lines, saying
> "these are in the irqdomain, but you cannot map them".
> 
> See
> commit 79b804cb6af4f128b2c53f0887c02537a7eb5824
> "gpiolib: Make it possible to exclude GPIOs from IRQ domain"
> commit 96b2cca64fa3e1a31b573bb308b2944c802aacf3
> "gpio: stmpe: forbid unused lines to be mapped as IRQs"
> 
> So what we do in the generic case is put a linear map over all
> the lines/offsets, then punch holes in that map and say
> "this and this and this can not be mapped as IRQ".
> 
> As you can see in _gpiochip_irqchip_add() we pre-map all
> except those with irq_create_mapping().
> 
> Does this scheme fit with your usecase? I would guess so,
> just that your domain is hierarchic, not simple/linear.

Thanks for pointing this out, however I don't think this solve my
issue. I'll try to be as clear as possible but feel free to ask me
question if needed:

Ressource issue : When you create an irq mapping, in case of hierarchic
domain, it calls the "alloc" function of the domain, which will
eventually call the "alloc" function of the parent domain ... until you
reach the "root" domain (here the gic).

The particular HW at hand (meson gpio interrupt controller) is a set of
8 muxes (or channels). Each channel output its signal on 1 specific GIC
input. Its the HW wiring, not programmable.
The inputs are the all pad that can be seen by the controller (*almost*
all the SoC gpio, but not all, as explain earlier). When you call
"alloc", the driver find an available channel, set the mux input to
forward the appropriate signal to the GIC.

As you may understand, the driver can accept only 8 mappings at a time
before being out of GIC irqs, and returning -ENOSPC.

If we were trying the use the punch hole method, we would have to know
at boot time the only eight pin we want, and this for every platform.
Also there not might be 8 available to the gpio subsys, since someone
could request an irq directly to controller, w/o going through the gpio
subsys. This would be putting restriction on the gpio because of an
issue in the controller. This looks very complicated to setup, static
and platform specific. That's not really what we were aiming for.

We are looking to create mapping "on-demand" to make the best use of
the little number of interrupts we have. To catch request of drivers,
like gpio-keys, which use gpio_to_irq, it looks the only viable place
is the to_irq callback (at the moment).

Drivers using gpio_to_irq in their probe function expect that this will
give them the corresponding virq, so create the mapping if need be.

However, I now get why you don't want that, it seems we have 2 types of
platforms in the kernel right now:?

1. The one creating the mapping in the to_irq callback. It might be
because they just copy/paste from another driver doing it, or they may
have a good reason for it (like I think we do)

2. the one which call gpio_to_irq in interrupt handlers. Honestly I did
not know that one could that, but they are in the mainline too, and
probably have a good reason for doing it.

irq_find_mapping looks safe in interrupt handler, I does not seem to
sleep (please correct me if I'm wrong).
irq_create_mapping definitely isn't, because of the irq_domain mutex.

We probably got into this situation because it wasn't clear enough
whether to_irq was fast or slow (at least it took me a few mails to
understand this ...)
The two platform groups are most likely exclusive so nobody is sleeping
in irq, everybody is happy. As a maintainer, I understand that you
can't allow a dangerous situation like this to continue.

To fix the situation we could add a few things in the gpio subsys:
- Make it clear that to_irq is fast (like you just did)
- add a new callback (to_irq_slow ? provide_irq ? something else) which
would be allowed to sleep and create mappings.
- in gpio_to_irq function keeps calling to_irq like it does but also
call to_irq_slow only if we are not in an interrupt context and a
mapping could not be found. We could maybe use "in_interrupt()" for
this ?

This way, we could keep the existing drivers working as they are (even
if they are wrong) and slowly fix things up.

What do you think about this ? Do you have something else in mind ?
I'd be happy to help on this.

Sorry, it was kind of long explanation but I hope things are more clear
now.

> 
> Maybe the takeaway is that your map does not need to
> be "dense", i.e. every hwirq is in use. It can be sparse.
> It is stored in a radix tree anyways.
> 
> > 
> > Looking at other gpio drivers, it is not uncommon to have some
> > simple
> > calculation to get from gpio offset to the hwirq number. I don't
> > get
> > what is the specific problem here ?
> 
> It's OK to use the offset vs hwirq.
> 
> I just misunderstand it as the global GPIO number, that is
> not OK.

Ok. Just to be clear, you are ok with the function
"meson_gpio_to_hwirq" which just does this translation, right ?

> 
> Yours,
> Linus Walleij


Cheers
Jerome

^ permalink raw reply

* [PATCH] ARM: DT: stm32: move dma translation to board files
From: Bruno Herrera @ 2016-10-27 10:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <404315f0-fc9e-93e2-54df-f85e484d9389@st.com>

Hi Alex,

On Wed, Oct 26, 2016 at 7:09 AM, Alexandre Torgue
<alexandre.torgue@st.com> wrote:
> Hi Bruno,
>
> On 10/25/2016 11:06 PM, Bruno Herrera wrote:
>>
>> Hi Alexandre,
>>
>>>
>>> stm32f469-disco and stm32f429-eval boards use SDRAM start address
>>> remapping
>>> (to @0) to boost performances. A DMA translation through "dma-ranges"
>>> property was needed for other masters than the M4 CPU.
>>> stm32f429-disco doesn't use remapping so doesn't need this DMA
>>> translation.
>>> This patches moves this DMA translation definition from stm32f429 soc
>>> file
>>> to board files.
>>>
>>> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
>>>
>>> diff --git a/arch/arm/boot/dts/stm32429i-eval.dts
>>> b/arch/arm/boot/dts/stm32429i-eval.dts
>>> index 13c7cd2..a763c15 100644
>>> --- a/arch/arm/boot/dts/stm32429i-eval.dts
>>> +++ b/arch/arm/boot/dts/stm32429i-eval.dts
>>> @@ -82,6 +82,10 @@
>>>                 };
>>>         };
>>>
>>> +       soc {
>>> +               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>> +       };
>>> +
>>>         usbotg_hs_phy: usbphy {
>>>                 #phy-cells = <0>;
>>>                 compatible = "usb-nop-xceiv";
>>
>>
>> Shouldn't also the peripheral dma-ranges property move to board specific
>> too?
>> I  had this patch for while but I didn't had the time to submit:
>
>
> Well spot I forgot it. Actually, discussing with Arnd ysterday on IIRC,
> empty dma-ranges is not needed. Can you test on your side by removing
> dma-ranges in usb node please ?
Unfortunately will take a time for me to set up this environment on
the STM32F4-EVAL board.
And on the discovery boards we dont have this scenario. That was the
main reason I did not submit the patch right away.
My conclusion and I might be wrong but is based on the my tests with
SDIO device at STM32F469I-DISCO board.

I started this issue as discussion at ST Forum but Maxime gave me the hint.

https://my.st.com/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/Flat.aspx?RootFolder=https%3a%2f%2fmy%2est%2ecom%2fpublic%2fSTe2ecommunities%2fmcu%2fLists%2fcortex_mx_stm32%2fDMA2%20and%20SYSCFG_MEMRMP%20relationship&FolderCTID=0x01200200770978C69A1141439FE559EB459D7580009C4E14902C3CDE46A77F0FFD06506F5B&currentviews=44

> I will push a v2 by removing empty dma-ranges if tests are ok in your side.

>From my understating/conclusion is: when empty property(dma-ranges) is
the device node, the mapping will be taken in consideration when using
DMA otherwise the mapping is ignored.
And in the SDIO case it is needed for DEV->MEM(SDRAM) and
MEM(SDRAM)->DEV. If it is not the case for the devices in question so
I suppose it can work without the property.

>
> Thanks in advance
> Alex
>
>
>>
>> Author: Bruno Herrera <bruherrera@gmail.com>
>> Date:   Sun Oct 16 14:50:00 2016 -0200
>>
>>     ARM: DT: STM32: Use dma-ranges property per board not at dtsi file
>>
>> diff --git a/arch/arm/boot/dts/stm32429i-eval.dts
>> b/arch/arm/boot/dts/stm32429i-eval.dts
>> index 6bfc595..2a22a82 100644
>> --- a/arch/arm/boot/dts/stm32429i-eval.dts
>> +++ b/arch/arm/boot/dts/stm32429i-eval.dts
>> @@ -52,6 +52,10 @@
>>         model = "STMicroelectronics STM32429i-EVAL board";
>>         compatible = "st,stm32429i-eval", "st,stm32f429";
>>
>> +       soc {
>> +               dma-ranges = <0xC0000000 0x0 0x10000000>;
>> +       };
>> +
>>         chosen {
>>                 bootargs = "root=/dev/ram rdinit=/linuxrc";
>>                 stdout-path = "serial0:115200n8";
>> @@ -96,6 +100,7 @@
>>
>>  &ethernet0 {
>>         status = "okay";
>> +       dma-ranges;
>>         pinctrl-0       = <&ethernet0_mii>;
>>         pinctrl-names   = "default";
>>         phy-mode        = "mii-id";
>> @@ -116,6 +121,7 @@
>>  };
>>
>>  &usbotg_hs {
>> +       dma-ranges;
>>         dr_mode = "host";
>>         phys = <&usbotg_hs_phy>;
>>         phy-names = "usb2-phy";
>> diff --git a/arch/arm/boot/dts/stm32f429.dtsi
>> b/arch/arm/boot/dts/stm32f429.dtsi
>> index 7d624a2..697a133 100644
>> --- a/arch/arm/boot/dts/stm32f429.dtsi
>> +++ b/arch/arm/boot/dts/stm32f429.dtsi
>> @@ -59,7 +59,6 @@
>>         };
>>
>>         soc {
>> -               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>
>>                 timer2: timer at 40000000 {
>>                         compatible = "st,stm32-timer";
>> @@ -472,13 +471,11 @@
>>                         st,syscon = <&syscfg 0x4>;
>>                         snps,pbl = <8>;
>>                         snps,mixed-burst;
>> -                       dma-ranges;
>>                         status = "disabled";
>>                 };
>>
>>                 usbotg_hs: usb at 40040000 {
>>                         compatible = "snps,dwc2";
>> -                       dma-ranges;
>>                         reg = <0x40040000 0x40000>;
>>                         interrupts = <77>;
>>                         clocks = <&rcc 0 29>;
>>
>>
>>> diff --git a/arch/arm/boot/dts/stm32f429.dtsi
>>> b/arch/arm/boot/dts/stm32f429.dtsi
>>> index 0596d60..3a1cfdd 100644
>>> --- a/arch/arm/boot/dts/stm32f429.dtsi
>>> +++ b/arch/arm/boot/dts/stm32f429.dtsi
>>> @@ -59,8 +59,6 @@
>>>         };
>>>
>>>         soc {
>>> -               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>> -
>>>                 timer2: timer at 40000000 {
>>>                         compatible = "st,stm32-timer";
>>>                         reg = <0x40000000 0x400>;
>>> diff --git a/arch/arm/boot/dts/stm32f469-disco.dts
>>> b/arch/arm/boot/dts/stm32f469-disco.dts
>>> index 9e73656..c2213c0 100644
>>> --- a/arch/arm/boot/dts/stm32f469-disco.dts
>>> +++ b/arch/arm/boot/dts/stm32f469-disco.dts
>>> @@ -64,6 +64,10 @@
>>>         aliases {
>>>                 serial0 = &usart3;
>>>         };
>>> +
>>> +       soc {
>>> +               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>> +       };
>>>  };
>>>
>>>  &clk_hse {
>>> --
>>
>>
>>
>> Br.,
>> Bruno
>>
>

^ permalink raw reply

* [PATCH V3 4/8] drivers: platform: Configure dma operations at probe time
From: Lorenzo Pieralisi @ 2016-10-27 10:49 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <003d01d22f9a$56477b80$02d67280$@codeaurora.org>

On Wed, Oct 26, 2016 at 08:34:59PM +0530, Sricharan wrote:
> Hi Robin,
> 
> >On 04/10/16 18:03, Sricharan R wrote:
> >> Configuring DMA ops at probe time will allow deferring device probe when
> >> the IOMMU isn't available yet. The dma_configure for the device is now called
> >> from the generic device_attach callback just before the bus/driver probe
> >> is called. This way, configuring the dma ops for the device would be called
> >> at the same place for all bus_types, hence the deferred probing mechanism
> >> should work for all buses as well.
> >>
> >> pci_bus_add_devices    (platform/amba)(_device_create/driver_register)
> >>        |                         |
> >> pci_bus_add_device     (device_add/driver_register)
> >>        |                         |
> >> device_attach           device_initial_probe
> >>        |                         |
> >> __device_attach_driver    __device_attach_driver
> >>        |
> >> driver_probe_device
> >>        |
> >> really_probe
> >>        |
> >> dma_configure
> >>
> >>  Similarly on the device/driver_unregister path __device_release_driver is
> >>  called which inturn calls dma_deconfigure.
> >>
> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >> ---
> >>  drivers/base/dd.c           | 10 ++++++++++
> >>  drivers/base/dma-mapping.c  | 11 +++++++++++
> >>  drivers/of/platform.c       |  4 ----
> >>  drivers/pci/probe.c         |  5 +----
> >>  include/linux/dma-mapping.h |  3 +++
> >>  5 files changed, 25 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> >> index 16688f5..cfebd48 100644
> >> --- a/drivers/base/dd.c
> >> +++ b/drivers/base/dd.c
> >> @@ -19,6 +19,7 @@
> >>
> >>  #include <linux/device.h>
> >>  #include <linux/delay.h>
> >> +#include <linux/dma-mapping.h>
> >>  #include <linux/module.h>
> >>  #include <linux/kthread.h>
> >>  #include <linux/wait.h>
> >> @@ -353,6 +354,10 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> >>  	if (ret)
> >>  		goto pinctrl_bind_failed;
> >>
> >> +	ret = dma_configure(dev);
> >> +	if (ret)
> >> +		goto dma_failed;
> >> +
> >>  	if (driver_sysfs_add(dev)) {
> >>  		printk(KERN_ERR "%s: driver_sysfs_add(%s) failed\n",
> >>  			__func__, dev_name(dev));
> >> @@ -395,6 +400,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> >>  	goto done;
> >>
> >>  probe_failed:
> >> +	dma_deconfigure(dev);
> >> +dma_failed:
> >>  	if (dev->bus)
> >>  		blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
> >>  					     BUS_NOTIFY_DRIVER_NOT_BOUND, dev);
> >> @@ -780,6 +787,9 @@ static void __device_release_driver(struct device *dev)
> >>  			dev->bus->remove(dev);
> >>  		else if (drv->remove)
> >>  			drv->remove(dev);
> >> +
> >> +		dma_deconfigure(dev);
> >> +
> >>  		devres_release_all(dev);
> >>  		dev->driver = NULL;
> >>  		dev_set_drvdata(dev, NULL);
> >> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> >> index d799662..54e87f5 100644
> >> --- a/drivers/base/dma-mapping.c
> >> +++ b/drivers/base/dma-mapping.c
> >> @@ -10,6 +10,7 @@
> >>  #include <linux/dma-mapping.h>
> >>  #include <linux/export.h>
> >>  #include <linux/gfp.h>
> >> +#include <linux/of_device.h>
> >>  #include <linux/slab.h>
> >>  #include <linux/vmalloc.h>
> >>
> >> @@ -166,6 +167,16 @@ void dmam_free_noncoherent(struct device *dev, size_t size, void *vaddr,
> >>  }
> >>  EXPORT_SYMBOL(dmam_free_noncoherent);
> >>
> >> +int dma_configure(struct device *dev)
> >> +{
> >> +	return of_dma_configure(dev, dev->of_node);
> >> +}
> >> +
> >> +void dma_deconfigure(struct device *dev)
> >> +{
> >> +	of_dma_deconfigure(dev);
> >> +}
> >> +
> >>  #ifdef CONFIG_HAVE_GENERIC_DMA_COHERENT
> >>
> >>  static void dmam_coherent_decl_release(struct device *dev, void *res)
> >> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> >> index 9cb7090..adbd77c 100644
> >> --- a/drivers/of/platform.c
> >> +++ b/drivers/of/platform.c
> >> @@ -181,11 +181,9 @@ static struct platform_device *of_platform_device_create_pdata(
> >>
> >>  	dev->dev.bus = &platform_bus_type;
> >>  	dev->dev.platform_data = platform_data;
> >> -	of_dma_configure(&dev->dev, dev->dev.of_node);
> >>  	of_msi_configure(&dev->dev, dev->dev.of_node);
> >>
> >>  	if (of_device_add(dev) != 0) {
> >> -		of_dma_deconfigure(&dev->dev);
> >>  		platform_device_put(dev);
> >>  		goto err_clear_flag;
> >>  	}
> >> @@ -242,7 +240,6 @@ static struct amba_device *of_amba_device_create(struct device_node *node,
> >>  		dev_set_name(&dev->dev, "%s", bus_id);
> >>  	else
> >>  		of_device_make_bus_id(&dev->dev);
> >> -	of_dma_configure(&dev->dev, dev->dev.of_node);
> >>
> >>  	/* Allow the HW Peripheral ID to be overridden */
> >>  	prop = of_get_property(node, "arm,primecell-periphid", NULL);
> >> @@ -536,7 +533,6 @@ static int of_platform_device_destroy(struct device *dev, void *data)
> >>  		amba_device_unregister(to_amba_device(dev));
> >>  #endif
> >>
> >> -	of_dma_deconfigure(dev);
> >>  	of_node_clear_flag(dev->of_node, OF_POPULATED);
> >>  	of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
> >>  	return 0;
> >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> >> index 93f280d..85c9553 100644
> >> --- a/drivers/pci/probe.c
> >> +++ b/drivers/pci/probe.c
> >> @@ -1724,10 +1724,7 @@ static void pci_dma_configure(struct pci_dev *dev)
> >>  {
> >>  	struct device *bridge = pci_get_host_bridge_device(dev);
> >>
> >> -	if (IS_ENABLED(CONFIG_OF) &&
> >> -		bridge->parent && bridge->parent->of_node) {
> >> -			of_dma_configure(&dev->dev, bridge->parent->of_node);
> >> -	} else if (has_acpi_companion(bridge)) {
> >> +	if (has_acpi_companion(bridge)) {
> >>  		struct acpi_device *adev = to_acpi_device_node(bridge->fwnode);
> >>  		enum dev_dma_attr attr = acpi_get_dma_attr(adev);
> >
> >It seems a bit awkward leaving pci_dma_configure here, doing DMA
> >configuration at device add, after we've allegedly moved DMA
> >configuration to driver probe. Lorenzo, do you foresee any issues if
> >this probe-time of_dma_configure() path were to also multiplex
> >acpi_dma_configure() in future, such that everything would be back in
> >the same place eventually?
> >
> >Conversely, is there actually any issue with leaving pci_dma_configure()
> >unchanged, and simply moving the call from pci_device_add() into
> >dma_configure()?
> 
> Ya i removed only the CONFIG_OF part out of this, and was hoping that
> the acpi configure can also be called from the dma_configure during
> probe later. I did not have an ACPI based platform though.  As you
> said, if that looks right to the ACPI world, it can be moved out from
> here. I can try mergin series (after fixing the vfio errors part) with
> the ACPI IO port and see how it behaves.

I do not expect any issue from this change, as long as, as Robin said,
it is done consistently which means that I am not ok with this patch
as it stands (ie you should defer pci_dma_configure() in its entirety,
not just the DT bits, which obviously requires some ACPI testing too).

CC me in please in the next posting since this affects the ACPI probing
path too, we have to coordinate this series with my ACPI IORT SMMU
enablement patches:

https://lkml.org/lkml/2016/10/18/506

Thanks,
Lorenzo

^ permalink raw reply

* [PATCH] arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
From: Mark Rutland @ 2016-10-27 10:51 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161027100428.GA17829@cbox>

Hi Christoffer,

On Thu, Oct 27, 2016 at 12:04:28PM +0200, Christoffer Dall wrote:
> On Thu, Oct 27, 2016 at 10:49:00AM +0100, Marc Zyngier wrote:
> > The guest wouldn't have to do any invalidation at all on real HW,
> > because the TLBs are strictly private to a physical CPU (only the
> > invalidation can be broadcast to the Inner Shareable domain). But when
> > we multiplex two vcpus on the same physical CPU, we break the private
> > semantics, and a vcpu could hit in the TLB entries populated by the
> > another one.
> 
> Such a guest would be using a mapping of the same VA with the same ASID
> on two separate CPUs, each pointing to a separate PA.  If it ever were
> to, say, migrate a task, it would have to do invalidations then.  Right?

An OS (not Linux) could use a different ASID space per-cpu.

e.g. with two single-threaded tasks A and B, you could have ASIDS:

	cpu0	cpu1
A	0	1
B	1	0

... and this would not be a problem, so long as when mappings changed
maintenance were performed appropriately (e.g. perhaps it uses IPIs to
trigger the relevant local TLB invlidation, rather than using broadcast
ops).

> Does Linux or other guests actually do this?

Linux currently doesn't use ASIDs that way, but does use global mappings
in a potentially-confliciting way in the cold-return paths (hotplug-on
and return from idle). With two vCPUs, you could have a sequence like:

	cpu0				cpu1
	Task with ASID x started
					hotplug on
					install global TTBR0 mapping
					global entry allocated into TLB
	Task hits cpu1's global entry

... which cannot happen bare-metal, and there's no point at which the
guest can perform suitable maintenance.

> Another fix would be to allocate a VMID per VCPU I suppose, just to
> introduce a terrible TLB hit ratio :)

That would break broadcast invalidation within the guest, no?

... unless you also trapped all TLB maintenance, and did the IPI-based
broadcast in SW.

Thanks,
Mark.

^ permalink raw reply

* [PATCH v2 3/4] arm64: arch_timer: Work around Erratum Hisilicon-161601
From: Marc Zyngier @ 2016-10-27 10:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1477553651-13428-3-git-send-email-dingtianhong@huawei.com>

On 27/10/16 08:34, Ding Tianhong wrote:
> Erratum Hisilicon-161601 says that the ARM generic timer counter "has the
> potential to contain an erroneous value when the timer value changes".
> Accesses to TVAL (both read and write) are also affected due to the implicit counter
> read.  Accesses to CVAL are not affected.
> 
> The workaround is to reread the system count registers until the value of the second
> read is larger than the first one by less than 32, the system counter can be guaranteed
> not to return wrong value twice by back-to-back read and the error value is always larger
> than the correct one by 32. Writes to TVAL are replaced with an equivalent write to CVAL.
> 
> The workaround is enabled if the hisilicon,erratum-161601 property is found in
> the timer node in the device tree.  This can be overridden with the
> clocksource.arm_arch_timer.hisilicon-161601 boot parameter, which allows KVM
> users to enable the workaround until a mechanism is implemented to
> automatically communicate this information.
> 
> Fix some description for fsl erratum a008585.
> 
> v2: Significant rework based on feedback, including seperate the fsl erratum a008585
>     to another patch, update the erratum name and remove unwanted code.
> 
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> ---
>  Documentation/arm64/silicon-errata.txt |  1 +
>  Documentation/kernel-parameters.txt    |  9 ++++
>  arch/arm64/include/asm/arch_timer.h    | 28 ++++++++++-
>  drivers/clocksource/Kconfig            | 14 +++++-
>  drivers/clocksource/arm_arch_timer.c   | 88 ++++++++++++++++++++++++++--------
>  5 files changed, 118 insertions(+), 22 deletions(-)
> 
> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
> index 405da11..70c5d5e 100644
> --- a/Documentation/arm64/silicon-errata.txt
> +++ b/Documentation/arm64/silicon-errata.txt
> @@ -63,3 +63,4 @@ stable kernels.
>  | Cavium         | ThunderX SMMUv2 | #27704          | N/A		       |
>  |                |                 |                 |                         |
>  | Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585     |
> +| Hisilicon      | Hip05/Hip06/Hip07 | #161601       | HISILICON_ERRATUM_161601|

I've already commented on the alignment. Please read my initial review.

> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 6fa1d8a..735b4b6 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -707,6 +707,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  			erratum.  If unspecified, the workaround is
>  			enabled based on the device tree.
>  
> +	clocksource.arm_arch_timer.hisilicon-161601=
> +			[ARM64]
> +			Format: <bool>
> +			Enable/disable the workaround of Hisilicon
> +			erratum 161601.  This can be useful for KVM
> +			guests, if the guest device tree doesn't show the
> +			erratum.  If unspecified, the workaround is
> +			enabled based on the device tree.
> +
>  	clearcpuid=BITNUM [X86]
>  			Disable CPUID feature X for the kernel. See
>  			arch/x86/include/asm/cpufeatures.h for the valid bit
> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
> index 118719d8..49b3041 100644
> --- a/arch/arm64/include/asm/arch_timer.h
> +++ b/arch/arm64/include/asm/arch_timer.h
> @@ -29,7 +29,7 @@
>  
>  #include <clocksource/arm_arch_timer.h>
>  
> -#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585)
> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>  extern struct static_key_false arch_timer_read_ool_enabled;
>  #define needs_timer_erratum_workaround() \
>  	static_branch_unlikely(&arch_timer_read_ool_enabled)
> @@ -65,11 +65,35 @@ extern struct arch_timer_erratum_workaround *erratum_workaround;
>  	_new;						\
>  })
>  
> +
> +
> +/*
> + * The number of retries is an arbitrary value well beyond the highest number
> + * of iterations the loop has been observed to take.
> + * Verify whether the value of the second read is larger than the first by
> + * less than 32 is the only way to confirm the value is correct, the system
> + * counter can be guaranteed not to return wrong value twice by back-to-back read
> + * and the error value is always larger than the correct one by 32.
> + */
> +#define __hisi_161601_read_reg(reg) ({				\
> +	u64 _old, _new;						\
> +	int _retries = 200;					\

Please document how this value was found (either in the code or in the
commit message).

> +								\
> +	do {							\
> +		_old = read_sysreg(reg);			\
> +		_new = read_sysreg(reg);			\
> +		_retries--;					\
> +	} while (unlikely((_new - _old) >> 5) && _retries);	\
> +								\
> +	WARN_ON_ONCE(!_retries);				\
> +	_new;							\
> +})

Same remark as in the previous patch.

> +
>  #define arch_timer_reg_read_stable(reg) 		\
>  ({							\
>  	u64 _val;					\
>  	if (needs_timer_erratum_workaround())		\
> -		_val = erratum_workaround->read_##reg();	\
> +		_val = erratum_workaround->read_##reg();\
>  	else						\
>  		_val = read_sysreg(reg);		\
>  	_val;						\
> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
> index 8a753fd..4aafb6a 100644
> --- a/drivers/clocksource/Kconfig
> +++ b/drivers/clocksource/Kconfig
> @@ -312,8 +312,20 @@ config FSL_ERRATUM_A008585
>  	help
>  	  This option enables a workaround for Freescale/NXP Erratum
>  	  A-008585 ("ARM generic timer may contain an erroneous
> -	  value").  The workaround will only be active if the
> +	  value").  The workaround will be active if the
>  	  fsl,erratum-a008585 property is found in the timer node.
> +	  This can be overridden with the clocksource.arm_arch_timer.fsl-a008585
> +	  boot parameter.

Move this hunk to the previous patch.

> +
> +config HISILICON_ERRATUM_161601
> +	bool "Workaround for Hisilicon Erratum 161601"
> +	default y
> +	depends on ARM_ARCH_TIMER && ARM64
> +	help
> +	  This option enables a workaround for Hisilicon Erratum
> +	  161601. The workaround will be active if the hisilicon,erratum-161601
> +	  property is found in the timer node. This can be overridden with
> +	  the clocksource.arm_arch_timer.hisilicon-161601 boot parameter.
>  
>  config ARM_GLOBAL_TIMER
>  	bool "Support for the ARM global timer" if COMPILE_TEST
> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> index e4f7fa1..89f1895 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -94,13 +94,14 @@ early_param("clocksource.arm_arch_timer.evtstrm", early_evtstrm_cfg);
>   * Architected system timer support.
>   */
>  
> -#ifdef CONFIG_FSL_ERRATUM_A008585
> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>  struct arch_timer_erratum_workaround *erratum_workaround = NULL;
>  
>  DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled);
>  EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled);
> +#endif
>  
> -
> +#ifdef CONFIG_FSL_ERRATUM_A008585
>  static u32 fsl_a008585_read_cntp_tval_el0(void)
>  {
>  	return __fsl_a008585_read_reg(cntp_tval_el0);
> @@ -139,6 +140,45 @@ static int __init early_fsl_a008585_cfg(char *buf)
>  early_param("clocksource.arm_arch_timer.fsl-a008585", early_fsl_a008585_cfg);
>  #endif /* CONFIG_FSL_ERRATUM_A008585 */
>  
> +#ifdef CONFIG_HISILICON_ERRATUM_161601
> +static u32 hisi_161601_read_cntp_tval_el0(void)
> +{
> +	return __hisi_161601_read_reg(cntp_tval_el0);
> +}
> +
> +static  u32 hisi_161601_read_cntv_tval_el0(void)
> +{
> +	return __hisi_161601_read_reg(cntv_tval_el0);
> +}
> +
> +static u64 hisi_161601_read_cntvct_el0(void)
> +{
> +	return __hisi_161601_read_reg(cntvct_el0);
> +}
> +
> +static struct arch_timer_erratum_workaround arch_timer_hisi_161601 = {
> +	.read_cntp_tval_el0 = hisi_161601_read_cntp_tval_el0,
> +	.read_cntv_tval_el0 = hisi_161601_read_cntv_tval_el0,
> +	.read_cntvct_el0 = hisi_161601_read_cntvct_el0,
> +};
> +
> +static int __init early_hisi_161601_cfg(char *buf)
> +{
> +	int ret;
> +	bool val;
> +
> +	ret = strtobool(buf, &val);
> +	if (ret)
> +		return ret;
> +
> +	if (val)
> +		erratum_workaround = &arch_timer_hisi_161601;
> +
> +	return 0;
> +}
> +early_param("clocksource.arm_arch_timer.hisilicon-161601", early_hisi_161601_cfg);
> +#endif /* CONFIG_HISILICON_ERRATUM_161601 */
> +
>  static __always_inline
>  void arch_timer_reg_write(int access, enum arch_timer_reg reg, u32 val,
>  			  struct clock_event_device *clk)
> @@ -288,8 +328,8 @@ static __always_inline void set_next_event(const int access, unsigned long evt,
>  	arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk);
>  }
>  
> -#ifdef CONFIG_FSL_ERRATUM_A008585
> -static __always_inline void fsl_a008585_set_next_event(const int access,
> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
> +static __always_inline void erratum_set_next_event(const int access,
>  		unsigned long evt, struct clock_event_device *clk)
>  {
>  	unsigned long ctrl;
> @@ -307,20 +347,20 @@ static __always_inline void fsl_a008585_set_next_event(const int access,
>  	arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk);
>  }
>  
> -static int fsl_a008585_set_next_event_virt(unsigned long evt,
> +static int erratum_set_next_event_virt(unsigned long evt,
>  					   struct clock_event_device *clk)
>  {
> -	fsl_a008585_set_next_event(ARCH_TIMER_VIRT_ACCESS, evt, clk);
> +	erratum_set_next_event(ARCH_TIMER_VIRT_ACCESS, evt, clk);
>  	return 0;
>  }
>  
> -static int fsl_a008585_set_next_event_phys(unsigned long evt,
> +static int erratum_set_next_event_phys(unsigned long evt,
>  					   struct clock_event_device *clk)
>  {
> -	fsl_a008585_set_next_event(ARCH_TIMER_PHYS_ACCESS, evt, clk);
> +	erratum_set_next_event(ARCH_TIMER_PHYS_ACCESS, evt, clk);
>  	return 0;
>  }
> -#endif /* CONFIG_FSL_ERRATUM_A008585 */
> +#endif
>  
>  static int arch_timer_set_next_event_virt(unsigned long evt,
>  					  struct clock_event_device *clk)
> @@ -350,16 +390,16 @@ static int arch_timer_set_next_event_phys_mem(unsigned long evt,
>  	return 0;
>  }
>  
> -static void fsl_a008585_set_sne(struct clock_event_device *clk)
> +static void erratum_set_sne(struct clock_event_device *clk)
>  {
> -#ifdef CONFIG_FSL_ERRATUM_A008585
> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>  	if (!static_branch_unlikely(&arch_timer_read_ool_enabled))
>  		return;
>  
>  	if (arch_timer_uses_ppi == VIRT_PPI)
> -		clk->set_next_event = fsl_a008585_set_next_event_virt;
> +		clk->set_next_event = erratum_set_next_event_virt;
>  	else
> -		clk->set_next_event = fsl_a008585_set_next_event_phys;
> +		clk->set_next_event = erratum_set_next_event_phys;
>  #endif

This should be in the previous patch as well, as it only messes with the
FSL erratum.

>  }
>  
> @@ -392,7 +432,7 @@ static void __arch_timer_setup(unsigned type,
>  			BUG();
>  		}
>  
> -		fsl_a008585_set_sne(clk);
> +		erratum_set_sne(clk);
>  	} else {
>  		clk->features |= CLOCK_EVT_FEAT_DYNIRQ;
>  		clk->name = "arch_mem_timer";
> @@ -612,7 +652,7 @@ static void __init arch_counter_register(unsigned type)
>  
>  		clocksource_counter.archdata.vdso_direct = true;
>  
> -#ifdef CONFIG_FSL_ERRATUM_A008585
> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>  		/*
>  		 * Don't use the vdso fastpath if errata require using
>  		 * the out-of-line counter accessor.
> @@ -899,12 +939,22 @@ static int __init arch_timer_of_init(struct device_node *np)
>  	arch_timer_c3stop = !of_property_read_bool(np, "always-on");
>  
>  #ifdef CONFIG_FSL_ERRATUM_A008585
> -	if (!erratum_workaround && of_property_read_bool(np, "fsl,erratum-a008585"))
> +	if (!erratum_workaround && of_property_read_bool(np, "fsl,erratum-a008585")) {
>  		erratum_workaround = &arch_timer_fsl_a008585;
> +		if (erratum_workaround) {

Brilliant!

> +			static_branch_enable(&arch_timer_read_ool_enabled);
> +			pr_info("Enabling workaround for FSL erratum A-008585\n");
> +		}
> +	}
> +#endif
>  
> -	if (erratum_workaround) {
> -		static_branch_enable(&arch_timer_read_ool_enabled);
> -		pr_info("Enabling workaround for FSL erratum A-008585\n");
> +#ifdef CONFIG_HISILICON_ERRATUM_161601
> +	if (!erratum_workaround && of_property_read_bool(np, "hisilicon,erratum-161601")) {
> +		erratum_workaround = &arch_timer_hisi_161601;
> +		if (erratum_workaround) {
> +			static_branch_enable(&arch_timer_read_ool_enabled);
> +			pr_info("Enabling workaround for HISILICON erratum 161601\n");
> +		}
>  	}
>  #endif

Do you see a pattern here? Surely you can factor a bit of that code (and
remove the nonsensical bits).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH v7] spi: sun4i: Allow transfers larger than FIFO size
From: Mark Brown @ 2016-10-27 11:14 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161026085528.vjjypbfmfbhvileb@lukather>

On Wed, Oct 26, 2016 at 10:55:28AM +0200, Maxime Ripard wrote:
> On Wed, Oct 26, 2016 at 12:00:30AM -0700, Alexandru Gagniuc wrote:

> > When DMA finally takes over, this fallback path is not mutually exclusive.

> I definitely agree, and we had this patch in the CHIP kernel for quite
> some time, working like a charm.

> I was planning to respin it in the next few days, glad to see you took
> care of it :)

> Mark, any comments on this? For the record, it already has my Acked-by.

Without knowing what the previous discussion was it's hard to comment,
it sounds like some prior review comments are just being ignored here
but since I'm not turning up anything with this subject line I've no
idea what that might have been (and that's very concerning in itself
given that this is apparently v7...).  I'm also concerned that there
isn't a version of this for sun6i, it's going to make all the
cut'n'pasting between the two drivers harder if we make changes in one
and not the other.

If the concern from the previous reviews to do with not using DMA is
there some reason it's hard to do DMA?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161027/279cec73/attachment.sig>

^ permalink raw reply

* [PATCH 5/5] ARM: dts: Add LEGO MINDSTORTMS EV3 dts
From: Sekhar Nori @ 2016-10-27 11:18 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <38a3a10c-a269-6a9f-305c-7d38ec9b839b@lechnology.com>

On Thursday 27 October 2016 07:00 AM, David Lechner wrote:
> On 10/24/2016 06:58 AM, Sekhar Nori wrote:
>> On Saturday 22 October 2016 12:06 AM, David Lechner wrote:
>>> This adds a device tree definition file for LEGO MINDSTORMS EV3.
>>
>> Thanks for the patch!
>>
>>>
>>> What is working:
>>>
>>> * Pin muxing
>>> * MicroSD card reader
>>> * UART on input port 1
>>>
>>> What is partially working:
>>>
>>> * Buttons - working after GPIO fix
>>> * LEDs - working after GPIO fix
>>> * Poweroff/reset - working after GPIO fix
>>
>> Is the GPIO fix something that will go in v4.9-rc cycle ?
>>
> 
> FYI, the fix is in linux-gpio/fixes now.

Thanks, I added the patch to my 'acked' branch which has stuff useful
for testing on davinci but not going through my tree. It gets merged
into my master branch so that branch can be used for all mach-davinci
development.

Thanks,
Sekhar

^ permalink raw reply

* [PATCH] drivers: mfd: ti_am335x_tscadc: increase ADC ref clock to 24MHz
From: Mugunthan V N @ 2016-10-27 11:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161024060226.4170-1-mugunthanvnm@ti.com>

On Monday 24 October 2016 11:32 AM, Mugunthan V N wrote:
> Increase ADC reference clock from 3MHz to 24MHz so that the
> sampling rates goes up from 100K samples per second to 800K
> samples per second on AM335x and AM437x SoC.
> 
> Also increase opendelay for touchscreen configuration to
> equalize the increase in ADC reference clock frequency,
> which results in the same amount touch events reported via
> evtest on AM335x GP EVM.
> 
> Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>

Found issues with touchscreen usage with GUI on my AM335x GP EVM. Fixed
with change the "ti,charge-delay" DT property based on change in ADC
clocks. Will be sending v2 patch soon.

Regards
Mugunthan V N

^ permalink raw reply

* [PATCH 4/5] ARM: davinci: enable LEDs default-on trigger in default config
From: Sekhar Nori @ 2016-10-27 11:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1477075018-20176-5-git-send-email-david@lechnology.com>

On Saturday 22 October 2016 12:06 AM, David Lechner wrote:
> The LEDs default-on trigger is nice to have. For example, it can be used
> to configure a LED as a power indicator.
> 
> Signed-off-by: David Lechner <david@lechnology.com>
> ---
>  arch/arm/configs/davinci_all_defconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/configs/davinci_all_defconfig b/arch/arm/configs/davinci_all_defconfig
> index 9d7f0bc..e380743 100644
> --- a/arch/arm/configs/davinci_all_defconfig
> +++ b/arch/arm/configs/davinci_all_defconfig
> @@ -181,6 +181,7 @@ CONFIG_LEDS_GPIO=m
>  CONFIG_LEDS_TRIGGERS=y
>  CONFIG_LEDS_TRIGGER_TIMER=m
>  CONFIG_LEDS_TRIGGER_HEARTBEAT=m
> +CONFIG_LEDS_TRIGGER_DEFAULT_ON=y

Can this be module like rest of the triggers? It will come on later, but
not sure if you care about the difference that much. Having it a module
will be better for those boards that don't need it.

Thanks,
Sekhar

^ permalink raw reply

* [PATCH] ASoC: hdmi-codec: Fix hdmi_of_xlate_dai_name when #sound-dai-cells = <0>
From: Jon Medhurst (Tixy) @ 2016-10-27 11:42 UTC (permalink / raw)
  To: linux-arm-kernel

If a DAI specifies "#sound-dai-cells = <0>" in device-tree then
hdmi_of_xlate_dai_name() will be called with zero args, which it isn't
implemented to cope with. The resulting use of an uninitialised variable
for the id will usually result in an error like:

  asoc-simple-card sound: parse error -11
  asoc-simple-card: probe of sound failed with error -11

Fix this by using and id of zero if no arg is provided.

Fixes: 9731f82d6016 ("ASoC: hdmi-codec: enable multi probe for same device")

Signed-off-by: Jon Medhurst <tixy@linaro.org>
---

Jyri, FYI, I believe this issue broke your commit df0bd1e8f3c508 ("ARM: dts:
am335x-boneblack: Add HDMI audio support")

 sound/soc/codecs/hdmi-codec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c
index b904492..6661ace 100644
--- a/sound/soc/codecs/hdmi-codec.c
+++ b/sound/soc/codecs/hdmi-codec.c
@@ -364,7 +364,7 @@ static int hdmi_of_xlate_dai_name(struct snd_soc_component *component,
 				  struct of_phandle_args *args,
 				  const char **dai_name)
 {
-	int id = args->args[0];
+	int id = args->args_count ? args->args[0] : 0;
 
 	if (id < ARRAY_SIZE(hdmi_dai_name)) {
 		*dai_name = hdmi_dai_name[id];
-- 
2.1.4

^ permalink raw reply related

* [PATCH v3] ARM: davinci: da8xx: Fix some redefined symbol warnings
From: Sekhar Nori @ 2016-10-27 11:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1477485542-10555-2-git-send-email-abailon@baylibre.com>

On Wednesday 26 October 2016 06:09 PM, Alexandre Bailon wrote:
> Some macro for DA8xx CFGCHIP are defined in usb-davinci.h,
> but da8xx-cfgchip.h intend to replace them.
> The usb-da8xx.c is using both headers, causing redefined symbol warnings.

Looks like this is not true for v4.9-rc2 and so I don't see any warnings
in v4.9-rc2. This is probably introduced due to some changes you are
doing for v4.10 and so it is not a v4.9-rc candidate.

> Remove the macro and update the board files to use da8xx-cfgchip.h

This patch should probably be folded into the patch which actually
introduces da8xx-cfgchip.h into usb-da8xx.c

> Signed-off-by: Alexandre Bailon <abailon@baylibre.com>

Thanks,
Sekhar

^ permalink raw reply

* Pinctrl nodes missing for USB
From: Anand Moon @ 2016-10-27 11:49 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161026175827.GA10224@kozik-lap>

Hi Krzysztof,

On 26 October 2016 at 23:28, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> On Wed, Oct 26, 2016 at 11:15:51PM +0530, Anand Moon wrote:
>> Hi Krzysztof
>>
>> On 26 October 2016 at 22:23, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>> > On Wed, Oct 26, 2016 at 05:56:54PM +0530, Anand Moon wrote:
>> >> Hi All,
>> >>
>> >> I have tried to enable CONFIG_DEBUG_PINCTRL=y on Odroid XU4.
>> >> Just to try to understand the feature.
>> >> Is this feature suppoted for USB nodes.
>> >>
>> >> Below is the output of failed to pase pinctrl for USB nodes via dts.
>> >
>> > I do not see any question here...
>> >
>> > Anyway the devices not instantiated from DT will have such warning and
>> > USB devices are not present in DT, from obvious reasons... However what
>> > surprises me is why pinctrl_dt_to_map() was called for USB devices?
>> >
>> > Best regards,
>> > Krzysztof
>> >
>> [snip]
>>
>> Sorry. I was just referring HK odroidxu3 dts for reference for dwc3 controller.
>>
>> https://github.com/hardkernel/linux/blob/odroidxu3-3.10.y/arch/arm/boot/dts/exynos5422-odroidxu3.dts#L525
>>
>> I am just trying to understand if such a configuration possible for
>> dwc3 controllers.
>
> What do you mean by "configuration"? Which configuration?

I am just elaborating what I have understood below.

>
> Best regards,
> Krzysztof
>
Adding Vivek Gautam +

Apologize for my poor English and explanation of what I am trying to
work on or understand the feature related to dwc3

For some time I am trying to figure out the performance issue of USB
3.0 on Odroid XU4 boards.

[1] https://lkml.org/lkml/2015/2/2/259

Following above patch helps registration of USB storage device to
"Super-Speed" ie

root at odroidcsh:/usr/src/odroidxu3-4.y-devel# lsusb -t
/:  Bus 06.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
    |__ Port 1: Dev 3, If 0, Class=Vendor Specific Class, Driver=r8152, 480M
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/2p, 5000M
        |__ Port 1: Dev 3, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
        |__ Port 2: Dev 4, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/2p, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=exynos-ohci/3p, 12M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=exynos-ehci/3p, 480M

But the performance is not good enough to work on the storage device.
we cannot compile whole kernel or source code on this drive attached.

So did some investigation on this by comparing the dts nodes and driver.
I was looking into some logic to do some "gpio_reset" which will help
reset the driver.

But after studying the driver code of OdroidXU4 Hardkernel,
I tried to understand the requirement of gpio pin controlled by pinctrl.

Sorry I am not an expert in the internal of the the pinctrl and
internal gpio bus.
So I have just for my understanding created this small patch to help
elaborate this feature.

--------------------------------------------------------------------------------------------
root at odroidcsh:/usr/src/odroidxu3-4.y-devel# git diff
arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
diff --git a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
index 246d298..03e90b6 100644
--- a/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
+++ b/arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi
@@ -560,6 +560,24 @@
                samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
                samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
        };
+
+       b_sess0_irq: b-sess0-irq {
+               samsung,pins = "gpx3-5";
+               samsung,pin-function = <0xf>;
+               samsung,pin-pud = <0>;
+       };
+
+       b_sess1_irq: b-sess1-irq {
+               samsung,pins = "gpx3-4";
+               samsung,pin-function = <0xf>;
+               samsung,pin-pud = <0>;
+       };
+
+       id2_irq: id2-irq {
+               samsung,pins = "gpx1-1";
+               samsung,pin-function = <0xf>;
+               samsung,pin-pud = <0>;
+        };
 };

 &pinctrl_1 {
@@ -569,6 +587,12 @@
                samsung,pin-pud = <EXYNOS_PIN_PULL_NONE>;
                samsung,pin-drv = <EXYNOS5420_PIN_DRV_LV1>;
        };
+
+        id1_irq: id1-irq {
+                samsung,pins = "gpc1-0";
+                samsung,pin-function = <0xf>;
+                samsung,pin-pud = <3>;
+        };
 };

 &tmu_cpu0 {
@@ -604,11 +628,23 @@
 /* usbdrd_dwc3_1 mode customized in each board */

 &usbdrd3_0 {
+       samsung,bsess-gpio = <&gpx3 5 0xf>;
+       pinctrl-names = "default";
+       pinctrl-0 = <&b_sess0_irq>;
+       samsung,id-gpio = <&gpc1 0 0xf>;
+       pinctrl-names = "default";
+       pinctrl-0 = <&id1_irq>;
        vdd33-supply = <&ldo9_reg>;
        vdd10-supply = <&ldo11_reg>;
 };

 &usbdrd3_1 {
+       samsung,bsess-gpio = <&gpx3 4 0xf>;
+       pinctrl-names = "default";
+       pinctrl-0 = <&b_sess1_irq>;
+       samsung,id-gpio = <&gpx1 1 0xf>;
+       pinctrl-names = "default";
+       pinctrl-0 = <&id2_irq>;
        vdd33-supply = <&ldo9_reg>;
        vdd10-supply = <&ldo11_reg>;
 };
--------------------------------------------------------------------------------------------
Here is the core logic as I understood out of the driver code from
OdroidXU3 Hardkernel.
so we have samsung,bsess-gpio and samsung,id-gpio two gpio pins control
by the exynos-dwc3 driver in the Odroid Hardkernel

# drivers/usb/dwc3/dwc3-exynos.c

tries to register interrupts on these gpio pin to control the flow via gpio-irq.
Also It monitor the vbus controller changes

Below irq to monitor id-gpio thread.

https://github.com/hardkernel/linux/blob/odroidxu3-3.10.y/drivers/usb/dwc3/dwc3-exynos.c#L217

Below irq to monitor bsess-gpio thread.

https://github.com/hardkernel/linux/blob/odroidxu3-3.10.y/drivers/usb/dwc3/dwc3-exynos.c#L230

Once again I am not an expert in this configuration, I am just trying
to map the feature and the code.
Please let me know If my understanding is wrong or the feature should
work little bit different.

I am poor in English to explain technically details.
If any body have any other input on this please let me know.

-Best Regards
Anand Moon

^ permalink raw reply related

* [PATCH] ARM: shmobile: select errata 798181 for SoCs with CA15 cores
From: Simon Horman @ 2016-10-27 11:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CANqRtoTJHrXK4NBWWDShoF1RnxpBhheDnvsfeyDO6cvLdbOE=A@mail.gmail.com>

On Thu, Oct 27, 2016 at 04:39:24PM +0900, Magnus Damm wrote:
> Hi Geert and Simon,
> 
> On Thu, Oct 27, 2016 at 4:33 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> > Hi Magnus,
> >
> > On Thu, Oct 27, 2016 at 9:28 AM, Magnus Damm <magnus.damm@gmail.com> wrote:
> >> On Thu, Oct 27, 2016 at 4:00 PM, Simon Horman
> >> <horms+renesas@verge.net.au> wrote:
> >>> Select ARM errata 798181 on SoCs cores affected CA15 cores.
> >>>
> >>> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
> >>> ---
> >>>  arch/arm/mach-shmobile/Kconfig | 6 ++++++
> >>>  1 file changed, 6 insertions(+)
> >>>
> >>> diff --git a/arch/arm/mach-shmobile/Kconfig b/arch/arm/mach-shmobile/Kconfig
> >>> index c48be1d332ed..6fbd9b7d2d67 100644
> >>> --- a/arch/arm/mach-shmobile/Kconfig
> >>> +++ b/arch/arm/mach-shmobile/Kconfig
> >>> @@ -60,6 +60,7 @@ config ARCH_R7S72100
> >>>  config ARCH_R8A73A4
> >>>         bool "R-Mobile APE6 (R8A73A40)"
> >>>         select ARCH_RMOBILE
> >>> +       select ARM_ERRATA_798181 if SMP
> >>>         select RENESAS_IRQC
> >>>
> >>>  config ARCH_R8A7740
> >>> @@ -70,6 +71,7 @@ config ARCH_R8A7740
> >>>  config ARCH_R8A7743
> >>>         bool "RZ/G1M (R8A77430)"
> >>>         select ARCH_RCAR_GEN2
> >>> +       select ARM_ERRATA_798181 if SMP
> >>>
> >>>  config ARCH_R8A7778
> >>>         bool "R-Car M1A (R8A77781)"
> >>> @@ -82,20 +84,24 @@ config ARCH_R8A7779
> >>>  config ARCH_R8A7790
> >>>         bool "R-Car H2 (R8A77900)"
> >>>         select ARCH_RCAR_GEN2
> >>> +       select ARM_ERRATA_798181 if SMP
> >>>         select I2C
> >>
> >> Thanks for your help.
> >>
> >> I'm probably misunderstanding what this patch does and how the errata
> >> effects the system, but the commit message says CA15 cores. The above
> >> R-Car Gen1 and r8a7740 SoCs are not using CA15 - instead they use CA9.
> >> Not sure if the errata still applies though.
> >
> > Please don't become misled by the @@ context ;-)
> 
> Ouch, right... Sorry for the noise. The patch looks good, please proceed.

Thanks, I have queued this up.

^ permalink raw reply

* [PATCH v3] ARM: davinci: da8xx: Fix some redefined symbol warnings
From: Sekhar Nori @ 2016-10-27 11:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <7026f012-ffce-cc28-85be-9a5cd26de60f@ti.com>

On Thursday 27 October 2016 05:13 PM, Sekhar Nori wrote:
> On Wednesday 26 October 2016 06:09 PM, Alexandre Bailon wrote:
>> Some macro for DA8xx CFGCHIP are defined in usb-davinci.h,
>> but da8xx-cfgchip.h intend to replace them.
>> The usb-da8xx.c is using both headers, causing redefined symbol warnings.
> 
> Looks like this is not true for v4.9-rc2 and so I don't see any warnings

Ah, just noticed that _this_ is the patch that introduces
da8xx-cfgchip.h into usb-da8xx.c. So this is the patch that introduces
the warnings (and fixes them).

I can queue this for v4.10 (with Greg's ack) if you change the
description to make it about cleaning up duplicated defines between
da8xx-cfgchip.h and usb-davinci.h and not talk about "redefined symbol
warnings".

Also, when adding a header file, can you please keep it sorted in
alphabetical order.

Thanks,
Sekhar

^ permalink raw reply

* [PATCH v2 3/4] arm64: arch_timer: Work around Erratum Hisilicon-161601
From: Ding Tianhong @ 2016-10-27 12:17 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <16d8f329-3413-11da-6f2a-8b1a95c85352@arm.com>



On 2016/10/27 18:58, Marc Zyngier wrote:
> On 27/10/16 08:34, Ding Tianhong wrote:
>> Erratum Hisilicon-161601 says that the ARM generic timer counter "has the
>> potential to contain an erroneous value when the timer value changes".
>> Accesses to TVAL (both read and write) are also affected due to the implicit counter
>> read.  Accesses to CVAL are not affected.
>>
>> The workaround is to reread the system count registers until the value of the second
>> read is larger than the first one by less than 32, the system counter can be guaranteed
>> not to return wrong value twice by back-to-back read and the error value is always larger
>> than the correct one by 32. Writes to TVAL are replaced with an equivalent write to CVAL.
>>
>> The workaround is enabled if the hisilicon,erratum-161601 property is found in
>> the timer node in the device tree.  This can be overridden with the
>> clocksource.arm_arch_timer.hisilicon-161601 boot parameter, which allows KVM
>> users to enable the workaround until a mechanism is implemented to
>> automatically communicate this information.
>>
>> Fix some description for fsl erratum a008585.
>>
>> v2: Significant rework based on feedback, including seperate the fsl erratum a008585
>>     to another patch, update the erratum name and remove unwanted code.
>>
>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>> ---
>>  Documentation/arm64/silicon-errata.txt |  1 +
>>  Documentation/kernel-parameters.txt    |  9 ++++
>>  arch/arm64/include/asm/arch_timer.h    | 28 ++++++++++-
>>  drivers/clocksource/Kconfig            | 14 +++++-
>>  drivers/clocksource/arm_arch_timer.c   | 88 ++++++++++++++++++++++++++--------
>>  5 files changed, 118 insertions(+), 22 deletions(-)
>>
>> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
>> index 405da11..70c5d5e 100644
>> --- a/Documentation/arm64/silicon-errata.txt
>> +++ b/Documentation/arm64/silicon-errata.txt
>> @@ -63,3 +63,4 @@ stable kernels.
>>  | Cavium         | ThunderX SMMUv2 | #27704          | N/A		       |
>>  |                |                 |                 |                         |
>>  | Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585     |
>> +| Hisilicon      | Hip05/Hip06/Hip07 | #161601       | HISILICON_ERRATUM_161601|
> 
> I've already commented on the alignment. Please read my initial review.
> 

It sees misunderstood, fix it this time.

>> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
>> index 6fa1d8a..735b4b6 100644
>> --- a/Documentation/kernel-parameters.txt
>> +++ b/Documentation/kernel-parameters.txt
>> @@ -707,6 +707,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>>  			erratum.  If unspecified, the workaround is
>>  			enabled based on the device tree.
>>  
>> +	clocksource.arm_arch_timer.hisilicon-161601=
>> +			[ARM64]
>> +			Format: <bool>
>> +			Enable/disable the workaround of Hisilicon
>> +			erratum 161601.  This can be useful for KVM
>> +			guests, if the guest device tree doesn't show the
>> +			erratum.  If unspecified, the workaround is
>> +			enabled based on the device tree.
>> +
>>  	clearcpuid=BITNUM [X86]
>>  			Disable CPUID feature X for the kernel. See
>>  			arch/x86/include/asm/cpufeatures.h for the valid bit
>> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
>> index 118719d8..49b3041 100644
>> --- a/arch/arm64/include/asm/arch_timer.h
>> +++ b/arch/arm64/include/asm/arch_timer.h
>> @@ -29,7 +29,7 @@
>>  
>>  #include <clocksource/arm_arch_timer.h>
>>  
>> -#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585)
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  extern struct static_key_false arch_timer_read_ool_enabled;
>>  #define needs_timer_erratum_workaround() \
>>  	static_branch_unlikely(&arch_timer_read_ool_enabled)
>> @@ -65,11 +65,35 @@ extern struct arch_timer_erratum_workaround *erratum_workaround;
>>  	_new;						\
>>  })
>>  
>> +
>> +
>> +/*
>> + * The number of retries is an arbitrary value well beyond the highest number
>> + * of iterations the loop has been observed to take.
>> + * Verify whether the value of the second read is larger than the first by
>> + * less than 32 is the only way to confirm the value is correct, the system
>> + * counter can be guaranteed not to return wrong value twice by back-to-back read
>> + * and the error value is always larger than the correct one by 32.
>> + */
>> +#define __hisi_161601_read_reg(reg) ({				\
>> +	u64 _old, _new;						\
>> +	int _retries = 200;					\
> 
> Please document how this value was found (either in the code or in the
> commit message).

It really difficult to give the accurate standard, theoretically the error should not happened
twice together, so maybe 2 is enough here, I just give a arbitrary value.

> 
>> +								\
>> +	do {							\
>> +		_old = read_sysreg(reg);			\
>> +		_new = read_sysreg(reg);			\
>> +		_retries--;					\
>> +	} while (unlikely((_new - _old) >> 5) && _retries);	\
>> +								\
>> +	WARN_ON_ONCE(!_retries);				\
>> +	_new;							\
>> +})
> 
> Same remark as in the previous patch.
> 

I think the sentence *Verify whether the value of the second read is larger than the first by
less than 32 is the only way to confirm the value is correct* could explain why should *(_new - _old) >> 5*.
it is the same as (_new - _old)/32, also mean the _new should never bigger than _old more than 32.

>> +
>>  #define arch_timer_reg_read_stable(reg) 		\
>>  ({							\
>>  	u64 _val;					\
>>  	if (needs_timer_erratum_workaround())		\
>> -		_val = erratum_workaround->read_##reg();	\
>> +		_val = erratum_workaround->read_##reg();\
>>  	else						\
>>  		_val = read_sysreg(reg);		\
>>  	_val;						\
>> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
>> index 8a753fd..4aafb6a 100644
>> --- a/drivers/clocksource/Kconfig
>> +++ b/drivers/clocksource/Kconfig
>> @@ -312,8 +312,20 @@ config FSL_ERRATUM_A008585
>>  	help
>>  	  This option enables a workaround for Freescale/NXP Erratum
>>  	  A-008585 ("ARM generic timer may contain an erroneous
>> -	  value").  The workaround will only be active if the
>> +	  value").  The workaround will be active if the
>>  	  fsl,erratum-a008585 property is found in the timer node.
>> +	  This can be overridden with the clocksource.arm_arch_timer.fsl-a008585
>> +	  boot parameter.
> 
> Move this hunk to the previous patch.
> 
>> +
>> +config HISILICON_ERRATUM_161601
>> +	bool "Workaround for Hisilicon Erratum 161601"
>> +	default y
>> +	depends on ARM_ARCH_TIMER && ARM64
>> +	help
>> +	  This option enables a workaround for Hisilicon Erratum
>> +	  161601. The workaround will be active if the hisilicon,erratum-161601
>> +	  property is found in the timer node. This can be overridden with
>> +	  the clocksource.arm_arch_timer.hisilicon-161601 boot parameter.
>>  
>>  config ARM_GLOBAL_TIMER
>>  	bool "Support for the ARM global timer" if COMPILE_TEST
>> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
>> index e4f7fa1..89f1895 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -94,13 +94,14 @@ early_param("clocksource.arm_arch_timer.evtstrm", early_evtstrm_cfg);
>>   * Architected system timer support.
>>   */
>>  
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  struct arch_timer_erratum_workaround *erratum_workaround = NULL;
>>  
>>  DEFINE_STATIC_KEY_FALSE(arch_timer_read_ool_enabled);
>>  EXPORT_SYMBOL_GPL(arch_timer_read_ool_enabled);
>> +#endif
>>  
>> -
>> +#ifdef CONFIG_FSL_ERRATUM_A008585
>>  static u32 fsl_a008585_read_cntp_tval_el0(void)
>>  {
>>  	return __fsl_a008585_read_reg(cntp_tval_el0);
>> @@ -139,6 +140,45 @@ static int __init early_fsl_a008585_cfg(char *buf)
>>  early_param("clocksource.arm_arch_timer.fsl-a008585", early_fsl_a008585_cfg);
>>  #endif /* CONFIG_FSL_ERRATUM_A008585 */
>>  
>> +#ifdef CONFIG_HISILICON_ERRATUM_161601
>> +static u32 hisi_161601_read_cntp_tval_el0(void)
>> +{
>> +	return __hisi_161601_read_reg(cntp_tval_el0);
>> +}
>> +
>> +static  u32 hisi_161601_read_cntv_tval_el0(void)
>> +{
>> +	return __hisi_161601_read_reg(cntv_tval_el0);
>> +}
>> +
>> +static u64 hisi_161601_read_cntvct_el0(void)
>> +{
>> +	return __hisi_161601_read_reg(cntvct_el0);
>> +}
>> +
>> +static struct arch_timer_erratum_workaround arch_timer_hisi_161601 = {
>> +	.read_cntp_tval_el0 = hisi_161601_read_cntp_tval_el0,
>> +	.read_cntv_tval_el0 = hisi_161601_read_cntv_tval_el0,
>> +	.read_cntvct_el0 = hisi_161601_read_cntvct_el0,
>> +};
>> +
>> +static int __init early_hisi_161601_cfg(char *buf)
>> +{
>> +	int ret;
>> +	bool val;
>> +
>> +	ret = strtobool(buf, &val);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (val)
>> +		erratum_workaround = &arch_timer_hisi_161601;
>> +
>> +	return 0;
>> +}
>> +early_param("clocksource.arm_arch_timer.hisilicon-161601", early_hisi_161601_cfg);
>> +#endif /* CONFIG_HISILICON_ERRATUM_161601 */
>> +
>>  static __always_inline
>>  void arch_timer_reg_write(int access, enum arch_timer_reg reg, u32 val,
>>  			  struct clock_event_device *clk)
>> @@ -288,8 +328,8 @@ static __always_inline void set_next_event(const int access, unsigned long evt,
>>  	arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk);
>>  }
>>  
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> -static __always_inline void fsl_a008585_set_next_event(const int access,
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>> +static __always_inline void erratum_set_next_event(const int access,
>>  		unsigned long evt, struct clock_event_device *clk)
>>  {
>>  	unsigned long ctrl;
>> @@ -307,20 +347,20 @@ static __always_inline void fsl_a008585_set_next_event(const int access,
>>  	arch_timer_reg_write(access, ARCH_TIMER_REG_CTRL, ctrl, clk);
>>  }
>>  
>> -static int fsl_a008585_set_next_event_virt(unsigned long evt,
>> +static int erratum_set_next_event_virt(unsigned long evt,
>>  					   struct clock_event_device *clk)
>>  {
>> -	fsl_a008585_set_next_event(ARCH_TIMER_VIRT_ACCESS, evt, clk);
>> +	erratum_set_next_event(ARCH_TIMER_VIRT_ACCESS, evt, clk);
>>  	return 0;
>>  }
>>  
>> -static int fsl_a008585_set_next_event_phys(unsigned long evt,
>> +static int erratum_set_next_event_phys(unsigned long evt,
>>  					   struct clock_event_device *clk)
>>  {
>> -	fsl_a008585_set_next_event(ARCH_TIMER_PHYS_ACCESS, evt, clk);
>> +	erratum_set_next_event(ARCH_TIMER_PHYS_ACCESS, evt, clk);
>>  	return 0;
>>  }
>> -#endif /* CONFIG_FSL_ERRATUM_A008585 */
>> +#endif
>>  
>>  static int arch_timer_set_next_event_virt(unsigned long evt,
>>  					  struct clock_event_device *clk)
>> @@ -350,16 +390,16 @@ static int arch_timer_set_next_event_phys_mem(unsigned long evt,
>>  	return 0;
>>  }
>>  
>> -static void fsl_a008585_set_sne(struct clock_event_device *clk)
>> +static void erratum_set_sne(struct clock_event_device *clk)
>>  {
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  	if (!static_branch_unlikely(&arch_timer_read_ool_enabled))
>>  		return;
>>  
>>  	if (arch_timer_uses_ppi == VIRT_PPI)
>> -		clk->set_next_event = fsl_a008585_set_next_event_virt;
>> +		clk->set_next_event = erratum_set_next_event_virt;
>>  	else
>> -		clk->set_next_event = fsl_a008585_set_next_event_phys;
>> +		clk->set_next_event = erratum_set_next_event_phys;
>>  #endif
> 
> This should be in the previous patch as well, as it only messes with the
> FSL erratum.
> 

Ok.

>>  }
>>  
>> @@ -392,7 +432,7 @@ static void __arch_timer_setup(unsigned type,
>>  			BUG();
>>  		}
>>  
>> -		fsl_a008585_set_sne(clk);
>> +		erratum_set_sne(clk);
>>  	} else {
>>  		clk->features |= CLOCK_EVT_FEAT_DYNIRQ;
>>  		clk->name = "arch_mem_timer";
>> @@ -612,7 +652,7 @@ static void __init arch_counter_register(unsigned type)
>>  
>>  		clocksource_counter.archdata.vdso_direct = true;
>>  
>> -#ifdef CONFIG_FSL_ERRATUM_A008585
>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>  		/*
>>  		 * Don't use the vdso fastpath if errata require using
>>  		 * the out-of-line counter accessor.
>> @@ -899,12 +939,22 @@ static int __init arch_timer_of_init(struct device_node *np)
>>  	arch_timer_c3stop = !of_property_read_bool(np, "always-on");
>>  
>>  #ifdef CONFIG_FSL_ERRATUM_A008585
>> -	if (!erratum_workaround && of_property_read_bool(np, "fsl,erratum-a008585"))
>> +	if (!erratum_workaround && of_property_read_bool(np, "fsl,erratum-a008585")) {
>>  		erratum_workaround = &arch_timer_fsl_a008585;
>> +		if (erratum_workaround) {
> 
> Brilliant!
> 
>> +			static_branch_enable(&arch_timer_read_ool_enabled);
>> +			pr_info("Enabling workaround for FSL erratum A-008585\n");
>> +		}
>> +	}
>> +#endif
>>  
>> -	if (erratum_workaround) {
>> -		static_branch_enable(&arch_timer_read_ool_enabled);
>> -		pr_info("Enabling workaround for FSL erratum A-008585\n");
>> +#ifdef CONFIG_HISILICON_ERRATUM_161601
>> +	if (!erratum_workaround && of_property_read_bool(np, "hisilicon,erratum-161601")) {
>> +		erratum_workaround = &arch_timer_hisi_161601;
>> +		if (erratum_workaround) {
>> +			static_branch_enable(&arch_timer_read_ool_enabled);
>> +			pr_info("Enabling workaround for HISILICON erratum 161601\n");
>> +		}
>>  	}
>>  #endif
> 
> Do you see a pattern here? Surely you can factor a bit of that code (and
> remove the nonsensical bits).
> 

Yes, found it, try to fix it.

Thanks.
Ding

> Thanks,
> 
> 	M.
> 

^ permalink raw reply

* [PATCH] ARM: DT: stm32: move dma translation to board files
From: Alexandre Torgue @ 2016-10-27 12:21 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAF3+TqdzESUNbkmAVv42pNv13wuyuPgGgksK3pJbORYw2a8ZFQ@mail.gmail.com>

Hi Bruno,

On 10/27/2016 12:43 PM, Bruno Herrera wrote:
> Hi Alex,
>
> On Wed, Oct 26, 2016 at 7:09 AM, Alexandre Torgue
> <alexandre.torgue@st.com> wrote:
>> Hi Bruno,
>>
>> On 10/25/2016 11:06 PM, Bruno Herrera wrote:
>>>
>>> Hi Alexandre,
>>>
>>>>
>>>> stm32f469-disco and stm32f429-eval boards use SDRAM start address
>>>> remapping
>>>> (to @0) to boost performances. A DMA translation through "dma-ranges"
>>>> property was needed for other masters than the M4 CPU.
>>>> stm32f429-disco doesn't use remapping so doesn't need this DMA
>>>> translation.
>>>> This patches moves this DMA translation definition from stm32f429 soc
>>>> file
>>>> to board files.
>>>>
>>>> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
>>>>
>>>> diff --git a/arch/arm/boot/dts/stm32429i-eval.dts
>>>> b/arch/arm/boot/dts/stm32429i-eval.dts
>>>> index 13c7cd2..a763c15 100644
>>>> --- a/arch/arm/boot/dts/stm32429i-eval.dts
>>>> +++ b/arch/arm/boot/dts/stm32429i-eval.dts
>>>> @@ -82,6 +82,10 @@
>>>>                 };
>>>>         };
>>>>
>>>> +       soc {
>>>> +               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>> +       };
>>>> +
>>>>         usbotg_hs_phy: usbphy {
>>>>                 #phy-cells = <0>;
>>>>                 compatible = "usb-nop-xceiv";
>>>
>>>
>>> Shouldn't also the peripheral dma-ranges property move to board specific
>>> too?
>>> I  had this patch for while but I didn't had the time to submit:
>>
>>
>> Well spot I forgot it. Actually, discussing with Arnd ysterday on IIRC,
>> empty dma-ranges is not needed. Can you test on your side by removing
>> dma-ranges in usb node please ?
> Unfortunately will take a time for me to set up this environment on
> the STM32F4-EVAL board.
> And on the discovery boards we dont have this scenario. That was the
> main reason I did not submit the patch right away.
> My conclusion and I might be wrong but is based on the my tests with
> SDIO device at STM32F469I-DISCO board.
>
> I started this issue as discussion at ST Forum but Maxime gave me the hint.
>
> https://my.st.com/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/Flat.aspx?RootFolder=https%3a%2f%2fmy%2est%2ecom%2fpublic%2fSTe2ecommunities%2fmcu%2fLists%2fcortex_mx_stm32%2fDMA2%20and%20SYSCFG_MEMRMP%20relationship&FolderCTID=0x01200200770978C69A1141439FE559EB459D7580009C4E14902C3CDE46A77F0FFD06506F5B&currentviews=44
>
>> I will push a v2 by removing empty dma-ranges if tests are ok in your side.
>
> From my understating/conclusion is: when empty property(dma-ranges) is
> the device node, the mapping will be taken in consideration when using
> DMA otherwise the mapping is ignored.
> And in the SDIO case it is needed for DEV->MEM(SDRAM) and
> MEM(SDRAM)->DEV. If it is not the case for the devices in question so
> I suppose it can work without the property.

For sure translation has to be done but I'm not sure that an empty 
"dma-ranges" is needed in device node to activate it. For Ethernet empty 
"dma-ranges" is not needed. I will try with usb.

alex

>
>>
>> Thanks in advance
>> Alex
>>
>>
>>>
>>> Author: Bruno Herrera <bruherrera@gmail.com>
>>> Date:   Sun Oct 16 14:50:00 2016 -0200
>>>
>>>     ARM: DT: STM32: Use dma-ranges property per board not at dtsi file
>>>
>>> diff --git a/arch/arm/boot/dts/stm32429i-eval.dts
>>> b/arch/arm/boot/dts/stm32429i-eval.dts
>>> index 6bfc595..2a22a82 100644
>>> --- a/arch/arm/boot/dts/stm32429i-eval.dts
>>> +++ b/arch/arm/boot/dts/stm32429i-eval.dts
>>> @@ -52,6 +52,10 @@
>>>         model = "STMicroelectronics STM32429i-EVAL board";
>>>         compatible = "st,stm32429i-eval", "st,stm32f429";
>>>
>>> +       soc {
>>> +               dma-ranges = <0xC0000000 0x0 0x10000000>;
>>> +       };
>>> +
>>>         chosen {
>>>                 bootargs = "root=/dev/ram rdinit=/linuxrc";
>>>                 stdout-path = "serial0:115200n8";
>>> @@ -96,6 +100,7 @@
>>>
>>>  &ethernet0 {
>>>         status = "okay";
>>> +       dma-ranges;
>>>         pinctrl-0       = <&ethernet0_mii>;
>>>         pinctrl-names   = "default";
>>>         phy-mode        = "mii-id";
>>> @@ -116,6 +121,7 @@
>>>  };
>>>
>>>  &usbotg_hs {
>>> +       dma-ranges;
>>>         dr_mode = "host";
>>>         phys = <&usbotg_hs_phy>;
>>>         phy-names = "usb2-phy";
>>> diff --git a/arch/arm/boot/dts/stm32f429.dtsi
>>> b/arch/arm/boot/dts/stm32f429.dtsi
>>> index 7d624a2..697a133 100644
>>> --- a/arch/arm/boot/dts/stm32f429.dtsi
>>> +++ b/arch/arm/boot/dts/stm32f429.dtsi
>>> @@ -59,7 +59,6 @@
>>>         };
>>>
>>>         soc {
>>> -               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>
>>>                 timer2: timer at 40000000 {
>>>                         compatible = "st,stm32-timer";
>>> @@ -472,13 +471,11 @@
>>>                         st,syscon = <&syscfg 0x4>;
>>>                         snps,pbl = <8>;
>>>                         snps,mixed-burst;
>>> -                       dma-ranges;
>>>                         status = "disabled";
>>>                 };
>>>
>>>                 usbotg_hs: usb at 40040000 {
>>>                         compatible = "snps,dwc2";
>>> -                       dma-ranges;
>>>                         reg = <0x40040000 0x40000>;
>>>                         interrupts = <77>;
>>>                         clocks = <&rcc 0 29>;
>>>
>>>
>>>> diff --git a/arch/arm/boot/dts/stm32f429.dtsi
>>>> b/arch/arm/boot/dts/stm32f429.dtsi
>>>> index 0596d60..3a1cfdd 100644
>>>> --- a/arch/arm/boot/dts/stm32f429.dtsi
>>>> +++ b/arch/arm/boot/dts/stm32f429.dtsi
>>>> @@ -59,8 +59,6 @@
>>>>         };
>>>>
>>>>         soc {
>>>> -               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>> -
>>>>                 timer2: timer at 40000000 {
>>>>                         compatible = "st,stm32-timer";
>>>>                         reg = <0x40000000 0x400>;
>>>> diff --git a/arch/arm/boot/dts/stm32f469-disco.dts
>>>> b/arch/arm/boot/dts/stm32f469-disco.dts
>>>> index 9e73656..c2213c0 100644
>>>> --- a/arch/arm/boot/dts/stm32f469-disco.dts
>>>> +++ b/arch/arm/boot/dts/stm32f469-disco.dts
>>>> @@ -64,6 +64,10 @@
>>>>         aliases {
>>>>                 serial0 = &usart3;
>>>>         };
>>>> +
>>>> +       soc {
>>>> +               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>> +       };
>>>>  };
>>>>
>>>>  &clk_hse {
>>>> --
>>>
>>>
>>>
>>> Br.,
>>> Bruno
>>>
>>

^ permalink raw reply

* [PATCH v2 3/4] arm64: arch_timer: Work around Erratum Hisilicon-161601
From: Marc Zyngier @ 2016-10-27 12:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <44cf2916-7c81-5cdf-d113-f99326eae2c6@huawei.com>

On 27/10/16 13:17, Ding Tianhong wrote:
> 
> 
> On 2016/10/27 18:58, Marc Zyngier wrote:
>> On 27/10/16 08:34, Ding Tianhong wrote:
>>> Erratum Hisilicon-161601 says that the ARM generic timer counter "has the
>>> potential to contain an erroneous value when the timer value changes".
>>> Accesses to TVAL (both read and write) are also affected due to the implicit counter
>>> read.  Accesses to CVAL are not affected.
>>>
>>> The workaround is to reread the system count registers until the value of the second
>>> read is larger than the first one by less than 32, the system counter can be guaranteed
>>> not to return wrong value twice by back-to-back read and the error value is always larger
>>> than the correct one by 32. Writes to TVAL are replaced with an equivalent write to CVAL.
>>>
>>> The workaround is enabled if the hisilicon,erratum-161601 property is found in
>>> the timer node in the device tree.  This can be overridden with the
>>> clocksource.arm_arch_timer.hisilicon-161601 boot parameter, which allows KVM
>>> users to enable the workaround until a mechanism is implemented to
>>> automatically communicate this information.
>>>
>>> Fix some description for fsl erratum a008585.
>>>
>>> v2: Significant rework based on feedback, including seperate the fsl erratum a008585
>>>     to another patch, update the erratum name and remove unwanted code.
>>>
>>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>>> ---
>>>  Documentation/arm64/silicon-errata.txt |  1 +
>>>  Documentation/kernel-parameters.txt    |  9 ++++
>>>  arch/arm64/include/asm/arch_timer.h    | 28 ++++++++++-
>>>  drivers/clocksource/Kconfig            | 14 +++++-
>>>  drivers/clocksource/arm_arch_timer.c   | 88 ++++++++++++++++++++++++++--------
>>>  5 files changed, 118 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
>>> index 405da11..70c5d5e 100644
>>> --- a/Documentation/arm64/silicon-errata.txt
>>> +++ b/Documentation/arm64/silicon-errata.txt
>>> @@ -63,3 +63,4 @@ stable kernels.
>>>  | Cavium         | ThunderX SMMUv2 | #27704          | N/A		       |
>>>  |                |                 |                 |                         |
>>>  | Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585     |
>>> +| Hisilicon      | Hip05/Hip06/Hip07 | #161601       | HISILICON_ERRATUM_161601|
>>
>> I've already commented on the alignment. Please read my initial review.
>>
> 
> It sees misunderstood, fix it this time.
> 
>>> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
>>> index 6fa1d8a..735b4b6 100644
>>> --- a/Documentation/kernel-parameters.txt
>>> +++ b/Documentation/kernel-parameters.txt
>>> @@ -707,6 +707,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>>>  			erratum.  If unspecified, the workaround is
>>>  			enabled based on the device tree.
>>>  
>>> +	clocksource.arm_arch_timer.hisilicon-161601=
>>> +			[ARM64]
>>> +			Format: <bool>
>>> +			Enable/disable the workaround of Hisilicon
>>> +			erratum 161601.  This can be useful for KVM
>>> +			guests, if the guest device tree doesn't show the
>>> +			erratum.  If unspecified, the workaround is
>>> +			enabled based on the device tree.
>>> +
>>>  	clearcpuid=BITNUM [X86]
>>>  			Disable CPUID feature X for the kernel. See
>>>  			arch/x86/include/asm/cpufeatures.h for the valid bit
>>> diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h
>>> index 118719d8..49b3041 100644
>>> --- a/arch/arm64/include/asm/arch_timer.h
>>> +++ b/arch/arm64/include/asm/arch_timer.h
>>> @@ -29,7 +29,7 @@
>>>  
>>>  #include <clocksource/arm_arch_timer.h>
>>>  
>>> -#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585)
>>> +#if IS_ENABLED(CONFIG_FSL_ERRATUM_A008585) || IS_ENABLED(CONFIG_HISILICON_ERRATUM_161601)
>>>  extern struct static_key_false arch_timer_read_ool_enabled;
>>>  #define needs_timer_erratum_workaround() \
>>>  	static_branch_unlikely(&arch_timer_read_ool_enabled)
>>> @@ -65,11 +65,35 @@ extern struct arch_timer_erratum_workaround *erratum_workaround;
>>>  	_new;						\
>>>  })
>>>  
>>> +
>>> +
>>> +/*
>>> + * The number of retries is an arbitrary value well beyond the highest number
>>> + * of iterations the loop has been observed to take.
>>> + * Verify whether the value of the second read is larger than the first by
>>> + * less than 32 is the only way to confirm the value is correct, the system
>>> + * counter can be guaranteed not to return wrong value twice by back-to-back read
>>> + * and the error value is always larger than the correct one by 32.
>>> + */
>>> +#define __hisi_161601_read_reg(reg) ({				\
>>> +	u64 _old, _new;						\
>>> +	int _retries = 200;					\
>>
>> Please document how this value was found (either in the code or in the
>> commit message).
> 
> It really difficult to give the accurate standard, theoretically the error should not happened
> twice together, so maybe 2 is enough here, I just give a arbitrary value.
> 
>>
>>> +								\
>>> +	do {							\
>>> +		_old = read_sysreg(reg);			\
>>> +		_new = read_sysreg(reg);			\
>>> +		_retries--;					\
>>> +	} while (unlikely((_new - _old) >> 5) && _retries);	\
>>> +								\
>>> +	WARN_ON_ONCE(!_retries);				\
>>> +	_new;							\
>>> +})
>>
>> Same remark as in the previous patch.
>>
> 
> I think the sentence *Verify whether the value of the second read is larger than the first by
> less than 32 is the only way to confirm the value is correct* could explain why should *(_new - _old) >> 5*.
> it is the same as (_new - _old)/32, also mean the _new should never bigger than _old more than 32.

This is not about the explanation of the erratum, but about the location
of the #define, which can be made private to the .c file instead of
being globally visible.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH] arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
From: Christoffer Dall @ 2016-10-27 12:27 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <f00b6e69-7b45-5238-4d3a-af8d77e85670@arm.com>

On Thu, Oct 27, 2016 at 11:40:00AM +0100, Marc Zyngier wrote:
> On 27/10/16 11:04, Christoffer Dall wrote:
> > On Thu, Oct 27, 2016 at 10:49:00AM +0100, Marc Zyngier wrote:
> >> Hi Christoffer,
> >>
> >> On 27/10/16 10:19, Christoffer Dall wrote:
> >>> On Mon, Oct 24, 2016 at 04:31:28PM +0100, Marc Zyngier wrote:
> >>>> Architecturally, TLBs are private to the (physical) CPU they're
> >>>> associated with. But when multiple vcpus from the same VM are
> >>>> being multiplexed on the same CPU, the TLBs are not private
> >>>> to the vcpus (and are actually shared across the VMID).
> >>>>
> >>>> Let's consider the following scenario:
> >>>>
> >>>> - vcpu-0 maps PA to VA
> >>>> - vcpu-1 maps PA' to VA
> >>>>
> >>>> If run on the same physical CPU, vcpu-1 can hit TLB entries generated
> >>>> by vcpu-0 accesses, and access the wrong physical page.
> >>>>
> >>>> The solution to this is to keep a per-VM map of which vcpu ran last
> >>>> on each given physical CPU, and invalidate local TLBs when switching
> >>>> to a different vcpu from the same VM.
> >>>
> >>> Just making sure I understand this:  The reason you cannot rely on the
> >>> guest doing the necessary distinction with ASIDs or invalidating the TLB
> >>> is that a guest (which assumes it's running on hardware) can validly
> >>> defer any neccessary invalidation until it starts running on other
> >>> physical CPUs, but we do this transparently in KVM?
> >>
> >> The guest wouldn't have to do any invalidation at all on real HW,
> >> because the TLBs are strictly private to a physical CPU (only the
> >> invalidation can be broadcast to the Inner Shareable domain). But when
> >> we multiplex two vcpus on the same physical CPU, we break the private
> >> semantics, and a vcpu could hit in the TLB entries populated by the
> >> another one.
> > 
> > Such a guest would be using a mapping of the same VA with the same ASID
> > on two separate CPUs, each pointing to a separate PA.  If it ever were
> > to, say, migrate a task, it would have to do invalidations then.  Right?
> 
> This doesn't have to be ASID tagged. Actually, it is more likely to
> affect global mappings. Imagine for example that the kernel (which uses
> global mappings for its own page tables) decides to create per-cpu
> variable using this trick (all the CPUs have the same VA, but use
> different PAs). No invalidation at all, everything looks perfectly fine,
> until you start virtualizing it.
> 
> > Does Linux or other guests actually do this?
> 
> Linux may hit it with CPU hotplug, which uses global mappings (which a
> vcpu using an ASID tagged mapping could then hit if the VAs overlap).
> 

Right, ok, it's more threatening than I first thought.  Thanks for the
explanation.

> > 
> > I would suspect Linux has to eventually invalidate those mappins if it
> > wants the scheduler to be allowed to freely move things around.
> > 
> >>
> >> As we cannot segregate the TLB entries per vcpu (but only per VMID), the
> >> workaround is to nuke all the TLBs for this VMID (locally only - no
> >> broadcast) each time we find that two vcpus are sharing the same
> >> physical CPU.
> >>
> >> Is that clearer?
> > 
> > Yes, the fix is clear, just want to make sure I understand that it's a
> > valid circumstance where this actually happens.  But in either case, we
> > probably have to fix this to emulate the hardware correctly.
> > 
> > Another fix would be to allocate a VMID per VCPU I suppose, just to
> > introduce a terrible TLB hit ratio :)
> 
> But that would break TLB invalidations that are broadcast in the Inner
> Shareable domain. To do so, you'd have to trap every TBLI, and issue
> corresponding invalidations for all the vcpus. I'm not sure I want to
> see the performance number of that solution... ;-)
> 
Ah, yeah, that's ridiculous.  Forget what I said.

Thanks,
-Christoffer

^ permalink raw reply

* [PATCH] arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
From: Christoffer Dall @ 2016-10-27 12:28 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161027104925.GC27135@leverpostej>

On Thu, Oct 27, 2016 at 11:51:26AM +0100, Mark Rutland wrote:
> Hi Christoffer,
> 
> On Thu, Oct 27, 2016 at 12:04:28PM +0200, Christoffer Dall wrote:
> > On Thu, Oct 27, 2016 at 10:49:00AM +0100, Marc Zyngier wrote:
> > > The guest wouldn't have to do any invalidation at all on real HW,
> > > because the TLBs are strictly private to a physical CPU (only the
> > > invalidation can be broadcast to the Inner Shareable domain). But when
> > > we multiplex two vcpus on the same physical CPU, we break the private
> > > semantics, and a vcpu could hit in the TLB entries populated by the
> > > another one.
> > 
> > Such a guest would be using a mapping of the same VA with the same ASID
> > on two separate CPUs, each pointing to a separate PA.  If it ever were
> > to, say, migrate a task, it would have to do invalidations then.  Right?
> 
> An OS (not Linux) could use a different ASID space per-cpu.
> 
> e.g. with two single-threaded tasks A and B, you could have ASIDS:
> 
> 	cpu0	cpu1
> A	0	1
> B	1	0
> 
> ... and this would not be a problem, so long as when mappings changed
> maintenance were performed appropriately (e.g. perhaps it uses IPIs to
> trigger the relevant local TLB invlidation, rather than using broadcast
> ops).
> 
> > Does Linux or other guests actually do this?
> 
> Linux currently doesn't use ASIDs that way, but does use global mappings
> in a potentially-confliciting way in the cold-return paths (hotplug-on
> and return from idle). With two vCPUs, you could have a sequence like:
> 
> 	cpu0				cpu1
> 	Task with ASID x started
> 					hotplug on
> 					install global TTBR0 mapping
> 					global entry allocated into TLB
> 	Task hits cpu1's global entry
> 
> ... which cannot happen bare-metal, and there's no point at which the
> guest can perform suitable maintenance.
> 
> > Another fix would be to allocate a VMID per VCPU I suppose, just to
> > introduce a terrible TLB hit ratio :)
> 
> That would break broadcast invalidation within the guest, no?
> 
> ... unless you also trapped all TLB maintenance, and did the IPI-based
> broadcast in SW.
> 

Thanks for explanations, I'm getting the full picture now.

-Christoffer

^ permalink raw reply

* [PATCH v3] drivers: psci: PSCI checker module
From: Kevin Brodsky @ 2016-10-27 12:51 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161027091306.GA16603@red-moon>

On 27/10/16 10:13, Lorenzo Pieralisi wrote:
> On Wed, Oct 26, 2016 at 11:11:48AM -0700, Paul E. McKenney wrote:
>> On Wed, Oct 26, 2016 at 06:35:34PM +0100, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 26, 2016 at 10:22:52AM -0700, Paul E. McKenney wrote:
>>>> On Wed, Oct 26, 2016 at 06:10:06PM +0100, Lorenzo Pieralisi wrote:
>> [ . . . ]
>>
>>>>> Thanks a lot for your feedback, thoughts appreciated.
>>>> Let me ask the question more directly.
>>>>
>>>> Why on earth are we trying to run these tests concurrently?
>>> We must prevent that, no question about that, that's why I started
>>> this discussion. It is not fine to enable this checker and the
>>> RCU/LOCK torture hotplug tests at the same time.
>>>
>>>> After all, if we just run one at a time in isolation, there is no
>>>> problem.
>>> Fine by me, it was to understand if the current assumptions we made
>>> are correct and they are definitely not. If we enable the PSCI checker
>>> we must disable the torture rcu/lock hotplug tests either statically or
>>> dynamically.
>> What rcutorture, locktorture, and rcuperf do is to invoke
>> torture_init_begin(), which returns false if one of these tests
>> is already running.
>>
>> Perhaps we should extract this torture-test-exclusion and require
>> than conflicting torture tests invoke it?
> Yes if it can be extracted as a check (but it should also prevent the
> torture tests from running and vice versa), either that or Kconfig
> dependency (which we could do as a first step, waiting to add the
> required interface to the torture test code ?).
>
> Thanks !
> Lorenzo

That sounds like a reasonable idea, but then that would mean that the PSCI checker
would have to wait until the torture test is finished if it is already running (and
the other way around).

I wasn't aware that torture tests were hotplugging CPUs. I think that the most
sensible thing to do right now is to make CONFIG_PSCI_CHECKER depend on
!CONFIG_TORTURE_TEST (or maybe specifically !CONFIG_RCU_TORTURE_TEST &&
!CONFIG_LOCK_TORTURE_TEST). We can try to make them work together afterwards, but for
the sake of getting this patch merged in a reasonable amount of time, I think we
should just exclude the conflicting tests at the build level in this patch. I'll also
update the comment accordingly.

Thanks,
Kevin
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply

* [PATCH] ARM: DT: stm32: move dma translation to board files
From: Bruno Herrera @ 2016-10-27 12:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <65096fbe-3f06-c8db-fbe8-29f0be28cb61@st.com>

Hi Alex,

On Thu, Oct 27, 2016 at 10:21 AM, Alexandre Torgue
<alexandre.torgue@st.com> wrote:
> Hi Bruno,
>
>
> On 10/27/2016 12:43 PM, Bruno Herrera wrote:
>>
>> Hi Alex,
>>
>> On Wed, Oct 26, 2016 at 7:09 AM, Alexandre Torgue
>> <alexandre.torgue@st.com> wrote:
>>>
>>> Hi Bruno,
>>>
>>> On 10/25/2016 11:06 PM, Bruno Herrera wrote:
>>>>
>>>>
>>>> Hi Alexandre,
>>>>
>>>>>
>>>>> stm32f469-disco and stm32f429-eval boards use SDRAM start address
>>>>> remapping
>>>>> (to @0) to boost performances. A DMA translation through "dma-ranges"
>>>>> property was needed for other masters than the M4 CPU.
>>>>> stm32f429-disco doesn't use remapping so doesn't need this DMA
>>>>> translation.
>>>>> This patches moves this DMA translation definition from stm32f429 soc
>>>>> file
>>>>> to board files.
>>>>>
>>>>> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
>>>>>
>>>>> diff --git a/arch/arm/boot/dts/stm32429i-eval.dts
>>>>> b/arch/arm/boot/dts/stm32429i-eval.dts
>>>>> index 13c7cd2..a763c15 100644
>>>>> --- a/arch/arm/boot/dts/stm32429i-eval.dts
>>>>> +++ b/arch/arm/boot/dts/stm32429i-eval.dts
>>>>> @@ -82,6 +82,10 @@
>>>>>                 };
>>>>>         };
>>>>>
>>>>> +       soc {
>>>>> +               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>>> +       };
>>>>> +
>>>>>         usbotg_hs_phy: usbphy {
>>>>>                 #phy-cells = <0>;
>>>>>                 compatible = "usb-nop-xceiv";
>>>>
>>>>
>>>>
>>>> Shouldn't also the peripheral dma-ranges property move to board specific
>>>> too?
>>>> I  had this patch for while but I didn't had the time to submit:
>>>
>>>
>>>
>>> Well spot I forgot it. Actually, discussing with Arnd ysterday on IIRC,
>>> empty dma-ranges is not needed. Can you test on your side by removing
>>> dma-ranges in usb node please ?
>>
>> Unfortunately will take a time for me to set up this environment on
>> the STM32F4-EVAL board.
>> And on the discovery boards we dont have this scenario. That was the
>> main reason I did not submit the patch right away.
>> My conclusion and I might be wrong but is based on the my tests with
>> SDIO device at STM32F469I-DISCO board.
>>
>> I started this issue as discussion at ST Forum but Maxime gave me the
>> hint.
>>
>>
>> https://my.st.com/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/Flat.aspx?RootFolder=https%3a%2f%2fmy%2est%2ecom%2fpublic%2fSTe2ecommunities%2fmcu%2fLists%2fcortex_mx_stm32%2fDMA2%20and%20SYSCFG_MEMRMP%20relationship&FolderCTID=0x01200200770978C69A1141439FE559EB459D7580009C4E14902C3CDE46A77F0FFD06506F5B&currentviews=44
>>
>>> I will push a v2 by removing empty dma-ranges if tests are ok in your
>>> side.
>>
>>
>> From my understating/conclusion is: when empty property(dma-ranges) is
>> the device node, the mapping will be taken in consideration when using
>> DMA otherwise the mapping is ignored.
>> And in the SDIO case it is needed for DEV->MEM(SDRAM) and
>> MEM(SDRAM)->DEV. If it is not the case for the devices in question so
>> I suppose it can work without the property.
>
>
> For sure translation has to be done but I'm not sure that an empty
> "dma-ranges" is needed in device node to activate it. For Ethernet empty
> "dma-ranges" is not needed. I will try with usb.

In the case of SDIO it is needed. As example this is my working SDIO node:

sdio: sdio at 40012c00 {
compatible = "arm,pl18x", "arm,primecell";
arm,primecell-periphid = <0x00480181>;
reg = <0x40012c00 0x400>;
dmas =  <&dma2 6 4 0x10400 0x3>, /* Logical - DevToMem */
<&dma2 3 4 0x10400 0x3>; /* Logical - MemToDev */
dma-names = "rx", "tx";
clocks = <&rcc 0 171>;
clock-names = "apb_pclk";
interrupts = <49>;
status = "disabled";
};

&sdio {
status = "okay";
vmmc-supply = <&wlan_en>;
bus-width = <4>;
max-frequency = <24000000>;
pinctrl-names = "default";
pinctrl-0 = <&sdio_pins>;
ti,non-removable;
ti,needs-special-hs-handling;
dma-ranges;
cap-power-off-card;
keep-power-in-suspend;

#address-cells = <1>;
#size-cells = <0>;
wlcore: wlcore at 0 {
compatible = "ti,wl1835";
reg = <2>;
interrupt-parent = <&gpioa>;
interrupts = <8 IRQ_TYPE_EDGE_RISING>;
};
};

>
> alex
>
>
>>
>>>
>>> Thanks in advance
>>> Alex
>>>
>>>
>>>>
>>>> Author: Bruno Herrera <bruherrera@gmail.com>
>>>> Date:   Sun Oct 16 14:50:00 2016 -0200
>>>>
>>>>     ARM: DT: STM32: Use dma-ranges property per board not at dtsi file
>>>>
>>>> diff --git a/arch/arm/boot/dts/stm32429i-eval.dts
>>>> b/arch/arm/boot/dts/stm32429i-eval.dts
>>>> index 6bfc595..2a22a82 100644
>>>> --- a/arch/arm/boot/dts/stm32429i-eval.dts
>>>> +++ b/arch/arm/boot/dts/stm32429i-eval.dts
>>>> @@ -52,6 +52,10 @@
>>>>         model = "STMicroelectronics STM32429i-EVAL board";
>>>>         compatible = "st,stm32429i-eval", "st,stm32f429";
>>>>
>>>> +       soc {
>>>> +               dma-ranges = <0xC0000000 0x0 0x10000000>;
>>>> +       };
>>>> +
>>>>         chosen {
>>>>                 bootargs = "root=/dev/ram rdinit=/linuxrc";
>>>>                 stdout-path = "serial0:115200n8";
>>>> @@ -96,6 +100,7 @@
>>>>
>>>>  &ethernet0 {
>>>>         status = "okay";
>>>> +       dma-ranges;
>>>>         pinctrl-0       = <&ethernet0_mii>;
>>>>         pinctrl-names   = "default";
>>>>         phy-mode        = "mii-id";
>>>> @@ -116,6 +121,7 @@
>>>>  };
>>>>
>>>>  &usbotg_hs {
>>>> +       dma-ranges;
>>>>         dr_mode = "host";
>>>>         phys = <&usbotg_hs_phy>;
>>>>         phy-names = "usb2-phy";
>>>> diff --git a/arch/arm/boot/dts/stm32f429.dtsi
>>>> b/arch/arm/boot/dts/stm32f429.dtsi
>>>> index 7d624a2..697a133 100644
>>>> --- a/arch/arm/boot/dts/stm32f429.dtsi
>>>> +++ b/arch/arm/boot/dts/stm32f429.dtsi
>>>> @@ -59,7 +59,6 @@
>>>>         };
>>>>
>>>>         soc {
>>>> -               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>>
>>>>                 timer2: timer at 40000000 {
>>>>                         compatible = "st,stm32-timer";
>>>> @@ -472,13 +471,11 @@
>>>>                         st,syscon = <&syscfg 0x4>;
>>>>                         snps,pbl = <8>;
>>>>                         snps,mixed-burst;
>>>> -                       dma-ranges;
>>>>                         status = "disabled";
>>>>                 };
>>>>
>>>>                 usbotg_hs: usb at 40040000 {
>>>>                         compatible = "snps,dwc2";
>>>> -                       dma-ranges;
>>>>                         reg = <0x40040000 0x40000>;
>>>>                         interrupts = <77>;
>>>>                         clocks = <&rcc 0 29>;
>>>>
>>>>
>>>>> diff --git a/arch/arm/boot/dts/stm32f429.dtsi
>>>>> b/arch/arm/boot/dts/stm32f429.dtsi
>>>>> index 0596d60..3a1cfdd 100644
>>>>> --- a/arch/arm/boot/dts/stm32f429.dtsi
>>>>> +++ b/arch/arm/boot/dts/stm32f429.dtsi
>>>>> @@ -59,8 +59,6 @@
>>>>>         };
>>>>>
>>>>>         soc {
>>>>> -               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>>> -
>>>>>                 timer2: timer at 40000000 {
>>>>>                         compatible = "st,stm32-timer";
>>>>>                         reg = <0x40000000 0x400>;
>>>>> diff --git a/arch/arm/boot/dts/stm32f469-disco.dts
>>>>> b/arch/arm/boot/dts/stm32f469-disco.dts
>>>>> index 9e73656..c2213c0 100644
>>>>> --- a/arch/arm/boot/dts/stm32f469-disco.dts
>>>>> +++ b/arch/arm/boot/dts/stm32f469-disco.dts
>>>>> @@ -64,6 +64,10 @@
>>>>>         aliases {
>>>>>                 serial0 = &usart3;
>>>>>         };
>>>>> +
>>>>> +       soc {
>>>>> +               dma-ranges = <0xc0000000 0x0 0x10000000>;
>>>>> +       };
>>>>>  };
>>>>>
>>>>>  &clk_hse {
>>>>> --
>>>>
>>>>
>>>>
>>>>
>>>> Br.,
>>>> Bruno
>>>>
>>>
>

^ permalink raw reply

* Add Allwinner Q8 tablets hardware manager
From: Pierre-Hugues Husson @ 2016-10-27 12:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <99dde1e0-26f0-cc7b-36bf-1181c47296f9@redhat.com>

2016-10-27 11:14 GMT+02:00 Hans de Goede <hdegoede@redhat.com>:
> In my experience with these cheap boards, there is a mix of auto-probing +
> device / revision specific os-image modifications. I keep coming back to
> the touchscreen controller firmware (but also the orientation), for the
> gsl1680 controller I need at least 2 different firmware files (per gsl1680
> revision) to make all q8 tablets I have working. This is simply not solved
> by the vendor android code, they just shove the right firmware into the
> os-image. Likewise for the touchscreen orientation (x-mirored, y-mirored,
> etc) too is just a hard-coded setting in the os-image.
Reading your patch, it looks like to handle the two different firmware
files, you're simply adding a command-line switch, there is no
detection involved.
Am I understanding correctly?
If this is the case, two things:
1. I'm not too sure having the user choose this via cmdline is the
right way. I think I'd rather have it set by userspace. (though that's
not a strong opinion).
Or if cmdline is being changed... how about having DTS (or just an
overlay on top of it) being changed instead?

2. This could still be declared by DTS. For instance, assuming your
i2c-probe-stop-at-first-match:
&i2c0 {
        touchscreen1: gsl1680 at 40 {
                reg = <0x40>;
                compatible = "silead,gsl1680";
                enable-gpios = <&pio 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
                touchscreen-size = <1024 600>;
                touchscreen-fw = "gsl1680-a082-q8-700.fw";
                filter-names = "touchscreen_variant";
                filter-0 = "none", "gsl1680-a082-q8-700";
                id = <0xa0820000>;
                status = "disabled";
        };
        touchscreen2: gsl1680 at 40 {
                reg = <0x40>;
                compatible = "silead,gsl1680";
                enable-gpios = <&pio 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
                touchscreen-size = <480 800>;
                touchscreen-fw = "gsl1680-a082-q8-a70.fw";
                filter-names = "touchscreen_variant";
                filter-0 = "gsl1680-a082-q8-a70";
                id = <0xa0820000>;
               status = "disabled";
        };
        touchscreen2: gsl1680 at 40 {
                reg = <0x40>;
                compatible = "silead,gsl1680";
                enable-gpios = <&pio 7 1 GPIO_ACTIVE_HIGH>; /* PH1 */
                touchscreen-size = <960 640>;
                touchscreen-fw = "gsl1680-b482-q8-d702.fw";
                filter-names = "touchscreen_variant";
                filter-0 = "gsl1680-b482-q8-d702";
                id = <0xb4820000>;
               status = "disabled";
        };
        i2c-probe-stop-at-first-match = <&touchscreen1>,
<&touchscreen2>, <&touchscreen3>;
}

With "none" value being the value when the "touchscreen_variant"
option is not defined in cmdline.

Please note that I'm not too sure whether SILEAD_REG_ID represents an
OTP which can be changed by OEM, or if it's more of a hardware
revision. Depending on this, this would either fit into a id =
<0xa0820000> DTS line, or a compatible = "silead,gsl1680_a082",
"silead,gsl1680"; DTS line.

> Sofar I've only seen this with one type of touchscreen so an easy cop-out
> would be to add an "optional-vddio-supply" to the the bindings for the
> specific touchscreen use and put all the necessary logic in the driver.
>
> This does require propagating the learned need for the regulator
> from the drivers detect() callback to probe() or alternatively I'm
> thinking we should just use probe() instead of detect()to begin with,
> that will save a lot of duplication with things
> like code for enable gpio-s and regulators.
>
> So assuming we go for the cop-out option for 3. (I'm ok with that),
> this would be a pretty clean solution adding just the 2 new:
> i2c-probe-stop-at-first-match and i2c-probe-all properties to
> the i2c-bus bindings. One problem here is that we may want to have
> multiple i2c-probe-stop-at-first-match phandle lists on a single bus
> (e.g. try 3 touchscreens + 6 accelerometers on the same bus, stop at
> first touchscreen / first accelerometer), anyone have any ideas for
> that?
How about something like:

&i2c1 {
    touchscreen1....
    touchscreen2....
    touchscreen3....
    accelerometer1....
    accelerometer2....
    accelerometer3....
    accelerometer4....

    select-one {
       compatible = "i2c-select;
       group-names = "touchscreen", "accelerometer";
       group-0 = <&touchscreen1>, <&touchscreen2>, <&touchscreen3>;
       group-1 = <&accelerometer1>, <&accelerometer2>,
<&accelerometer3>, <&accelerometer4>;
    };
};

>> When it comes to detection, I've witnessed various things.
>> It can be kernel-side or bootloader-side "global setting" reading (like an
>> ADC/resistor value, or an OTP), it can be bootloader doing the
>> "brute-force", or it can be the kernel doing all the probes.
>>
>> For instance, as of today, on a Spreadtrum ODM tree, the bootloader will
>> detect the screen by testing all knowns screens, the screen-drivers declare
>> a get_id function, and the bootloader probes until the get_id matches the id
>> declared by the screen driver.
>> And then the bootloader tells the kernel, via cmdline, which screen is
>> actually there (but auto-detection is also coded in kernel).
>> Finally all possible sensors/touchscreen/camera are declared in DTS, and
>> probe will filter-out N/C ones in the kernel.
>>
>> Now the big difference between my experience and what Hans is trying to
>> do, is that I've always worked with devices with "safely" queriable IDs,
>> either on i2c or dsi. I've never encountered SPI. This makes probing
>> inherently more dangerous, but I believe the question roughly remains the
>> same.
>
>
> I'm dealing with i2c too, Mark mistakenly used SPI in his reply,
> which I think is what got you thinking I've SPI.
Right, so let's concentrate on reasonable bus-es first then. (I can
think of I2C and DSI)

> See above, I think that we can make this work by delegating the actual
> detection to the driver (so each compatible can have a different detect
> method / code).
> So with this we can remove a big part of drivers/misc/q8-hardwaremgr.c, but
> not all
> of it. We still need board specific code somewhere to deal with things like
> picking
> the right touchscreen firmware and touchscreen orientation. This is all
> somewhat
> gsl1680 specific.
> I actually have the same problem on x86 where the ACPI description of the
> device
> basically says: "There is a gsl1680 at this bus at this address" and does
> not say
> anything about firmware / orientation (again this is simply hardcoded
> in the os-image these devices ship with).
>
> For x86 my plan is to have an array of configs in the driver and select the
> right
> one based on DMI strings, which is in essence putting board specific info in
> the
> driver.
>
> I can imagine mirroring this for ARM, and have an array of configs in the
> driver
> there too (for cases where cannot simply hardcode everything in dt only) and
> have
> some board specific code (activated by of_machine_is_compatible()) to select
> the
> right config.
I do believe this can all be done in DTS, and at the moment, what
you're describing seem to happen often enough to be worth writing
generic code for.
But then, I can't really tell which makes the most sense between
source-based and devicetree-based.
I prefer doing it in device-tree, since it means that any OEM can have
his device supported by only providing DTB, and won't need to provide
kernel patches.

> 2) miscellaneous extra config on top of figuring out which ICs are
> connected,
> basically the kind of stuff many vendors simply hard-code in their device
> specific os-image. This one is much more difficult to deal with and I think
> we need to figure this out on a case by case basis. This will require board
> specific code (just like the kernel has tons of DMI string activated board
> specific code on x86) and what is the best code for this place to live will
> be a case by case thing too.

With things like mount-matrix devicetree property, the goal is to have
such informations in the DTS.
I'm not seeing every cases should be handled in DTS, but ATM I see no
good reason.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox