Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v3 1/2] net: dt-bindings: add RGMII TX delay configuration to meson8b-dwmac
From: Rob Herring @ 2016-12-09 21:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161202233238.17561-2-martin.blumenstingl@googlemail.com>

On Sat, Dec 03, 2016 at 12:32:37AM +0100, Martin Blumenstingl wrote:
> This allows configuring the RGMII TX clock delay. The RGMII clock is
> generated by underlying hardware of the the Meson 8b / GXBB DWMAC glue.
> The configuration depends on the actual hardware (no delay may be
> needed due to the design of the actual circuit, the PHY might add this
> delay, etc.).
> 
> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
> Tested-by: Neil Armstrong <narmstrong@baylibre.com>
> ---
>  Documentation/devicetree/bindings/net/meson-dwmac.txt | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply

* [PATCHv4 11/15] clk: ti: clockdomain: add clock provider support to clockdomains
From: Michael Turquette @ 2016-12-09 21:28 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161209204015.GL4920@atomide.com>

Quoting Tony Lindgren (2016-12-09 12:40:16)
> * Michael Turquette <mturquette@baylibre.com> [161209 12:02]:
> > Quoting Tony Lindgren (2016-12-05 07:25:34)
> > > * Tero Kristo <t-kristo@ti.com> [161205 02:09]:
> ...
> <snip>
> 
> > I had a recent conversation with Kevin Hilman about a related issue
> > (we were not discussing this thread or this series) and we both agreed
> > that most drivers don't even need to managed their clocks directly, so
> > much as they need to manage their on/off resources. Clocks are just one
> > part of that, and if we can hide that stuff inside of an attached genpd
> > then it would be better than having the driver manage clocks explicitly.
> > 
> > Obviously some devices such as audio codec or uart will need to manage
> > clocks directly, but this is mostly the exception, not the rule.
> 
> Yes. And we do that already for clkctrl clocks via PM runtime where
> hwmod manages them. Tero's series still has hwmod manage the clocks,
> but the clock register handling is moved to live under drivers/clock.
> 
> > > > > > There is certainly no API for that in the clock framework, but for genpd
> > > > > > your runtime_pm_get() callback for clkdm_A could call runtime_pm_get
> > > > > > against clkdm_B, which would satisfy the requirement. See section
> > > > > > 3.1.1.1.7 Clock Domain Dependency in the OMAP4 TRM, version AB.
> > > > 
> > > > For static dependencies the apis genpd_add/remove_subdomain could probably
> > > > be used.
> > > > 
> > > > > To me it seems the API is just clk_get() :) Do you have some
> > > > > specific example we can use to check? My guess is that the
> > > > > TRM "Clock Domain Dependency" is just the usual parent child
> > > > > relationship between clocks that are the clockdomains..
> > 
> > clk_get() only fetches a pointer to the clk. I guess you mean
> > clk_prepare_enable() to actually increment the use count?
> 
> Right, with clocks that's all we should need to do :)
> 
> > If we used the clk framework here is that it would look something like
> > this:
> > 
> > clk_enable(clk_a)
> > -> .enable(clk_a_hw)
> >    -> clk_enable(clk_b)
> > 
> > However, clk_a and clk_b do not have a parent-child relationship in the
> > clock tree. This is purely a functional relationship between IP blocks.
> > Modeling this sort of thing in the clk framework would be wrong, and
> > genpd is a much better place to establish these arbitrary relationships.
> 
> Hmm yes, and I don't mean the clock framework should do anything more
> complex beyond what it already does.
> 
> We just want to represent the clocks as clocks, then have the
> interconnect code manage those clocks. That's currently hwmod, eventually
> it will be genpd.
> 
> > > > > If there is something more magical there certainly that should
> > > > > be considered though.
> > > > 
> > > > The hwmods could be transformed to individual genpds also I guess. On DT
> > > > level though, we would still need a clock pointer to the main clock and a
> > > > genpd pointer in addition to that.
> > > 
> > > Hmm a genpd pointer to where exactly? AFAIK each interconnect
> > > instance should be a genpd provider, and the individual interconnect
> > > target modules should be consumers for that genpd.
> > 
> > I was thinking that the clock domains would be modeled as genpd objects
> > with the interconnect target modules attached as struct devices.
> 
> I think clock domains should be just clocks, then we let the interconnect
> code and eventually genpd manage them.
> 
> > > > Tony, any thoughts on that? Would this break up the plans for the
> > > > interconnect completely?
> > > 
> > > Does using genpd for clockdomains cause issues for using genpd for
> > > interconnect instances and the target modules?
> > 
> > Can they be the same object in Linux? If there is a one-to-one mapping
> > between clock domains and the interconnect port then maybe you can just
> > model them together.
> 
> I'm thinking that it should be the interconnect code implementing
> genpd, and use clk_request_enable().
> 
> > > The thing I'd be worried about there is that the clockdomains and
> > > their child clocks are just devices sitting on the interconnect,
> > > so we could easily end up with genpd modeling something that does
> > > not represent the hardware.
> > > 
> > > For example, on 4430 we have:
> > > 
> > > l4_cfg interconnect
> > >        ...
> > >        segment at 0
> > >                 ...
> > >                 target_module at 4000
> > >                         cm1: cm1 at 0
> > 
> > How about:
> > 
> > l4_cfg interconnect
> >        ...
> >        segment at 0
> >                 ...
> >                 cm1 at 4000
> >                       module: foo_module at 0
> 
> That's the wrong way around from hardware point of view. There's
> a generic interconnect wrapper module with it's own registers,
> then cm1 (and possibly other devices) are children of that target
> module.
> 
> > I don't know much about the segments. Do they map one-to-one with the
> > clock domains?
> 
> I need to check, it's been a while, but I recall some interconnects
> are partioned to segments based on voltages or clocks.
> 
> > If my quick-and-dirty DT above makes sense, then the target modules
> > (e.g. io controller) would not get clocks anymore, but just
> > pm_runtime_get(). The genpd backing object would call clk_enable/disable
> > as needed.
> 
> Yeah that's what we already have with hwmod and PM runtime for the
> clockctrl register. But hwmod currently directly manages the clkctrl
> register, we just want to move that part to be a clock driver.
> 
> The children of the interconnect target modules just need to use
> PM runtime, but the interconnect target module driver needs to know
> it's clkctrl clock.

OK, that sounds good to me but I'm not quite sure about the difference
between "children of interconnect target modules" versus just the
"interconnect target module".

Can you give an example on each? I just want to understand for my own
curiosity.

> 
> > If fine grained control of a clock is needed (e.g. for clk_set_rate)
> > then the driver can still clk_get it. Whether or not the clockdomain
> > provides that clock or if it comes from the clock generator (e.g. cm1,
> > cm2, prm, etc) isn't as important to me, but I prefer for the
> > clockdomain to not be a clock provider if possible.
> 
> Yeah I totally agree with that, and that's already what we mostly
> have.
> 
> > > I don't at least yet
> > > follow what we need to do with the clockdomains with genpd :)
> > 
> > Use the clockdomain genpd to call clk_enable/disable under the hood.
> > Don't use them as clock providers to the target modules. Clockdomain
> > genpds would be the clock consumers.
> 
> I don't think the clockdomain should be a genpd provider because
> that creates a genpd network of dependencies instead of a tree
> structure. If we end up setting the clockdomains with genpd, then
> only the other clockdomains should use them, but I don't know how
> we ever keep drivers from directly tinkering with them..

Genpd is set up as an arbitrary graph, not strictly a tree, so these
types of dependencies should be OK.

> 
> IMO, the clockdomain clock driver should just provides clocks, then
> we can have the interconnect target module driver deal with the
> clockdomain dependencies.
> 
> > > Wouldn't just doing clk_get() from one clockdomain clock to
> > > another clockdomain clock (or it's output) be enough to
> > > represent the clockdomain dependencies?
> > 
> > s/clk_get/clk_prepare_enable/
> > 
> > Yes, but you're stuffing functional dependencies into the clock tree,
> > which sucks. genpd was created to model these arbitrary dependencies.
> 
> Well let's not stuff anything beyond clock framework to the
> clockdomain clock drivers. We already have the clockdomain
> dependencies handled by the interconnect code (hwmod), and there
> should be no problem moving those to be handled by genpd and the
> interconnect target driver instances.
> 
> Care to take another look at Tero's patches with the assumption
> that the clockdomain clocks stay just as a clocks?

Sure.

Regards,
Mike

> 
> Regards,
> 
> Tony

^ permalink raw reply

* [PATCH v2 1/3] crypto: brcm: DT documentation for Broadcom SPU driver
From: Rob Herring @ 2016-12-09 21:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480714499-1476-2-git-send-email-rob.rice@broadcom.com>

On Fri, Dec 02, 2016 at 04:34:57PM -0500, Rob Rice wrote:
> Device tree documentation for Broadcom Secure Processing Unit
> (SPU) crypto driver.
> 
> Signed-off-by: Steve Lin <steven.lin1@broadcom.com>
> Signed-off-by: Rob Rice <rob.rice@broadcom.com>
> ---
>  .../devicetree/bindings/crypto/brcm,spu-crypto.txt | 25 ++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
> 
> diff --git a/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt b/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
> new file mode 100644
> index 0000000..e5fe942
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/crypto/brcm,spu-crypto.txt
> @@ -0,0 +1,25 @@
> +The Broadcom Secure Processing Unit (SPU) driver supports symmetric

Bindings describe h/w, not drivers.

> +cryptographic offload for Broadcom SoCs with SPU hardware. A SoC may have
> +multiple SPU hardware blocks.
> +
> +Required properties:
> +- compatible : Should be "brcm,spum-crypto" for devices with SPU-M hardware

Additionally, you should have SoC specific compatible here.

> +  (e.g., Northstar2) or "brcm,spum-nsp-crypto" for the Northstar Plus variant
> +  of the SPU-M hardware.
> +
> +- reg: Should contain SPU registers location and length.
> +- mboxes: A list of mailbox channels to be used by the kernel driver. Mailbox
> +channels correspond to DMA rings on the device.

What determines the mbox assignment?

Needs to specify how many.

> +
> +Example:
> +	spu-crypto at 612d0000 {

Just crypto at ...

Rob

^ permalink raw reply

* [PATCH 1/3] nvmem: imx-ocotp: Add support for i.MX6UL
From: Rob Herring @ 2016-12-09 21:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480689949-17957-1-git-send-email-d.schultz@phytec.de>

On Fri, Dec 02, 2016 at 03:45:47PM +0100, Daniel Schultz wrote:
> This patch adds OCOTP support for the i.MX6UL SoC.
> 
> Signed-off-by: Daniel Schultz <d.schultz@phytec.de>
> ---
>  Documentation/devicetree/bindings/nvmem/imx-ocotp.txt | 5 +++--
>  drivers/nvmem/imx-ocotp.c                             | 1 +
>  2 files changed, 4 insertions(+), 2 deletions(-)

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply

* [RESEND PATCH v2 5/7] drm/vc4: Document VEC DT binding
From: Rob Herring @ 2016-12-09 21:00 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480686493-4813-6-git-send-email-boris.brezillon@free-electrons.com>

On Fri, Dec 02, 2016 at 02:48:11PM +0100, Boris Brezillon wrote:
> Document the DT binding for the VEC (Video EnCoder) IP.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
> ---
>  Documentation/devicetree/bindings/display/brcm,bcm-vc4.txt | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply

* [PATCH v4] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Ard Biesheuvel @ 2016-12-09 20:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161209192932.GD1574@e103592.cambridge.arm.com>

On 9 December 2016 at 19:29, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Dec 09, 2016 at 06:21:55PM +0000, Catalin Marinas wrote:
>> On Fri, Dec 09, 2016 at 04:46:32PM +0000, Ard Biesheuvel wrote:
>> >  void kernel_neon_begin_partial(u32 num_regs)
>> >  {
>> > -   if (in_interrupt()) {
>> > -           struct fpsimd_partial_state *s = this_cpu_ptr(
>> > -                   in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
>> > +   struct fpsimd_partial_state *s;
>> > +   int level;
>> > +
>> > +   preempt_disable();
>> > +
>> > +   level = this_cpu_inc_return(kernel_neon_nesting_level);
>> > +   BUG_ON(level > 3);
>> > +
>> > +   if (level > 1) {
>> > +           s = this_cpu_ptr(nested_fpsimdstate);
>> >
>> > -           BUG_ON(num_regs > 32);
>> > -           fpsimd_save_partial_state(s, roundup(num_regs, 2));
>> > +           WARN_ON_ONCE(num_regs > 32);
>> > +           num_regs = min(roundup(num_regs, 2), 32U);
>> > +
>> > +           fpsimd_save_partial_state(&s[level - 2], num_regs);
>> >     } else {
>> >             /*
>> >              * Save the userland FPSIMD state if we have one and if we
>> > @@ -241,7 +256,6 @@ void kernel_neon_begin_partial(u32 num_regs)
>> >              * that there is no longer userland FPSIMD state in the
>> >              * registers.
>> >              */
>> > -           preempt_disable();
>> >             if (current->mm &&
>> >                 !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
>> >                     fpsimd_save_state(&current->thread.fpsimd_state);
>>
>> I wonder whether we could actually do this saving and flag/level setting
>> in reverse to simplify the races. Something like your previous patch but
>> only set TIF_FOREIGN_FPSTATE after saving:
>>
>>       level = this_cpu_read(kernel_neon_nesting_level);
>>       if (level > 0) {
>>               ...
>>               fpsimd_save_partial_state();
>>       } else {
>>               if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
>>                       fpsimd_save_state();
>>               set_thread_flag(TIF_FOREIGN_FPSTATE);
>>       }
>>       this_cpu_inc(kernel_neon_nesting_level);
>>
>> There is a risk of extra saving if we get an interrupt after
>> test_thread_flag() and before set_thread_flag() but I don't think this
>> would corrupt any state, just writing things twice.
>
> I would worry that we can save two states over the same buffer and then
> restore an uninitialised buffer in this case unless we are careful.
> Because the level-dependent code is now misbracketed by the inc/dec,
> a preempting call races with the outer call and use the same value.
>
> I guess we could do
>
> if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
>         fpsimd_save_state();
> clear_thread_flag(TIF_FOREIGN_FPSTATE);
>
> at the start unconditionally, before the _inc_return().
>
> The task state may then get saved in the middle of being saved, but
> as you say it shouldn't have changed in the meantime.

It /will/ have changed in the meantime: when the interrupted context
is resumed, it will happily proceed with saving the state where it
left off, but now the register file contains whatever was left after
the interrupt handler is done with the NEON.

> The nested
> save code may then do a partial save of the same state on top of that
> which could get restored at the inner kernel_neon_end() call.
>

I'm afraid the only way to deal with this correctly is to treat the
whole sequence as a critical section, which means execute it with
interrupts disabled.

^ permalink raw reply

* [PATCH] usb: dwc3: omap: remove devm_request_threaded_irq
From: Grygorii Strashko @ 2016-12-09 20:55 UTC (permalink / raw)
  To: linux-arm-kernel

Switch back from devm_request_threaded_irq() to request_threaded_irq() to
avoid races between interrupt handler execution and PM runtime in error
handling code path in probe and in dwc3_omap_remove():

in probe:
...
 err1:
	pm_runtime_put_sync(dev);
^^ PM runtime can race with IRQ handler when deferred probing happening
   due to extcon
	pm_runtime_disable(dev);

	return ret;

in dwc3_omap_remove:
...
	dwc3_omap_disable_irqs(omap);
^^ IRQs are disabled in HW, but handler may still run
	of_platform_depopulate(omap->dev);
	pm_runtime_put_sync(&pdev->dev);
^^ PM runtime can race with IRQ handler
	pm_runtime_disable(&pdev->dev);

	return 0;

So, OMAP DWC3 IRQ need to be disabled end freed before calling
pm_runtime_put() in probe and in dwc3_omap_remove().

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
---
Hi Tony,

This is reworked patch, so might be it need to be re-tested.

 drivers/usb/dwc3/dwc3-omap.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/dwc3/dwc3-omap.c b/drivers/usb/dwc3/dwc3-omap.c
index 29e80cc..79d74f6 100644
--- a/drivers/usb/dwc3/dwc3-omap.c
+++ b/drivers/usb/dwc3/dwc3-omap.c
@@ -511,18 +511,18 @@ static int dwc3_omap_probe(struct platform_device *pdev)
 	/* check the DMA Status */
 	reg = dwc3_omap_readl(omap->base, USBOTGSS_SYSCONFIG);
 
-	ret = devm_request_threaded_irq(dev, omap->irq, dwc3_omap_interrupt,
-					dwc3_omap_interrupt_thread, IRQF_SHARED,
-					"dwc3-omap", omap);
+	ret = request_threaded_irq(omap->irq, dwc3_omap_interrupt,
+				   dwc3_omap_interrupt_thread, IRQF_SHARED,
+				   "dwc3-omap", omap);
 	if (ret) {
 		dev_err(dev, "failed to request IRQ #%d --> %d\n",
 				omap->irq, ret);
-		goto err1;
+		goto err11;
 	}
 
 	ret = dwc3_omap_extcon_register(omap);
 	if (ret < 0)
-		goto err1;
+		goto err11;
 
 	ret = of_platform_populate(node, NULL, NULL, dev);
 	if (ret) {
@@ -538,6 +538,8 @@ static int dwc3_omap_probe(struct platform_device *pdev)
 	extcon_unregister_notifier(omap->edev, EXTCON_USB, &omap->vbus_nb);
 	extcon_unregister_notifier(omap->edev, EXTCON_USB_HOST, &omap->id_nb);
 
+err11:
+	free_irq(omap->irq, omap);
 err1:
 	pm_runtime_put_sync(dev);
 	pm_runtime_disable(dev);
@@ -552,6 +554,7 @@ static int dwc3_omap_remove(struct platform_device *pdev)
 	extcon_unregister_notifier(omap->edev, EXTCON_USB, &omap->vbus_nb);
 	extcon_unregister_notifier(omap->edev, EXTCON_USB_HOST, &omap->id_nb);
 	dwc3_omap_disable_irqs(omap);
+	free_irq(omap->irq, omap);
 	of_platform_depopulate(omap->dev);
 	pm_runtime_put_sync(&pdev->dev);
 	pm_runtime_disable(&pdev->dev);
-- 
2.10.1

^ permalink raw reply related

* [PATCHv4 11/15] clk: ti: clockdomain: add clock provider support to clockdomains
From: Tony Lindgren @ 2016-12-09 20:40 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <148131376868.16621.16082839810419290638@resonance>

* Michael Turquette <mturquette@baylibre.com> [161209 12:02]:
> Quoting Tony Lindgren (2016-12-05 07:25:34)
> > * Tero Kristo <t-kristo@ti.com> [161205 02:09]:
...
<snip>

> I had a recent conversation with Kevin Hilman about a related issue
> (we were not discussing this thread or this series) and we both agreed
> that most drivers don't even need to managed their clocks directly, so
> much as they need to manage their on/off resources. Clocks are just one
> part of that, and if we can hide that stuff inside of an attached genpd
> then it would be better than having the driver manage clocks explicitly.
> 
> Obviously some devices such as audio codec or uart will need to manage
> clocks directly, but this is mostly the exception, not the rule.

Yes. And we do that already for clkctrl clocks via PM runtime where
hwmod manages them. Tero's series still has hwmod manage the clocks,
but the clock register handling is moved to live under drivers/clock.

> > > > > There is certainly no API for that in the clock framework, but for genpd
> > > > > your runtime_pm_get() callback for clkdm_A could call runtime_pm_get
> > > > > against clkdm_B, which would satisfy the requirement. See section
> > > > > 3.1.1.1.7 Clock Domain Dependency in the OMAP4 TRM, version AB.
> > > 
> > > For static dependencies the apis genpd_add/remove_subdomain could probably
> > > be used.
> > > 
> > > > To me it seems the API is just clk_get() :) Do you have some
> > > > specific example we can use to check? My guess is that the
> > > > TRM "Clock Domain Dependency" is just the usual parent child
> > > > relationship between clocks that are the clockdomains..
> 
> clk_get() only fetches a pointer to the clk. I guess you mean
> clk_prepare_enable() to actually increment the use count?

Right, with clocks that's all we should need to do :)

> If we used the clk framework here is that it would look something like
> this:
> 
> clk_enable(clk_a)
> -> .enable(clk_a_hw)
>    -> clk_enable(clk_b)
> 
> However, clk_a and clk_b do not have a parent-child relationship in the
> clock tree. This is purely a functional relationship between IP blocks.
> Modeling this sort of thing in the clk framework would be wrong, and
> genpd is a much better place to establish these arbitrary relationships.

Hmm yes, and I don't mean the clock framework should do anything more
complex beyond what it already does.

We just want to represent the clocks as clocks, then have the
interconnect code manage those clocks. That's currently hwmod, eventually
it will be genpd.

> > > > If there is something more magical there certainly that should
> > > > be considered though.
> > > 
> > > The hwmods could be transformed to individual genpds also I guess. On DT
> > > level though, we would still need a clock pointer to the main clock and a
> > > genpd pointer in addition to that.
> > 
> > Hmm a genpd pointer to where exactly? AFAIK each interconnect
> > instance should be a genpd provider, and the individual interconnect
> > target modules should be consumers for that genpd.
> 
> I was thinking that the clock domains would be modeled as genpd objects
> with the interconnect target modules attached as struct devices.

I think clock domains should be just clocks, then we let the interconnect
code and eventually genpd manage them.

> > > Tony, any thoughts on that? Would this break up the plans for the
> > > interconnect completely?
> > 
> > Does using genpd for clockdomains cause issues for using genpd for
> > interconnect instances and the target modules?
> 
> Can they be the same object in Linux? If there is a one-to-one mapping
> between clock domains and the interconnect port then maybe you can just
> model them together.

I'm thinking that it should be the interconnect code implementing
genpd, and use clk_request_enable().

> > The thing I'd be worried about there is that the clockdomains and
> > their child clocks are just devices sitting on the interconnect,
> > so we could easily end up with genpd modeling something that does
> > not represent the hardware.
> > 
> > For example, on 4430 we have:
> > 
> > l4_cfg interconnect
> >        ...
> >        segment at 0
> >                 ...
> >                 target_module at 4000
> >                         cm1: cm1 at 0
> 
> How about:
> 
> l4_cfg interconnect
>        ...
>        segment at 0
>                 ...
>                 cm1 at 4000
>                 	module: foo_module at 0

That's the wrong way around from hardware point of view. There's
a generic interconnect wrapper module with it's own registers,
then cm1 (and possibly other devices) are children of that target
module.

> I don't know much about the segments. Do they map one-to-one with the
> clock domains?

I need to check, it's been a while, but I recall some interconnects
are partioned to segments based on voltages or clocks.

> If my quick-and-dirty DT above makes sense, then the target modules
> (e.g. io controller) would not get clocks anymore, but just
> pm_runtime_get(). The genpd backing object would call clk_enable/disable
> as needed.

Yeah that's what we already have with hwmod and PM runtime for the
clockctrl register. But hwmod currently directly manages the clkctrl
register, we just want to move that part to be a clock driver.

The children of the interconnect target modules just need to use
PM runtime, but the interconnect target module driver needs to know
it's clkctrl clock.

> If fine grained control of a clock is needed (e.g. for clk_set_rate)
> then the driver can still clk_get it. Whether or not the clockdomain
> provides that clock or if it comes from the clock generator (e.g. cm1,
> cm2, prm, etc) isn't as important to me, but I prefer for the
> clockdomain to not be a clock provider if possible.

Yeah I totally agree with that, and that's already what we mostly
have.

> > I don't at least yet
> > follow what we need to do with the clockdomains with genpd :)
> 
> Use the clockdomain genpd to call clk_enable/disable under the hood.
> Don't use them as clock providers to the target modules. Clockdomain
> genpds would be the clock consumers.

I don't think the clockdomain should be a genpd provider because
that creates a genpd network of dependencies instead of a tree
structure. If we end up setting the clockdomains with genpd, then
only the other clockdomains should use them, but I don't know how
we ever keep drivers from directly tinkering with them..

IMO, the clockdomain clock driver should just provides clocks, then
we can have the interconnect target module driver deal with the
clockdomain dependencies.

> > Wouldn't just doing clk_get() from one clockdomain clock to
> > another clockdomain clock (or it's output) be enough to
> > represent the clockdomain dependencies?
> 
> s/clk_get/clk_prepare_enable/
> 
> Yes, but you're stuffing functional dependencies into the clock tree,
> which sucks. genpd was created to model these arbitrary dependencies.

Well let's not stuff anything beyond clock framework to the
clockdomain clock drivers. We already have the clockdomain
dependencies handled by the interconnect code (hwmod), and there
should be no problem moving those to be handled by genpd and the
interconnect target driver instances.

Care to take another look at Tero's patches with the assumption
that the clockdomain clocks stay just as a clocks?

Regards,

Tony

^ permalink raw reply

* [PATCH 1/3] clkdev: add devm_get_clk_from_child()
From: Russell King - ARM Linux @ 2016-12-09 20:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <8737i3vtl7.wl%kuninori.morimoto.gx@renesas.com>

On Mon, Dec 05, 2016 at 05:23:20AM +0000, Kuninori Morimoto wrote:
> 
> From: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
> 
> Some driver is using this type of DT bindings for clock (more detail,
> see ${LINUX}/Documentation/devicetree/bindings/sound/simple-card.txt).
> 
> 	sound_soc {
> 		...
> 		cpu {
> 			clocks = <&xxx>;
> 			...
> 		};
> 		codec {
> 			clocks = <&xxx>;
> 			...
> 		};
> 	};
> 
> Current driver in this case uses of_clk_get() for each node, but there
> is no devm_of_clk_get() today.
> OTOH, the problem of having devm_of_clk_get() is that it encourages the
> use of of_clk_get() when clk_get() is more desirable.
> 
> Thus, this patch adds new devm_get_clk_from_chile() which explicitly
> reads as get a clock from a child node of this device.
> By this function, we can also use this type of DT bindings
> 
> 	sound_soc {
> 		clocks = <&xxx>, <&xxx>;
> 		clock-names = "cpu", "codec";
> 		...
> 		cpu {
> 			...
> 		};
> 		codec {
> 			...
> 		};
> 	};
> 
> Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>

This looks lots better, thanks.

Acked-by: Russell King <rmk+kernel@armlinux.org.uk>

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* [PATCH 4/4] ARM: versatile: support configuring versatile machine for no-MMU
From: Russell King - ARM Linux @ 2016-12-09 20:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <58481DD1.6090606@arm.com>

On Wed, Dec 07, 2016 at 02:33:53PM +0000, Vladimir Murzin wrote:
> Hi Linus,
> 
> On 07/12/16 14:11, Linus Walleij wrote:
> > Another target I had in mind was the Integrator which
> > incidentally supports a bunch of the old noMMU core
> > tiles where we can swap in an ARM946, which I guess
> > could work with this?
> 
> Do you mind trying my "Allow NOMMU for MULTIPLATFORM" series [1]? Greg just
> reported it did a trick for Versatile, there is a good chance it would work
> for Integrator too ;)

My views on gluing multiplatform and nommu together have already been
stated several times, and remain unchanged.

Greg's patch looks almost sane, but what I'd like to see is instead of
directly doing this:

 config ARCH_VERSATILE
        bool "ARM Ltd. Versatile family"
-       depends on ARCH_MULTI_V5
+       depends on ARCH_MULTI_V5 || ARM_SINGLE_ARMV5

we could do:

 config ARCH_VERSATILE
        bool "ARM Ltd. Versatile family" if ARCH_MULTI_V5
-       depends on ARCH_MULTI_V5
+       depends on ARCH_MULTI_V5 || ARM_SINGLE_ARMV5
	default y if ARM_SINGLE_ARCH_VERSATILE

adding the versatile entry in the upper level choice (where it always used
to be) but with "ARM_SINGLE_ARCH_VERSATILE" instead.

This would have the effect of allowing the multiplatform-converted stuff
to start being used on nommu (again) but without running into the problems
I've highlighted.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* [PATCH v2 1/2] firmware: arm_scpi: zero RX buffer before requesting data from the mbox
From: Martin Blumenstingl @ 2016-12-09 20:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <f544a379-a6a0-7839-85c3-a22449fecc46@arm.com>

On Wed, Dec 7, 2016 at 7:17 PM, Sudeep Holla <sudeep.holla@arm.com> wrote:
>
>
> On 25/11/16 00:54, Martin Blumenstingl wrote:
>> The original code was relying on the fact that the SCPI firmware
>> responds with the same number of bytes (or more, all extra data would be
>> ignored in that case) as requested.
>> However, we have some pre-v1.0 SCPI firmwares which are responding with
>> less data for some commands (sensor_value.hi_val did not exist in the
>> old implementation). This means that some data from the previous
>> command's RX buffer was leaked into the current command (as the RX
>> buffer is re-used for all commands on the same channel). Clearing the
>> RX buffer before (re-) using it ensures we get a consistent result, even
>> if the SCPI firmware returns less bytes than requested.
>>
>> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
>> ---
>>  drivers/firmware/arm_scpi.c | 19 ++++++++++++++++++-
>>  1 file changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c
>> index 70e1323..8c183d8 100644
>> --- a/drivers/firmware/arm_scpi.c
>> +++ b/drivers/firmware/arm_scpi.c
>> @@ -259,6 +259,7 @@ struct scpi_chan {
>>       struct mbox_chan *chan;
>>       void __iomem *tx_payload;
>>       void __iomem *rx_payload;
>> +     resource_size_t max_payload_len;
>>       struct list_head rx_pending;
>>       struct list_head xfers_list;
>>       struct scpi_xfer *xfers;
>> @@ -470,6 +471,20 @@ static void scpi_tx_prepare(struct mbox_client *c, void *msg)
>>       if (t->rx_buf) {
>>               if (!(++ch->token))
>>                       ++ch->token;
>> +
>> +             /* clear the RX buffer as it is shared across all commands on
>> +              * the same channel (to make sure we're not leaking data from
>> +              * the previous response into the current command if the SCPI
>> +              * firmware writes less data than requested).
>> +              * This is especially important for pre-v1.0 SCPI firmwares
>> +              * where some fields in the responses do not exist (while they
>> +              * exist but are optional in newer versions). One example for
>> +              * this problem is sensor_value.hi_val, which would contain
>> +              * ("leak") the second 4 bytes of the RX buffer from the
>> +              * previous command.
>> +              */
>> +             memset_io(ch->rx_payload, 0, ch->max_payload_len);
>> +
>
> This looks like a overkill to me. I prefer your first approach over
> this, if it's only this command that's affected. I am still not sure
> why Neil Armstrong mentioned that it worked for him with 64-bit read.
OK, I'm fine with that. I also had a look at the patch you posted in
the cover-letter (I did not have time to test it yet though, I'll give
feedback tomorrow) - this looks better than my v1.

I think the explanation why it worked for Neil is pretty simple:
it depends on the SCPI command which was executed before the "read
sensor" command:
if that command returned [4 bytes with any value]0x00000000[any number
of bytes] then he got a valid result

^ permalink raw reply

* [PATCH v3 11/12] arm64: dts: marvell: add sdhci support for Armada 7K/8K
From: Russell King - ARM Linux @ 2016-12-09 20:07 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <923c97f32a828b579756bf86946689ed54a0a174.1481279228.git-series.gregory.clement@free-electrons.com>

On Fri, Dec 09, 2016 at 11:30:07AM +0100, Gregory CLEMENT wrote:
> Also enable it on the Armada 7040 DB board
> 
> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>

Hi,

Can this also be added to the cp110 on the 7k/8k SoCs as well, or does
it rely on other unmerged support?

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* [PATCHv4 11/15] clk: ti: clockdomain: add clock provider support to clockdomains
From: Michael Turquette @ 2016-12-09 20:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161205152534.GJ4705@atomide.com>

Quoting Tony Lindgren (2016-12-05 07:25:34)
> * Tero Kristo <t-kristo@ti.com> [161205 02:09]:
> > On 03/12/16 02:18, Tony Lindgren wrote:
> > > * Michael Turquette <mturquette@baylibre.com> [161202 15:52]:
> > > > Quoting Tony Lindgren (2016-12-02 15:12:40)
> > > > > * Michael Turquette <mturquette@baylibre.com> [161202 14:34]:
> > > > > > Quoting Tony Lindgren (2016-10-28 16:54:48)
> > > > > > > * Stephen Boyd <sboyd@codeaurora.org> [161028 16:37]:
> > > > > > > > On 10/28, Tony Lindgren wrote:
> > > > > > > > > * Tero Kristo <t-kristo@ti.com> [161028 00:43]:
> > > > > > > > > > On 28/10/16 03:50, Stephen Boyd wrote:
> > > > > > > > > > > I suppose a PRCM is
> > > > > > > > > > > like an MFD that has clocks and resets under it? On other
> > > > > > > > > > > platforms we've combined that all into one node and just had
> > > > > > > > > > > #clock-cells and #reset-cells in that node. Is there any reason
> > > > > > > > > > > we can't do that here?
> > > > > > > > > > 
> > > > > > > > > > For OMAPs, there are typically multiple instances of the PRCM around; OMAP4
> > > > > > > > > > for example has:
> > > > > > > > > > 
> > > > > > > > > > cm1 @ 0x4a004000 (clocks + clockdomains)
> > > > > > > > > > cm2 @ 0x4a008000 (clocks + clockdomains)
> > > > > > > > > > prm @ 0x4a306000 (few clocks + resets + power state handling)
> > > > > > > > > > scrm @ 0x4a30a000 (few external clocks + plenty of misc stuff)
> > > > > > > > > > 
> > > > > > > > > > These instances are also under different power/voltage domains which means
> > > > > > > > > > their PM behavior is different.
> > > > > > > > > > 
> > > > > > > > > > The idea behind having a clockdomain as a provider was mostly to have the
> > > > > > > > > > topology visible : prcm-instance -> clockdomain -> clocks
> > > > > > > > > 
> > > > > > > > > Yeah that's needed to get the interconnect hierarchy right for
> > > > > > > > > genpd :)
> > > > > > > > > 
> > > > > > > > > > ... but basically I think it would be possible to drop the clockdomain
> > > > > > > > > > representation and just mark the prcm-instance as a clock provider. Tony,
> > > > > > > > > > any thoughts on that?
> > > > > > > > > 
> > > > > > > > > No let's not drop the clockdomains as those will be needed when we
> > > > > > > > > move things into proper hierarchy within the interconnect instances.
> > > > > > > > > This will then help with getting things right with genpd.
> > > > > > > > > 
> > > > > > > > > In the long run we just want to specify clockdomain and the offset of
> > > > > > > > > the clock instance within the clockdomain in the dts files.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Sorry, I have very little idea how OMAP hardware works. Do you
> > > > > > > > mean that you will have different nodes for each clockdomain so
> > > > > > > > that genpd can map 1:1 to the node in dts? But in hardware
> > > > > > > > there's a prcm that allows us to control many clock domains
> > > > > > > > through register read/writes? How is the interconnect involved?
> > > > > > > 
> > > > > > > There are multiple clockdomains, at least one for each interconnect
> > > > > > > instance. Once a clockdomain is idle, the related interconnect can
> > > > > > > idle too. So yeah genpd pretty much maps 1:1 with the clockdomains.
> > > > > > > 
> > > > > > > There's more info in for example omap4 TRM section "3.4.1 Device
> > > > > > > Power-Management Layout" that shows the voltage/power/clock domains.
> > > > > > > The interconnect instances are mostly named there too looking at
> > > > > > > the L4/L3 naming.
> > > > > > 
> > > > > > I'm confused on two points:
> > > > > > 
> > > > > > 1) why are the clkdm's acting as clock providers? I've always hated the
> > > > > > name "clock domain" since those bits are for managing module state, not
> > > > > > clock state. The PRM, CM1 and CM2 provide the clocks, not the
> > > > > > clockdomains.
> > > > > 
> > > > > The clock domains have multiple clock inputs that are routed to multiple
> > > > > child clocks. So it is a clock :)
> > > > > 
> > > > > See for example omap4430 TRM "3.6.4 CD_WKUP Clock Domain" on page
> > > > > 393 in my revision here.
> > > > > 
> > > > > On that page "Figure 3-48" shows CD_WKUP with the four input clocks.
> > > > > And then "Table 3-84. CD_WKUP Control and Status Parameters" shows
> > > > > the CD_WKUP clock domain specific registers. These registers show
> > > > > the status, I think they are all read-only registers. Then CD_WKUP
> > > > > has multiple child clocks with configurable registers.
> > > > > 
> > > > > From hardware register point of view, each clock domain has:
> > > > > 
> > > > > - Read-only clockdomain status registers in the beginning of
> > > > >   the address space
> > > > > 
> > > > > - Multiple similar clock instances register instances each
> > > > >   mapping to a specific interconnect target module
> > > > > 
> > > > > These are documented in "3.11.16.1 WKUP_CM Register Summary".
> > > > 
> > > > Oh, this is because you are treating the MODULEMODE bits like gate
> > > > clocks. I never really figured out if this was the best way to model
> > > > those bits since they do more than control a line toggling at a rate.
> > > > For instance this bit will affect the master/slave IDLE protocol between
> > > > the module and the PRCM.
> > > 
> > > Yes seems like there is some negotiation going on there with the
> > > target module. But from practical point of view the CLKCTRL
> > > register is the gate for a module functional clock.
> > 
> > There's some confusion on this, clockdomain is effectively a collection of
> > clocks, and can be used to force control that collection if needed. Chapter
> > "3.1.1.1.3 Clock Domain" in some OMAP4 TRM shows the relationship neatly.
> 
> Yeah that's my understanding too.

There is no clk api for this type of behavior. We keep clocks enabled so
long as consumers have a positive prepare_count or enable_count. The
concept of force-idle doesn't work very well here, unless any calls to
force the clkdm to idle also use a usecount.

> 
> > > > > From hardware point of view, we ideally want to map interconnect
> > > > > target modules to the clock instance offset from the clock domain
> > > > > for that interconnect segment. For example gptimer1 clocks would
> > > > > be just:
> > > > > 
> > > > > clocks = <&cd_wkup 0x40>;
> > > > > 
> > > > > > 2) why aren't the clock domains modeled as genpds with their associated
> > > > > > devices attached to them? Note that it is possible to "nest" genpd
> > > > > > objects. This would also allow for the "Clockdomain Dependency"
> > > > > > relationships to be properly modeled (see section 3.1.1.1.7 Clock Domain
> > > > > > Dependency in the OMAP4 TRM).
> > > > > 
> > > > > Clock domains only route clocks to child clocks. Power domains
> > > > > are different registers. The power domains map roughly to
> > > > > interconnect instances, there we have registers to disable the
> > > > > whole interconnect when idle.
> > > > 
> > > > I'm not talking about power islands at all, but the genpd object in
> > > > Linux. For instance, if we treat each clock domain like a clock
> > > > provider, how could the functional dependency between clkdm_A and
> > > > clkdm_B be asserted?
> > > 
> > > To me it seems that some output of a clockdomain is just a input
> > > of another clockdomain? So it's just the usual parent child
> > > relationship once we treat a clockdomain just as a clock. Tero
> > > probably has some input here.
> > 
> > A clockdomain should be modelled as a genpd, that I agree. However, it
> > doesn't prevent it from being a clock provider also, or does it?

It does not prevent it, but I don't understand why you would want both.

I had a recent conversation with Kevin Hilman about a related issue
(we were not discussing this thread or this series) and we both agreed
that most drivers don't even need to managed their clocks directly, so
much as they need to manage their on/off resources. Clocks are just one
part of that, and if we can hide that stuff inside of an attached genpd
then it would be better than having the driver manage clocks explicitly.

Obviously some devices such as audio codec or uart will need to manage
clocks directly, but this is mostly the exception, not the rule.

> > 
> > > > There is certainly no API for that in the clock framework, but for genpd
> > > > your runtime_pm_get() callback for clkdm_A could call runtime_pm_get
> > > > against clkdm_B, which would satisfy the requirement. See section
> > > > 3.1.1.1.7 Clock Domain Dependency in the OMAP4 TRM, version AB.
> > 
> > For static dependencies the apis genpd_add/remove_subdomain could probably
> > be used.
> > 
> > > To me it seems the API is just clk_get() :) Do you have some
> > > specific example we can use to check? My guess is that the
> > > TRM "Clock Domain Dependency" is just the usual parent child
> > > relationship between clocks that are the clockdomains..

clk_get() only fetches a pointer to the clk. I guess you mean
clk_prepare_enable() to actually increment the use count?

If we used the clk framework here is that it would look something like
this:

clk_enable(clk_a)
-> .enable(clk_a_hw)
   -> clk_enable(clk_b)

However, clk_a and clk_b do not have a parent-child relationship in the
clock tree. This is purely a functional relationship between IP blocks.
Modeling this sort of thing in the clk framework would be wrong, and
genpd is a much better place to establish these arbitrary relationships.

> > > 
> > > If there is something more magical there certainly that should
> > > be considered though.
> > 
> > The hwmods could be transformed to individual genpds also I guess. On DT
> > level though, we would still need a clock pointer to the main clock and a
> > genpd pointer in addition to that.
> 
> Hmm a genpd pointer to where exactly? AFAIK each interconnect
> instance should be a genpd provider, and the individual interconnect
> target modules should be consumers for that genpd.

I was thinking that the clock domains would be modeled as genpd objects
with the interconnect target modules attached as struct devices.

> 
> > Tony, any thoughts on that? Would this break up the plans for the
> > interconnect completely?
> 
> Does using genpd for clockdomains cause issues for using genpd for
> interconnect instances and the target modules?

Can they be the same object in Linux? If there is a one-to-one mapping
between clock domains and the interconnect port then maybe you can just
model them together.

> 
> The thing I'd be worried about there is that the clockdomains and
> their child clocks are just devices sitting on the interconnect,
> so we could easily end up with genpd modeling something that does
> not represent the hardware.
> 
> For example, on 4430 we have:
> 
> l4_cfg interconnect
>        ...
>        segment at 0
>                 ...
>                 target_module at 4000
>                         cm1: cm1 at 0

How about:

l4_cfg interconnect
       ...
       segment at 0
                ...
                cm1 at 4000
                	module: foo_module at 0

I don't know much about the segments. Do they map one-to-one with the
clock domains?

>                              ...
>                 ...
>                 target_module at 8000
>                         cm2: cm2 at 0
>                 ...
> 
> 
> l4_wkup interonnect
>         ...
>         segment at 0
>                 ...
>                 target_module at 6000
>                         prm: prm at 0
>                 ...
>                 target_module at a000
>                         scrm: scrm at 0
>                 ...
> 
> So what do you guys have in mind for using genpd in the above
> example for the clockdomains?

If my quick-and-dirty DT above makes sense, then the target modules
(e.g. io controller) would not get clocks anymore, but just
pm_runtime_get(). The genpd backing object would call clk_enable/disable
as needed.

If fine grained control of a clock is needed (e.g. for clk_set_rate)
then the driver can still clk_get it. Whether or not the clockdomain
provides that clock or if it comes from the clock generator (e.g. cm1,
cm2, prm, etc) isn't as important to me, but I prefer for the
clockdomain to not be a clock provider if possible.

> 
> To me it seems that the interconnect instances like l4_cfg and
> l4_wkup above should be genpd providers.

Agreed.

> I don't at least yet
> follow what we need to do with the clockdomains with genpd :)

Use the clockdomain genpd to call clk_enable/disable under the hood.
Don't use them as clock providers to the target modules. Clockdomain
genpds would be the clock consumers.

> 
> Wouldn't just doing clk_get() from one clockdomain clock to
> another clockdomain clock (or it's output) be enough to
> represent the clockdomain dependencies?

s/clk_get/clk_prepare_enable/

Yes, but you're stuffing functional dependencies into the clock tree,
which sucks. genpd was created to model these arbitrary dependencies.

Regards,
Mike

> 
> Regards,
> 
> Tony

^ permalink raw reply

* [PATCH] nommu: allow mmap when !CONFIG_MMU
From: Russell King - ARM Linux @ 2016-12-09 20:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480600080-30196-1-git-send-email-benjamin.gaignard@linaro.org>

On Thu, Dec 01, 2016 at 02:48:00PM +0100, Benjamin Gaignard wrote:
> commit ab6494f0c96f ("nommu: Add noMMU support to the DMA API") have
> add CONFIG_MMU compilation flag but that prohibit to use dma_mmap_wc()
> when the platform doesn't have MMU.
> 
> This patch call vm_iomap_memory() in noMMU case to test if addresses
> are correct and set wma->vm_flags rather than all return an error.

ITYM vma.

I think this is fine, but I've no way to know as I don't run nommu here...
I've been hoping that others working on nommu would at least give an
ack or something, but I guess not.

So, please put it in the patch system, thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* [PATCH] ARM: soft-reboot into same mode that we entered the kernel
From: Russell King @ 2016-12-09 19:49 UTC (permalink / raw)
  To: linux-arm-kernel

When we soft-reboot (eg, kexec) from one kernel into the next, we need
to ensure that we enter the new kernel in the same processor mode as
when we were entered, so that (eg) the new kernel can install its own
hypervisor - the old kernel's hypervisor will have been overwritten.

In order to do this, we need to pass a flag to cpu_reset() so it knows
what to do, and we need to modify the kernel's own hypervisor stub to
allow it to handle a soft-reboot.

As we are always guaranteed to install our own hypervisor if we're
entered in HYP32 mode, and KVM will have moved itself out of the way
on kexec/normal reboot, we can assume that our hypervisor is in place
when we want to kexec, so changing our hypervisor API should not be a
problem.

Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
Mark,

Any opinions on this?

Thanks.

 arch/arm/include/asm/proc-fns.h |  4 ++--
 arch/arm/kernel/hyp-stub.S      | 36 ++++++++++++++++++++++++++++++------
 arch/arm/kernel/reboot.c        | 12 ++++++++++--
 arch/arm/mm/proc-v7.S           | 12 ++++++++----
 4 files changed, 50 insertions(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index 8877ad5ffe10..f2e1af45bd6f 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -43,7 +43,7 @@ extern struct processor {
 	/*
 	 * Special stuff for a reset
 	 */
-	void (*reset)(unsigned long addr) __attribute__((noreturn));
+	void (*reset)(unsigned long addr, bool hvc) __attribute__((noreturn));
 	/*
 	 * Idle the processor
 	 */
@@ -88,7 +88,7 @@ extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
 #else
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte, unsigned int ext);
 #endif
-extern void cpu_reset(unsigned long addr) __attribute__((noreturn));
+extern void cpu_reset(unsigned long addr, bool hvc) __attribute__((noreturn));
 
 /* These three are private to arch/arm/kernel/suspend.c */
 extern void cpu_do_suspend(void *);
diff --git a/arch/arm/kernel/hyp-stub.S b/arch/arm/kernel/hyp-stub.S
index 15d073ae5da2..cab1ac36939d 100644
--- a/arch/arm/kernel/hyp-stub.S
+++ b/arch/arm/kernel/hyp-stub.S
@@ -22,6 +22,9 @@
 #include <asm/assembler.h>
 #include <asm/virt.h>
 
+#define HVC_GET_VECTORS -1
+#define HVC_SOFT_RESTART 1
+
 #ifndef ZIMAGE
 /*
  * For the kernel proper, we need to find out the CPU boot mode long after
@@ -202,9 +205,18 @@ ARM_BE8(orr	r7, r7, #(1 << 25))     @ HSCTLR.EE
 ENDPROC(__hyp_stub_install_secondary)
 
 __hyp_stub_do_trap:
-	cmp	r0, #-1
-	mrceq	p15, 4, r0, c12, c0, 0	@ get HVBAR
-	mcrne	p15, 4, r0, c12, c0, 0	@ set HVBAR
+	cmp	r0, #HVC_GET_VECTORS
+	bne	1f
+	mrc	p15, 4, r0, c12, c0, 0	@ get HVBAR
+	b	__hyp_stub_exit
+
+1:	teq	r0, #HVC_SOFT_RESTART
+	bne	1f
+	mov	r0, r3
+	bx	r0
+
+1:	mcrne	p15, 4, r0, c12, c0, 0	@ set HVBAR
+__hyp_stub_exit:
 	__ERET
 ENDPROC(__hyp_stub_do_trap)
 
@@ -231,14 +243,26 @@ ENDPROC(__hyp_stub_do_trap)
  * initialisation entry point.
  */
 ENTRY(__hyp_get_vectors)
-	mov	r0, #-1
+	mov	r0, #HVC_GET_VECTORS
+	__HVC(0)
+	ret	lr
 ENDPROC(__hyp_get_vectors)
-	@ fall through
+
 ENTRY(__hyp_set_vectors)
+	tst	r0, #31
+	bne	1f
 	__HVC(0)
-	ret	lr
+1:	ret	lr
 ENDPROC(__hyp_set_vectors)
 
+ENTRY(__hyp_soft_restart)
+	mov	r3, r0
+	mov	r0, #HVC_SOFT_RESTART
+	__HVC(0)
+	mov	r0, r3
+	ret	lr
+ENDPROC(__hyp_soft_restart)
+
 #ifndef ZIMAGE
 .align 2
 .L__boot_cpu_mode_offset:
diff --git a/arch/arm/kernel/reboot.c b/arch/arm/kernel/reboot.c
index 3fa867a2aae6..f0e3c7f1ea21 100644
--- a/arch/arm/kernel/reboot.c
+++ b/arch/arm/kernel/reboot.c
@@ -12,10 +12,11 @@
 
 #include <asm/cacheflush.h>
 #include <asm/idmap.h>
+#include <asm/virt.h>
 
 #include "reboot.h"
 
-typedef void (*phys_reset_t)(unsigned long);
+typedef void (*phys_reset_t)(unsigned long, bool);
 
 /*
  * Function pointers to optional machine specific functions
@@ -36,6 +37,7 @@ static u64 soft_restart_stack[16];
 static void __soft_restart(void *addr)
 {
 	phys_reset_t phys_reset;
+	bool hvc = false;
 
 	/* Take out a flat memory mapping. */
 	setup_mm_for_reboot();
@@ -51,7 +53,13 @@ static void __soft_restart(void *addr)
 
 	/* Switch to the identity mapping. */
 	phys_reset = (phys_reset_t)virt_to_idmap(cpu_reset);
-	phys_reset((unsigned long)addr);
+
+#ifdef CONFIG_ARM_VIRT_EXT
+	/* original stub should be restored by kvm */
+	hvc = is_hyp_mode_available();
+#endif
+
+	phys_reset((unsigned long)addr, hvc);
 
 	/* Should never get here. */
 	BUG();
diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index d00d52c9de3e..1846ca4255d0 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -53,11 +53,15 @@ ENDPROC(cpu_v7_proc_fin)
 	.align	5
 	.pushsection	.idmap.text, "ax"
 ENTRY(cpu_v7_reset)
-	mrc	p15, 0, r1, c1, c0, 0		@ ctrl register
-	bic	r1, r1, #0x1			@ ...............m
- THUMB(	bic	r1, r1, #1 << 30 )		@ SCTLR.TE (Thumb exceptions)
-	mcr	p15, 0, r1, c1, c0, 0		@ disable MMU
+	mrc	p15, 0, r2, c1, c0, 0		@ ctrl register
+	bic	r2, r2, #0x1			@ ...............m
+ THUMB(	bic	r2, r2, #1 << 30 )		@ SCTLR.TE (Thumb exceptions)
+	mcr	p15, 0, r2, c1, c0, 0		@ disable MMU
 	isb
+#ifdef CONFIG_ARM_VIRT_EXT
+	teq	r1, #0
+	bne	__hyp_soft_restart
+#endif
 	bx	r0
 ENDPROC(cpu_v7_reset)
 	.popsection
-- 
2.7.4

^ permalink raw reply related

* [PATCH v4] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Dave Martin @ 2016-12-09 19:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161209182155.GB21051@e104818-lin.cambridge.arm.com>

On Fri, Dec 09, 2016 at 06:21:55PM +0000, Catalin Marinas wrote:
> On Fri, Dec 09, 2016 at 04:46:32PM +0000, Ard Biesheuvel wrote:
> >  void kernel_neon_begin_partial(u32 num_regs)
> >  {
> > -	if (in_interrupt()) {
> > -		struct fpsimd_partial_state *s = this_cpu_ptr(
> > -			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
> > +	struct fpsimd_partial_state *s;
> > +	int level;
> > +
> > +	preempt_disable();
> > +
> > +	level = this_cpu_inc_return(kernel_neon_nesting_level);
> > +	BUG_ON(level > 3);
> > +
> > +	if (level > 1) {
> > +		s = this_cpu_ptr(nested_fpsimdstate);
> >  
> > -		BUG_ON(num_regs > 32);
> > -		fpsimd_save_partial_state(s, roundup(num_regs, 2));
> > +		WARN_ON_ONCE(num_regs > 32);
> > +		num_regs = min(roundup(num_regs, 2), 32U);
> > +
> > +		fpsimd_save_partial_state(&s[level - 2], num_regs);
> >  	} else {
> >  		/*
> >  		 * Save the userland FPSIMD state if we have one and if we
> > @@ -241,7 +256,6 @@ void kernel_neon_begin_partial(u32 num_regs)
> >  		 * that there is no longer userland FPSIMD state in the
> >  		 * registers.
> >  		 */
> > -		preempt_disable();
> >  		if (current->mm &&
> >  		    !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
> >  			fpsimd_save_state(&current->thread.fpsimd_state);
> 
> I wonder whether we could actually do this saving and flag/level setting
> in reverse to simplify the races. Something like your previous patch but
> only set TIF_FOREIGN_FPSTATE after saving:
> 
> 	level = this_cpu_read(kernel_neon_nesting_level);
> 	if (level > 0) {
> 		...
> 		fpsimd_save_partial_state();
> 	} else {
> 		if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
> 			fpsimd_save_state();
> 		set_thread_flag(TIF_FOREIGN_FPSTATE);
> 	}
> 	this_cpu_inc(kernel_neon_nesting_level);
> 
> There is a risk of extra saving if we get an interrupt after
> test_thread_flag() and before set_thread_flag() but I don't think this
> would corrupt any state, just writing things twice.

I would worry that we can save two states over the same buffer and then
restore an uninitialised buffer in this case unless we are careful.
Because the level-dependent code is now misbracketed by the inc/dec,
a preempting call races with the outer call and use the same value.

I guess we could do

if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
	fpsimd_save_state();
clear_thread_flag(TIF_FOREIGN_FPSTATE);

at the start unconditionally, before the _inc_return().

The task state may then get saved in the middle of being saved, but
as you say it shouldn't have changed in the meantime.  The nested
save code may then do a partial save of the same state on top of that
which could get restored at the inner kernel_neon_end() call.

> (disclaimer: I haven't thought of all the possible races and I'm not
> entirely sure about the kernel_neon_end() part)

Cheers
---Dave

^ permalink raw reply

* [PATCH] mmc: sdhci-xenon: fix device_node_continue.cocci warnings
From: Julia Lawall @ 2016-12-09 19:08 UTC (permalink / raw)
  To: linux-arm-kernel

Device node iterators put the previous value of the index variable, so an
explicit put causes a double put.

Generated by: scripts/coccinelle/iterators/device_node_continue.cocci

CC: Hu Ziji <huziji@marvell.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

Please check on this.  I have only seen the code shown below, but the rule
normally has a lot false positive rate.

url:
https://github.com/0day-ci/linux/commits/Gregory-CLEMENT/mmc-Add-support-to-Marvell-Xenon-SD-Host-Controller/20161210-002602
base:
:::::: branch date: 2 hours ago
:::::: commit date: 2 hours ago


Please take the patch only if it's a positive warning. Thanks!

 sdhci-xenon.c |    1 -
 1 file changed, 1 deletion(-)

--- a/drivers/mmc/host/sdhci-xenon.c
+++ b/drivers/mmc/host/sdhci-xenon.c
@@ -423,7 +423,6 @@ static int xenon_child_node_of_parse(str
 				      MMC_CAP2_NO_SD |
 				      MMC_CAP2_NO_SDIO;
 		}
-		of_node_put(child);
 	}

 	return 0;

^ permalink raw reply

* [PATCH] arm64: mm: Fix NOMAP page initialization
From: Russell King - ARM Linux @ 2016-12-09 19:07 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481307042-29773-1-git-send-email-rrichter@cavium.com>

On Fri, Dec 09, 2016 at 07:10:41PM +0100, Robert Richter wrote:
> On ThunderX systems with certain memory configurations we see the
> following BUG_ON():
> 
>  kernel BUG at mm/page_alloc.c:1848!
> 
> This happens for some configs with 64k page size enabled. The BUG_ON()
> checks if start and end page of a memmap range belongs to the same
> zone.
> 
> The BUG_ON() check fails if a memory zone contains NOMAP regions. In
> this case the node information of those pages is not initialized. This
> causes an inconsistency of the page links with wrong zone and node
> information for that pages. NOMAP pages from node 1 still point to the
> mem zone from node 0 and have the wrong nid assigned.
> 
> The reason for the mis-configuration is a change in pfn_valid() which
> reports pages marked NOMAP as invalid:
> 
>  68709f45385a arm64: only consider memblocks with NOMAP cleared for linear mapping
> 
> This causes pages marked as nomap being no longer reassigned to the
> new zone in memmap_init_zone() by calling __init_single_pfn().
> 
> Fixing this by implementing an arm64 specific early_pfn_valid(). This
> causes the whole mem range including NOMAP memory to be initialized by
> __init_single_page() and ensures consistency of page links to zone,
> node and section.
> 
> The HAVE_ARCH_PFN_VALID config option now requires an explicit
> definiton of early_pfn_valid() in the same way as pfn_valid(). This
> allows a customized implementation of early_pfn_valid() which
> redirects to memblock_is_memory() for arm64.
> 
> Signed-off-by: Robert Richter <rrichter@cavium.com>

Acked-by: Russell King <rmk+kernel@armlinux.org.uk>

Thanks.

> ---
>  arch/arm/include/asm/page.h   |  1 +
>  arch/arm64/include/asm/page.h |  2 ++
>  arch/arm64/mm/init.c          | 12 ++++++++++++
>  include/linux/mmzone.h        |  5 ++++-
>  4 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
> index 4355f0ec44d6..79761bd55f94 100644
> --- a/arch/arm/include/asm/page.h
> +++ b/arch/arm/include/asm/page.h
> @@ -158,6 +158,7 @@ typedef struct page *pgtable_t;
>  
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int pfn_valid(unsigned long);
> +#define early_pfn_valid(pfn)	pfn_valid(pfn)
>  #endif
>  
>  #include <asm/memory.h>
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 8472c6def5ef..17ceb7435ded 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -49,6 +49,8 @@ typedef struct page *pgtable_t;
>  
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
>  extern int pfn_valid(unsigned long);
> +extern int early_pfn_valid(unsigned long);
> +#define early_pfn_valid early_pfn_valid
>  #endif
>  
>  #include <asm/memory.h>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 212c4d1e2f26..fbc136533472 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -145,11 +145,23 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
>  #endif /* CONFIG_NUMA */
>  
>  #ifdef CONFIG_HAVE_ARCH_PFN_VALID
> +
>  int pfn_valid(unsigned long pfn)
>  {
>  	return memblock_is_map_memory(pfn << PAGE_SHIFT);
>  }
>  EXPORT_SYMBOL(pfn_valid);
> +
> +/*
> + * We use memblock_is_memory() here to make sure all pages including
> + * NOMAP ranges are initialized with __init_single_page().
> + */
> +int early_pfn_valid(unsigned long pfn)
> +{
> +	return memblock_is_memory(pfn << PAGE_SHIFT);
> +}
> +EXPORT_SYMBOL(early_pfn_valid);
> +
>  #endif
>  
>  #ifndef CONFIG_SPARSEMEM
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 0f088f3a2fed..bedcf8a95881 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1170,12 +1170,16 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn)
>  }
>  
>  #ifndef CONFIG_HAVE_ARCH_PFN_VALID
> +
>  static inline int pfn_valid(unsigned long pfn)
>  {
>  	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>  		return 0;
>  	return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
>  }
> +
> +#define early_pfn_valid(pfn)	pfn_valid(pfn)
> +
>  #endif
>  
>  static inline int pfn_present(unsigned long pfn)
> @@ -1200,7 +1204,6 @@ static inline int pfn_present(unsigned long pfn)
>  #define pfn_to_nid(pfn)		(0)
>  #endif
>  
> -#define early_pfn_valid(pfn)	pfn_valid(pfn)
>  void sparse_init(void);
>  #else
>  #define sparse_init()	do {} while (0)
> -- 
> 2.1.4
> 

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply

* [PATCH v7 1/2] provide lock-less versions of clk_{enable|disable}
From: Uwe Kleine-König @ 2016-12-09 18:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481302773-14181-1-git-send-email-abailon@baylibre.com>

Hello,

On Fri, Dec 09, 2016 at 05:59:32PM +0100, Alexandre Bailon wrote:
> Rename __clk_{enable|disble} in davinci_clk_{enable|disable}.
> davinci_clk_{enable|disable} is a lock-less version of
> clk_{enable|disable}.
> This is useful to recursively enable clock without doing recursive call
> to clk_enable(), which would cause a recursive locking issue.
> The lock-less version could be used by example by the usb20 phy clock,
> that need to enable the usb20 clock before to start.

I wouldn't call that lock-less. The difference is that the newly exposed
funcion requires the caller to already hold the lock. So maybe a better
name would be clk_enable_locked.

Would it be more sensible to convert davinci to common-clk?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-K?nig            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply

* [PATCH 6/9] dt-bindings: Document rk3399 Gru/Kevin
From: Heiko Stuebner @ 2016-12-09 18:28 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161209175402.gy4mqjaf2rsib7qf@rob-hp-laptop>

Am Freitag, 9. Dezember 2016, 11:54:02 CET schrieb Rob Herring:
> On Thu, Dec 01, 2016 at 06:27:30PM -0800, Brian Norris wrote:
> > Gru is a base dev board for a family of devices, including Kevin. Both
> > utilize Rockchip RK3399, and they share much of their design.
> > 
> > Signed-off-by: Brian Norris <briannorris@chromium.org>
> > ---
> > 
> >  Documentation/devicetree/bindings/arm/rockchip.txt | 20
> >  ++++++++++++++++++++ 1 file changed, 20 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/arm/rockchip.txt
> > b/Documentation/devicetree/bindings/arm/rockchip.txt index
> > cc4ace6397ab..830e13f5890c 100644
> > --- a/Documentation/devicetree/bindings/arm/rockchip.txt
> > +++ b/Documentation/devicetree/bindings/arm/rockchip.txt
> > @@ -99,6 +99,26 @@ Rockchip platforms device tree bindings
> > 
> >  		     "google,veyron-speedy-rev3", "google,veyron-speedy-rev2",
> >  		     "google,veyron-speedy", "google,veyron", "rockchip,rk3288";
> > 
> > +- Google Gru (dev-board):
> > +    Required root node properties:
> > +      - compatible = "google,gru-rev15", "google,gru-rev14",
> > +		     "google,gru-rev13", "google,gru-rev12",
> > +		     "google,gru-rev11", "google,gru-rev10",
> > +		     "google,gru-rev9", "google,gru-rev8",
> > +		     "google,gru-rev7", "google,gru-rev6",
> > +		     "google,gru-rev5", "google,gru-rev4",
> > +		     "google,gru-rev3", "google,gru-rev2",
> > +		     "google,gru", "rockchip,rk3399";
> 
> All of these are supposed to be specified or just one rev at a time?

All of them. I.e. the devicetree is supposed to be compatible with all of 
those, while the bootloader determines the actual board revision and sets the 
choosen compatible - at least that was the way with the previous series of 
devices based around the rk3288 (veyron) and I think it will be the same way 
still.

I.e. as you can see below, Kevin starts being compatible at -rev6, as 
(engineering) revisions before that probably had some differences on the 
board.


> 
> > +
> > +- Google Kevin:
> > +    Required root node properties:
> > +      - compatible = "google,kevin-rev15", "google,kevin-rev14",
> > +		     "google,kevin-rev13", "google,kevin-rev12",
> > +		     "google,kevin-rev11", "google,kevin-rev10",
> > +		     "google,kevin-rev9", "google,kevin-rev8",
> > +		     "google,kevin-rev7", "google,kevin-rev6",
> > +		     "google,kevin", "google,gru", "rockchip,rk3399";
> > +
> > 
> >  - mqmaker MiQi:
> >      Required root node properties:
> >        - compatible = "mqmaker,miqi", "rockchip,rk3288";

^ permalink raw reply

* [PATCH] mmc: sdhci-xenon: fix boolreturn.cocci warnings
From: kbuild test robot @ 2016-12-09 18:27 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <fd1e454d2d8c30c4b191ae78e911d481315352e0.1481279228.git-series.gregory.clement@free-electrons.com>

drivers/mmc/host/sdhci-xenon-phy.c:469:9-10: WARNING: return of 0/1 in function 'emmc_phy_slow_mode' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

CC: Hu Ziji <huziji@marvell.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---

 sdhci-xenon-phy.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/mmc/host/sdhci-xenon-phy.c
+++ b/drivers/mmc/host/sdhci-xenon-phy.c
@@ -466,11 +466,11 @@ static bool emmc_phy_slow_mode(struct sd
 
 	/* Skip temp stages from HS200 to HS400 */
 	if (temp_stage_hs200_to_hs400(host, priv))
-		return 0;
+		return false;
 
 	/* Skip temp stages from HS400 t0 HS200 */
 	if (temp_stage_hs400_to_h200(host, priv))
-		return 0;
+		return false;
 
 	reg = sdhci_readl(host, phy_regs->timing_adj);
 	/* Enable Slow Mode for SDIO in slower SDR mode */

^ permalink raw reply

* [PATCH v3 08/12] mmc: sdhci-xenon: Add support to PHYs of Marvell Xenon SDHC.
From: kbuild test robot @ 2016-12-09 18:27 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <fd1e454d2d8c30c4b191ae78e911d481315352e0.1481279228.git-series.gregory.clement@free-electrons.com>

Hi Hu,

[auto build test WARNING on ]

url:    https://github.com/0day-ci/linux/commits/Gregory-CLEMENT/mmc-Add-support-to-Marvell-Xenon-SD-Host-Controller/20161210-002602
base:    


coccinelle warnings: (new ones prefixed by >>)

>> drivers/mmc/host/sdhci-xenon-phy.c:469:9-10: WARNING: return of 0/1 in function 'emmc_phy_slow_mode' with return type bool

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* [PATCH] efi/libstub: arm*: Pass latest memory map to the kernel
From: Ard Biesheuvel @ 2016-12-09 18:24 UTC (permalink / raw)
  To: linux-arm-kernel

As reported by James, the current libstub code involving the annotated
memory map only works somewhat correctly by accident, due to the fact
that a pool allocation happens to be reused immediately, retaining its
former contents.

Instead of juggling memory maps, which makes the code more complex than
it needs to be, simply put a placholder value into the FDT, and only
write the actual value after ExitBootServices() has been called.

Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/fdt.c | 51 ++++++++++++++++++++++++--------------
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/drivers/firmware/efi/libstub/fdt.c b/drivers/firmware/efi/libstub/fdt.c
index a6a93116a8f0..5d39dff77f17 100644
--- a/drivers/firmware/efi/libstub/fdt.c
+++ b/drivers/firmware/efi/libstub/fdt.c
@@ -101,7 +101,7 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 	if (status)
 		goto fdt_set_fail;
 
-	fdt_val64 = cpu_to_fdt64((u64)(unsigned long)memory_map);
+	fdt_val64 = U64_MAX; /* placeholder */
 	status = fdt_setprop(fdt, node, "linux,uefi-mmap-start",
 			     &fdt_val64,  sizeof(fdt_val64));
 	if (status)
@@ -148,6 +148,24 @@ efi_status_t update_fdt(efi_system_table_t *sys_table, void *orig_fdt,
 	return EFI_LOAD_ERROR;
 }
 
+static efi_status_t update_fdt_memmap(void *fdt, u64 memmap)
+{
+	int node = fdt_path_offset(fdt, "/chosen");
+	efi_status_t status;
+
+	if (node < 0)
+		return EFI_LOAD_ERROR;
+
+	memmap = cpu_to_fdt64(memmap);
+	status = fdt_setprop_inplace(fdt, node, "linux,uefi-mmap-start",
+				     &memmap, sizeof(memmap));
+
+	if (status)
+		return EFI_LOAD_ERROR;
+
+	return EFI_SUCCESS;
+}
+
 #ifndef EFI_FDT_ALIGN
 #define EFI_FDT_ALIGN EFI_PAGE_SIZE
 #endif
@@ -243,15 +261,6 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 			goto fail;
 		}
 
-		/*
-		 * Now that we have done our final memory allocation (and free)
-		 * we can get the memory map key  needed for
-		 * exit_boot_services().
-		 */
-		status = efi_get_memory_map(sys_table, &map);
-		if (status != EFI_SUCCESS)
-			goto fail_free_new_fdt;
-
 		status = update_fdt(sys_table,
 				    (void *)fdt_addr, fdt_size,
 				    (void *)*new_fdt_addr, new_fdt_size,
@@ -266,20 +275,16 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 			/*
 			 * We need to allocate more space for the new
 			 * device tree, so free existing buffer that is
-			 * too small.  Also free memory map, as we will need
-			 * to get new one that reflects the free/alloc we do
-			 * on the device tree buffer.
+			 * too small.
 			 */
 			efi_free(sys_table, new_fdt_size, *new_fdt_addr);
-			sys_table->boottime->free_pool(memory_map);
 			new_fdt_size += EFI_PAGE_SIZE;
 		} else {
 			pr_efi_err(sys_table, "Unable to construct new device tree.\n");
-			goto fail_free_mmap;
+			goto fail_free_new_fdt;
 		}
 	}
 
-	sys_table->boottime->free_pool(memory_map);
 	priv.runtime_map = runtime_map;
 	priv.runtime_entry_count = &runtime_entry_count;
 	status = efi_exit_boot_services(sys_table, handle, &map, &priv,
@@ -288,6 +293,17 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 	if (status == EFI_SUCCESS) {
 		efi_set_virtual_address_map_t *svam;
 
+		status = update_fdt_memmap((void *)*new_fdt_addr,
+					   (u64)memory_map);
+		if (status != EFI_SUCCESS) {
+			/*
+			 * The kernel won't get far without the memory map, but
+			 * may still be able to print something meaningful so
+			 * return success here.
+			 */
+			return EFI_SUCCESS;
+		}
+
 		/* Install the new virtual address map */
 		svam = sys_table->runtime->set_virtual_address_map;
 		status = svam(runtime_entry_count * desc_size, desc_size,
@@ -319,9 +335,6 @@ efi_status_t allocate_new_fdt_and_exit_boot(efi_system_table_t *sys_table,
 
 	pr_efi_err(sys_table, "Exit boot services failed.\n");
 
-fail_free_mmap:
-	sys_table->boottime->free_pool(memory_map);
-
 fail_free_new_fdt:
 	efi_free(sys_table, new_fdt_size, *new_fdt_addr);
 
-- 
2.7.4

^ permalink raw reply related

* Tearing down DMA transfer setup after DMA client has finished
From: Mason @ 2016-12-09 18:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161209175606.GM6408@localhost>

[ Dropping Mans to preserve his peace-of-mind ]

On 09/12/2016 18:56, Vinod Koul wrote:
> On Fri, Dec 09, 2016 at 06:34:15PM +0100, Mason wrote:
>> On 09/12/2016 18:17, Vinod Koul wrote:
>>
>>> On Fri, Dec 09, 2016 at 11:25:57AM +0100, Sebastian Frias wrote:
>>>>
>>>> What concrete solution do you propose?
>>>
>>> I have already proposed two solutions.
>>>
>>> A) Request a channel only when you need it. Obviously we can't do virtual
>>> channels with this (though we should still use virt-channels framework).
>>> The sbox setup and teardown can be done as part of channel request and
>>> freeup. PL08x already does this.
>>>
>>> Downside is that we can only have as many consumers at a time as channels.
>>>
>>> I have not heard any technical reason for not doing this apart from drivers
>>> grab the channel at probe, which is incorrect and needs to be fixed
>>> irrespective of the problem at hand.
>>>
>>> This is my preferred option.
>>
>> There is one important drawback with this solution. If a driver calls
>> dma_request_chan() when no channels are currently available, it will
>> get -EBUSY. If there were a flag in dma_request_chan to be put to
>> sleep (with timeout) until a channel is available, then it would
>> work. But busy waiting in the client driver is a waste of power.
> 
> Right, but in that case the fallback would be PIO mode, and if that is
> not availble (IIRC some f your devices don't) then reject the usage with
> EAGAIN.

Maybe I'm missing something, but I don't see how that would help.
Take the NAND Flash controller driver, for instance. PIO is not
an option, because the ECC engine is tied to DMA.

And failing with -EAGAIN doesn't help the busy looping situation.
The caller should be put on some kind of queue to wait for a
"channel ready" event.

Regards.

^ permalink raw reply

* [PATCH v4] arm64: fpsimd: improve stacking logic in non-interruptible context
From: Catalin Marinas @ 2016-12-09 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481301992-2344-1-git-send-email-ard.biesheuvel@linaro.org>

On Fri, Dec 09, 2016 at 04:46:32PM +0000, Ard Biesheuvel wrote:
>  void kernel_neon_begin_partial(u32 num_regs)
>  {
> -	if (in_interrupt()) {
> -		struct fpsimd_partial_state *s = this_cpu_ptr(
> -			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
> +	struct fpsimd_partial_state *s;
> +	int level;
> +
> +	preempt_disable();
> +
> +	level = this_cpu_inc_return(kernel_neon_nesting_level);
> +	BUG_ON(level > 3);
> +
> +	if (level > 1) {
> +		s = this_cpu_ptr(nested_fpsimdstate);
>  
> -		BUG_ON(num_regs > 32);
> -		fpsimd_save_partial_state(s, roundup(num_regs, 2));
> +		WARN_ON_ONCE(num_regs > 32);
> +		num_regs = min(roundup(num_regs, 2), 32U);
> +
> +		fpsimd_save_partial_state(&s[level - 2], num_regs);
>  	} else {
>  		/*
>  		 * Save the userland FPSIMD state if we have one and if we
> @@ -241,7 +256,6 @@ void kernel_neon_begin_partial(u32 num_regs)
>  		 * that there is no longer userland FPSIMD state in the
>  		 * registers.
>  		 */
> -		preempt_disable();
>  		if (current->mm &&
>  		    !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
>  			fpsimd_save_state(&current->thread.fpsimd_state);

I wonder whether we could actually do this saving and flag/level setting
in reverse to simplify the races. Something like your previous patch but
only set TIF_FOREIGN_FPSTATE after saving:

	level = this_cpu_read(kernel_neon_nesting_level);
	if (level > 0) {
		...
		fpsimd_save_partial_state();
	} else {
		if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
			fpsimd_save_state();
		set_thread_flag(TIF_FOREIGN_FPSTATE);
	}
	this_cpu_inc(kernel_neon_nesting_level);

There is a risk of extra saving if we get an interrupt after
test_thread_flag() and before set_thread_flag() but I don't think this
would corrupt any state, just writing things twice.

(disclaimer: I haven't thought of all the possible races and I'm not
entirely sure about the kernel_neon_end() part)

-- 
Catalin

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox