Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] efi: Pass secure boot mode to kernel
From: David Howells @ 2016-12-08 11:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Matt, Ard,

Is it too late to request this for the upcoming merge window?  Also, I've made
Lukas's requested changes and reposted just that patch in my reply to him.  Do
you want me to repost the lot?

Here's a set of patches that can determine the secure boot state of the
UEFI BIOS and pass that along to the main kernel image.  This involves
generalising ARM's efi_get_secureboot() function and making it mixed-mode
safe.

Changes:

 Ver 6:
  - Removed unnecessary variable init and trimmed comment.
  - Return efi_secureboot_mode_disabled directly rather than going to a
    place that just returns it.
  - Switched the last two patches.

 Ver 5:
  - Fix i386 compilation error (rsi should've been changed to esi).
  - Fix arm64 compilation error ('sys_table_arg' is a hidden macro parameter).

 Ver 4:
  - Use an enum to tell the kernel whether secure boot mode is enabled,
    disabled, couldn't be determined or wasn't even tried due to not being
    in EFI mode.
  - Support the UEFI-2.6 DeployedMode flag.
  - Don't clear boot_params->secure_boot in x86 sanitize_boot_params().
  - Preclear the boot_params->secure_boot on x86 head_*.S entry if we may
    not go through efi_main().

David
---
The following changes since commit 018edcfac4c3b140366ad51b0907f3becb5bb624:

  efi/libstub: Make efi_random_alloc() allocate below 4 GB on 32-bit (2016-11-25 07:15:23 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/efi-secure-boot-20161208

for you to fetch changes up to e71dd6bffca41faf7b4458c230e5c3d3c2b16d3e:

  efi: Add EFI_SECURE_BOOT bit (2016-12-08 08:19:04 +0000)

----------------------------------------------------------------
EFI secure boot

----------------------------------------------------------------
Ard Biesheuvel (1):
      efi: use typed function pointers for runtime services table

David Howells (5):
      x86/efi: Allow invocation of arbitrary runtime services
      arm/efi: Allow invocation of arbitrary runtime services
      efi: Add SHIM and image security database GUID definitions
      efi: Get the secure boot status
      efi: Handle secure boot from UEFI-2.6

Josh Boyer (2):
      efi: Disable secure boot if shim is in insecure mode
      efi: Add EFI_SECURE_BOOT bit

 Documentation/x86/zero-page.txt           |  2 +
 arch/arm/include/asm/efi.h                |  1 +
 arch/arm64/include/asm/efi.h              |  1 +
 arch/x86/boot/compressed/eboot.c          |  3 +
 arch/x86/boot/compressed/head_32.S        |  7 ++-
 arch/x86/boot/compressed/head_64.S        |  9 +--
 arch/x86/include/asm/bootparam_utils.h    |  5 +-
 arch/x86/include/asm/efi.h                |  5 ++
 arch/x86/include/uapi/asm/bootparam.h     |  3 +-
 arch/x86/kernel/asm-offsets.c             |  1 +
 arch/x86/kernel/setup.c                   | 15 +++++
 drivers/firmware/efi/libstub/Makefile     |  2 +-
 drivers/firmware/efi/libstub/arm-stub.c   | 63 ++------------------
 drivers/firmware/efi/libstub/secureboot.c | 99 +++++++++++++++++++++++++++++++
 include/linux/efi.h                       | 52 ++++++++++------
 15 files changed, 182 insertions(+), 86 deletions(-)
 create mode 100644 drivers/firmware/efi/libstub/secureboot.c

^ permalink raw reply

* [PATCH] arm64: Work around Falkor erratum 1009
From: Mark Rutland @ 2016-12-08 11:45 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161207200431.4587-1-cov@codeaurora.org>

On Wed, Dec 07, 2016 at 03:04:31PM -0500, Christopher Covington wrote:
> From: Shanker Donthineni <shankerd@codeaurora.org>
> 
> During a TLB invalidate sequence targeting the inner shareable
> domain, Falkor may prematurely complete the DSB before all loads
> and stores using the old translation are observed; instruction
> fetches are not subject to the conditions of this erratum.
> 
> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
> Signed-off-by: Christopher Covington <cov@codeaurora.org>
> ---
>  arch/arm64/Kconfig                | 10 +++++++++
>  arch/arm64/include/asm/cpucaps.h  |  3 ++-
>  arch/arm64/include/asm/tlbflush.h | 43 +++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/cpu_errata.c    |  7 +++++++
>  arch/arm64/kvm/hyp/tlb.c          | 39 ++++++++++++++++++++++++++++++-----
>  5 files changed, 96 insertions(+), 6 deletions(-)

Please update Documentation/arm64/silicon-errata.txt respectively.

[...]

>  #include <linux/sched.h>
>  #include <asm/cputype.h>
> +#include <asm/alternative.h>

Nit: please keep includes (alphabetically) ordered (at least below the
linux/ or asm/ level).

[...]

> +	asm volatile(ALTERNATIVE(
> +		     "nop \n"
> +		     "nop \n",
> +		     "tlbi vmalle1is \n"
> +		     "dsb ish \n",

As a general note, perhaps we want a C compatible NOP_ALTERNATIVE() so
that the nop case can be implicitly generated for sequences like this.

Thanks,
Mark.

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-12-08 11:44 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208103921.GC6408@localhost>

Vinod Koul <vinod.koul@intel.com> writes:

> On Wed, Dec 07, 2016 at 04:45:58PM +0000, M?ns Rullg?rd wrote:
>> Vinod Koul <vinod.koul@intel.com> writes:
>> 
>> > On Tue, Dec 06, 2016 at 01:14:20PM +0000, M?ns Rullg?rd wrote:
>> >> 
>> >> That's not going to work very well.  Device drivers typically request
>> >> dma channels in their probe functions or when the device is opened.
>> >> This means that reserving one of the few channels there will inevitably
>> >> make some other device fail to operate.
>> >
>> > No that doesnt make sense at all, you should get a channel only when you
>> > want to use it and not in probe!
>> 
>> Tell that to just about every single driver ever written.
>
> Not really, few do yes which is wrong but not _all_ do that.

Every driver I ever looked at does.  Name one you consider "correct."

-- 
M?ns Rullg?rd

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Måns Rullgård @ 2016-12-08 11:44 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208103730.GB6408@localhost>

Vinod Koul <vinod.koul@intel.com> writes:

> On Wed, Dec 07, 2016 at 04:44:55PM +0000, M?ns Rullg?rd wrote:
>> Vinod Koul <vinod.koul@intel.com> writes:
>> >> 
>> >> What you're proposing, Vinod, is to make a channel exclusive
>> >> to a driver, as long as the driver has not explicitly released
>> >> the channel, via dma_release_channel(), right?
>> >
>> > Precisely, but yes the downside of that is concurrent access are
>> > limited, but am not sure if driver implements virtual channels and
>> > allows that..
>> 
>> My driver implements virtual channels.  The problem is that the physical
>> dma channels signal completion slightly too soon, at least with
>> mem-to-device transfers.  Apparently we need to keep the sbox routing
>> until the peripheral indicates that it has actually received all the
>> data.
>
> So do we need concurrent accesses by all clients.

Depends on what people do with the chips.

-- 
M?ns Rullg?rd

^ permalink raw reply

* [GIT PULL v2] ARM: Xilinx Zynq DT changes for v4.10
From: Michal Simek @ 2016-12-08 11:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161207203919.GA11699@localhost>

On 7.12.2016 21:39, Olof Johansson wrote:
> On Tue, Dec 06, 2016 at 01:51:45PM +0100, Michal Simek wrote:
>> Hi,
>>
>> please add these changes to your arm-soc repo.
>> It fixes dtc warnings, small coding style changes and add support for
>> Microzed.
>>
>> Thanks,
>> Michal
>>
>> ---
>> v2: Use correct zynq-dt-for-4.10 tag instead of zynqmp-dt-for-4.10.
>>
>>
>> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
>>
>>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
>>
>> are available in the git repository at:
>>
>>   https://github.com/Xilinx/linux-xlnx.git tags/zynq-dt-for-4.10
>>
>> for you to fetch changes up to df2f3c48b9cd51e2612a1598342769d09d849f39:
>>
>>   arm: dts: zynq: Add MicroZed board support (2016-12-06 13:17:39 +0100)
> 
> Merged. In the future it'd be nice to get these a few weeks earlier in the
> cycle.

I know and sorry about it - busy time before Christmas.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP SoCs


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161208/934e330c/attachment.sig>

^ permalink raw reply

* [GIT PULL] ARM: Xilinx Zynq SOC changes for v4.10
From: Michal Simek @ 2016-12-08 11:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161207203629.GD10158@localhost>

On 7.12.2016 21:36, Olof Johansson wrote:
> On Tue, Dec 06, 2016 at 01:46:01PM +0100, Michal Simek wrote:
>> Hi,
>>
>> please pull this fix to your tree.
>>
>> Thanks,
>> Michal
>>
>>
>> The following changes since commit 1001354ca34179f3db924eb66672442a173147dc:
>>
>>   Linux 4.9-rc1 (2016-10-15 12:17:50 -0700)
>>
>> are available in the git repository at:
>>
>>   https://github.com/Xilinx/linux-xlnx.git tags/zynq-soc-for-4.10
>>
>> for you to fetch changes up to 7a3cc2a7b2c723aa552028f4e66841cec183756d:
>>
>>   ARM: zynq: Reserve correct amount of non-DMA RAM (2016-11-14 16:07:58
>> +0100)
>>
>> ----------------------------------------------------------------
>> arm: Xilinx Zynq patches for v4.10
>>
>> - Fix dma issue
>>
>> ----------------------------------------------------------------
>> Kyle Roeschley (1):
>>       ARM: zynq: Reserve correct amount of non-DMA RAM
> 
> Merged, thanks.  Should this have been a fix, btw? I queued it for 4.10 merge
> window now.

yes it is also fix and should go to stable but it is not marked like
that why queue to 4.10 should be fine.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP SoCs


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161208/dce0d59b/attachment.sig>

^ permalink raw reply

* [PATCH v5 2/5] i2c: Add STM32F4 I2C driver
From: kbuild test robot @ 2016-12-08 11:42 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481185563-8735-3-git-send-email-cedric.madianga@gmail.com>

Hi M'boumba,

[auto build test WARNING on wsa/i2c/for-next]
[also build test WARNING on v4.9-rc8 next-20161208]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/M-boumba-Cedric-Madianga/Add-support-for-the-STM32F4-I2C/20161208-173240
base:   https://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git i2c/for-next
config: openrisc-allyesconfig (attached as .config)
compiler: or32-linux-gcc (GCC) 4.5.1-or32-1.0rc1
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=openrisc 

All warnings (new ones prefixed by >>):

   drivers/i2c/busses/i2c-stm32f4.c: In function 'stm32f4_i2c_set_periph_clk_freq':
>> drivers/i2c/busses/i2c-stm32f4.c:201:9: warning: comparison of distinct pointer types lacks a cast
>> drivers/i2c/busses/i2c-stm32f4.c:201:9: warning: comparison of distinct pointer types lacks a cast
>> drivers/i2c/busses/i2c-stm32f4.c:201:9: warning: comparison of distinct pointer types lacks a cast

vim +201 drivers/i2c/busses/i2c-stm32f4.c

   185	
   186	static void stm32f4_i2c_disable_it(struct stm32f4_i2c_dev *i2c_dev)
   187	{
   188		void __iomem *reg = i2c_dev->base + STM32F4_I2C_CR2;
   189	
   190		stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR2_IRQ_MASK);
   191	}
   192	
   193	static void stm32f4_i2c_set_periph_clk_freq(struct stm32f4_i2c_dev *i2c_dev)
   194	{
   195		u32 clk_rate, cr2, freq;
   196	
   197		cr2 = readl_relaxed(i2c_dev->base + STM32F4_I2C_CR2);
   198		cr2 &= ~STM32F4_I2C_CR2_FREQ_MASK;
   199		clk_rate = clk_get_rate(i2c_dev->clk);
   200		freq = clk_rate / MHZ_TO_HZ;
 > 201		freq = clamp(freq, STM32F4_I2C_MIN_FREQ, STM32F4_I2C_MAX_FREQ);
   202		cr2 |= STM32F4_I2C_CR2_FREQ(freq);
   203		writel_relaxed(cr2, i2c_dev->base + STM32F4_I2C_CR2);
   204	}
   205	
   206	static void stm32f4_i2c_set_rise_time(struct stm32f4_i2c_dev *i2c_dev)
   207	{
   208		u32 trise, freq, cr2, val;
   209	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 39528 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161208/970e614a/attachment-0001.gz>

^ permalink raw reply

* [PATCH] arm64: Work around Falkor erratum 1009
From: Marc Zyngier @ 2016-12-08 11:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208112042.GB706@arm.com>

On 08/12/16 11:20, Will Deacon wrote:
> On Wed, Dec 07, 2016 at 03:04:31PM -0500, Christopher Covington wrote:
>> From: Shanker Donthineni <shankerd@codeaurora.org>
>>
>> During a TLB invalidate sequence targeting the inner shareable
>> domain, Falkor may prematurely complete the DSB before all loads
>> and stores using the old translation are observed; instruction
>> fetches are not subject to the conditions of this erratum.
>>
>> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
>> Signed-off-by: Christopher Covington <cov@codeaurora.org>
>> ---
>>  arch/arm64/Kconfig                | 10 +++++++++
>>  arch/arm64/include/asm/cpucaps.h  |  3 ++-
>>  arch/arm64/include/asm/tlbflush.h | 43 +++++++++++++++++++++++++++++++++++++++
>>  arch/arm64/kernel/cpu_errata.c    |  7 +++++++
>>  arch/arm64/kvm/hyp/tlb.c          | 39 ++++++++++++++++++++++++++++++-----
>>  5 files changed, 96 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 1004a3d..125440f 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -485,6 +485,16 @@ config QCOM_FALKOR_ERRATUM_E1003
>>  
>>  	  If unsure, say Y.
>>  
>> +config QCOM_FALKOR_ERRATUM_E1009
>> +	bool "Falkor E1009: Prematurely complete a DSB after a TLBI"
>> +	default y
>> +	help
>> +	  Falkor CPU may prematurely complete a DSB following a TLBI xxIS
>> +	  invalidate maintenance operations. Repeat the TLBI operation one
>> +	  more time to fix the issue.
>> +
>> +	  If unsure, say Y.
> 
> Call me perverse, but I like this workaround. People often tend to screw
> up TLBI and DVM sync, but the IPI-based workaround is horribly invasive
> and fragile. Simply repeating the operation tends to be enough to make
> the chance of failure small enough to be acceptable.
> 
>> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
>> index cb6a8c2..5357d7f 100644
>> --- a/arch/arm64/include/asm/cpucaps.h
>> +++ b/arch/arm64/include/asm/cpucaps.h
>> @@ -35,7 +35,8 @@
>>  #define ARM64_HYP_OFFSET_LOW			14
>>  #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
>>  #define ARM64_WORKAROUND_QCOM_FALKOR_E1003	16
>> +#define ARM64_WORKAROUND_QCOM_FALKOR_E1009	17
> 
> Could you rename this to something like ARM64_WORKAROUND_REPEAT_TLBI, so
> that it could potentially be used by others?

And add a parameter to it so that we can generate multiple TLBIs,
depending on the level of brokenness? ;-)

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH 2/3] arm64: Work around Falkor erratum 1003
From: Mark Rutland @ 2016-12-08 11:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161207200028.4420-2-cov@codeaurora.org>

On Wed, Dec 07, 2016 at 03:00:26PM -0500, Christopher Covington wrote:
> From: Shanker Donthineni <shankerd@codeaurora.org>
> 
> On the Qualcomm Datacenter Technologies Falkor v1 CPU, memory accesses may
> allocate TLB entries using an incorrect ASID when TTBRx_EL1 is being
> updated. Changing the TTBRx_EL1[ASID] and TTBRx_EL1[BADDR] fields
> separately using a reserved ASID will ensure that there are no TLB entries
> with incorrect ASID after changing the the ASID.
> 
> Pseudo code:
>   write TTBRx_EL1[ASID] to a reserved value
>   ISB
>   write TTBRx_EL1[BADDR] to a desired value
>   ISB
>   write TTBRx_EL1[ASID] to a desired value
>   ISB
> 
> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
> Signed-off-by: Christopher Covington <cov@codeaurora.org>
> ---
>  arch/arm64/Kconfig               | 11 +++++++++++
>  arch/arm64/include/asm/cpucaps.h |  3 ++-
>  arch/arm64/kernel/cpu_errata.c   |  7 +++++++
>  arch/arm64/mm/context.c          | 10 ++++++++++
>  arch/arm64/mm/proc.S             | 21 +++++++++++++++++++++
>  5 files changed, 51 insertions(+), 1 deletion(-)

This needs an update to Documentation/arm64/silicon-errata.txt.

> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
> index efcf1f7..f8d94ff 100644
> --- a/arch/arm64/mm/context.c
> +++ b/arch/arm64/mm/context.c
> @@ -87,6 +87,11 @@ static void flush_context(unsigned int cpu)
>  	/* Update the list of reserved ASIDs and the ASID bitmap. */
>  	bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
>  
> +	/* Reserve ASID '1' for Falkor erratum E1003 */
> +	if (IS_ENABLED(CONFIG_QCOM_FALKOR_ERRATUM_E1003) &&
> +	    cpus_have_cap(ARM64_WORKAROUND_QCOM_FALKOR_E1003))
> +		__set_bit(1, asid_map);
> +
>  	/*
>  	 * Ensure the generation bump is observed before we xchg the
>  	 * active_asids.
> @@ -239,6 +244,11 @@ static int asids_init(void)
>  		panic("Failed to allocate bitmap for %lu ASIDs\n",
>  		      NUM_USER_ASIDS);
>  
> +	/* Reserve ASID '1' for Falkor erratum E1003 */
> +	if (IS_ENABLED(CONFIG_QCOM_FALKOR_ERRATUM_E1003) &&
> +	    cpus_have_cap(ARM64_WORKAROUND_QCOM_FALKOR_E1003))
> +		__set_bit(1, asid_map);
> +
>  	pr_info("ASID allocator initialised with %lu entries\n", NUM_USER_ASIDS);
>  	return 0;
>  }
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 352c73b..b4d6508 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -134,6 +134,27 @@ ENDPROC(cpu_do_resume)
>  ENTRY(cpu_do_switch_mm)
>  	mmid	x1, x1				// get mm->context.id
>  	bfi	x0, x1, #48, #16		// set the ASID
> +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1003
> +alternative_if_not ARM64_WORKAROUND_QCOM_FALKOR_E1003
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +alternative_else
> +	mrs     x2, ttbr0_el1                   // get cuurent TTBR0_EL1
> +	mov     x3, #1                          // reserved ASID

It might be best to define a FALCOR_E1003_RESERVED_ASID constant
somewhere, rather than using 1 directly here and in the ASID allocator.

> +	bfi     x2, x3, #48, #16                // set the reserved ASID + old BADDR
> +	msr     ttbr0_el1, x2                   // update TTBR0_EL1
> +	isb
> +	bfi     x2, x0, #0, #48                 // set the desired BADDR + reserved ASID
> +	msr     ttbr0_el1, x2                   // update TTBR0_EL1
> +	isb
> +alternative_endif

Please use alternative_if and alternative_else_nop_endif.

As Catalin noted, there are issues with stale and/or conflicting TLB
entries allocated with the reserved ASID, so we likely have to
invalidate that after the final switch.

Thanks,
Mark.

^ permalink raw reply

* [Linaro-acpi] [PATCH v17 08/15] clocksource/drivers/arm_arch_timer: move arch_timer_needs_of_probing into DT init call
From: Fu Wei @ 2016-12-08 11:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208110410.GA9768@leverpostej>

Hi Mark,

On 8 December 2016 at 19:04, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Dec 08, 2016 at 11:16:21AM +0800, Fu Wei wrote:
>> Hi Timur,
>>
>> On 8 December 2016 at 01:25, Timur Tabi <timur@codeaurora.org> wrote:
>> > On Fri, Nov 25, 2016 at 9:06 AM, Fu Wei <fu.wei@linaro.org> wrote:
>> >>
>> >> a "+ int ret;" should be move from [12/15] to here, I have fix the
>> >> problem in my repo, it would happen in next patchset
>> >>
>> >> https://git.linaro.org/people/fu.wei/linux.git/log/?h=topic-gtdt-wakeup-timer_upstream_v18_devel
>> >
>> > Fu, please post v18 to the mailing list so that it can be picked up
>> > for 4.10 (if it's not too late already).
>
> Unfortunately, it's too late for v4.10. It hasn't been sat in linux-next
> at all, and we've seen kbuild test failures.
>
> Hopefully there's time to beat this into shape and get it into
> linux-next so that it's ready to queue for v4.11, though.

cross fingers for getting into v4.11 :-)

>
>> Great thanks for your suggestion!  :-)
>> yes, you are right, I would love to post v18 ASAP.
>>
>> But I am still waiting for more feedback from the maintainers.
>
> Please post a version which passes inspection by the kbuild test robot.
> I haven't had a chance to look at this yet, and it'll be better to look
> at a version that actually works.

OK, NP, will post them in several hours
Great thanks for your feedback.

>
> Thanks,
> Mark.



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat

^ permalink raw reply

* [PATCH] arm64: Work around Falkor erratum 1009
From: Will Deacon @ 2016-12-08 11:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161207200431.4587-1-cov@codeaurora.org>

On Wed, Dec 07, 2016 at 03:04:31PM -0500, Christopher Covington wrote:
> From: Shanker Donthineni <shankerd@codeaurora.org>
> 
> During a TLB invalidate sequence targeting the inner shareable
> domain, Falkor may prematurely complete the DSB before all loads
> and stores using the old translation are observed; instruction
> fetches are not subject to the conditions of this erratum.
> 
> Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
> Signed-off-by: Christopher Covington <cov@codeaurora.org>
> ---
>  arch/arm64/Kconfig                | 10 +++++++++
>  arch/arm64/include/asm/cpucaps.h  |  3 ++-
>  arch/arm64/include/asm/tlbflush.h | 43 +++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/cpu_errata.c    |  7 +++++++
>  arch/arm64/kvm/hyp/tlb.c          | 39 ++++++++++++++++++++++++++++++-----
>  5 files changed, 96 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 1004a3d..125440f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -485,6 +485,16 @@ config QCOM_FALKOR_ERRATUM_E1003
>  
>  	  If unsure, say Y.
>  
> +config QCOM_FALKOR_ERRATUM_E1009
> +	bool "Falkor E1009: Prematurely complete a DSB after a TLBI"
> +	default y
> +	help
> +	  Falkor CPU may prematurely complete a DSB following a TLBI xxIS
> +	  invalidate maintenance operations. Repeat the TLBI operation one
> +	  more time to fix the issue.
> +
> +	  If unsure, say Y.

Call me perverse, but I like this workaround. People often tend to screw
up TLBI and DVM sync, but the IPI-based workaround is horribly invasive
and fragile. Simply repeating the operation tends to be enough to make
the chance of failure small enough to be acceptable.

> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index cb6a8c2..5357d7f 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -35,7 +35,8 @@
>  #define ARM64_HYP_OFFSET_LOW			14
>  #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
>  #define ARM64_WORKAROUND_QCOM_FALKOR_E1003	16
> +#define ARM64_WORKAROUND_QCOM_FALKOR_E1009	17

Could you rename this to something like ARM64_WORKAROUND_REPEAT_TLBI, so
that it could potentially be used by others?

>  
> -#define ARM64_NCAPS				17
> +#define ARM64_NCAPS				18
>  
>  #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index deab523..03bafc5 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -23,6 +23,7 @@
>  
>  #include <linux/sched.h>
>  #include <asm/cputype.h>
> +#include <asm/alternative.h>
>  
>  /*
>   * Raw TLBI operations.
> @@ -94,6 +95,13 @@ static inline void flush_tlb_all(void)
>  	dsb(ishst);
>  	__tlbi(vmalle1is);
>  	dsb(ish);
> +	asm volatile(ALTERNATIVE(
> +		     "nop \n"
> +		     "nop \n",
> +		     "tlbi vmalle1is \n"
> +		     "dsb ish \n",
> +		     ARM64_WORKAROUND_QCOM_FALKOR_E1009)
> +		     : :);

I'd much rather this was part of the __tlbi macro, which would hopefully
restrict this to one place in the code.

Will

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Geert Uytterhoeven @ 2016-12-08 11:18 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <91b0d10c-1bc2-c3e1-4088-f4ad9adcd6c0@free.fr>

On Thu, Dec 8, 2016 at 11:54 AM, Mason <slash.tmp@free.fr> wrote:
> On 08/12/2016 11:39, Vinod Koul wrote:
>> On Wed, Dec 07, 2016 at 04:45:58PM +0000, M?ns Rullg?rd wrote:
>>> Vinod Koul <vinod.koul@intel.com> writes:
>>>> On Tue, Dec 06, 2016 at 01:14:20PM +0000, M?ns Rullg?rd wrote:
>>>>> That's not going to work very well.  Device drivers typically request
>>>>> dma channels in their probe functions or when the device is opened.
>>>>> This means that reserving one of the few channels there will inevitably
>>>>> make some other device fail to operate.
>>>>
>>>> No that doesn't make sense at all, you should get a channel only when you
>>>> want to use it and not in probe!
>>>
>>> Tell that to just about every single driver ever written.
>>
>> Not really, few do yes which is wrong but not _all_ do that.
>
> Vinod,
>
> Could you explain something to me in layman's terms?
>
> I have a NAND Flash Controller driver that depends on the
> DMA driver under discussion.
>
> Suppose I move the dma_request_chan() call from the driver's
> probe function, to the actual DMA transfer function.
>
> I would want dma_request_chan() to put the calling thread
> to sleep until a channel becomes available (possibly with
> a timeout value).
>
> But Maxime told me dma_request_chan() will just return
> -EBUSY if no channels are available.
>
> Am I supposed to busy wait in my driver's DMA function
> until a channel becomes available?

Can you fall back to PIO if requesting a channel fails?

Alternatively, dma_request_chan() could always succeed, and
dmaengine_prep_slave_sg() could fail if the channel is currently not
available due to a limitation on the number of active channels, and
the driver could fall back to PIO for that transfer.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH V8 1/3] ACPI: Add support for ResourceSource/IRQ domain mapping
From: Lorenzo Pieralisi @ 2016-12-08 11:05 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1480460259-8585-2-git-send-email-agustinv@codeaurora.org>

Hi Agustin,

please CC me for next version.

On Tue, Nov 29, 2016 at 05:57:37PM -0500, Agustin Vega-Frias wrote:
> When an Extended IRQ Resource contains a valid ResourceSource
> use it to map the IRQ on the domain associated with the ACPI
> device referenced.
> 
> With this in place an irqchip driver can create its domain using
> irq_domain_create_linear and pass the device fwnode to create
> the domain mapping. When dependent devices are probed these
> changes allow the ACPI core find the domain and map the IRQ.
> 
> Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
> ---
>  drivers/acpi/Makefile         |  2 +-
>  drivers/acpi/{gsi.c => irq.c} | 99 +++++++++++++++++++++++++++++++++++++------

[...]

> -static void acpi_dev_get_irqresource(struct resource *res, u32 gsi,
> +static void acpi_dev_get_irqresource(struct resource *res, u32 hwirq,
> +				     struct fwnode_handle *source,
>  				     u8 triggering, u8 polarity, u8 shareable,
>  				     bool legacy)
>  {
>  	int irq, p, t;
>  
> -	if (!valid_IRQ(gsi)) {
> -		acpi_dev_irqresource_disabled(res, gsi);
> +	if (!source && !valid_IRQ(hwirq)) {

If you make source a struct acpi_resource_source pointer it could be:

if (source || !valid_IRQ(hwirq))

Actually we would not even need to pass the pointer, if we detect
an acpi_resource_source dependency we can just disable the resource
(without even looking-up the fwnode_handle, see below), it is a design
choice we have to make.

> +		acpi_dev_irqresource_disabled(res, hwirq);
>  		return;
>  	}
>  
> @@ -402,25 +403,25 @@ static void acpi_dev_get_irqresource(struct resource *res, u32 gsi,
>  	 * using extended IRQ descriptors we take the IRQ configuration
>  	 * from _CRS directly.
>  	 */
> -	if (legacy && !acpi_get_override_irq(gsi, &t, &p)) {
> +	if (legacy && !acpi_get_override_irq(hwirq, &t, &p)) {
>  		u8 trig = t ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
>  		u8 pol = p ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH;
>  
>  		if (triggering != trig || polarity != pol) {
> -			pr_warning("ACPI: IRQ %d override to %s, %s\n", gsi,
> -				   t ? "level" : "edge", p ? "low" : "high");
> +			pr_warn("ACPI: IRQ %d override to %s, %s\n", hwirq,
> +				t ? "level" : "edge", p ? "low" : "high");
>  			triggering = trig;
>  			polarity = pol;
>  		}
>  	}
>  
>  	res->flags = acpi_dev_irq_flags(triggering, polarity, shareable);
> -	irq = acpi_register_gsi(NULL, gsi, triggering, polarity);
> +	irq = acpi_register_irq(source, hwirq, triggering, polarity);
>  	if (irq >= 0) {
>  		res->start = irq;
>  		res->end = irq;
>  	} else {
> -		acpi_dev_irqresource_disabled(res, gsi);
> +		acpi_dev_irqresource_disabled(res, hwirq);
>  	}
>  }
>  
> @@ -448,6 +449,7 @@ bool acpi_dev_resource_interrupt(struct acpi_resource *ares, int index,
>  {
>  	struct acpi_resource_irq *irq;
>  	struct acpi_resource_extended_irq *ext_irq;
> +	struct fwnode_handle *src;
>  
>  	switch (ares->type) {
>  	case ACPI_RESOURCE_TYPE_IRQ:
> @@ -460,7 +462,7 @@ bool acpi_dev_resource_interrupt(struct acpi_resource *ares, int index,
>  			acpi_dev_irqresource_disabled(res, 0);
>  			return false;
>  		}
> -		acpi_dev_get_irqresource(res, irq->interrupts[index],
> +		acpi_dev_get_irqresource(res, irq->interrupts[index], NULL,
>  					 irq->triggering, irq->polarity,
>  					 irq->sharable, true);
>  		break;
> @@ -470,7 +472,8 @@ bool acpi_dev_resource_interrupt(struct acpi_resource *ares, int index,
>  			acpi_dev_irqresource_disabled(res, 0);
>  			return false;
>  		}
> -		acpi_dev_get_irqresource(res, ext_irq->interrupts[index],
> +		src = acpi_get_irq_source_fwhandle(&ext_irq->resource_source);

I think the only pending question on my side for this series is whether
we still carry out the domain look-up here (as you are doing now), or,
if we detect a resource_source dependency, we just disable the resource
and leave the deferred probing mechanism to deal with it, this will
completely decouple the current resource parsing from the deferred probe
mechanism that you are introducing; basically this is equivalent to
saying "if the IRQ resource has a dependency let's resolve it at
acpi_irq_get() time, not now".

I am fine either way, I just think that leaving the domain look-up
in the middle of the IRQ resource parsing is not really clean-cut.

Thanks,
Lorenzo

> +		acpi_dev_get_irqresource(res, ext_irq->interrupts[index], src,
>  					 ext_irq->triggering, ext_irq->polarity,
>  					 ext_irq->sharable, false);
>  		break;
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 61a3d90..154e576 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -321,6 +321,25 @@ void acpi_set_irq_model(enum acpi_irq_model_id model,
>   */
>  void acpi_unregister_gsi (u32 gsi);
>  
> +#ifdef CONFIG_ACPI_GENERIC_GSI
> +struct fwnode_handle *
> +acpi_get_irq_source_fwhandle(const struct acpi_resource_source *source);
> +int acpi_register_irq(struct fwnode_handle *source, u32 hwirq, int trigger,
> +		      int polarity);
> +void acpi_unregister_irq(struct fwnode_handle *source, u32 hwirq);
> +#else
> +#define acpi_get_irq_source_fwhandle(source) (NULL)
> +static inline int acpi_register_irq(struct fwnode_handle *source, u32 hwirq,
> +				    int trigger, int polarity)
> +{
> +	return acpi_register_gsi(NULL, hwirq, trigger, polarity);
> +}
> +static inline void acpi_unregister_irq(struct fwnode_handle *source, u32 hwirq)
> +{
> +	acpi_unregister_gsi(hwirq);
> +}
> +#endif
> +
>  struct pci_dev;
>  
>  int acpi_pci_irq_enable (struct pci_dev *dev);
> -- 
> Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [Linaro-acpi] [PATCH v17 08/15] clocksource/drivers/arm_arch_timer: move arch_timer_needs_of_probing into DT init call
From: Mark Rutland @ 2016-12-08 11:04 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CADyBb7vAfMGV5ERFqAbwo4v4BhJv5s_-x4DXmdvy=iUacBzJWQ@mail.gmail.com>

On Thu, Dec 08, 2016 at 11:16:21AM +0800, Fu Wei wrote:
> Hi Timur,
> 
> On 8 December 2016 at 01:25, Timur Tabi <timur@codeaurora.org> wrote:
> > On Fri, Nov 25, 2016 at 9:06 AM, Fu Wei <fu.wei@linaro.org> wrote:
> >>
> >> a "+ int ret;" should be move from [12/15] to here, I have fix the
> >> problem in my repo, it would happen in next patchset
> >>
> >> https://git.linaro.org/people/fu.wei/linux.git/log/?h=topic-gtdt-wakeup-timer_upstream_v18_devel
> >
> > Fu, please post v18 to the mailing list so that it can be picked up
> > for 4.10 (if it's not too late already).

Unfortunately, it's too late for v4.10. It hasn't been sat in linux-next
at all, and we've seen kbuild test failures.

Hopefully there's time to beat this into shape and get it into
linux-next so that it's ready to queue for v4.11, though.

> Great thanks for your suggestion!  :-)
> yes, you are right, I would love to post v18 ASAP.
> 
> But I am still waiting for more feedback from the maintainers.

Please post a version which passes inspection by the kbuild test robot.
I haven't had a chance to look at this yet, and it'll be better to look
at a version that actually works.

Thanks,
Mark.

^ permalink raw reply

* [PATCH v2] crypto: sun4i-ss: support the Security System PRNG
From: PrasannaKumar Muralidharan @ 2016-12-08 11:01 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208092400.GA9743@Red>

>> The hwrng interface was always meant to be an interface for real
>> hardware random number generators.  People rely on that so we
>> should not provide bogus entropy sources through this interface.
>>
>
> Why not adding a KCONFIG HW_RANDOM_ACCEPT_ALSO_PRNG with big warning ?
> Or a HW_PRNG Kconfig which do the same than hwrandom with /dev/prng ?
> With that it will be much easier to convert in-tree PRNG that you want to remove.

I do have driver for a PRNG that I was planning to post in sometime.
Upon seeing this thread I think the code has to be changed.

I completely agree with Corentin, adding /dev/prng or /dev/hw_prng
will make it easier to move existing code. It can be made explicit
that using new device will provide only pseudo random number.

Thanks,
PrasannaKumar

^ permalink raw reply

* [RFC PATCH v3 07/11] arm64: compat: Add a 32-bit vDSO
From: Kevin Brodsky @ 2016-12-08 10:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <87r35jmv3e.fsf@wedge.i-did-not-set--mail-host-address--so-tickle-me>

On 07/12/16 18:51, Nathan Lynch wrote:
> Hi Kevin,

Hi Nathan,

Thanks for taking a look at these patches!

> Kevin Brodsky <kevin.brodsky@arm.com> writes:
>> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
>> new file mode 100644
>> index 000000000000..53c3d1f82b26
>> --- /dev/null
>> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
> As I said at LPC last month, I'm not excited to have arch/arm's
> vgettimeofday.c copied into arch/arm64 and tweaked; I'd rather see this
> part of the implementation shared between arch/arm and arch/arm64
> somehow, even if there's not an elegant way to do so.
>
> The situation which must be avoided is one where the arch/arm64 compat
> VDSO incompatibly diverges from the arch/arm VDSO.  That becomes much
> less likely if there's only one copy of the userspace-exposed code to
> maintain.

I still agree this is very suboptimal. However, I also think this is by far the most 
straightforward solution, and I would like to stick to it *as a first step*. If you 
diff this "tweaked" vgettimeofday.c with the original one, you'll see that it is 
really not obvious to share even parts of vgettimeofday.c in the current state of arm 
and arm64. Besides, arm's vDSO hasn't changed much since it has been introduced last 
year, and vgettimeofday.c hasn't changed at all, so I think it's fair to say 
divergences are unlikely to happen in the near future.

Nevertheless, I completely agree this is not a good situation in the long run, and I 
am not happy with leaving things in this state. After this series is merged, my plan 
is to try and align arm and arm64 both in terms of vDSO features and kernel-vDSO 
interface. This would notably involve:
* Aligning the vDSO data struct's (especially vdso_data's member names), so that 
vgettimeofday.c can be (mostly) shared.
* Non-trivial Makefile work to share the same one between the arm and arm64 compat 
vDSOs. The refactoring I did in v3 is a first step, as not using top-level flags 
makes it more feasible to share the same Makefile.
* Adding support for CLOCK_MONOTONIC_RAW, as I did in the 64-bit vDSO.
* Moving arm's sigreturn trampolines to its vDSO (if enabled), and convince glibc to 
use the kernel's trampolines like on arm64.

This represents a substantial amount of work, which I think should be in a separate 
series (this one has already grown bigger than I expected).

The final step would be to investigate whether the 64-bit vDSO could reasonably move 
to a C implementation, and if it is the case unify the 3 vDSOs.

> [...]
>
>> +/*
>> + * We use the hidden visibility to prevent the compiler from generating a GOT
>> + * relocation. Not only is going through a GOT useless (the entry couldn't and
>> + * musn't be overridden by another library), it does not even work: the linker
>> + * cannot generate an absolute address to the data page.
>> + *
>> + * With the hidden visibility, the compiler simply generates a PC-relative
>> + * relocation (R_ARM_REL32), and this is what we need.
>> + */
>> +extern const struct vdso_data _vdso_data __attribute__((visibility("hidden")));
>> +
>> +static inline const struct vdso_data *get_vdso_data(void)
>> +{
>> +	const struct vdso_data *ret;
>> +	/*
>> +	 * This simply puts &_vdso_data into ret. The reason why we don't use
>> +	 * `ret = &_vdso_data` is that the compiler tends to optimise this in a
>> +	 * very suboptimal way: instead of keeping &_vdso_data in a register,
>> +	 * it goes through a relocation almost every time _vdso_data must be
>> +	 * accessed (even in subfunctions). This is both time and space
>> +	 * consuming: each relocation uses a word in the code section, and it
>> +	 * has to be loaded at runtime.
>> +	 *
>> +	 * This trick hides the assignment from the compiler. Since it cannot
>> +	 * track where the pointer comes from, it will only use one relocation
>> +	 * where get_vdso_data() is called, and then keep the result in a
>> +	 * register.
>> +	 */
>> +	asm("mov %0, %1" : "=r"(ret) : "r"(&_vdso_data));
>> +	return ret;
>> +}
> I think this is an instructive case: this code looks like an improvement
> over how the arch/arm VDSO accesses the data page.  Enhancements like
> this (not to mention fixes) shouldn't require a patch-per-architecture;
> things inevitably get out of sync and it's more of a maintenance burden
> in the long term.

What arm does is suboptimal indeed, as it uses a whole function to derive vdso_data's 
address instead of a simple relocation. However both methods are functionally 
equivalent. The approach I used here is simply consistent with how vdso_data is 
accessed in the 64-bit vDSO (with those two tricks because of both an assembler 
difference and the fact we are using C here, and GCC is not being smart). Again, I 
agree the arm vDSO should use the same approach, but unification requires other 
changes beyond the scope of this first series.

Thanks,
Kevin

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Mason @ 2016-12-08 10:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208103921.GC6408@localhost>

On 08/12/2016 11:39, Vinod Koul wrote:

> On Wed, Dec 07, 2016 at 04:45:58PM +0000, M?ns Rullg?rd wrote:
>
>> Vinod Koul <vinod.koul@intel.com> writes:
>>
>>> On Tue, Dec 06, 2016 at 01:14:20PM +0000, M?ns Rullg?rd wrote:
>>>>
>>>> That's not going to work very well.  Device drivers typically request
>>>> dma channels in their probe functions or when the device is opened.
>>>> This means that reserving one of the few channels there will inevitably
>>>> make some other device fail to operate.
>>>
>>> No that doesn't make sense at all, you should get a channel only when you
>>> want to use it and not in probe!
>>
>> Tell that to just about every single driver ever written.
> 
> Not really, few do yes which is wrong but not _all_ do that.

Vinod,

Could you explain something to me in layman's terms?

I have a NAND Flash Controller driver that depends on the
DMA driver under discussion.

Suppose I move the dma_request_chan() call from the driver's
probe function, to the actual DMA transfer function.

I would want dma_request_chan() to put the calling thread
to sleep until a channel becomes available (possibly with
a timeout value).

But Maxime told me dma_request_chan() will just return
-EBUSY if no channels are available.

Am I supposed to busy wait in my driver's DMA function
until a channel becomes available?

I don't understand how the multiplexing of few memory
channels to many clients is supposed to happen efficiently?

Regards.

^ permalink raw reply

* [PATCH v5 2/5] i2c: Add STM32F4 I2C driver
From: kbuild test robot @ 2016-12-08 10:47 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481185563-8735-3-git-send-email-cedric.madianga@gmail.com>

Hi M'boumba,

[auto build test WARNING on wsa/i2c/for-next]
[also build test WARNING on v4.9-rc8 next-20161208]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/M-boumba-Cedric-Madianga/Add-support-for-the-STM32F4-I2C/20161208-173240
base:   https://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git i2c/for-next
config: i386-allmodconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/linux/clk.h:16:0,
                    from drivers/i2c/busses/i2c-stm32f4.c:12:
   drivers/i2c/busses/i2c-stm32f4.c: In function 'stm32f4_i2c_set_periph_clk_freq':
   include/linux/kernel.h:749:16: warning: comparison of distinct pointer types lacks a cast
     (void) (&max1 == &max2);   \
                   ^
   include/linux/kernel.h:737:2: note: in definition of macro '__min'
     t1 min1 = (x);     \
     ^~
   include/linux/kernel.h:778:28: note: in expansion of macro 'min'
    #define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)
                               ^~~
>> include/linux/kernel.h:752:2: note: in expansion of macro '__max'
     __max(typeof(x), typeof(y),   \
     ^~~~~
>> include/linux/kernel.h:778:45: note: in expansion of macro 'max'
    #define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)
                                                ^~~
>> drivers/i2c/busses/i2c-stm32f4.c:201:9: note: in expansion of macro 'clamp'
     freq = clamp(freq, STM32F4_I2C_MIN_FREQ, STM32F4_I2C_MAX_FREQ);
            ^~~~~
   include/linux/kernel.h:749:16: warning: comparison of distinct pointer types lacks a cast
     (void) (&max1 == &max2);   \
                   ^
   include/linux/kernel.h:737:13: note: in definition of macro '__min'
     t1 min1 = (x);     \
                ^
   include/linux/kernel.h:778:28: note: in expansion of macro 'min'
    #define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)
                               ^~~
>> include/linux/kernel.h:752:2: note: in expansion of macro '__max'
     __max(typeof(x), typeof(y),   \
     ^~~~~
>> include/linux/kernel.h:778:45: note: in expansion of macro 'max'
    #define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)
                                                ^~~
>> drivers/i2c/busses/i2c-stm32f4.c:201:9: note: in expansion of macro 'clamp'
     freq = clamp(freq, STM32F4_I2C_MIN_FREQ, STM32F4_I2C_MAX_FREQ);
            ^~~~~
   include/linux/kernel.h:739:16: warning: comparison of distinct pointer types lacks a cast
     (void) (&min1 == &min2);   \
                   ^
   include/linux/kernel.h:742:2: note: in expansion of macro '__min'
     __min(typeof(x), typeof(y),   \
     ^~~~~
   include/linux/kernel.h:778:28: note: in expansion of macro 'min'
    #define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)
                               ^~~
>> drivers/i2c/busses/i2c-stm32f4.c:201:9: note: in expansion of macro 'clamp'
     freq = clamp(freq, STM32F4_I2C_MIN_FREQ, STM32F4_I2C_MAX_FREQ);
            ^~~~~

vim +/clamp +201 drivers/i2c/busses/i2c-stm32f4.c

   185	
   186	static void stm32f4_i2c_disable_it(struct stm32f4_i2c_dev *i2c_dev)
   187	{
   188		void __iomem *reg = i2c_dev->base + STM32F4_I2C_CR2;
   189	
   190		stm32f4_i2c_clr_bits(reg, STM32F4_I2C_CR2_IRQ_MASK);
   191	}
   192	
   193	static void stm32f4_i2c_set_periph_clk_freq(struct stm32f4_i2c_dev *i2c_dev)
   194	{
   195		u32 clk_rate, cr2, freq;
   196	
   197		cr2 = readl_relaxed(i2c_dev->base + STM32F4_I2C_CR2);
   198		cr2 &= ~STM32F4_I2C_CR2_FREQ_MASK;
   199		clk_rate = clk_get_rate(i2c_dev->clk);
   200		freq = clk_rate / MHZ_TO_HZ;
 > 201		freq = clamp(freq, STM32F4_I2C_MIN_FREQ, STM32F4_I2C_MAX_FREQ);
   202		cr2 |= STM32F4_I2C_CR2_FREQ(freq);
   203		writel_relaxed(cr2, i2c_dev->base + STM32F4_I2C_CR2);
   204	}
   205	
   206	static void stm32f4_i2c_set_rise_time(struct stm32f4_i2c_dev *i2c_dev)
   207	{
   208		u32 trise, freq, cr2, val;
   209	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 56838 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161208/776da99e/attachment-0001.gz>

^ permalink raw reply

* [PATCH 3/3] clk: keystone: Add sci-clk driver support
From: Tero Kristo @ 2016-12-08 10:45 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161208001348.GC5423@codeaurora.org>

On 08/12/16 02:13, Stephen Boyd wrote:
> On 10/21, Tero Kristo wrote:
>> diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
>> index 6a8ac04..dce08a7 100644
>> --- a/drivers/clk/Kconfig
>> +++ b/drivers/clk/Kconfig
>> @@ -169,6 +169,15 @@ config COMMON_CLK_NXP
>>  	---help---
>>  	  Support for clock providers on NXP platforms.
>>
>> +config TI_SCI_CLK
>> +	tristate "TI System Control Interface clock drivers"
>> +	depends on (TI_SCI_PROTOCOL && COMMON_CLK_KEYSTONE) || COMPILE_TEST
>
> Given that we depend on COMMON_CLK_KEYSTONE (just for the
> Makefile dependency?) this should be moved to right below the
> COMMON_CLK_KEYSTONE config. And we should consider making a
> Kconfig file in drivers/clk/keystone/ to hold both those configs
> instead of having them at the toplevel.

Its a makefile dependency only right now. I'll have a look at how to 
handle this properly.

>
>> +	default TI_SCI_PROTOCOL
>> +	---help---
>> +	  This adds the clock driver support over TI System Control Interface.
>> +	  If you wish to use clock resources from the PMMC firmware, say Y.
>> +	  Otherwise, say N.
>> +
>>  config COMMON_CLK_PALMAS
>>  	tristate "Clock driver for TI Palmas devices"
>>  	depends on MFD_PALMAS
>> diff --git a/drivers/clk/keystone/Makefile b/drivers/clk/keystone/Makefile
>> index 0477cf6..0e7993d 100644
>> --- a/drivers/clk/keystone/Makefile
>> +++ b/drivers/clk/keystone/Makefile
>> @@ -1 +1,2 @@
>>  obj-y			+= pll.o gate.o
>> +obj-$(CONFIG_TI_SCI_CLK)	+= sci-clk.o
>> diff --git a/drivers/clk/keystone/sci-clk.c b/drivers/clk/keystone/sci-clk.c
>> new file mode 100644
>> index 0000000..f6af5bd
>> --- /dev/null
>> +++ b/drivers/clk/keystone/sci-clk.c
>> @@ -0,0 +1,589 @@
> [...]
>> +
>> +/**
>> + * sci_clk_recalc_rate - Get clock rate for a TI SCI clock
>> + * @hw: clock to get rate for
>> + * @parent_rate: parent rate provided by common clock framework, not used
>> + *
>> + * Gets the current clock rate of a TI SCI clock. Returns the current
>> + * clock rate, or zero in failure.
>> + */
>> +static unsigned long sci_clk_recalc_rate(struct clk_hw *hw,
>> +					 unsigned long parent_rate)
>> +{
>> +	struct sci_clk *clk = to_sci_clk(hw);
>> +	u64 freq;
>> +	int ret;
>> +
>> +	ret = clk->provider->ops->get_freq(clk->provider->sci, clk->dev_id,
>> +					   clk->clk_id, &freq);
>> +	if (ret) {
>> +		dev_err(clk->provider->dev,
>> +			"recalc-rate failed for dev=%d, clk=%d, ret=%d\n",
>> +			clk->dev_id, clk->clk_id, ret);
>> +		return 0;
>> +	}
>> +
>> +	return (u32)freq;
>
> Do we need the cast? sizeof(u32) doesn't always equal
> sizeof(unsigned long).

Hmm, not sure where that came from. But yea, can be dropped.


>
>> +
>> +/**
>> + * _sci_clk_get - Gets a handle for an SCI clock
>> + * @provider: Handle to SCI clock provider
>> + * @dev_id: device ID for the clock to register
>> + * @clk_id: clock ID for the clock to register
>> + *
>> + * Gets a handle to an existing TI SCI hw clock, or builds a new clock
>> + * entry and registers it with the common clock framework. Called from
>> + * the common clock framework, when a corresponding of_clk_get call is
>> + * executed, or recursively from itself when parsing parent clocks.
>> + * Returns a pointer to the hw clock struct, or ERR_PTR value in failure.
>> + */
>> +static struct clk_hw *_sci_clk_build(struct sci_clk_provider *provider,
>> +				     u16 dev_id, u8 clk_id)
>> +{
>> +	struct clk_init_data init = { NULL };
>> +	struct sci_clk *sci_clk = NULL;
>> +	char *name = NULL;
>> +	char **parent_names = NULL;
>> +	int i;
>> +	int ret;
>> +
>> +	sci_clk = devm_kzalloc(provider->dev, sizeof(*sci_clk), GFP_KERNEL);
>> +	if (!sci_clk)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	sci_clk->dev_id = dev_id;
>> +	sci_clk->clk_id = clk_id;
>> +	sci_clk->provider = provider;
>> +
>> +	ret = provider->ops->get_num_parents(provider->sci, dev_id,
>> +					     clk_id,
>> +					     &init.num_parents);
>> +	if (ret)
>> +		goto err;
>> +
>> +	name = kasprintf(GFP_KERNEL, "%s:%d:%d", dev_name(provider->dev),
>> +			 sci_clk->dev_id, sci_clk->clk_id);
>> +
>> +	init.name = name;
>> +
>> +	if (init.num_parents < 2)
>> +		init.num_parents = 0;
>
> This deserves a comment. Why is num_parents == 1 the same as
> num_parents == 0?

I'll add a comment on this. Basically some clocks can be root clocks 
which don't have parents, and we only want to have parent control for 
clocks that can switch their parent. This is kind of a quirk of the 
firmware.

>
>> +
>> +	if (init.num_parents) {
>> +		parent_names = devm_kcalloc(provider->dev, init.num_parents,
>> +					    sizeof(char *), GFP_KERNEL);
>> +
>> +		if (!parent_names) {
>> +			ret = -ENOMEM;
>> +			goto err;
>> +		}
>> +
>> +		for (i = 0; i < init.num_parents; i++) {
>> +			char *parent_name;
>> +
>> +			parent_name = kasprintf(GFP_KERNEL, "%s:%d:%d",
>> +						dev_name(provider->dev),
>> +						sci_clk->dev_id,
>> +						sci_clk->clk_id + 1 + i);
>> +			if (!parent_name) {
>> +				ret = -ENOMEM;
>> +				goto err;
>> +			}
>> +			parent_names[i] = parent_name;
>> +		}
>> +		init.parent_names = (const char * const *)parent_names;
>
> Does that really need a cast?

Doesn't seem like so... I think without this it was generating some 
checkpatch issue sometime back, but doesn't seem to be the case anymore.

>
>> +	}
>> +
>> +	init.ops = &sci_clk_ops;
>> +	sci_clk->hw.init = &init;
>> +
>> +	ret = devm_clk_hw_register(provider->dev, &sci_clk->hw);
>> +	if (ret) {
>> +		dev_err(provider->dev, "failed clk register with %d\n", ret);
>> +		goto err;
>> +	}
>> +	kfree(name);
>> +
>> +	return &sci_clk->hw;
>> +
>> +err:
>> +	if (parent_names) {
>> +		for (i = 0; i < init.num_parents; i++)
>> +			devm_kfree(provider->dev, parent_names[i]);
>> +
>> +		devm_kfree(provider->dev, parent_names);
>
> Shouldn't we be freeing the parent names all the time? It should
> be deep copied in the framework.

I'll check this.

>
>> +	}
>> +
>> +	devm_kfree(provider->dev, sci_clk);
>> +
>> +	kfree(name);
>> +
>> +	return ERR_PTR(ret);
>> +}
> [..]
>> +
>> +static int ti_sci_init_clocks(struct sci_clk_provider *p)
>> +{
>> +	struct sci_clk_data *data = p->clocks;
>> +	struct clk_hw *hw;
>> +	int i;
>> +
>> +	while (data->num_clks) {
>> +		data->clocks = devm_kcalloc(p->dev, data->num_clks,
>> +					    sizeof(struct sci_clk),
>> +					    GFP_KERNEL);
>> +		if (!data->clocks)
>> +			return -ENOMEM;
>> +
>> +		for (i = 0; i < data->num_clks; i++) {
>> +			hw = _sci_clk_build(p, data->dev, i);
>> +			if (!IS_ERR(hw)) {
>> +				data->clocks[i] = hw;
>> +				continue;
>> +			}
>> +
>> +			/* Skip any holes in the clock lists */
>> +			if (PTR_ERR(hw) == -ENODEV)
>
> Does this happen? I don't see where _sci_clk_build() returns
> -ENODEV.

Yes, it can and will happen. get_num_parents() called by sci_clk_build 
will return ENODEV for device/clock pairs that don't exist.

>
>> +				continue;
>> +
>> +			return PTR_ERR(hw);
>> +		}
>> +		data++;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>
>> +
>> +/**
>> + * ti_sci_clk_probe - Probe function for the TI SCI clock driver
>> + * @pdev: platform device pointer to be probed
>> + *
>> + * Probes the TI SCI clock device. Allocates a new clock provider
>> + * and registers this to the common clock framework. Also applies
>> + * any required flags to the identified clocks via clock lists
>> + * supplied from DT. Returns 0 for success, negative error value
>> + * for failure.
>> + */
>> +static int ti_sci_clk_probe(struct platform_device *pdev)
>> +{
>> +	struct device *dev = &pdev->dev;
>> +	struct device_node *np = dev->of_node;
>> +	struct sci_clk_provider *provider;
>> +	const struct ti_sci_handle *handle;
>> +	struct sci_clk_data *data;
>> +	int ret;
>> +
>> +	data = (struct sci_clk_data *)
>> +		of_match_node(ti_sci_clk_of_match, np)->data;
>
> Just use of_device_get_match_data() instead.

All righty.

>
>> +
>> +	handle = devm_ti_sci_get_handle(dev);
>> +	if (IS_ERR(handle))
>> +		return PTR_ERR(handle);
>> +
>> +	provider = devm_kzalloc(dev, sizeof(*provider), GFP_KERNEL);
>> +	if (!provider)
>> +		return -ENOMEM;
>> +
>> +	provider->clocks = data;
>> +
>> +	provider->sci = handle;
>> +	provider->ops = &handle->ops.clk_ops;
>> +	provider->dev = dev;
>> +
>> +	ti_sci_init_clocks(provider);
>
> And if this fails?

Yea this is kind of controversial. ti_sci_init_clocks() can fail if any 
of the clocks registered will fail. I decided to have it this way so 
that at least some clocks might work in failure cause, and you might 
have a booting device instead of total lock-up.

Obviously it could be done so that if any clock fails, we would 
de-register all clocks at that point, but personally I think this is a 
worse option.

ti_sci_init_clocks could probably be modified to continue registering 
clocks when a single clock fails though. Currently it aborts at first 
failure.

Thoughts on that?

>
>> +
>> +	ret = of_clk_add_hw_provider(np, sci_clk_get, provider);
>> +	if (ret)
>> +		return ret;
>> +
>> +	return 0;
>
> Just "return of_clk_add_hw_provider()" please.

True that, will fix.

-Tero

^ permalink raw reply

* crypto regression?
From: Herbert Xu @ 2016-12-08 10:44 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <10260816-d835-746e-d69e-da4cd7b6b88e@atmel.com>

On Thu, Dec 08, 2016 at 11:41:55AM +0100, Cyrille Pitchen wrote:
> Hi Herbert,
> 
> Let me report a potential regression I've noticed this morning when testing
> linux-next.
> 
> I've set CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n when compiling both kernel
> images.
> 
> 
> On 4.9.0-rc2-next-20161028, /proc/crypto displays:
> driver       : atmel-xts-aes

Thanks for the report.  This should be fixed once I push this out

https://patchwork.kernel.org/patch/9465979/

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* crypto regression?
From: Cyrille Pitchen @ 2016-12-08 10:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Herbert,

Let me report a potential regression I've noticed this morning when testing
linux-next.

I've set CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=n when compiling both kernel
images.


On 4.9.0-rc2-next-20161028, /proc/crypto displays:
driver       : atmel-xts-aes
module       : kernel
priority     : 300
refcnt       : 1
selftest     : passed
internal     : no
type         : ablkcipher
async        : yes
blocksize    : 16
min keysize  : 32
max keysize  : 64
ivsize       : 16
geniv        : <default>

and no output from the test manager in the boot log.


Whereas on 4.9.0-rc8-next-20161208, we get:
driver       : atmel-xts-aes
module       : kernel
priority     : 300
refcnt       : 1
selftest     : unknown
internal     : no
type         : ablkcipher
async        : yes
blocksize    : 16
min keysize  : 32
max keysize  : 64
ivsize       : 16
geniv        : <default>


Also I see the following traces during the boot:

alg: skcipher: Chunk test 1 failed on encryption at page 0 for atmel-xts-aes
00000000: 1c 3b 3a 10 2f 77 03 86 e4 83 6c 99 e3 70 cf 9b
00000010: ea 00 80 3f 5e 48 23 57 a4 ae 12 d4 14 a3 e6 3b
00000020: 5d 31 e2 76 f8 fe 4a 8d 66 b3 17 f9 ac 68 3f 44
00000030: 68 0a 86 ac 35 ad fc 33 45 be fe cb 4b b1 88 fd
00000040: 57 76 92 6c 49 a3 09 5e b1 08 fd 10 98 ba ec 70
00000050: aa a6 69 99 a7 2a 82 f2 7d 84 8b 21 d4 a7 41 b0
00000060: c5 cd 4d 5f ff 9d ac 89 ae ba 12 29 61 d0 3a 75
00000070: 71 23 e9 87 0f 8a cf 10 00 02 08 87 89 14 29 ca
00000080: 2a 3e 7a 7d 7d f7 b1 03 55 16 5c 8b 9a 6d 0a 7d
00000090: e8 b0 62 c4 50 0d c4 cd 12 0c 0f 74 18 da e3 d0
000000a0: b5 78 1c 34 80 3f a7 54 21 c7 90 df e1 de 18 34
000000b0: f2 80 d7 66 7b 32 7f 6c 8c d7 55 7e 12 ac 3a 0f
000000c0: 93 ec 05 c5 2e 04 93 ef 31 a1 2d 3d 92 60 f7 9a
000000d0: 28 9d 6a 37 9b c7 0c 50 84 14 73 d1 a8 cc 81 ec
000000e0: 58 3e 96 45 e0 7b 8d 96 70 65 5b a5 bb cf ec c6
000000f0: dc 39 66 38 0a d8 fe cb 17 b6 ba 02 46 9a 02 0a
00000100: 84 e1 8e 8f 84 25 20 70 c1 3e 9f 1f 28 9b e5 4f
00000110: bc 48 14 57 77 8f 61 60 15 e1 32 7a 02 b1 40 f1
00000120: 50 5e b3 09 32 6d 68 37 8f 83 74 59 5c 84 9d 84
00000130: f4 c3 33 ec 44 23 88 51 43 cb 47 bd 71 c5 ed ae
00000140: 9b e6 9a 2f fe ce b1 be c9 de 24 4f be 15 99 2b
00000150: 11 b7 7c 04 0f 12 bd 8f 6a 97 5a 44 a0 f9 0c 29
00000160: a9 ab c3 d4 d8 93 92 72 84 c5 87 54 cc e2 94 52
00000170: 9f 86 14 dc d2 ab a9 91 92 5f ed c4 ae 74 ff ac
00000180: 6e 33 3b 93 eb 4a ff 04 79 da 9a 41 0e 44 50 e0
00000190: dd 7a e4 c6 e2 91 09 00 57 5d a4 01 fc 07 05 9f
000001a0: 64 5e 8b 7e 9b fd ef 33 94 30 54 ff 84 01 14 93
000001b0: c2 7b 34 29 ea ed b4 ed 53 76 44 1a 77 ed 43 85
000001c0: 1a d7 7f 16 f5 41 df d2 69 d5 0d 6a 5f 14 fb 0a
000001d0: b5 32 fd 6f 01 77 3d 53 f7 a4 70 83 46 bc 5f c4
000001e0: f3 6f fd a9 fc ea 70 b9 c6 e6 93 e1

The output is a little bit long for test 1, isn't it?
When I look at aes_xts_enc_tv_template[] from crypto/testmgr.h
I see .rlen = 32 .


I didn't bisect to find out exactly since when the regression is there. I
wanted to warn you quickly since we are close to the merge window.
Also if you have already been notified about this issue, please, sorry for
the noise!

Best regards,

Cyrille

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Vinod Koul @ 2016-12-08 10:39 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <yw1xr35j1yd5.fsf@unicorn.mansr.com>

On Wed, Dec 07, 2016 at 04:45:58PM +0000, M?ns Rullg?rd wrote:
> Vinod Koul <vinod.koul@intel.com> writes:
> 
> > On Tue, Dec 06, 2016 at 01:14:20PM +0000, M?ns Rullg?rd wrote:
> >> 
> >> That's not going to work very well.  Device drivers typically request
> >> dma channels in their probe functions or when the device is opened.
> >> This means that reserving one of the few channels there will inevitably
> >> make some other device fail to operate.
> >
> > No that doesnt make sense at all, you should get a channel only when you
> > want to use it and not in probe!
> 
> Tell that to just about every single driver ever written.

Not really, few do yes which is wrong but not _all_ do that.

-- 
~Vinod

^ permalink raw reply

* Tearing down DMA transfer setup after DMA client has finished
From: Vinod Koul @ 2016-12-08 10:37 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <yw1xvauv1yew.fsf@unicorn.mansr.com>

On Wed, Dec 07, 2016 at 04:44:55PM +0000, M?ns Rullg?rd wrote:
> Vinod Koul <vinod.koul@intel.com> writes:
> >> 
> >> What you're proposing, Vinod, is to make a channel exclusive
> >> to a driver, as long as the driver has not explicitly released
> >> the channel, via dma_release_channel(), right?
> >
> > Precisely, but yes the downside of that is concurrent access are
> > limited, but am not sure if driver implements virtual channels and
> > allows that..
> 
> My driver implements virtual channels.  The problem is that the physical
> dma channels signal completion slightly too soon, at least with
> mem-to-device transfers.  Apparently we need to keep the sbox routing
> until the peripheral indicates that it has actually received all the
> data.

So do we need concurrent accesses by all clients.

-- 
~Vinod

^ permalink raw reply

* [PATCH 2/3] arm64: Work around Falkor erratum 1003
From: Catalin Marinas @ 2016-12-08 10:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161207200028.4420-2-cov@codeaurora.org>

On Wed, Dec 07, 2016 at 03:00:26PM -0500, Christopher Covington wrote:
> From: Shanker Donthineni <shankerd@codeaurora.org>
> 
> On the Qualcomm Datacenter Technologies Falkor v1 CPU, memory accesses may
> allocate TLB entries using an incorrect ASID when TTBRx_EL1 is being
> updated. Changing the TTBRx_EL1[ASID] and TTBRx_EL1[BADDR] fields
> separately using a reserved ASID will ensure that there are no TLB entries
> with incorrect ASID after changing the the ASID.
> 
> Pseudo code:
>   write TTBRx_EL1[ASID] to a reserved value
>   ISB
>   write TTBRx_EL1[BADDR] to a desired value
>   ISB
>   write TTBRx_EL1[ASID] to a desired value
>   ISB

While the new ASID probably won't have incorrect TLB entries, the
reserved ASID will have random entries from all over the place. That's
because in step 1 you change the ASID to the reserved one while leaving
the old BADDR in place. There is a brief time before changing the ASID
when speculative page table walks will populate the TLB with entries
tagged with the reserved ASID. Such entries are never removed during TLB
shoot-down for the real ASID, so, depending on how this CPU implements
the walk cache, you could end up with intermediate level entries still
active and pointing to freed/reused pages. It will eventually hit an
entry that looks global with weird consequences.

We've been bitten by this in the past on arm32: 52af9c6cd863 ("ARM:
6943/1: mm: use TTBR1 instead of reserved context ID").

-- 
Catalin

^ permalink raw reply

* [PATCH 1/1] arm64: mm: add config options for page table configuration
From: Catalin Marinas @ 2016-12-08 10:00 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1481139600-24455-2-git-send-email-scott.branden@broadcom.com>

On Wed, Dec 07, 2016 at 11:40:00AM -0800, Scott Branden wrote:
> Make MAX_PHYSMEM_BITS and SECTIONS_SIZE_BITS configurable by adding
> config options.
> Default to current settings currently defined in sparesmem.h.
> For systems wishing to save memory the config options can be overridden.
> Example, changing MAX_PHYSMEM_BITS from 48 to 36 at the same time as
> changing SECTION_SIZE_BITS from 30 to 26 frees 13MB of memory.

I'm not keen on such change, it's a big departure from the single Image
aims. I would rather reduce SECTION_SIZE_BITS permanently where
feasible, like in this patch:

http://lkml.kernel.org/r/1465821119-3384-1-git-send-email-jszhang at marvell.com

-- 
Catalin

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox