* [PATCHv2 0/2] kvm: arm/arm64: Fixes for race conditions
From: Suzuki K Poulose @ 2017-05-03 14:17 UTC (permalink / raw)
To: linux-arm-kernel
These two patches fixes race conditions in the stage2 page table
handling of the KVM on arm/arm64.
Applies on next-20170503.
Changes since v1:
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/502867.html
- Dropped patch for fixing mmu_notifier race condition, which couldn't be
reproduced.
- Added reviewed-by from Christoffer
- Added new patch to fix another race condition
Suzuki K Poulose (2):
kvm: arm/arm64: Fix race in resetting stage2 PGD
kvm: arm/arm64: Fix use after free of stage2 page table
arch/arm/kvm/mmu.c | 33 ++++++++++++++++++++-------------
1 file changed, 20 insertions(+), 13 deletions(-)
--
2.7.4
^ permalink raw reply
* [PATCH 1/2] kvm: arm/arm64: Fix race in resetting stage2 PGD
From: Suzuki K Poulose @ 2017-05-03 14:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493821072-29713-1-git-send-email-suzuki.poulose@arm.com>
In kvm_free_stage2_pgd() we check the stage2 PGD before holding
the lock and proceed to take the lock if it is valid. And we unmap
the page tables, followed by releasing the lock. We reset the PGD
only after dropping this lock, which could cause a race condition
where another thread waiting on or even holding the lock, could
potentially see that the PGD is still valid and proceed to perform
a stage2 operation and later encounter a NULL PGD.
[223090.242280] Unable to handle kernel NULL pointer dereference at
virtual address 00000040
[223090.262330] PC is at unmap_stage2_range+0x8c/0x428
[223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c
[223090.262531] Call trace:
[223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428
[223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c
[223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104
[223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc
[223090.262543] [<ffff0000080a2478>]
kvm_mmu_notifier_invalidate_page+0x50/0x8c
[223090.262547] [<ffff0000082274f8>]
__mmu_notifier_invalidate_page+0x5c/0x84
[223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0
[223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0
[223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4
[223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac
[223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac
[223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0
[223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290
[223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac
[223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4
[223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254
[223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538
[223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4
[223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c
[223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8
[223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c
[223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c
[223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774
[223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac
[223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648
[223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768
[223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4
[223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4
[223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c
This patch moves the stage2 PGD manipulation under the lock.
Reported-by: Alexander Graf <agraf@suse.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Kr?m?? <rkrcmar@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
arch/arm/kvm/mmu.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 313ee64..909a1a7 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -829,22 +829,22 @@ void stage2_unmap_vm(struct kvm *kvm)
* Walks the level-1 page table pointed to by kvm->arch.pgd and frees all
* underlying level-2 and level-3 tables before freeing the actual level-1 table
* and setting the struct pointer to NULL.
- *
- * Note we don't need locking here as this is only called when the VM is
- * destroyed, which can only be done once.
*/
void kvm_free_stage2_pgd(struct kvm *kvm)
{
- if (kvm->arch.pgd == NULL)
- return;
+ void *pgd = NULL;
spin_lock(&kvm->mmu_lock);
- unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
+ if (kvm->arch.pgd) {
+ unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
+ pgd = kvm->arch.pgd;
+ kvm->arch.pgd = NULL;
+ }
spin_unlock(&kvm->mmu_lock);
/* Free the HW pgd, one page at a time */
- free_pages_exact(kvm->arch.pgd, S2_PGD_SIZE);
- kvm->arch.pgd = NULL;
+ if (pgd)
+ free_pages_exact(pgd, S2_PGD_SIZE);
}
static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
--
2.7.4
^ permalink raw reply related
* [PATCH 2/2] kvm: arm/arm64: Fix use after free of stage2 page table
From: Suzuki K Poulose @ 2017-05-03 14:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493821072-29713-1-git-send-email-suzuki.poulose@arm.com>
We yield the kvm->mmu_lock occassionaly while performing an operation
(e.g, unmap or permission changes) on a large area of stage2 mappings.
However this could possibly cause another thread to clear and free up
the stage2 page tables while we were waiting for regaining the lock and
thus the original thread could end up in accessing memory that was
freed. This patch fixes the problem by making sure that the stage2
pagetable is still valid after we regain the lock. The fact that
mmu_notifer->release() could be called twice (via __mmu_notifier_release
and mmu_notifier_unregsister) enhances the possibility of hitting
this race where there are two threads trying to unmap the entire guest
shadow pages.
While at it, cleanup the redudant checks around cond_resched_lock in
stage2_wp_range(), as cond_resched_lock already does the same checks.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Radim Kr?m?? <rkrcmar@redhat.com>
Cc: andreyknvl at google.com
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
arch/arm/kvm/mmu.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 909a1a7..5b3e0db 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -301,9 +301,14 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
/*
* If the range is too large, release the kvm->mmu_lock
* to prevent starvation and lockup detector warnings.
+ * Make sure the page table is still active when we regain
+ * the lock.
*/
- if (next != end)
+ if (next != end) {
cond_resched_lock(&kvm->mmu_lock);
+ if (!READ_ONCE(kvm->arch.pgd))
+ break;
+ }
} while (pgd++, addr = next, addr != end);
}
@@ -1170,11 +1175,13 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
* large. Otherwise, we may see kernel panics with
* CONFIG_DETECT_HUNG_TASK, CONFIG_LOCKUP_DETECTOR,
* CONFIG_LOCKDEP. Additionally, holding the lock too long
- * will also starve other vCPUs.
+ * will also starve other vCPUs. We have to also make sure
+ * that the page tables are not freed while we released
+ * the lock.
*/
- if (need_resched() || spin_needbreak(&kvm->mmu_lock))
- cond_resched_lock(&kvm->mmu_lock);
-
+ cond_resched_lock(&kvm->mmu_lock);
+ if (!READ_ONCE(kvm->arch.pgd))
+ break;
next = stage2_pgd_addr_end(addr, end);
if (stage2_pgd_present(*pgd))
stage2_wp_puds(pgd, addr, next);
--
2.7.4
^ permalink raw reply related
* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Marek Vasut @ 2017-05-03 14:26 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20170503135715.GG18578@dragon>
On 05/03/2017 03:57 PM, Shawn Guo wrote:
> On Tue, Apr 25, 2017 at 07:28:06PM +0200, Marek Vasut wrote:
>> On 04/25/2017 07:23 PM, Leonard Crestez wrote:
>>> Anyway, that version also sets the supply for reg_arm and reg_soc. It
>>> is not necessary for fixing the crash I'm seeing but is good because it
>>> will result in the minimum voltage on VDD_ARM_SOC_IN rather than a fix
>>> 1375mv. I tested Marek's patch and it works fine on my rev B board
>>> (which otherwise fails to boot upstream).
>>
>> Oh that's nice , thanks ! I don't have SDB and I hacked it up after a
>> brief discussion with Fabio without even compile-testing it, thus RFC.
>> Glad to hear it works and thanks for testing it ! Can you add a formal
>> Tested-by please ?
>
> Hi Marek,
Hi Shawn,
> Thanks for your patch. But I prefer Leonard's version because: 1) it
> has a better commit log; 2) it sticks to one-patch-does-one-thing
> policy.
Well I'd prefer this patch because
1) It has T-B
2) It actually fixes a problem with the voltage rails such that the DVFS
works without leaving the system in unstable or dead state. You do
need the second part of my patch if you drop the OPP hackery, without
it the power framework cannot correctly configure the core voltages,
so the patch from Leonard makes things worse.
> But I'm going to wait for a while to get Peter's comment discussed,
> before I actually apply Leonard's patch.
>
> Shawn
>
--
Best regards,
Marek Vasut
^ permalink raw reply
* [PATCH] [RFC] ARM: dts: imx6sx-sdb: Drop OPP hackery
From: Henri Roosen @ 2017-05-03 14:27 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493818353.11226.2.camel@nxp.com>
On 05/03/2017 03:32 PM, Leonard Crestez wrote:
> On Tue, 2017-04-25 at 22:39 +0300, Leonard Crestez wrote:
>> On Tue, 2017-04-25 at 21:30 +0200, Marek Vasut wrote:
>>>
>>> On 04/25/2017 06:42 PM, Marek Vasut wrote:
>>>>
>>>> The imx6sx-sdb has one power supply that drives both VDDARM_IN
>>>> and VDDSOC_IN, which is the sw1a regulator on the PFUZE PMIC.
>>>> Connect both inputs to the sw1a regulator on the PMIC and drop
>>>> the OPP hackery which is no longer needed as the power framework
>>>> will take care of the regulator configuration as needed.
>>>>
>>>> Signed-off-by: Marek Vasut <marex@denx.de>
>>>> Cc: Fabio Estevam <fabio.estevam@nxp.com>
>>>> Cc: Shawn Guo <shawnguo@kernel.org>
>>> +CC Leonard
>> Tested-by: Leonard Crestez <leonard.crestez@nxp.com>
>>
>> The OPP hack only applies to LDO bypass mode and that is not in
>> upstream. When LDOs are enabled the effect is to use higher voltages
>> than necessary for no good reason.
>>
>> Setting these higher voltages can make some boards (for example Rev B)
>> fail to boot with ugly semi-random crashes reminiscent of memory
>> corruption. These failures happen the first time the lowest idle state
>> is used. This patch fixes those crashes.
>>
>> It's not clear exactly why the crashes happen. Perhaps waking up from
>> idle draws more power than is available? I don't think it matters.
>>
>> I sent a very similar patch a few minutes after this one:
>>
>> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503241
>> .html
>
> Adding Henri Roosen because he reported something very similar.
>
> Henri: Can you please check if this patch fixes your issue?
I've been successfully running a similar patch (reverting operating
points, without reg_arm/reg_soc) as provided by Anson for about a month
on an iMX6SX-SDB RevC board. I'm sorry for not copying the mailing list
on this before.
This patch, with the regulators setup as provided by Marek, has been
tested successfully on the iMX6SX-SDB RevC board as well.
Feel free to add my
Tested-by: Henri Roosen <henri.roosen@ginzinger.com>
--
Best regards,
Henri Roosen
>
> --
> Regards,
> Leonard
>
^ permalink raw reply
* [PATCH][v2] arm64: dts: ls2081ardb: Add DTS support for NXP LS2081ARDB
From: Shawn Guo @ 2017-05-03 14:31 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493367930-10612-1-git-send-email-priyanka.jain@nxp.com>
On Fri, Apr 28, 2017 at 01:55:30PM +0530, Priyanka Jain wrote:
> This patch add support for NXP LS2081ARDB board which has
> LS2081A SoC.
>
> LS2081A SoC is 40-pin derivative of LS2088A SoC
> So, from functional perspective both are same.
> Hence,ls2088a SoC dtsi files are included from ls2081ARDB dts
>
> Signed-off-by: Priyanka Jain <priyanka.jain@nxp.com>
> ---
> Changes for v2: Updated subject and NXP copyright
> based on Leo's comments
>
> arch/arm64/boot/dts/freescale/Makefile | 1 +
> arch/arm64/boot/dts/freescale/fsl-ls2081a-rdb.dts | 139 +++++++++++++++++++++
> 2 files changed, 140 insertions(+), 0 deletions(-)
> create mode 100644 arch/arm64/boot/dts/freescale/fsl-ls2081a-rdb.dts
>
> diff --git a/arch/arm64/boot/dts/freescale/Makefile b/arch/arm64/boot/dts/freescale/Makefile
> index 72c4b52..58b80de 100644
> --- a/arch/arm64/boot/dts/freescale/Makefile
> +++ b/arch/arm64/boot/dts/freescale/Makefile
> @@ -9,6 +9,7 @@ dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls1088a-qds.dtb
> dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls1088a-rdb.dtb
> dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls2080a-qds.dtb
> dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls2080a-rdb.dtb
> +dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls2081a-rdb.dtb
> dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls2080a-simu.dtb
> dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls2088a-qds.dtb
> dtb-$(CONFIG_ARCH_LAYERSCAPE) += fsl-ls2088a-rdb.dtb
> diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2081a-rdb.dts b/arch/arm64/boot/dts/freescale/fsl-ls2081a-rdb.dts
> new file mode 100644
> index 0000000..bc1c4f6
> --- /dev/null
> +++ b/arch/arm64/boot/dts/freescale/fsl-ls2081a-rdb.dts
> @@ -0,0 +1,139 @@
> +/*
> + * Device Tree file for NXP LS2081A RDB Board.
> + *
> + * Copyright 2017, NXP
It still doesn't match what Leo's patch does.
https://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux.git/commit/?h=imx/dt64&id=827270124c75c7eca8d2e0b41e21fc668e04932e
Hint: there is no comma in the middle.
> + *
> + * Priyanka Jain <priyanka.jain@nxp.com>
> + *
> + * This file is dual-licensed: you can use it either under the terms
> + * of the GPLv2 or the X11 license, at your option. Note that this dual
> + * licensing only applies to this file, and not this project as a
> + * whole.
> + *
> + * a) This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation; either version 2 of the
> + * License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * Or, alternatively,
> + *
> + * b) Permission is hereby granted, free of charge, to any person
> + * obtaining a copy of this software and associated documentation
> + * files (the "Software"), to deal in the Software without
> + * restriction, including without limitation the rights to use,
> + * copy, modify, merge, publish, distribute, sublicense, and/or
> + * sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following
> + * conditions:
> + *
> + * The above copyright notice and this permission notice shall be
> + * included in all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
> + * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
> + * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
> + * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + */
> +
> +/dts-v1/;
> +
> +#include "fsl-ls2088a.dtsi"
> +
> +/ {
> + model = "NXP Layerscape 2081A RDB Board";
> + compatible = "fsl,ls2081a-rdb", "fsl,ls2081a";
> +
> + aliases {
> + serial0 = &serial0;
> + serial1 = &serial1;
> + };
> +
> + chosen {
> + stdout-path = "serial1:115200n8";
> + };
> +};
> +
> +&esdhc {
> + status = "okay";
> +};
> +
> +&ifc {
> + status = "disabled";
We should have those devices which will need board level connector/pinout
disabled in <soc>.dtsi, and enable the device in <board>.dts which
actually has the device pinout on board.
> +};
> +
> +&i2c0 {
> + status = "okay";
Have a newline between property list and child node.
> + pca9547 at 75 {
Node name should be as generic as possible. The name used in bindings
doc example is obviously better.
This comment applies to other nodes added by this patch.
> + compatible = "nxp,pca9547";
> + reg = <0x75>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> + i2c at 1 {
> + #address-cells = <1>;
> + #size-cells = <0>;
> + reg = <0x01>;
> + rtc at 51 {
> + compatible = "nxp,pcf2129";
> + reg = <0x51>;
> + };
> + };
> +
> + i2c at 3 {
> + #address-cells = <1>;
> + #size-cells = <0>;
> + reg = <0x3>;
> +
> + adt7481 at 4c {
> + compatible = "adi,adt7461";
> + reg = <0x4c>;
> + };
> + };
> + };
> +};
> +
> +&dspi {
Sort these labeled nodes alphabetically.
> + status = "okay";
> + dflash0: n25q512a {
> + #address-cells = <1>;
> + #size-cells = <1>;
> + compatible = "st,m25p80";
> + spi-max-frequency = <3000000>;
> + reg = <0>;
> + };
> +};
> +
> +
> +&qspi {
> + status = "okay";
> + flash0: n25q512a at 0 {
> + #address-cells = <1>;
> + #size-cells = <1>;
> + compatible = "st,m25p80";
> + spi-max-frequency = <20000000>;
> + reg = <0>;
> + };
Have a newline between nodes.
Shawn
> + flash1: n25q512a at 1 {
> + #address-cells = <1>;
> + #size-cells = <1>;
> + compatible = "st,m25p80";
> + spi-max-frequency = <20000000>;
> + reg = <0>;
> + };
> +};
> +
> +&usb0 {
> + status = "okay";
> +};
> +
> +&usb1 {
> + status = "okay";
> +};
> --
> 1.7.4.1
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Marek Vasut @ 2017-05-03 14:32 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <76e22a7e-590f-00ee-04f8-87604303eaad@denx.de>
On 05/03/2017 04:26 PM, Marek Vasut wrote:
> On 05/03/2017 03:57 PM, Shawn Guo wrote:
>> On Tue, Apr 25, 2017 at 07:28:06PM +0200, Marek Vasut wrote:
>>> On 04/25/2017 07:23 PM, Leonard Crestez wrote:
>>>> Anyway, that version also sets the supply for reg_arm and reg_soc. It
>>>> is not necessary for fixing the crash I'm seeing but is good because it
>>>> will result in the minimum voltage on VDD_ARM_SOC_IN rather than a fix
>>>> 1375mv. I tested Marek's patch and it works fine on my rev B board
>>>> (which otherwise fails to boot upstream).
>>>
>>> Oh that's nice , thanks ! I don't have SDB and I hacked it up after a
>>> brief discussion with Fabio without even compile-testing it, thus RFC.
>>> Glad to hear it works and thanks for testing it ! Can you add a formal
>>> Tested-by please ?
>>
>> Hi Marek,
>
> Hi Shawn,
>
>> Thanks for your patch. But I prefer Leonard's version because: 1) it
>> has a better commit log; 2) it sticks to one-patch-does-one-thing
>> policy.
>
> Well I'd prefer this patch because
> 1) It has T-B
Correction, two TBs [1]
[1] https://patchwork.kernel.org/patch/9698749/
> 2) It actually fixes a problem with the voltage rails such that the DVFS
> works without leaving the system in unstable or dead state. You do
> need the second part of my patch if you drop the OPP hackery, without
> it the power framework cannot correctly configure the core voltages,
> so the patch from Leonard makes things worse.
>
>> But I'm going to wait for a while to get Peter's comment discussed,
>> before I actually apply Leonard's patch.
>>
>> Shawn
>>
>
>
--
Best regards,
Marek Vasut
^ permalink raw reply
* [PATCH v5 16/22] KVM: arm64: vgic-its: Add infrastructure for table lookup
From: Christoffer Dall @ 2017-05-03 14:38 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <8f162a7d-e629-5828-6230-f0b7b7a6f9db@redhat.com>
On Wed, May 03, 2017 at 03:40:34PM +0200, Auger Eric wrote:
> Hi Christoffer,
>
> On 30/04/2017 21:33, Christoffer Dall wrote:
> > On Fri, Apr 14, 2017 at 12:15:28PM +0200, Eric Auger wrote:
> >> Add a generic lookup_table() helper whose role consists in
> >> scanning a contiguous table located in guest RAM and applying
> >> a callback on each entry. Entries can be handled as linked lists
> >> since the callback may return an offset to the next entry and
> >> also tell that an entry is the last one.
> >>
> >> Helper functions also are added to compute the device/event ID
> >> offset to the next DTE/ITE.
> >>
> >> compute_next_devid_offset, compute_next_eventid_offset and
> >> lookup_table will become static in subsequent patches
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>
> >> ---
> >> v4 -> v5:
> >> - use kvm_read_guest
> >>
> >> v3 -> v4:
> >> - remove static to avoid compilation warning
> >> - correct size computation in looup_table()
> >> - defines now encode the number of bits used for devid and eventid offsets
> >> - use BIT() - 1 to encode the max offets
> >> ---
> >> virt/kvm/arm/vgic/vgic-its.c | 93 ++++++++++++++++++++++++++++++++++++++++++++
> >> 1 file changed, 93 insertions(+)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> >> index 56c5123..c22b35d 100644
> >> --- a/virt/kvm/arm/vgic/vgic-its.c
> >> +++ b/virt/kvm/arm/vgic/vgic-its.c
> >> @@ -195,6 +195,8 @@ static struct its_ite *find_ite(struct vgic_its *its, u32 device_id,
> >>
> >> #define VITS_TYPER_IDBITS 16
> >> #define VITS_TYPER_DEVBITS 16
> >> +#define VITS_DTE_MAX_DEVID_OFFSET (BIT(14) - 1)
> >> +#define VITS_ITE_MAX_EVENTID_OFFSET (BIT(16) - 1)
> >>
> >> /*
> >> * Finds and returns a collection in the ITS collection table.
> >> @@ -1674,6 +1676,97 @@ int vgic_its_attr_regs_access(struct kvm_device *dev,
> >> return ret;
> >> }
> >>
> >> +u32 compute_next_devid_offset(struct list_head *h, struct its_device *dev)
> >> +{
> >> + struct list_head *e = &dev->dev_list;
> >> + struct its_device *next;
> >> + u32 next_offset;
> >> +
> >> + if (e->next == h)
> >> + return 0;
> >> + next = list_entry(e->next, struct its_device, dev_list);
> >> + next_offset = next->device_id - dev->device_id;
> >> +
> >> + return min_t(u32, next_offset, VITS_DTE_MAX_DEVID_OFFSET);
> >> +}
> >> +
> >> +u32 compute_next_eventid_offset(struct list_head *h, struct its_ite *ite)
> >> +{
> >> + struct list_head *e = &ite->ite_list;
> >> + struct its_ite *next;
> >> + u32 next_offset;
> >> +
> >> + if (e->next == h)
> >> + return 0;
> >> + next = list_entry(e->next, struct its_ite, ite_list);
> >> + next_offset = next->event_id - ite->event_id;
> >> +
> >> + return min_t(u32, next_offset, VITS_ITE_MAX_EVENTID_OFFSET);
> >> +}
> >> +
> >> +/**
> >> + * entry_fn_t - Callback called on a table entry restore path
> >> + * @its: its handle
> >> + * @id: id of the entry
> >> + * @entry: pointer to the entry
> >> + * @opaque: pointer to an opaque data
> >> + * @next_offset: minimal ID offset to the next entry. 0 if this
> >> + * entry is the last one, 1 if the entry is invalid, >= 1 if an
> >> + * entry's next_offset field was truly decoded
> >> + *
> >> + * Return: < 0 on error, 0 otherwise
> >> + */
> >> +typedef int (*entry_fn_t)(struct vgic_its *its, u32 id, void *entry,
> >> + void *opaque, u32 *next_offset);
> >
> > Just noticed. All the table entries are 64-bit long at this point,
> > right? So why not make entry a u64 * instead? Could we end up with
> > some endianness mess with using void pointers the way it is now?
> the size of the entry is ABI dependent while this infrastructure is
> generic.
Yes, for a single version of the ABI where all the entries are 64-bit.
> In each of such function we use
>
> u64 entry = *(u64 *)addr;
> and we do a le64_to_cpu(entry).
>
> Do you see something wrong? Otherwise I would be tempted to leave as is.
>
I don't think there's anything wrong with the current version, and
you're right, this always points to an ITS data structure which is LE,
so there shouldn't be a problem. I always just quiver when I see void
pointers cast to a type in the caller and callee.
Just leave it for now.
Thanks,
-Christoffer
^ permalink raw reply
* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Shawn Guo @ 2017-05-03 14:41 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <0f84e2ee-1ea1-acf8-8368-3fc848d33e23@denx.de>
On Wed, May 03, 2017 at 04:32:06PM +0200, Marek Vasut wrote:
> On 05/03/2017 04:26 PM, Marek Vasut wrote:
> > On 05/03/2017 03:57 PM, Shawn Guo wrote:
> >> On Tue, Apr 25, 2017 at 07:28:06PM +0200, Marek Vasut wrote:
> >>> On 04/25/2017 07:23 PM, Leonard Crestez wrote:
> >>>> Anyway, that version also sets the supply for reg_arm and reg_soc. It
> >>>> is not necessary for fixing the crash I'm seeing but is good because it
> >>>> will result in the minimum voltage on VDD_ARM_SOC_IN rather than a fix
> >>>> 1375mv. I tested Marek's patch and it works fine on my rev B board
> >>>> (which otherwise fails to boot upstream).
> >>>
> >>> Oh that's nice , thanks ! I don't have SDB and I hacked it up after a
> >>> brief discussion with Fabio without even compile-testing it, thus RFC.
> >>> Glad to hear it works and thanks for testing it ! Can you add a formal
> >>> Tested-by please ?
> >>
> >> Hi Marek,
> >
> > Hi Shawn,
> >
> >> Thanks for your patch. But I prefer Leonard's version because: 1) it
> >> has a better commit log; 2) it sticks to one-patch-does-one-thing
> >> policy.
> >
> > Well I'd prefer this patch because
> > 1) It has T-B
>
> Correction, two TBs [1]
>
> [1] https://patchwork.kernel.org/patch/9698749/
That doesn't mean Leonard's patch hasn't been tested by anyone.
> > 2) It actually fixes a problem with the voltage rails such that the DVFS
> > works without leaving the system in unstable or dead state. You do
> > need the second part of my patch if you drop the OPP hackery, without
> > it the power framework cannot correctly configure the core voltages,
> > so the patch from Leonard makes things worse.
If that's true, I will change my mind.
Shawn
^ permalink raw reply
* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Marek Vasut @ 2017-05-03 14:51 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20170503144059.GL18578@dragon>
On 05/03/2017 04:41 PM, Shawn Guo wrote:
> On Wed, May 03, 2017 at 04:32:06PM +0200, Marek Vasut wrote:
>> On 05/03/2017 04:26 PM, Marek Vasut wrote:
>>> On 05/03/2017 03:57 PM, Shawn Guo wrote:
>>>> On Tue, Apr 25, 2017 at 07:28:06PM +0200, Marek Vasut wrote:
>>>>> On 04/25/2017 07:23 PM, Leonard Crestez wrote:
>>>>>> Anyway, that version also sets the supply for reg_arm and reg_soc. It
>>>>>> is not necessary for fixing the crash I'm seeing but is good because it
>>>>>> will result in the minimum voltage on VDD_ARM_SOC_IN rather than a fix
>>>>>> 1375mv. I tested Marek's patch and it works fine on my rev B board
>>>>>> (which otherwise fails to boot upstream).
>>>>>
>>>>> Oh that's nice , thanks ! I don't have SDB and I hacked it up after a
>>>>> brief discussion with Fabio without even compile-testing it, thus RFC.
>>>>> Glad to hear it works and thanks for testing it ! Can you add a formal
>>>>> Tested-by please ?
>>>>
>>>> Hi Marek,
>>>
>>> Hi Shawn,
>>>
>>>> Thanks for your patch. But I prefer Leonard's version because: 1) it
>>>> has a better commit log; 2) it sticks to one-patch-does-one-thing
>>>> policy.
>>>
>>> Well I'd prefer this patch because
>>> 1) It has T-B
>>
>> Correction, two TBs [1]
>>
>> [1] https://patchwork.kernel.org/patch/9698749/
>
> That doesn't mean Leonard's patch hasn't been tested by anyone.
That's what the T-B tags are for ...
>>> 2) It actually fixes a problem with the voltage rails such that the DVFS
>>> works without leaving the system in unstable or dead state. You do
>>> need the second part of my patch if you drop the OPP hackery, without
>>> it the power framework cannot correctly configure the core voltages,
>>> so the patch from Leonard makes things worse.
>
> If that's true, I will change my mind.
Well you can check the schematic ... the ARM and SOC rails share the
same upstream regulator.
--
Best regards,
Marek Vasut
^ permalink raw reply
* [PATCH 0/3] arm64: queued spinlocks and rw-locks
From: Yury Norov @ 2017-05-03 14:51 UTC (permalink / raw)
To: linux-arm-kernel
The patch 3 adds implementation for queued-based locking on
ARM64, and the option in kernel config to enable it. Patches
1 and 2 fix some mess in header files to apply patch 3 smoothly.
Tested on QDF2400 with huge improvements with these patches on
the torture tests, by Adam Wallis.
Tested on ThunderX, by Andrew Pinski:
120 thread (30 core - 4 thread/core) CN99xx (single socket):
benchmark Units qspinlocks vs ticket locks
sched/messaging s 73.91%
sched/pipe ops/s 104.18%
futex/hash ops/s 103.87%
futex/wake ms 71.04%
futex/wake-parallel ms 93.88%
futex/requeue ms 96.47%
futex/lock-pi ops/s 118.33%
Notice, there's the queued locks implementation for the Power PC introduced
by Pan Xinhui. He largely tested it and also found significant performance
gain. In arch part it is very similar to this patch though.
https://lwn.net/Articles/701137/
RFC: https://www.spinics.net/lists/arm-kernel/msg575575.html
v1:
- queued_spin_unlock_wait() and queued_spin_is_locked() are
re-implemented in arch part to add additional memory barriers;
- queued locks are made optional, ticket locks are enabled by default.
Jan Glauber (1):
arm64/locking: qspinlocks and qrwlocks support
Yury Norov (2):
kernel/locking: #include <asm/spinlock.h> in qrwlock.c
asm-generic: don't #include <linux/atomic.h> in qspinlock_types.h
arch/arm64/Kconfig | 24 +++++++++++++++++++
arch/arm64/include/asm/qrwlock.h | 7 ++++++
arch/arm64/include/asm/qspinlock.h | 42 +++++++++++++++++++++++++++++++++
arch/arm64/include/asm/spinlock.h | 12 ++++++++++
arch/arm64/include/asm/spinlock_types.h | 14 ++++++++---
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/qspinlock.c | 34 ++++++++++++++++++++++++++
include/asm-generic/qspinlock.h | 1 +
include/asm-generic/qspinlock_types.h | 8 -------
kernel/locking/qrwlock.c | 1 +
10 files changed, 133 insertions(+), 11 deletions(-)
create mode 100644 arch/arm64/include/asm/qrwlock.h
create mode 100644 arch/arm64/include/asm/qspinlock.h
create mode 100644 arch/arm64/kernel/qspinlock.c
--
2.11.0
^ permalink raw reply
* [PATCH 1/3] kernel/locking: #include <asm/spinlock.h> in qrwlock.c
From: Yury Norov @ 2017-05-03 14:51 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20170503145141.4966-1-ynorov@caviumnetworks.com>
qrwlock.c calls arch_spin_lock() and arch_spin_unlock() but doesn't
include the asm/spinlock.h, where those functions are defined. It
may produce "implicit declaration of function" errors. This patch
fixes it.
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
kernel/locking/qrwlock.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index cc3ed0ccdfa2..6fb42925b201 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -20,6 +20,7 @@
#include <linux/cpumask.h>
#include <linux/percpu.h>
#include <linux/hardirq.h>
+#include <asm/spinlock.h>
#include <asm/qrwlock.h>
/*
--
2.11.0
^ permalink raw reply related
* [PATCH 2/3] asm-generic: don't #include <linux/atomic.h> in qspinlock_types.h
From: Yury Norov @ 2017-05-03 14:51 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20170503145141.4966-1-ynorov@caviumnetworks.com>
The "qspinlock_types.h" doesn't need linux/atomic.h directly. So
because of this, and because including of it requires the protection
against recursive inclusion, it looks reasonable to move the
inclusion exactly where it is needed. This change affects the x86_64
arch, as the only user of qspinlocks at now. I have build-tested the
change on x86_64 with CONFIG_PARAVIRT enabled and disabled.
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
include/asm-generic/qspinlock.h | 1 +
include/asm-generic/qspinlock_types.h | 8 --------
2 files changed, 1 insertion(+), 8 deletions(-)
diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index 9f0681bf1e87..5f4d42a09175 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -20,6 +20,7 @@
#define __ASM_GENERIC_QSPINLOCK_H
#include <asm-generic/qspinlock_types.h>
+#include <linux/atomic.h>
/**
* queued_spin_unlock_wait - wait until the _current_ lock holder releases the lock
diff --git a/include/asm-generic/qspinlock_types.h b/include/asm-generic/qspinlock_types.h
index 034acd0c4956..a13cc90c87fc 100644
--- a/include/asm-generic/qspinlock_types.h
+++ b/include/asm-generic/qspinlock_types.h
@@ -18,15 +18,7 @@
#ifndef __ASM_GENERIC_QSPINLOCK_TYPES_H
#define __ASM_GENERIC_QSPINLOCK_TYPES_H
-/*
- * Including atomic.h with PARAVIRT on will cause compilation errors because
- * of recursive header file incluson via paravirt_types.h. So don't include
- * it if PARAVIRT is on.
- */
-#ifndef CONFIG_PARAVIRT
#include <linux/types.h>
-#include <linux/atomic.h>
-#endif
typedef struct qspinlock {
atomic_t val;
--
2.11.0
^ permalink raw reply related
* [PATCH 3/3] arm64/locking: qspinlocks and qrwlocks support
From: Yury Norov @ 2017-05-03 14:51 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20170503145141.4966-1-ynorov@caviumnetworks.com>
From: Jan Glauber <jglauber@cavium.com>
Ported from x86_64 with paravirtualization support removed.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Note. This patch removes protection from direct inclusion of
arch/arm64/include/asm/spinlock_types.h. It's done because
kernel/locking/qrwlock.c file does it thru the header
include/asm-generic/qrwlock_types.h. Until now the only user
of qrwlock.c was x86, and there's no such protection too.
I'm not happy to remove the protection, but if it's OK for x86,
it should be also OK for arm64. If not, I think we'd fix it
for x86, and add the protection there too.
Yury
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
arch/arm64/Kconfig | 24 +++++++++++++++++++
arch/arm64/include/asm/qrwlock.h | 7 ++++++
arch/arm64/include/asm/qspinlock.h | 42 +++++++++++++++++++++++++++++++++
arch/arm64/include/asm/spinlock.h | 12 ++++++++++
arch/arm64/include/asm/spinlock_types.h | 14 ++++++++---
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/qspinlock.c | 34 ++++++++++++++++++++++++++
7 files changed, 131 insertions(+), 3 deletions(-)
create mode 100644 arch/arm64/include/asm/qrwlock.h
create mode 100644 arch/arm64/include/asm/qspinlock.h
create mode 100644 arch/arm64/kernel/qspinlock.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 22dbde97eefa..db24b3b3f3c6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -25,6 +25,8 @@ config ARM64
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
select ARCH_WANT_FRAME_POINTERS
select ARCH_HAS_UBSAN_SANITIZE_ALL
+ select ARCH_USE_QUEUED_RWLOCKS if QUEUED_LOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS if QUEUED_LOCKS
select ARM_AMBA
select ARM_ARCH_TIMER
select ARM_GIC
@@ -692,6 +694,28 @@ config ARCH_WANT_HUGE_PMD_SHARE
config ARCH_HAS_CACHE_LINE_SIZE
def_bool y
+choice
+ prompt "Locking type"
+ default TICKET_LOCKS
+ help
+ Choose between traditional ticked-based locking mechanism and
+ queued-based mechanism.
+
+config TICKET_LOCKS
+ bool "Ticket locks"
+ help
+ Ticked-based locking implementation for ARM64
+
+config QUEUED_LOCKS
+ bool "Queued locks"
+ help
+ Queue-based locking mechanism. This option improves
+ locking performance in case of high-contended locking
+ on many-cpu machines. On low-cpu machines there is no
+ difference comparing to ticked-based locking.
+
+endchoice
+
source "mm/Kconfig"
config SECCOMP
diff --git a/arch/arm64/include/asm/qrwlock.h b/arch/arm64/include/asm/qrwlock.h
new file mode 100644
index 000000000000..626f6ebfb52d
--- /dev/null
+++ b/arch/arm64/include/asm/qrwlock.h
@@ -0,0 +1,7 @@
+#ifndef _ASM_ARM64_QRWLOCK_H
+#define _ASM_ARM64_QRWLOCK_H
+
+#include <asm-generic/qrwlock_types.h>
+#include <asm-generic/qrwlock.h>
+
+#endif /* _ASM_ARM64_QRWLOCK_H */
diff --git a/arch/arm64/include/asm/qspinlock.h b/arch/arm64/include/asm/qspinlock.h
new file mode 100644
index 000000000000..09ef4f13f549
--- /dev/null
+++ b/arch/arm64/include/asm/qspinlock.h
@@ -0,0 +1,42 @@
+#ifndef _ASM_ARM64_QSPINLOCK_H
+#define _ASM_ARM64_QSPINLOCK_H
+
+#include <asm-generic/qspinlock_types.h>
+#include <asm/atomic.h>
+
+extern void queued_spin_unlock_wait(struct qspinlock *lock);
+#define queued_spin_unlock_wait queued_spin_unlock_wait
+
+#define queued_spin_unlock queued_spin_unlock
+/**
+ * queued_spin_unlock - release a queued spinlock
+ * @lock : Pointer to queued spinlock structure
+ *
+ * A smp_store_release() on the least-significant byte.
+ */
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+ smp_store_release((u8 *)lock, 0);
+}
+
+#define queued_spin_is_locked queued_spin_is_locked
+/**
+ * queued_spin_is_locked - is the spinlock locked?
+ * @lock: Pointer to queued spinlock structure
+ * Return: 1 if it is locked, 0 otherwise
+ */
+static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
+{
+ /*
+ * See queued_spin_unlock_wait().
+ *
+ * Any !0 state indicates it is locked, even if _Q_LOCKED_VAL
+ * isn't immediately observable.
+ */
+ smp_mb();
+ return atomic_read(&lock->val);
+}
+
+#include <asm-generic/qspinlock.h>
+
+#endif /* _ASM_ARM64_QSPINLOCK_H */
diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
index cae331d553f8..37713397e0c5 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -20,6 +20,10 @@
#include <asm/spinlock_types.h>
#include <asm/processor.h>
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include <asm/qspinlock.h>
+#else
+
/*
* Spinlock implementation.
*
@@ -187,6 +191,12 @@ static inline int arch_spin_is_contended(arch_spinlock_t *lock)
}
#define arch_spin_is_contended arch_spin_is_contended
+#endif /* CONFIG_QUEUED_SPINLOCKS */
+
+#ifdef CONFIG_QUEUED_RWLOCKS
+#include <asm/qrwlock.h>
+#else
+
/*
* Write lock implementation.
*
@@ -351,6 +361,8 @@ static inline int arch_read_trylock(arch_rwlock_t *rw)
/* read_can_lock - would read_trylock() succeed? */
#define arch_read_can_lock(x) ((x)->lock < 0x80000000)
+#endif /* CONFIG_QUEUED_RWLOCKS */
+
#define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
#define arch_write_lock_flags(lock, flags) arch_write_lock(lock)
diff --git a/arch/arm64/include/asm/spinlock_types.h b/arch/arm64/include/asm/spinlock_types.h
index 55be59a35e3f..0f0f1561ab6a 100644
--- a/arch/arm64/include/asm/spinlock_types.h
+++ b/arch/arm64/include/asm/spinlock_types.h
@@ -16,9 +16,9 @@
#ifndef __ASM_SPINLOCK_TYPES_H
#define __ASM_SPINLOCK_TYPES_H
-#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H)
-# error "please don't include this file directly"
-#endif
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include <asm-generic/qspinlock_types.h>
+#else
#include <linux/types.h>
@@ -36,10 +36,18 @@ typedef struct {
#define __ARCH_SPIN_LOCK_UNLOCKED { 0 , 0 }
+#endif /* CONFIG_QUEUED_SPINLOCKS */
+
+#ifdef CONFIG_QUEUED_RWLOCKS
+#include <asm-generic/qrwlock_types.h>
+#else
+
typedef struct {
volatile unsigned int lock;
} arch_rwlock_t;
#define __ARCH_RW_LOCK_UNLOCKED { 0 }
+#endif /* CONFIG_QUEUED_RWLOCKS */
+
#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 9d56467dc223..f48f6256e893 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -56,6 +56,7 @@ arm64-obj-$(CONFIG_KEXEC) += machine_kexec.o relocate_kernel.o \
arm64-obj-$(CONFIG_ARM64_RELOC_TEST) += arm64-reloc-test.o
arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
arm64-obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
+arm64-obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o
obj-y += $(arm64-obj-y) vdso/ probes/
obj-$(CONFIG_ARM64_ILP32) += vdso-ilp32/
diff --git a/arch/arm64/kernel/qspinlock.c b/arch/arm64/kernel/qspinlock.c
new file mode 100644
index 000000000000..924f19953adb
--- /dev/null
+++ b/arch/arm64/kernel/qspinlock.c
@@ -0,0 +1,34 @@
+#include <asm/qspinlock.h>
+#include <asm/processor.h>
+
+void queued_spin_unlock_wait(struct qspinlock *lock)
+{
+ u32 val;
+
+ for (;;) {
+ smp_mb();
+ val = atomic_read(&lock->val);
+
+ if (!val) /* not locked, we're done */
+ goto done;
+
+ if (val & _Q_LOCKED_MASK) /* locked, go wait for unlock */
+ break;
+
+ /* not locked, but pending, wait until we observe the lock */
+ cpu_relax();
+ }
+
+ for (;;) {
+ smp_mb();
+ val = atomic_read(&lock->val);
+ if (!(val & _Q_LOCKED_MASK)) /* any unlock is good */
+ break;
+
+ cpu_relax();
+ }
+
+done:
+ smp_acquire__after_ctrl_dep();
+}
+EXPORT_SYMBOL(queued_spin_unlock_wait);
--
2.11.0
^ permalink raw reply related
* [PATCH v2] drivers: dma-mapping: Do not leave an invalid area->pages pointer in dma_common_contiguous_remap()
From: Catalin Marinas @ 2017-05-03 14:57 UTC (permalink / raw)
To: linux-arm-kernel
The dma_common_pages_remap() function allocates a vm_struct object and
initialises the pages pointer to value passed as argument. However, when
this function is called dma_common_contiguous_remap(), the pages array
is only temporarily allocated, being freed shortly after
dma_common_contiguous_remap() returns. Architecture code checking the
validity of an area->pages pointer would incorrectly dereference already
freed pointers. This has been exposed by the arm64 commit 44176bb38fa4
("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU").
Fixes: 513510ddba96 ("common: dma-mapping: introduce common remapping functions")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
Greg,
Please merge this patch via your tree (and therefore I haven't added
your ack). Thanks.
Changes since v1:
- Kept the original comment near the dma_common_pages_remap() function
- Added acks/reviews
drivers/base/dma-mapping.c | 33 ++++++++++++++++++++++++---------
1 file changed, 24 insertions(+), 9 deletions(-)
diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index efd71cf4fdea..ac224de462f6 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -273,6 +273,24 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
EXPORT_SYMBOL(dma_common_mmap);
#ifdef CONFIG_MMU
+static struct vm_struct *__dma_common_pages_remap(struct page **pages,
+ size_t size, unsigned long vm_flags, pgprot_t prot,
+ const void *caller)
+{
+ struct vm_struct *area;
+
+ area = get_vm_area_caller(size, vm_flags, caller);
+ if (!area)
+ return NULL;
+
+ if (map_vm_area(area, prot, pages)) {
+ vunmap(area->addr);
+ return NULL;
+ }
+
+ return area;
+}
+
/*
* remaps an array of PAGE_SIZE pages into another vm_area
* Cannot be used in non-sleeping contexts
@@ -283,17 +301,12 @@ void *dma_common_pages_remap(struct page **pages, size_t size,
{
struct vm_struct *area;
- area = get_vm_area_caller(size, vm_flags, caller);
+ area = __dma_common_pages_remap(pages, size, vm_flags, prot, caller);
if (!area)
return NULL;
area->pages = pages;
- if (map_vm_area(area, prot, pages)) {
- vunmap(area->addr);
- return NULL;
- }
-
return area->addr;
}
@@ -308,7 +321,7 @@ void *dma_common_contiguous_remap(struct page *page, size_t size,
{
int i;
struct page **pages;
- void *ptr;
+ struct vm_struct *area;
unsigned long pfn;
pages = kmalloc(sizeof(struct page *) << get_order(size), GFP_KERNEL);
@@ -318,11 +331,13 @@ void *dma_common_contiguous_remap(struct page *page, size_t size,
for (i = 0, pfn = page_to_pfn(page); i < (size >> PAGE_SHIFT); i++)
pages[i] = pfn_to_page(pfn + i);
- ptr = dma_common_pages_remap(pages, size, vm_flags, prot, caller);
+ area = __dma_common_pages_remap(pages, size, vm_flags, prot, caller);
kfree(pages);
- return ptr;
+ if (!area)
+ return NULL;
+ return area->addr;
}
/*
^ permalink raw reply related
* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Leonard Crestez @ 2017-05-03 14:58 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <76e22a7e-590f-00ee-04f8-87604303eaad@denx.de>
On Wed, 2017-05-03 at 16:26 +0200, Marek Vasut wrote:
> On 05/03/2017 03:57 PM, Shawn Guo wrote:
> >
> > On Tue, Apr 25, 2017 at 07:28:06PM +0200, Marek Vasut wrote:
> > >
> > > On 04/25/2017 07:23 PM, Leonard Crestez wrote:
> > > >
> > > > Anyway, that version also sets the supply for reg_arm and reg_soc. It
> > > > is not necessary for fixing the crash I'm seeing but is good because it
> > > > will result in the minimum voltage on VDD_ARM_SOC_IN rather than a fix
> > > > 1375mv. I tested Marek's patch and it works fine on my rev B board
> > > > (which otherwise fails to boot upstream).
> > > Oh that's nice , thanks ! I don't have SDB and I hacked it up after a
> > > brief discussion with Fabio without even compile-testing it, thus RFC.
> > > Glad to hear it works and thanks for testing it ! Can you add a formal
> > > Tested-by please ?
> > Hi Marek,
> Hi Shawn,
>
> > Thanks for your patch.??But I prefer Leonard's version because: 1) it
> > has a better commit log; 2) it sticks to one-patch-does-one-thing
> > policy.
> 2) It actually fixes a problem with the voltage rails such that the DVFS
> ???works without leaving the system in unstable or dead state. You do
> ???need the second part of my patch if you drop the OPP hackery, without
> ???it the power framework cannot correctly configure the core voltages,
> ???so the patch from Leonard makes things worse.
No, I think there is a misunderstanding here. The second part of your
patch will cause cpufreq poking at LDOs to indirectly adjust the input
from the PMIC to the minimum required (this is LDO target +
min_dropout_uv). Without it by default VDD_ARM_SOC_IN will remain fixed
as 1375mV from boot.
That default value should be high enough for all cpufreq settings.
Setting the LDO parent will cause this voltage to be dynamically
reduced when possible (at low frequencies). This is not required for
proper operation, it is just an optimization to do more of the
regulation in the PMIC instead. It should work just fine without the
second part.
That OPP override exists for "LDO bypass" mode, a feature not present
in upstream. In that mode the internal regulators are set to bypass
mode and voltage is controlled directly from the PMIC. Since VDD_ARM
and VDD_SOC have different minimum requirements but are joined on the
board the OPP voltages are adjusted to max(VDD_ARM, VDD_SOC). If LDOs
are enabled there is no good reason to do this.
I don't care which patch goes in but the effect of the patch should be
clarified.
--?
Regards,
Leonard
^ permalink raw reply
* [PATCH] ARM: dts: imx: add DH2228FV DAC to Gateworks Ventana boards with SPI
From: Tim Harvey @ 2017-05-03 15:04 UTC (permalink / raw)
To: linux-arm-kernel
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
---
arch/arm/boot/dts/imx6qdl-gw52xx.dtsi | 6 ++++++
arch/arm/boot/dts/imx6qdl-gw54xx.dtsi | 6 ++++++
arch/arm/boot/dts/imx6qdl-gw560x.dtsi | 6 ++++++
3 files changed, 18 insertions(+)
diff --git a/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi b/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi
index 91991d6..b5c1a8f 100644
--- a/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-gw52xx.dtsi
@@ -143,6 +143,12 @@
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_ecspi3>;
status = "okay";
+
+ spidev0: spidev at 0 {
+ compatible = "rohm,dh2228fv";
+ reg = <0>;
+ spi-max-frequency = <60000000>;
+ };
};
&fec {
diff --git a/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi b/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi
index 968fda9..9d26511 100644
--- a/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-gw54xx.dtsi
@@ -154,6 +154,12 @@
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_ecspi2>;
status = "okay";
+
+ spidev0: spidev at 0 {
+ compatible = "rohm,dh2228fv";
+ reg = <0>;
+ spi-max-frequency = <60000000>;
+ };
};
&fec {
diff --git a/arch/arm/boot/dts/imx6qdl-gw560x.dtsi b/arch/arm/boot/dts/imx6qdl-gw560x.dtsi
index d894dde..bfb63e9 100644
--- a/arch/arm/boot/dts/imx6qdl-gw560x.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-gw560x.dtsi
@@ -208,6 +208,12 @@
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_ecspi3>;
status = "okay";
+
+ spidev0: spidev at 0 {
+ compatible = "rohm,dh2228fv";
+ reg = <0>;
+ spi-max-frequency = <60000000>;
+ };
};
&can1 {
--
2.7.4
^ permalink raw reply related
* [PATCH 1/3] kernel/locking: #include <asm/spinlock.h> in qrwlock.c
From: Geert Uytterhoeven @ 2017-05-03 15:05 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20170503145141.4966-2-ynorov@caviumnetworks.com>
On Wed, May 3, 2017 at 4:51 PM, Yury Norov <ynorov@caviumnetworks.com> wrote:
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -20,6 +20,7 @@
> #include <linux/cpumask.h>
> #include <linux/percpu.h>
> #include <linux/hardirq.h>
> +#include <asm/spinlock.h>
linux/spinlock.h?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply
* [PATCH 0/6] arm64: inline assembly fixes + cleanup
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
To: linux-arm-kernel
Recent attempts to make our inline assembly more clang-friendly [1,2]
made it clear that we have some latent problems. I've reviewed all the
inline assembly under arch/arm64/, and this series fixes the issues that
I noted.
The series is based on the arm64 for-next/core branch. I've built the
series with a Linaro 15,08 GCC 5.1.1 toolchain. I see no new warnings,
and the result boots happily on Juno R1.
The first four patches address latent bugs, with the final two patches
improving consistency and compatibility with clang. I believe that this
supersedes [2], with the GIC accessor having been fixed up by the recent
sysreg rework.
Thanks,
Mark.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-April/503535.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-May/504072.html
Mark Rutland (6):
arm64: xchg: hazard against entire exchange variable
arm64: ensure extension of smp_store_release value
arm64: uaccess: ensure extension of access_ok() addr
arm64: armv8_deprecated: ensure extension of addr
arm64: atomic_lse: match asm register sizes
arm64: uaccess: suppress spurious clang warning
arch/arm64/include/asm/atomic_lse.h | 4 ++--
arch/arm64/include/asm/barrier.h | 20 +++++++++++++++-----
arch/arm64/include/asm/cmpxchg.h | 2 +-
arch/arm64/include/asm/uaccess.h | 7 ++++---
arch/arm64/kernel/armv8_deprecated.c | 3 ++-
5 files changed, 24 insertions(+), 12 deletions(-)
--
1.9.1
^ permalink raw reply
* [PATCH 1/6] arm64: xchg: hazard against entire exchange variable
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493824178-7399-1-git-send-email-mark.rutland@arm.com>
The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard
against other accesses to the memory location being exchanged. However,
the pointer passed to the constraint is a u8 pointer, and thus the
hazard only applies to the first byte of the location.
GCC can take advantage of this, assuming that other portions of the
location are unchanged, as demonstrated with the following test case:
union u {
unsigned long l;
unsigned int i[2];
};
unsigned long update_char_hazard(union u *u)
{
unsigned int a, b;
a = u->i[1];
asm ("str %1, %0" : "+Q" (*(char *)&u->l) : "r" (0UL));
b = u->i[1];
return a ^ b;
}
unsigned long update_long_hazard(union u *u)
{
unsigned int a, b;
a = u->i[1];
asm ("str %1, %0" : "+Q" (*(long *)&u->l) : "r" (0UL));
b = u->i[1];
return a ^ b;
}
The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when
using -O2 or above:
0000000000000000 <update_char_hazard>:
0: d2800001 mov x1, #0x0 // #0
4: f9000001 str x1, [x0]
8: d2800000 mov x0, #0x0 // #0
c: d65f03c0 ret
0000000000000010 <update_long_hazard>:
10: b9400401 ldr w1, [x0,#4]
14: d2800002 mov x2, #0x0 // #0
18: f9000002 str x2, [x0]
1c: b9400400 ldr w0, [x0,#4]
20: 4a000020 eor w0, w1, w0
24: d65f03c0 ret
This patch fixes the issue by passing an unsigned long pointer into the
+Q constraint, as we do for our cmpxchg code. This may hazard against
more than is necessary, but this is better than missing a necessary
hazard.
Fixes: 305d454aaa292be3 ("arm64: atomics: implement native {relaxed, acquire, release} atomics")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/cmpxchg.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
index 91b26d2..ae852ad 100644
--- a/arch/arm64/include/asm/cmpxchg.h
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -46,7 +46,7 @@
" swp" #acq_lse #rel #sz "\t%" #w "3, %" #w "0, %2\n" \
__nops(3) \
" " #nop_lse) \
- : "=&r" (ret), "=&r" (tmp), "+Q" (*(u8 *)ptr) \
+ : "=&r" (ret), "=&r" (tmp), "+Q" (*(unsigned long *)ptr) \
: "r" (x) \
: cl); \
\
--
1.9.1
^ permalink raw reply related
* [PATCH 2/6] arm64: ensure extension of smp_store_release value
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493824178-7399-1-git-send-email-mark.rutland@arm.com>
When an inline assembly operand's type is narrower than the register it
is allocated to, the least significant bits of the register (up to the
operand type's width) are valid, and any other bits are permitted to
contain any arbitrary value. This aligns with the AAPCS64 parameter
passing rules.
Our __smp_store_release() implementation does not account for this, and
implicitly assumes that operands have been zero-extended to the width of
the type being stored to. Thus, we may store unknown values to memory
when the value type is narrower than the pointer type (e.g. when storing
a char to a long).
This patch fixes the issue by casting the value operand to the same
width as the pointer operand in all cases, which ensures that the value
is zero-extended as we expect. We use the same union trickery as
__smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that
pointers are potentially cast to narrower width integers in unreachable
paths.
A whitespace issue at the top of __smp_store_release() is also
corrected.
No changes are necessary for __smp_load_acquire(). Load instructions
implicitly clear any upper bits of the register, and the compiler will
only consider the least significant bits of the register as valid
regardless.
Fixes: 47933ad41a86a4a9 ("arch: Introduce smp_load_acquire(), smp_store_release()")
Fixes: 878a84d5a8a18a4a ("arm64: add missing data types in smp_load_acquire/smp_store_release")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/barrier.h | 20 +++++++++++++++-----
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 4e0497f..0fe7e43 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -42,25 +42,35 @@
#define __smp_rmb() dmb(ishld)
#define __smp_wmb() dmb(ishst)
-#define __smp_store_release(p, v) \
+#define __smp_store_release(p, v) \
do { \
+ union { typeof(*p) __val; char __c[1]; } __u = \
+ { .__val = (__force typeof(*p)) (v) }; \
compiletime_assert_atomic_type(*p); \
switch (sizeof(*p)) { \
case 1: \
asm volatile ("stlrb %w1, %0" \
- : "=Q" (*p) : "r" (v) : "memory"); \
+ : "=Q" (*p) \
+ : "r" (*(__u8 *)__u.__c) \
+ : "memory"); \
break; \
case 2: \
asm volatile ("stlrh %w1, %0" \
- : "=Q" (*p) : "r" (v) : "memory"); \
+ : "=Q" (*p) \
+ : "r" (*(__u16 *)__u.__c) \
+ : "memory"); \
break; \
case 4: \
asm volatile ("stlr %w1, %0" \
- : "=Q" (*p) : "r" (v) : "memory"); \
+ : "=Q" (*p) \
+ : "r" (*(__u32 *)__u.__c) \
+ : "memory"); \
break; \
case 8: \
asm volatile ("stlr %1, %0" \
- : "=Q" (*p) : "r" (v) : "memory"); \
+ : "=Q" (*p) \
+ : "r" (*(__u64 *)__u.__c) \
+ : "memory"); \
break; \
} \
} while (0)
--
1.9.1
^ permalink raw reply related
* [PATCH 3/6] arm64: uaccess: ensure extension of access_ok() addr
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493824178-7399-1-git-send-email-mark.rutland@arm.com>
Our access_ok() simply hands its arguments over to __range_ok(), which
implicitly assummes that the addr parameter is 64 bits wide. This isn't
necessarily true for compat code, which might pass down a 32-bit address
parameter.
In these cases, we don't have a guarantee that the address has been zero
extended to 64 bits, and the upper bits of the register may contain
unknown values, potentially resulting in a suprious failure.
Avoid this by explicitly casting the addr parameter to an unsigned long
(as is done on other architectures), ensuring that the parameter is
widened appropriately.
Fixes: 0aea86a2176c2264 ("arm64: User access library functions")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/uaccess.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 5308d69..ed3ecf1 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -95,11 +95,12 @@ static inline void set_fs(mm_segment_t fs)
*/
#define __range_ok(addr, size) \
({ \
+ unsigned long __addr = (unsigned long __force)(addr); \
unsigned long flag, roksum; \
__chk_user_ptr(addr); \
asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, ls" \
: "=&r" (flag), "=&r" (roksum) \
- : "1" (addr), "Ir" (size), \
+ : "1" (__addr), "Ir" (size), \
"r" (current_thread_info()->addr_limit) \
: "cc"); \
flag; \
--
1.9.1
^ permalink raw reply related
* [PATCH 4/6] arm64: armv8_deprecated: ensure extension of addr
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493824178-7399-1-git-send-email-mark.rutland@arm.com>
Our compat swp emulation holds the compat user address in an unsigned
int, which it passes to __user_swpX_asm(). When a 32-bit value is passed
in a register, the upper 32 bits of the register are unknown, and we
must extend the value to 64 bits before we can use it as a base address.
This patch casts the address to unsigned long to ensure it has been
suitably extended, avoiding the potential issue, and silencing a related
warning from clang.
Fixes: bd35a4adc4131c53 ("arm64: Port SWP/SWPB emulation support from arm")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
arch/arm64/kernel/armv8_deprecated.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 657977e..f0e6d71 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -306,7 +306,8 @@ static void __init register_insn_emulation_sysctl(struct ctl_table *table)
_ASM_EXTABLE(0b, 4b) \
_ASM_EXTABLE(1b, 4b) \
: "=&r" (res), "+r" (data), "=&r" (temp), "=&r" (temp2) \
- : "r" (addr), "i" (-EAGAIN), "i" (-EFAULT), \
+ : "r" ((unsigned long)addr), "i" (-EAGAIN), \
+ "i" (-EFAULT), \
"i" (__SWP_LL_SC_LOOPS) \
: "memory"); \
uaccess_disable(); \
--
1.9.1
^ permalink raw reply related
* [PATCH 5/6] arm64: atomic_lse: match asm register sizes
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493824178-7399-1-git-send-email-mark.rutland@arm.com>
The LSE atomic code uses asm register variables to ensure that
parameters are allocated in specific registers. In the majority of cases
we specifically ask for an x register when using 64-bit values, but in a
couple of cases we use a w regsiter for a 64-bit value.
For asm register variables, the compiler only cares about the register
index, with wN and xN having the same meaning. The compiler determines
the register size to use based on the type of the variable. Thus, this
inconsistency is merely confusing, and not harmful to code generation.
For consistency, this patch updates those cases to use the x register
alias. There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/atomic_lse.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index 7457ce0..99fa69c 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -322,7 +322,7 @@ static inline void atomic64_and(long i, atomic64_t *v)
#define ATOMIC64_FETCH_OP_AND(name, mb, cl...) \
static inline long atomic64_fetch_and##name(long i, atomic64_t *v) \
{ \
- register long x0 asm ("w0") = i; \
+ register long x0 asm ("x0") = i; \
register atomic64_t *x1 asm ("x1") = v; \
\
asm volatile(ARM64_LSE_ATOMIC_INSN( \
@@ -394,7 +394,7 @@ static inline void atomic64_sub(long i, atomic64_t *v)
#define ATOMIC64_FETCH_OP_SUB(name, mb, cl...) \
static inline long atomic64_fetch_sub##name(long i, atomic64_t *v) \
{ \
- register long x0 asm ("w0") = i; \
+ register long x0 asm ("x0") = i; \
register atomic64_t *x1 asm ("x1") = v; \
\
asm volatile(ARM64_LSE_ATOMIC_INSN( \
--
1.9.1
^ permalink raw reply related
* [PATCH 6/6] arm64: uaccess: suppress spurious clang warning
From: Mark Rutland @ 2017-05-03 15:09 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1493824178-7399-1-git-send-email-mark.rutland@arm.com>
Clang tries to warn when there's a mismatch between an operand's size,
and the size of the register it is held in, as this may indicate a bug.
Specifically, clang warns when the operand's type is less than 64 bits
wide, and the register is used unqualified (i.e. %N rather than %xN or
%wN).
Unfortunately clang can generate these warnings for unreachable code.
For example, for code like:
do { \
typeof(*(ptr)) __v = (v); \
switch(sizeof(*(ptr))) { \
case 1: \
// assume __v is 1 byte wide \
asm ("{op}b %w0" : : "r" (v)); \
break; \
case 8: \
// assume __v is 8 bytes wide \
asm ("{op} %0" : : "r" (v)); \
break; \
}
while (0)
... if op() were passed a char value and pointer to char, clang may
produce a warning for the unreachable case where sizeof(*(ptr)) is 8.
For the same reasons, clang produces warnings when __put_user_err() is
used for types that are less than 64 bits wide.
We could avoid this with a cast to a fixed-width type in each of the
cases. However, GCC will then warn that pointer types are being cast to
mismatched integer sizes (in unreachable paths).
Another option would be to use the same union trickery as we do for
__smp_store_release() and __smp_load_acquire(), but this is fairly
invasive.
Instead, this patch suppresses the clang warning by using an x modifier
in the assembly for the 8 byte case of __put_user_err(). No additional
work is necessary as the value has been cast to typeof(*(ptr)), so the
compiler will have performed any necessary extension for the reachable
case.
For consistency, __get_user_err() is also updated to use the x modifier
for its 8 byte case.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/uaccess.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index ed3ecf1..2c7822c 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -257,7 +257,7 @@ static inline void uaccess_enable_not_uao(void)
(err), ARM64_HAS_UAO); \
break; \
case 8: \
- __get_user_asm("ldr", "ldtr", "%", __gu_val, (ptr), \
+ __get_user_asm("ldr", "ldtr", "%x", __gu_val, (ptr), \
(err), ARM64_HAS_UAO); \
break; \
default: \
@@ -324,7 +324,7 @@ static inline void uaccess_enable_not_uao(void)
(err), ARM64_HAS_UAO); \
break; \
case 8: \
- __put_user_asm("str", "sttr", "%", __pu_val, (ptr), \
+ __put_user_asm("str", "sttr", "%x", __pu_val, (ptr), \
(err), ARM64_HAS_UAO); \
break; \
default: \
--
1.9.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox