* [GIT PULL] arm64: fixes for -rc3
From: Will Deacon @ 2016-10-28 16:01 UTC (permalink / raw)
To: linux-arm-kernel
Hi Linus,
Please pull these three arm64 fixes for -rc3. They're all pretty
straightforward: a couple of NUMA issues from the Huawei folks and a
thinko in __page_to_voff that seems to be benign, but is certainly
better off fixed.
Thanks,
Will
--->8
The following changes since commit 07d9a380680d1c0eb51ef87ff2eab5c994949e69:
Linux 4.9-rc2 (2016-10-23 17:10:14 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/arm64-fixes
for you to fetch changes up to 3fa72fe9c614717d22ae75b84d45f41da65c10fe:
arm64: mm: fix __page_to_voff definition (2016-10-26 18:22:42 +0100)
----------------------------------------------------------------
arm64 fixes:
- Couple of NUMA fixes
- Thinko in __page_to_voff
----------------------------------------------------------------
Hanjun Guo (1):
arm64/numa: fix incorrect log for memory-less node
Neeraj Upadhyay (1):
arm64: mm: fix __page_to_voff definition
Yisheng Xie (1):
arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximity
arch/arm64/include/asm/memory.h | 2 +-
arch/arm64/mm/numa.c | 9 ++++++---
2 files changed, 7 insertions(+), 4 deletions(-)
^ permalink raw reply
* [PATCH v2 3/4] arm64: arch_timer: Work around Erratum Hisilicon-161601
From: Will Deacon @ 2016-10-28 16:00 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1477553651-13428-3-git-send-email-dingtianhong@huawei.com>
Hi Ding,
On Thu, Oct 27, 2016 at 03:34:10PM +0800, Ding Tianhong wrote:
> Erratum Hisilicon-161601 says that the ARM generic timer counter "has the
> potential to contain an erroneous value when the timer value changes".
> Accesses to TVAL (both read and write) are also affected due to the implicit counter
> read. Accesses to CVAL are not affected.
>
> The workaround is to reread the system count registers until the value of the second
> read is larger than the first one by less than 32, the system counter can be guaranteed
> not to return wrong value twice by back-to-back read and the error value is always larger
> than the correct one by 32. Writes to TVAL are replaced with an equivalent write to CVAL.
>
> The workaround is enabled if the hisilicon,erratum-161601 property is found in
> the timer node in the device tree. This can be overridden with the
> clocksource.arm_arch_timer.hisilicon-161601 boot parameter, which allows KVM
> users to enable the workaround until a mechanism is implemented to
> automatically communicate this information.
>
> Fix some description for fsl erratum a008585.
>
> v2: Significant rework based on feedback, including seperate the fsl erratum a008585
> to another patch, update the erratum name and remove unwanted code.
>
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
> ---
[...]
> diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
> index 8a753fd..4aafb6a 100644
> --- a/drivers/clocksource/Kconfig
> +++ b/drivers/clocksource/Kconfig
> @@ -312,8 +312,20 @@ config FSL_ERRATUM_A008585
> help
> This option enables a workaround for Freescale/NXP Erratum
> A-008585 ("ARM generic timer may contain an erroneous
> - value"). The workaround will only be active if the
> + value"). The workaround will be active if the
> fsl,erratum-a008585 property is found in the timer node.
> + This can be overridden with the clocksource.arm_arch_timer.fsl-a008585
> + boot parameter.
> +
> +config HISILICON_ERRATUM_161601
> + bool "Workaround for Hisilicon Erratum 161601"
> + default y
> + depends on ARM_ARCH_TIMER && ARM64
> + help
> + This option enables a workaround for Hisilicon Erratum
> + 161601. The workaround will be active if the hisilicon,erratum-161601
> + property is found in the timer node. This can be overridden with
> + the clocksource.arm_arch_timer.hisilicon-161601 boot parameter.
I'm really not keen on having a kernel commandline parameter for this.
It's not something we've done for other, similar errata (e.g. CNTFRQ
reporting the wrong value) and I think it's a slippery slope to having
more of these workarounds controlled at boot-time. If you have a board
that is affected by this, it's always going to need the workaround. Why
would you turn it off?
Will
^ permalink raw reply
* [PATCH 0/3] Support userspace irqchip with arch timers
From: Marc Zyngier @ 2016-10-28 15:57 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <7500DEDE-EC21-4287-9B11-04C058A48747@suse.de>
On 28/10/16 16:52, Alexander Graf wrote:
>
>
>> Am 28.10.2016 um 16:38 schrieb Marc Zyngier <marc.zyngier@arm.com>:
>>
>> Alex,
>>
>>> On 30/09/16 20:31, Alexander Graf wrote:
>>>
>>>
>>>> On 30.09.16 17:43, Christoffer Dall wrote:
>>>>> On Fri, Sep 30, 2016 at 05:38:11PM +0200, Alexander Graf wrote:
>>>>>
>>>>>
>>>>>> On 30.09.16 16:54, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>>> On 27.09.16 21:08, Christoffer Dall wrote:
>>>>>>> Hi Alex,
>>>>>>>
>>>>>>> Marc and I have been looking at this during Linaro connect and have
>>>>>>> slightly reworked your patch into this small series.
>>>>>>>
>>>>>>> It would be good if you could have a look at it and test it out.
>>>>>>>
>>>>>>> I've tested it with your QEMU, and it works for UP, but secondary CPUs
>>>>>>> fail to come up, and it looks like the kernel never gets an IPI for
>>>>>>> those CPUs from userspace. Any chance you're willing to take a look at
>>>>>>> that?
>>>>>>
>>>>>> I still need to see whether I can come up with a prettier solution, but
>>>>>> for now this works:
>>>>>>
>>>>>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>>>>>
>>>>> Eh, no, not in i386 code :). But the problem seems to be a missing
>>>>> mpstate sync.
>>>>>
>>>> Yeah, that looked really dodgy. Have you tested it? :)
>>>
>>> This time around tested with the correct command line parameters I hope
>>> :). I'll send a pretty patch later.
>>>
>>> diff --git a/target-arm/kvm.c b/target-arm/kvm.c
>>> index b4c8fe2..b549f00 100644
>>> --- a/target-arm/kvm.c
>>> +++ b/target-arm/kvm.c
>>> @@ -173,6 +173,12 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>> */
>>> kvm_async_interrupts_allowed = true;
>>>
>>> + /*
>>> + * PSCI wakes up secondary cores, so we always need to
>>> + * have vCPUs waiting in kernel space
>>> + */
>>> + kvm_halt_in_kernel_allowed = true;
>>> +
>>> cap_has_mp_state = kvm_check_extension(s, KVM_CAP_MP_STATE);
>>>
>>> type_register_static(&host_arm_cpu_type_info);
>>
>> What the status of userspace for this thing? Are QEMU patches being
>> posted and reviewed?
>
> I didn't see a notification that the patches were merged. Are they in
> Linus' tree yet? Then I can post enablement to qemu-devel.
I think you got it backward. I have no intention of merging them until I
see a vague consensus on the userspace API, and a set of patches ready
to be merged in QEMU.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply
* [PATCH 0/3] Support userspace irqchip with arch timers
From: Alexander Graf @ 2016-10-28 15:52 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <3928feee-cb23-8635-20bd-4c068b649a1d@arm.com>
> Am 28.10.2016 um 16:38 schrieb Marc Zyngier <marc.zyngier@arm.com>:
>
> Alex,
>
>> On 30/09/16 20:31, Alexander Graf wrote:
>>
>>
>>> On 30.09.16 17:43, Christoffer Dall wrote:
>>>> On Fri, Sep 30, 2016 at 05:38:11PM +0200, Alexander Graf wrote:
>>>>
>>>>
>>>>> On 30.09.16 16:54, Alexander Graf wrote:
>>>>>
>>>>>
>>>>>> On 27.09.16 21:08, Christoffer Dall wrote:
>>>>>> Hi Alex,
>>>>>>
>>>>>> Marc and I have been looking at this during Linaro connect and have
>>>>>> slightly reworked your patch into this small series.
>>>>>>
>>>>>> It would be good if you could have a look at it and test it out.
>>>>>>
>>>>>> I've tested it with your QEMU, and it works for UP, but secondary CPUs
>>>>>> fail to come up, and it looks like the kernel never gets an IPI for
>>>>>> those CPUs from userspace. Any chance you're willing to take a look at
>>>>>> that?
>>>>>
>>>>> I still need to see whether I can come up with a prettier solution, but
>>>>> for now this works:
>>>>>
>>>>> diff --git a/target-i386/kvm.c b/target-i386/kvm.c
>>>>
>>>> Eh, no, not in i386 code :). But the problem seems to be a missing
>>>> mpstate sync.
>>>>
>>> Yeah, that looked really dodgy. Have you tested it? :)
>>
>> This time around tested with the correct command line parameters I hope
>> :). I'll send a pretty patch later.
>>
>> diff --git a/target-arm/kvm.c b/target-arm/kvm.c
>> index b4c8fe2..b549f00 100644
>> --- a/target-arm/kvm.c
>> +++ b/target-arm/kvm.c
>> @@ -173,6 +173,12 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>> */
>> kvm_async_interrupts_allowed = true;
>>
>> + /*
>> + * PSCI wakes up secondary cores, so we always need to
>> + * have vCPUs waiting in kernel space
>> + */
>> + kvm_halt_in_kernel_allowed = true;
>> +
>> cap_has_mp_state = kvm_check_extension(s, KVM_CAP_MP_STATE);
>>
>> type_register_static(&host_arm_cpu_type_info);
>
> What the status of userspace for this thing? Are QEMU patches being
> posted and reviewed?
I didn't see a notification that the patches were merged. Are they in Linus' tree yet? Then I can post enablement to qemu-devel.
Alex
^ permalink raw reply
* [PATCH v2] irqchip/bcm2836: Prevent spurious interrupts
From: Eric Anholt @ 2016-10-28 15:52 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <alpine.DEB.2.20.1610281257230.4896@nanos>
Thomas Gleixner <tglx@linutronix.de> writes:
> On Thu, 27 Oct 2016, Eric Anholt wrote:
>
>> From: Phil Elwell <phil@raspberrypi.org>
>>
>> The old arch-specific IRQ macros included a dsb to ensure the
>> write to clear the mailbox interrupt completed before returning
>> from the interrupt. The BCM2836 irqchip driver needs the same
>> precaution to avoid spurious interrupts.
>
> This is missing a fixes tag. I have no idea when that problem was
> introduced, so I have no way to decide whether this needs to be tagged
> stable or not.
This code has been there since introduction of the driver, so:
Fixes: 1a15aaa998dc ("irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2")
^ permalink raw reply
* [PATCH v6 09/16] drivers: acpi: iort: add support for ARM SMMU platform devices creation
From: Lorenzo Pieralisi @ 2016-10-28 15:50 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161018160414.1228-10-lorenzo.pieralisi@arm.com>
On Tue, Oct 18, 2016 at 05:04:07PM +0100, Lorenzo Pieralisi wrote:
> In ARM ACPI systems, IOMMU components are specified through static
> IORT table entries. In order to create platform devices for the
> corresponding ARM SMMU components, IORT kernel code should be made
> able to parse IORT table entries and create platform devices
> dynamically.
>
> This patch adds the generic IORT infrastructure required to create
> platform devices for ARM SMMUs.
>
> ARM SMMU versions have different resources requirement therefore this
> patch also introduces an IORT specific structure (ie iort_iommu_config)
> that contains hooks (to be defined when the corresponding ARM SMMU
> driver support is added to the kernel) to be used to define the
> platform devices names, init the IOMMUs, count their resources and
> finally initialize them.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Hanjun Guo <hanjun.guo@linaro.org>
> Cc: Tomasz Nowicki <tn@semihalf.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> ---
> drivers/acpi/arm64/iort.c | 151 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 151 insertions(+)
>
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 1433de3..2eda2f5 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -19,9 +19,11 @@
> #define pr_fmt(fmt) "ACPI: IORT: " fmt
>
> #include <linux/acpi_iort.h>
> +#include <linux/iommu.h>
> #include <linux/kernel.h>
> #include <linux/list.h>
> #include <linux/pci.h>
> +#include <linux/platform_device.h>
> #include <linux/slab.h>
>
> struct iort_its_msi_chip {
> @@ -457,6 +459,153 @@ struct irq_domain *iort_get_device_domain(struct device *dev, u32 req_id)
> return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI);
> }
>
> +struct iort_iommu_config {
> + const char *name;
> + int (*iommu_init)(struct acpi_iort_node *node);
> + bool (*iommu_is_coherent)(struct acpi_iort_node *node);
> + int (*iommu_count_resources)(struct acpi_iort_node *node);
> + void (*iommu_init_resources)(struct resource *res,
> + struct acpi_iort_node *node);
> +};
> +
> +static __init
> +const struct iort_iommu_config *iort_get_iommu_cfg(struct acpi_iort_node *node)
> +{
> + return NULL;
> +}
> +
> +/**
> + * iort_add_smmu_platform_device() - Allocate a platform device for SMMU
> + * @node: Pointer to SMMU ACPI IORT node
> + *
> + * Returns: 0 on success, <0 failure
> + */
> +static int __init iort_add_smmu_platform_device(struct acpi_iort_node *node)
> +{
> + struct fwnode_handle *fwnode;
> + struct platform_device *pdev;
> + struct resource *r;
> + enum dev_dma_attr attr;
> + int ret, count;
> + const struct iort_iommu_config *ops = iort_get_iommu_cfg(node);
> +
> + if (!ops)
> + return -ENODEV;
> +
> + pdev = platform_device_alloc(ops->name, PLATFORM_DEVID_AUTO);
> + if (!pdev)
> + return PTR_ERR(pdev);
> +
> + count = ops->iommu_count_resources(node);
> +
> + r = kcalloc(count, sizeof(*r), GFP_KERNEL);
> + if (!r) {
> + ret = -ENOMEM;
> + goto dev_put;
> + }
> +
> + ops->iommu_init_resources(r, node);
> +
> + ret = platform_device_add_resources(pdev, r, count);
> + /*
> + * Resources are duplicated in platform_device_add_resources,
> + * free their allocated memory
> + */
> + kfree(r);
> +
> + if (ret)
> + goto dev_put;
> +
> + /*
> + * Add a copy of IORT node pointer to platform_data to
> + * be used to retrieve IORT data information.
> + */
> + ret = platform_device_add_data(pdev, &node, sizeof(node));
> + if (ret)
> + goto dev_put;
> +
> + /*
> + * We expect the dma masks to be equivalent for
> + * all SMMUs set-ups
> + */
> + pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask;
> +
> + fwnode = iort_get_fwnode(node);
> +
> + if (!fwnode) {
> + ret = -ENODEV;
> + goto dev_put;
> + }
> +
> + pdev->dev.fwnode = fwnode;
> +
> + attr = ops->iommu_is_coherent(node) ?
> + DEV_DMA_COHERENT : DEV_DMA_NON_COHERENT;
> +
> + /* Configure DMA for the page table walker */
> + acpi_dma_configure(&pdev->dev, attr);
> +
> + ret = platform_device_add(pdev);
> + if (ret)
> + goto dma_deconfigure;
> +
> + return 0;
> +
> +dma_deconfigure:
> + acpi_dma_deconfigure(&pdev->dev);
> +dev_put:
> + platform_device_put(pdev);
> +
> + return ret;
> +}
> +
> +static void __init iort_init_platform_devices(void)
> +{
> + struct acpi_iort_node *iort_node, *iort_end;
> + struct acpi_table_iort *iort;
> + struct fwnode_handle *fwnode;
> + int i, ret;
> +
> + /*
> + * iort_table and iort both point to the start of IORT table, but
> + * have different struct types
> + */
> + iort = (struct acpi_table_iort *)iort_table;
> +
> + /* Get the first IORT node */
> + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort,
> + iort->node_offset);
> + iort_end = ACPI_ADD_PTR(struct acpi_iort_node, iort,
> + iort_table->length);
> +
> + for (i = 0; i < iort->node_count; i++) {
> + if (iort_node >= iort_end) {
> + pr_err("iort node pointer overflows, bad table\n");
> + return;
> + }
> +
> + if ((iort_node->type == ACPI_IORT_NODE_SMMU) ||
> + (iort_node->type == ACPI_IORT_NODE_SMMU_V3)) {
> +
> + fwnode = acpi_alloc_fwnode_static();
> + if (!fwnode)
> + return;
> +
> + iort_set_fwnode(iort_node, fwnode);
> +
> + ret = iort_add_smmu_platform_device(iort_node);
> + if (ret) {
> + iort_delete_fwnode(iort_node);
> + acpi_free_fwnode_static(fwnode);
> + return;
> + }
> + }
> +
> + iort_node = ACPI_ADD_PTR(struct acpi_iort_node, iort_node,
> + iort_node->length);
> + }
> +}
> +
> void __init acpi_iort_init(void)
> {
> acpi_status status;
> @@ -468,5 +617,7 @@ void __init acpi_iort_init(void)
> return;
> }
>
Slipped through the cracks while rebasing v6, I should add code that
returns if no IORT table found here to prevent calling:
iort_init_platform_devices()
I will update the next version, it is just a heads-up for testers,
I can push an updated/fixed branch if needed.
Thanks,
Lorenzo
> + iort_init_platform_devices();
> +
> acpi_probe_device_table(iort);
> }
> --
> 2.10.0
>
^ permalink raw reply
* [PATCH] fpga zynq: Check the bitstream for validity
From: Jason Gunthorpe @ 2016-10-28 15:48 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <7950dec2-cc69-6304-ea2d-92b1a890214e@suse.com>
On Fri, Oct 28, 2016 at 01:12:06PM +0200, Matthias Brugger wrote:
> >Zynq is also only able to DMA dword quantities, so bitstreams must be
> >a multiple of 4 bytes. This also fixes a DMA-past the end bug.
>
> The you can also fix the transfer_length calculation in zynq_fpga_ops_write,
> as we don't allow buffers which are not a multiple of 4.
Didn't I do that? Did you see something else to change in the dma
part?
Jason
^ permalink raw reply
* [PATCH] fpga zynq: Check the bitstream for validity
From: Jason Gunthorpe @ 2016-10-28 15:47 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <8bed213a-96e3-1891-a46a-234253a2561e@suse.com>
On Fri, Oct 28, 2016 at 01:06:08PM +0200, Matthias Brugger wrote:
> The only case we don't check is, if count == 0. If we check that here, we
> can get rid of the count <= 4 check.
You don't think
if (count == 0 || buf[3] = 'x')
looks weird and wrong? I do.
> >The count <= 4 should stay here since it is primarily guarding against
> >read past the buffer in the if.
>
> If you insist in doing this check, it should be count < 4, because we check
> the first four elements of buf, or do I miss something?
count = 4 and count = 0 are both invalid. A bitstream consisting of
only the sync word is also going to fail programming.
As Michal said, the actual min bitstream length is probably >> 50 bytes
Jason
^ permalink raw reply
* [PATCH v12 RESEND 0/4] generic TEE subsystem
From: Andrew F. Davis @ 2016-10-28 15:43 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1477649984-16777-1-git-send-email-jens.wiklander@linaro.org>
On 10/28/2016 05:19 AM, Jens Wiklander wrote:
> Hi,
>
> This patch set introduces a generic TEE subsystem. The TEE subsystem will
> contain drivers for various TEE implementations. A TEE (Trusted Execution
> Environment) is a trusted OS running in some secure environment, for
> example, TrustZone on ARM CPUs, or a separate secure co-processor etc.
>
> Regarding use cases, TrustZone has traditionally been used for
> offloading secure tasks to the secure world. Examples include:
> - Secure key handling where the OS may or may not have direct access to key
> material.
> - E-commerce and payment technologies. Credentials, credit card numbers etc
> could be stored in a more secure environment.
> - Trusted User Interface (TUI) to ensure that no-one can snoop PIN-codes
> etc.
> - Secure boot to ensure that loaded binaries haven?t been tampered with.
> It?s not strictly needed for secure boot, but you could enhance security
> by leveraging a TEE during boot.
> - Digital Rights Management (DRM), the studios provides content with
> different resolution depending on the security of the device. Higher
> security means higher resolution.
>
> A TEE could also be used in existing and new technologies. For example IMA
> (Integrity Measurement Architecture) which has been in the kernel for quite
> a while. Today you can enhance security by using a TPM-chip to sign the IMA
> measurement list. This is something that you also could do by leveraging a
> TEE.
>
> Another example could be in 2-factor authentication which is becoming
> increasingly more important. FIDO (https://fidoalliance.org) for example
> are using public key cryptography in their 2-factor authentication standard
> (U2F). With FIDO, a private and public key pair will be generated for every
> site you visit and the private key should never leave the local device.
> This is an example where you could use secure storage in a TEE for the
> private key.
>
> Today you will find a quite a few different out of tree implementations of
> TEE drivers which tends to fragment the TEE ecosystem and development. We
> think it would be a good idea to have a generic TEE driver integrated in
> the kernel which would serve as a base for several different TEE solutions,
> no matter if they are on-chip like TrustZone or if they are on a separate
> crypto co-processor.
>
> To develop this TEE subsystem we have been using the open source TEE called
> OP-TEE (https://github.com/OP-TEE/optee_os) and therefore this would be the
> first TEE solution supported by this new subsystem. OP-TEE is a
> GlobalPlatform compliant TEE, however this TEE subsystem is not limited to
> only GlobalPlatform TEEs, instead we have tried to design it so that it
> should work with other TEE solutions also.
>
The above is my biggest concern with this whole subsystem, to me it
still feels very OPTEE specific. As much as I would love to believe
OPTEE will be the end-all TEE, I'm sure we soon will start to see wider
use of vendor TEEs (like TI's own legacy Trustzone thing we are hoping
to depreciate with OPTEE moving forward), possibly Google's Trusty TEE,
and whatever Intel/AMD are cooking up for x86.
As we all know when things are upstreamed we lose the ability to make
radical changes easily, especially to full subsystems. What happens when
this framework, built with only one existing TEE, built by the one
existing TEE's devs, is not as flexible as we need when other TEEs start
rolling out?
Do we see this as a chicken and egg situation, or is there any harm
beyond the pains of supporting an out-of-tree driver for a while, to
wait until we have at least one other TEE to add to this subsystem
before merging?
This may also help with the perceived lack of reviewers for this series.
Thanks,
Andrew
> "tee: generic TEE subsystem" brings in the generic TEE subsystem which
> helps when writing a driver for a specific TEE, for example, OP-TEE.
>
> "tee: add OP-TEE driver" is an OP-TEE driver which uses the subsystem to do
> its work.
>
> This patch set has been prepared in cooperation with Javier Gonz?lez who
> proposed "Generic TrustZone Driver in Linux Kernel" patches 28 Nov 2014,
> https://lwn.net/Articles/623380/ . We've since then changed the scope to
> TEE instead of TrustZone.
>
> We have discussed the design on tee-dev at lists.linaro.org (archive at
> https://lists.linaro.org/pipermail/tee-dev/) with people from other
> companies, including Valentin Manea <valentin.manea@huawei.com>,
> Emmanuel MICHEL <emmanuel.michel@st.com>,
> Jean-michel DELORME <jean-michel.delorme@st.com>,
> and Joakim Bech <joakim.bech@linaro.org>. Our main concern has been to
> agree on something that is generic enough to support many different
> TEEs while still keeping the interface together.
>
> v12-resend:
> * Rebased on v4.9-rc2
>
> v12:
> * Rebased on v4.8-rc5
> * Addressed review comments from Andrew F. Davis
> * Removed Acked-by: Andreas Dannenberg <dannenberg@ti.com> as the
> mail bounces
> * Bugfix possible null dereference in error cleanup path of
> optee_probe().
> * Bugfix optee_from_msg_param() when calculating offset of memref
> into a shared memory object
>
> v11:
> * Rebased on v4.8-rc3
> * Addressed review comments from Nishanth Menon
> * Made the TEE framework available as a loadable module.
> * Reviewed-by: Javier Gonz?lez <javier@javigon.com>
> * Zeroes shared memory on allocation to avoid information leakage
> * Links shared memory objects to context to avoid stealing of shared memory
> object from an unrelated process
> * Allow RPC interruption if supplicant is unavailable
>
> v10:
> * Rebased on v4.7-rc1
> * Addressed private review comments from Nishanth Menon
> * Optee driver only accepts one supplicant process on the privileged device
> * Optee driver avoids long delayed releases of shm objects
> * Added more comments on functions and structs
>
> v9:
> * Rebased on v4.6-rc1
> * Acked-by: Andreas Dannenberg <dannenberg@ti.com>
> * Addressed comments from Al Viro how file descriptors are passed to
> user space
> * Addressed comments from Randy Dunlap on documentation
> * Changed license for include/uapi/linux/tee.h
>
> v8:
> * Rebased on v4.5-rc3
> * dt/bindings: add bindings for optee
> Acked-by: Rob Herring <robh@kernel.org>
> * Fixes build error for X86
> * Fixes spell error in "dt/bindings: add bindings for optee"
>
> v7:
> * Rebased on v4.5-rc2
> * Moved the ARM SMC Calling Convention support into a separate patch
> set, which is now merged
>
> v6:
> * Rebased on v4.3-rc7
> * Changed smccc interface to let the compiler marshal most of the
> parameters
> * Added ARCH64 capability for smccc interface
> * Changed the PSCI firmware calls (both arm and arm64) to use the new
> generic smccc interface instead instead of own assembly functions.
> * Move optee DT bindings to below arm/firmware
> * Defines method for OP-TEE driver to call secure world in DT, smc or hvc
> * Exposes implementation id of a TEE driver in sysfs
> to easily spawn corresponding tee-supplicant when device is ready
> * Update OP-TEE Message Protocol to better cope with fragmented physical
> memory
> * Read time directly from OP-TEE driver instead of forwarding the RPC
> request to tee-supplicant
>
> v5:
> * Replaced kref reference counting for the device with a size_t instead as
> the counter is always protected by a mutex
>
> v4:
> * Rebased on 4.1
> * Redesigned the synchronization around entry exit of normal SMC
> * Replaced rwsem on the driver instance with kref and completion since
> rwsem wasn't intended to be used in this way
> * Expanded the TEE_IOCTL_PARAM_ATTR_TYPE_MASK to make room for
> future additional parameter types
> * Documents TEE subsystem and OP-TEE driver
> * Replaced TEE_IOC_CMD with TEE_IOC_OPEN_SESSION, TEE_IOC_INVOKE,
> TEE_IOC_CANCEL and TEE_IOC_CLOSE_SESSION
> * DT bindings in a separate patch
> * Assembly parts moved to arch/arm and arch/arm64 respectively, in a
> separate patch
> * Redefined/clarified the meaning of OPTEE_SMC_SHM_CACHED
> * Removed CMA usage to limit the scope of the patch set
>
> v3:
> * Rebased on 4.1-rc3 (dma_buf_export() API change)
> * A couple of small sparse fixes
> * Documents bindings for OP-TEE driver
> * Updated MAINTAINERS
>
> v2:
> * Replaced the stubbed OP-TEE driver with a real OP-TEE driver
> * Removed most APIs not needed by OP-TEE in current state
> * Update Documentation/ioctl/ioctl-number.txt with correct path to tee.h
> * Rename tee_shm_pool_alloc_cma() to tee_shm_pool_alloc()
> * Moved tee.h into include/uapi/linux/
> * Redefined tee.h IOCTL macros to be directly based on _IOR and friends
> * Removed version info on the API to user space, a data blob which
> can contain an UUID is left for user space to be able to tell which
> protocol to use in TEE_IOC_CMD
> * Changed user space exposed structures to only have types with __ prefix
> * Dropped THIS_MODULE from tee_fops
> * Reworked how the driver is registered and ref counted:
> - moved from using an embedded struct miscdevice to an embedded struct
> device.
> - uses an struct rw_semaphore as synchronization for driver detachment
> - uses alloc/register pattern from TPM
>
> Thanks,
> Jens
>
> Jens Wiklander (4):
> dt/bindings: add bindings for optee
> tee: generic TEE subsystem
> tee: add OP-TEE driver
> Documentation: tee subsystem and op-tee driver
>
> Documentation/00-INDEX | 2 +
> .../bindings/arm/firmware/linaro,optee-tz.txt | 31 +
> .../devicetree/bindings/vendor-prefixes.txt | 1 +
> Documentation/ioctl/ioctl-number.txt | 1 +
> Documentation/tee.txt | 118 +++
> MAINTAINERS | 13 +
> drivers/Kconfig | 2 +
> drivers/Makefile | 1 +
> drivers/tee/Kconfig | 18 +
> drivers/tee/Makefile | 5 +
> drivers/tee/optee/Kconfig | 7 +
> drivers/tee/optee/Makefile | 5 +
> drivers/tee/optee/call.c | 435 ++++++++++
> drivers/tee/optee/core.c | 598 ++++++++++++++
> drivers/tee/optee/optee_msg.h | 435 ++++++++++
> drivers/tee/optee/optee_private.h | 185 +++++
> drivers/tee/optee/optee_smc.h | 446 ++++++++++
> drivers/tee/optee/rpc.c | 404 +++++++++
> drivers/tee/optee/supp.c | 273 +++++++
> drivers/tee/tee_core.c | 901 +++++++++++++++++++++
> drivers/tee/tee_private.h | 129 +++
> drivers/tee/tee_shm.c | 357 ++++++++
> drivers/tee/tee_shm_pool.c | 158 ++++
> include/linux/tee_drv.h | 278 +++++++
> include/uapi/linux/tee.h | 403 +++++++++
> 25 files changed, 5206 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/arm/firmware/linaro,optee-tz.txt
> create mode 100644 Documentation/tee.txt
> create mode 100644 drivers/tee/Kconfig
> create mode 100644 drivers/tee/Makefile
> create mode 100644 drivers/tee/optee/Kconfig
> create mode 100644 drivers/tee/optee/Makefile
> create mode 100644 drivers/tee/optee/call.c
> create mode 100644 drivers/tee/optee/core.c
> create mode 100644 drivers/tee/optee/optee_msg.h
> create mode 100644 drivers/tee/optee/optee_private.h
> create mode 100644 drivers/tee/optee/optee_smc.h
> create mode 100644 drivers/tee/optee/rpc.c
> create mode 100644 drivers/tee/optee/supp.c
> create mode 100644 drivers/tee/tee_core.c
> create mode 100644 drivers/tee/tee_private.h
> create mode 100644 drivers/tee/tee_shm.c
> create mode 100644 drivers/tee/tee_shm_pool.c
> create mode 100644 include/linux/tee_drv.h
> create mode 100644 include/uapi/linux/tee.h
>
^ permalink raw reply
* [PATCH v3 0/5] Cavium ThunderX uncore PMU support
From: Mark Rutland @ 2016-10-28 15:42 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161028153646.GA11371@hardcore>
On Fri, Oct 28, 2016 at 05:36:46PM +0200, Jan Glauber wrote:
> On Fri, Oct 28, 2016 at 04:17:49PM +0100, Will Deacon wrote:
> > On Thu, Oct 20, 2016 at 01:23:51PM +0200, Jan Glauber wrote:
> > > On Thu, Oct 20, 2016 at 12:37:07PM +0200, Peter Zijlstra wrote:
> > > > On Thu, Oct 20, 2016 at 11:30:36AM +0200, Jan Glauber wrote:
> > > > > Note:
> > > > > I'm using perf_sw_context in difference to perf_invalid_context
> > > > > (see WARN_ON in perf_pmu_register). Reason is that with perf_invalid_context
> > > > > add() is never called and the counter results are shown as "unsupported" by
> > > > > perf. With perf_sw_context everything works as expected.
> > > >
> > > > What?! All the uncore PMUs use perf_invalid_context. What doesn't work
> > > > for you?
> > >
> > > OK, so using perf_invalid_context and "-a" seems to work.
> > >
> > > But I must say that I hate that from a user perspective. The user needs to know about
> > > the type of PMU behind the event and then provide "-a" or get a "<not supported"
> > > as counter value?
> >
> > Sure, but in the interest of getting *something* merged, can we start
> > off using perf_invalid_context and then have the discussion about whether
> > or not this can be extended later on, please? If your PMU is a shared
> > resource amongst CPUs, it maybe that all you want is a better error
> > message from the perf tool (but again, this can come later!).
>
> If that is the only obstacle I can repost with perf_sw_context (or do a
> follow-up patch). After all it works, it is just "clueless" people like
> me that are not aware of the required switches.
Please send a version using perf_invalid_context (which I assume is what
you meant above).
Thanks,
Mark.
^ permalink raw reply
* [PATCH] staging: vc04_services: tie up loose ends with dma_map_sg conversion
From: Greg KH @ 2016-10-28 15:42 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1477668994.12378.1.camel@crowfest.net>
On Fri, Oct 28, 2016 at 08:36:34AM -0700, Michael Zoran wrote:
> On Fri, 2016-10-28 at 11:31 -0400, Greg KH wrote:
> > On Fri, Oct 28, 2016 at 08:16:51AM -0700, Michael Zoran wrote:
> > > The conversion to dma_map_sg left a few loose ends.??This change
> > > ties up those loose ends.
> > >
> > > 1. Settings the DMA mask is mandatory on 64 bit even though it
> > > is optional on 32 bit.??Set the mask so that DMA space is always
> > > in the lower 32 bit range of data on both platforms.
> > >
> > > 2. The scatterlist was not completely initialized correctly.
> > > Initialize the list with sg_init_table so that DMA works correctly
> > > if scatterlist debugging is enabled in the build configuration.
> > >
> > > 3. The error paths in create_pagelist were not consistent.??Make
> > > them all consistent by calling a helper function called
> > > cleanup_pagelistinfo to cleanup regardless of what state the
> > > pagelist
> > > is in.
> > >
> > > 4. create_pagelist and free_pagelist had a very large amount of
> > > duplication in computing offsets into a single allocation of memory
> > > in the DMA area.??Introduce a new structure called the pagelistinfo
> > > that is appened to the end of the allocation to store necessary
> > > information to prevent duplication of code and make cleanup on
> > > errors
> > > easier.
> > >
> > > When combined with a fix for vchiq_copy_from_user which is not
> > > included at this time, both functional and pings tests of
> > > vchiq_test
> > > now pass in both 32 bit and 64 bit modes.
> > >
> > > Even though this cleanup could have been broken down to chunks,
> > > all the changes are to a single file and submitting it as a single
> > > related change should make reviewing the diff much easier then if
> > > it
> > > were submitted piecemeal.
> >
> > No, it's harder.??A patch should only do one type of thing, this
> > patch
> > has to be reviewed thinking of 4 different things all at once, making
> > it
> > much more difficult to do so.
> >
> > We write patches to be read easily, and make them easy to review.??We
> > don't write them in a way to be easy for the developer to create :)
> >
> > Can you please break this up into a patch series?
> >
> > thanks,
> >
> > greg k-h
>
> Point #1 and #2 would be very easy to seperate. Point #3 and #4 are
> essentually a redo of two major functions and are where most of the
> changes are.
>
> Would making #1 and #2 independent but combining #3 and #4 sufficient?
I don't know, try it and see what the patches look like.
Think about it from my point of view, which would be easier to review?
thanks,
greg k-h
^ permalink raw reply
* specifying order of /dev/mmcblk devices via device-tree?
From: Mark Rutland @ 2016-10-28 15:37 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <CAJ+vNU1g8nQedd06LeaD10YWUQ_UmdRToUMFKaTNT1aNY56b0g@mail.gmail.com>
On Fri, Oct 28, 2016 at 08:23:04AM -0700, Tim Harvey wrote:
> Greetings,
>
> I have an IMX6 board that has the following:
> sdhc1: mmc0: sdio radio
> sdhc2: mmc1: /dev/mmcblk1: microSD connector
> sdhc3: mmc2: /dev/mmcblk2: on-board eMMC
>
> I would like to have sdhc3 registered as /dev/mmcblk0 and sdhc2
> registered as /dev/mmcblk1 so that permanent storage is the first
> mmcblk device as I think this is more intuitive however currently
> these get instanced in the order they appear in the imx6qdl.dtsi
> device-tree configuration and are not able to be mapped the way I want
> them in my dts file.
>
> Is there a way, or if not is there a desire for a way, to specify the
> order of /dev/mmcblk devices via device-tree?
As with many other devices, there is no standard way of controlling the
Linux enumeration (and given the ID space is shared with other dynamic
devices it's not something that could generally work).
These should be refererd to by UUID if possible.
If not, we could cosider adding a by-dt-path or something like that.
Thanks,
Mark.
^ permalink raw reply
* specifying order of /dev/mmcblk devices via device-tree?
From: Javier Martinez Canillas @ 2016-10-28 15:37 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <CAJ+vNU1g8nQedd06LeaD10YWUQ_UmdRToUMFKaTNT1aNY56b0g@mail.gmail.com>
Hello Tim,
On Fri, Oct 28, 2016 at 12:23 PM, Tim Harvey <tharvey@gateworks.com> wrote:
> Greetings,
>
> I have an IMX6 board that has the following:
> sdhc1: mmc0: sdio radio
> sdhc2: mmc1: /dev/mmcblk1: microSD connector
> sdhc3: mmc2: /dev/mmcblk2: on-board eMMC
>
> I would like to have sdhc3 registered as /dev/mmcblk0 and sdhc2
> registered as /dev/mmcblk1 so that permanent storage is the first
> mmcblk device as I think this is more intuitive however currently
> these get instanced in the order they appear in the imx6qdl.dtsi
> device-tree configuration and are not able to be mapped the way I want
> them in my dts file.
>
One should not rely on a specific MMC block device numbering, since
this could change. Look at commit 9aaf3437aa72 ("mmc: block: Use the
mmc host device index as the mmcblk device index").
> Is there a way, or if not is there a desire for a way, to specify the
> order of /dev/mmcblk devices via device-tree?
>
There was an attempt to use aliasses for this but got nacked:
https://lkml.org/lkml/2016/4/29/610
As Fabio said, you should either use UUID or labels instead:
https://wiki.archlinux.org/index.php/persistent_block_device_naming
> Regards,
>
> Tim
>
Best regards,
Javier
^ permalink raw reply
* [PATCH v3 0/5] Cavium ThunderX uncore PMU support
From: Jan Glauber @ 2016-10-28 15:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161028151749.GG14402@arm.com>
On Fri, Oct 28, 2016 at 04:17:49PM +0100, Will Deacon wrote:
> On Thu, Oct 20, 2016 at 01:23:51PM +0200, Jan Glauber wrote:
> > On Thu, Oct 20, 2016 at 12:37:07PM +0200, Peter Zijlstra wrote:
> > > On Thu, Oct 20, 2016 at 11:30:36AM +0200, Jan Glauber wrote:
> > > > Note:
> > > > I'm using perf_sw_context in difference to perf_invalid_context
> > > > (see WARN_ON in perf_pmu_register). Reason is that with perf_invalid_context
> > > > add() is never called and the counter results are shown as "unsupported" by
> > > > perf. With perf_sw_context everything works as expected.
> > >
> > > What?! All the uncore PMUs use perf_invalid_context. What doesn't work
> > > for you?
> >
> > OK, so using perf_invalid_context and "-a" seems to work.
> >
> > But I must say that I hate that from a user perspective. The user needs to know about
> > the type of PMU behind the event and then provide "-a" or get a "<not supported"
> > as counter value?
>
> Sure, but in the interest of getting *something* merged, can we start
> off using perf_invalid_context and then have the discussion about whether
> or not this can be extended later on, please? If your PMU is a shared
> resource amongst CPUs, it maybe that all you want is a better error
> message from the perf tool (but again, this can come later!).
If that is the only obstacle I can repost with perf_sw_context (or do a
follow-up patch). After all it works, it is just "clueless" people like
me that are not aware of the required switches.
--Jan
> Will
^ permalink raw reply
* [PATCH] staging: vc04_services: tie up loose ends with dma_map_sg conversion
From: Michael Zoran @ 2016-10-28 15:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161028153124.GA29906@kroah.com>
On Fri, 2016-10-28 at 11:31 -0400, Greg KH wrote:
> On Fri, Oct 28, 2016 at 08:16:51AM -0700, Michael Zoran wrote:
> > The conversion to dma_map_sg left a few loose ends.??This change
> > ties up those loose ends.
> >
> > 1. Settings the DMA mask is mandatory on 64 bit even though it
> > is optional on 32 bit.??Set the mask so that DMA space is always
> > in the lower 32 bit range of data on both platforms.
> >
> > 2. The scatterlist was not completely initialized correctly.
> > Initialize the list with sg_init_table so that DMA works correctly
> > if scatterlist debugging is enabled in the build configuration.
> >
> > 3. The error paths in create_pagelist were not consistent.??Make
> > them all consistent by calling a helper function called
> > cleanup_pagelistinfo to cleanup regardless of what state the
> > pagelist
> > is in.
> >
> > 4. create_pagelist and free_pagelist had a very large amount of
> > duplication in computing offsets into a single allocation of memory
> > in the DMA area.??Introduce a new structure called the pagelistinfo
> > that is appened to the end of the allocation to store necessary
> > information to prevent duplication of code and make cleanup on
> > errors
> > easier.
> >
> > When combined with a fix for vchiq_copy_from_user which is not
> > included at this time, both functional and pings tests of
> > vchiq_test
> > now pass in both 32 bit and 64 bit modes.
> >
> > Even though this cleanup could have been broken down to chunks,
> > all the changes are to a single file and submitting it as a single
> > related change should make reviewing the diff much easier then if
> > it
> > were submitted piecemeal.
>
> No, it's harder.??A patch should only do one type of thing, this
> patch
> has to be reviewed thinking of 4 different things all at once, making
> it
> much more difficult to do so.
>
> We write patches to be read easily, and make them easy to review.??We
> don't write them in a way to be easy for the developer to create :)
>
> Can you please break this up into a patch series?
>
> thanks,
>
> greg k-h
Point #1 and #2 would be very easy to seperate. Point #3 and #4 are
essentually a redo of two major functions and are where most of the
changes are.
Would making #1 and #2 independent but combining #3 and #4 sufficient?
^ permalink raw reply
* specifying order of /dev/mmcblk devices via device-tree?
From: Fabio Estevam @ 2016-10-28 15:33 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <CAJ+vNU1g8nQedd06LeaD10YWUQ_UmdRToUMFKaTNT1aNY56b0g@mail.gmail.com>
Hi Tim,
On Fri, Oct 28, 2016 at 1:23 PM, Tim Harvey <tharvey@gateworks.com> wrote:
> Greetings,
>
> I have an IMX6 board that has the following:
> sdhc1: mmc0: sdio radio
> sdhc2: mmc1: /dev/mmcblk1: microSD connector
> sdhc3: mmc2: /dev/mmcblk2: on-board eMMC
>
> I would like to have sdhc3 registered as /dev/mmcblk0 and sdhc2
> registered as /dev/mmcblk1 so that permanent storage is the first
> mmcblk device as I think this is more intuitive however currently
> these get instanced in the order they appear in the imx6qdl.dtsi
> device-tree configuration and are not able to be mapped the way I want
> them in my dts file.
>
> Is there a way, or if not is there a desire for a way, to specify the
> order of /dev/mmcblk devices via device-tree?
You could use UUID to specify the rootfs location on your board.
Please check:
http://git.denx.de/?p=u-boot.git;a=commit;h=ca4f338e2efece5196eb2178e5f7d07be828da6e
^ permalink raw reply
* [PATCH 1/2] PCI: iproc: Support DT property for ignoring aborts when probing
From: Rafał Miłecki @ 2016-10-28 15:31 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <acccb06d-14b9-33bd-466b-47dd3689e537@broadcom.com>
On 20 April 2016 at 20:18, Ray Jui <ray.jui@broadcom.com> wrote:
> Hi Rafal/Florian/Arnd,
>
> After a couple days of email exchange with the ASIC team, I think I've
> figured out the behavior on all of the Broadcom SoCs that use this iProc
> PCIe controller.
>
> On NSP, Cygnus, and NS2:
> - There's an APB error enable register at offset 0xf40 from the iProc PCIe
> controller's base address. If one clears bit 0 (enabled by default after
> chip POR) of that register, one can stop this from being forwarded to "iProc
> host" as an APB error/external imprecise abort
> - I will submit a patch to the iProc PCIe driver to disable this error
> forwarding
>
> On NS:
> - Unfortunately, there's no such control register in NS. In other words, we
> cannot disable this error at the PCIe controller level
> - FSR code corresponds to external (bit[12] = '1'), read (bit[11] = '0'),
> imprecise abort (bits[10][3:0] = '1''0110'), i.e., external imprecise abort
> triggered by read access. Our ASIC team believes a read access to a
> non-exist APB register can also trigger an abort with the same FSR code.
> Note this is the tricky part, by registering an abort hook that skips this
> particular FSR, one has a chance of skipping other aborts triggered by
> accessing invalid APB registers. But given that this cannot be disabled for
> the PCIe controller NS, I'm not sure what approach we should take. Any
> thoughts?
It's really late reply but I wanted to finally handle this problem.
>From Ray's e-mail it seems Northstar is the only platform requiring
this workaround. So we don't have to worry about arm64.
We have two options then:
1) Add workaround in arch/arm/mach-bcm/bcm_5301x.c
2) Add workaround into built-in drivers/pci/host/pcie-iproc-fault.c
Do you have any preference about that?
^ permalink raw reply
* [PATCH] staging: vc04_services: tie up loose ends with dma_map_sg conversion
From: Greg KH @ 2016-10-28 15:31 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161028151651.3430-1-mzoran@crowfest.net>
On Fri, Oct 28, 2016 at 08:16:51AM -0700, Michael Zoran wrote:
> The conversion to dma_map_sg left a few loose ends. This change
> ties up those loose ends.
>
> 1. Settings the DMA mask is mandatory on 64 bit even though it
> is optional on 32 bit. Set the mask so that DMA space is always
> in the lower 32 bit range of data on both platforms.
>
> 2. The scatterlist was not completely initialized correctly.
> Initialize the list with sg_init_table so that DMA works correctly
> if scatterlist debugging is enabled in the build configuration.
>
> 3. The error paths in create_pagelist were not consistent. Make
> them all consistent by calling a helper function called
> cleanup_pagelistinfo to cleanup regardless of what state the pagelist
> is in.
>
> 4. create_pagelist and free_pagelist had a very large amount of
> duplication in computing offsets into a single allocation of memory
> in the DMA area. Introduce a new structure called the pagelistinfo
> that is appened to the end of the allocation to store necessary
> information to prevent duplication of code and make cleanup on errors
> easier.
>
> When combined with a fix for vchiq_copy_from_user which is not
> included at this time, both functional and pings tests of vchiq_test
> now pass in both 32 bit and 64 bit modes.
>
> Even though this cleanup could have been broken down to chunks,
> all the changes are to a single file and submitting it as a single
> related change should make reviewing the diff much easier then if it
> were submitted piecemeal.
No, it's harder. A patch should only do one type of thing, this patch
has to be reviewed thinking of 4 different things all at once, making it
much more difficult to do so.
We write patches to be read easily, and make them easy to review. We
don't write them in a way to be easy for the developer to create :)
Can you please break this up into a patch series?
thanks,
greg k-h
^ permalink raw reply
* specifying order of /dev/mmcblk devices via device-tree?
From: Tim Harvey @ 2016-10-28 15:23 UTC (permalink / raw)
To: linux-arm-kernel
Greetings,
I have an IMX6 board that has the following:
sdhc1: mmc0: sdio radio
sdhc2: mmc1: /dev/mmcblk1: microSD connector
sdhc3: mmc2: /dev/mmcblk2: on-board eMMC
I would like to have sdhc3 registered as /dev/mmcblk0 and sdhc2
registered as /dev/mmcblk1 so that permanent storage is the first
mmcblk device as I think this is more intuitive however currently
these get instanced in the order they appear in the imx6qdl.dtsi
device-tree configuration and are not able to be mapped the way I want
them in my dts file.
Is there a way, or if not is there a desire for a way, to specify the
order of /dev/mmcblk devices via device-tree?
Regards,
Tim
^ permalink raw reply
* [PATCH v3 0/5] Cavium ThunderX uncore PMU support
From: Will Deacon @ 2016-10-28 15:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20161020112351.GC13708@hardcore>
On Thu, Oct 20, 2016 at 01:23:51PM +0200, Jan Glauber wrote:
> On Thu, Oct 20, 2016 at 12:37:07PM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 20, 2016 at 11:30:36AM +0200, Jan Glauber wrote:
> > > Note:
> > > I'm using perf_sw_context in difference to perf_invalid_context
> > > (see WARN_ON in perf_pmu_register). Reason is that with perf_invalid_context
> > > add() is never called and the counter results are shown as "unsupported" by
> > > perf. With perf_sw_context everything works as expected.
> >
> > What?! All the uncore PMUs use perf_invalid_context. What doesn't work
> > for you?
>
> OK, so using perf_invalid_context and "-a" seems to work.
>
> But I must say that I hate that from a user perspective. The user needs to know about
> the type of PMU behind the event and then provide "-a" or get a "<not supported"
> as counter value?
Sure, but in the interest of getting *something* merged, can we start
off using perf_invalid_context and then have the discussion about whether
or not this can be extended later on, please? If your PMU is a shared
resource amongst CPUs, it maybe that all you want is a better error
message from the perf tool (but again, this can come later!).
Will
^ permalink raw reply
* [PATCH] staging: vc04_services: tie up loose ends with dma_map_sg conversion
From: Michael Zoran @ 2016-10-28 15:16 UTC (permalink / raw)
To: linux-arm-kernel
The conversion to dma_map_sg left a few loose ends. This change
ties up those loose ends.
1. Settings the DMA mask is mandatory on 64 bit even though it
is optional on 32 bit. Set the mask so that DMA space is always
in the lower 32 bit range of data on both platforms.
2. The scatterlist was not completely initialized correctly.
Initialize the list with sg_init_table so that DMA works correctly
if scatterlist debugging is enabled in the build configuration.
3. The error paths in create_pagelist were not consistent. Make
them all consistent by calling a helper function called
cleanup_pagelistinfo to cleanup regardless of what state the pagelist
is in.
4. create_pagelist and free_pagelist had a very large amount of
duplication in computing offsets into a single allocation of memory
in the DMA area. Introduce a new structure called the pagelistinfo
that is appened to the end of the allocation to store necessary
information to prevent duplication of code and make cleanup on errors
easier.
When combined with a fix for vchiq_copy_from_user which is not
included at this time, both functional and pings tests of vchiq_test
now pass in both 32 bit and 64 bit modes.
Even though this cleanup could have been broken down to chunks,
all the changes are to a single file and submitting it as a single
related change should make reviewing the diff much easier then if it
were submitted piecemeal.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
---
.../interface/vchiq_arm/vchiq_2835_arm.c | 239 +++++++++++----------
1 file changed, 129 insertions(+), 110 deletions(-)
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
index a5afcc5..06df77a 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
@@ -62,6 +62,18 @@ typedef struct vchiq_2835_state_struct {
VCHIQ_ARM_STATE_T arm_state;
} VCHIQ_2835_ARM_STATE_T;
+struct vchiq_pagelist_info {
+ PAGELIST_T *pagelist;
+ size_t pagelist_buffer_size;
+ dma_addr_t dma_addr;
+ enum dma_data_direction dma_dir;
+ unsigned int num_pages;
+ unsigned int pages_need_release;
+ struct page **pages;
+ struct scatterlist *scatterlist;
+ unsigned int scatterlist_mapped;
+};
+
static void __iomem *g_regs;
static unsigned int g_cache_line_size = sizeof(CACHE_LINE_SIZE);
static unsigned int g_fragments_size;
@@ -77,13 +89,13 @@ static DEFINE_SEMAPHORE(g_free_fragments_mutex);
static irqreturn_t
vchiq_doorbell_irq(int irq, void *dev_id);
-static int
+static struct vchiq_pagelist_info *
create_pagelist(char __user *buf, size_t count, unsigned short type,
- struct task_struct *task, PAGELIST_T **ppagelist,
- dma_addr_t *dma_addr);
+ struct task_struct *task);
static void
-free_pagelist(dma_addr_t dma_addr, PAGELIST_T *pagelist, int actual);
+free_pagelist(struct vchiq_pagelist_info *pagelistinfo,
+ int actual);
int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
{
@@ -97,6 +109,16 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
int slot_mem_size, frag_mem_size;
int err, irq, i;
+ /*
+ * Setting the DMA mask is necessary in the 64 bit environment.
+ * It isn't necessary in a 32 bit environment, but is considered
+ * a good practice.
+ */
+ err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+
+ if (err < 0)
+ return err;
+
(void)of_property_read_u32(dev->of_node, "cache-line-size",
&g_cache_line_size);
g_fragments_size = 2 * g_cache_line_size;
@@ -226,29 +248,27 @@ VCHIQ_STATUS_T
vchiq_prepare_bulk_data(VCHIQ_BULK_T *bulk, VCHI_MEM_HANDLE_T memhandle,
void *offset, int size, int dir)
{
- PAGELIST_T *pagelist;
- int ret;
- dma_addr_t dma_addr;
+ struct vchiq_pagelist_info *pagelistinfo;
WARN_ON(memhandle != VCHI_MEM_HANDLE_INVALID);
- ret = create_pagelist((char __user *)offset, size,
- (dir == VCHIQ_BULK_RECEIVE)
- ? PAGELIST_READ
- : PAGELIST_WRITE,
- current,
- &pagelist,
- &dma_addr);
+ pagelistinfo = create_pagelist((char __user *)offset, size,
+ (dir == VCHIQ_BULK_RECEIVE)
+ ? PAGELIST_READ
+ : PAGELIST_WRITE,
+ current);
- if (ret != 0)
+ if (!pagelistinfo)
return VCHIQ_ERROR;
bulk->handle = memhandle;
- bulk->data = (void *)(unsigned long)dma_addr;
+ bulk->data = (void *)(unsigned long)pagelistinfo->dma_addr;
- /* Store the pagelist address in remote_data, which isn't used by the
- slave. */
- bulk->remote_data = pagelist;
+ /*
+ * Store the pagelistinfo address in remote_data,
+ * which isn't used by the slave.
+ */
+ bulk->remote_data = pagelistinfo;
return VCHIQ_SUCCESS;
}
@@ -257,8 +277,8 @@ void
vchiq_complete_bulk(VCHIQ_BULK_T *bulk)
{
if (bulk && bulk->remote_data && bulk->actual)
- free_pagelist((dma_addr_t)(unsigned long)bulk->data,
- (PAGELIST_T *)bulk->remote_data, bulk->actual);
+ free_pagelist((struct vchiq_pagelist_info *)bulk->remote_data,
+ bulk->actual);
}
void
@@ -346,6 +366,25 @@ vchiq_doorbell_irq(int irq, void *dev_id)
return ret;
}
+static void
+cleaup_pagelistinfo(struct vchiq_pagelist_info *pagelistinfo)
+{
+ if (pagelistinfo->scatterlist_mapped) {
+ dma_unmap_sg(g_dev, pagelistinfo->scatterlist,
+ pagelistinfo->num_pages, pagelistinfo->dma_dir);
+ }
+
+ if (pagelistinfo->pages_need_release) {
+ unsigned int i;
+
+ for (i = 0; i < pagelistinfo->num_pages; i++)
+ put_page(pagelistinfo->pages[i]);
+ }
+
+ dma_free_coherent(g_dev, pagelistinfo->pagelist_buffer_size,
+ pagelistinfo->pagelist, pagelistinfo->dma_addr);
+}
+
/* There is a potential problem with partial cache lines (pages?)
** at the ends of the block when reading. If the CPU accessed anything in
** the same line (page?) then it may have pulled old data into the cache,
@@ -354,52 +393,64 @@ vchiq_doorbell_irq(int irq, void *dev_id)
** cached area.
*/
-static int
+static struct vchiq_pagelist_info *
create_pagelist(char __user *buf, size_t count, unsigned short type,
- struct task_struct *task, PAGELIST_T **ppagelist,
- dma_addr_t *dma_addr)
+ struct task_struct *task)
{
PAGELIST_T *pagelist;
+ struct vchiq_pagelist_info *pagelistinfo;
struct page **pages;
u32 *addrs;
unsigned int num_pages, offset, i, k;
int actual_pages;
- unsigned long *need_release;
size_t pagelist_size;
struct scatterlist *scatterlist, *sg;
int dma_buffers;
- int dir;
+ dma_addr_t dma_addr;
offset = ((unsigned int)(unsigned long)buf & (PAGE_SIZE - 1));
num_pages = (count + offset + PAGE_SIZE - 1) / PAGE_SIZE;
pagelist_size = sizeof(PAGELIST_T) +
- (num_pages * sizeof(unsigned long)) +
- sizeof(unsigned long) +
+ (num_pages * sizeof(u32)) +
(num_pages * sizeof(pages[0]) +
- (num_pages * sizeof(struct scatterlist)));
-
- *ppagelist = NULL;
-
- dir = (type == PAGELIST_WRITE) ? DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ (num_pages * sizeof(struct scatterlist))) +
+ sizeof(struct vchiq_pagelist_info);
/* Allocate enough storage to hold the page pointers and the page
** list
*/
pagelist = dma_zalloc_coherent(g_dev,
pagelist_size,
- dma_addr,
+ &dma_addr,
GFP_KERNEL);
vchiq_log_trace(vchiq_arm_log_level, "create_pagelist - %pK",
pagelist);
if (!pagelist)
- return -ENOMEM;
+ return NULL;
+
+ addrs = pagelist->addrs;
+ pages = (struct page **)(addrs + num_pages);
+ scatterlist = (struct scatterlist *)(pages + num_pages);
+ pagelistinfo = (struct vchiq_pagelist_info *)
+ (scatterlist + num_pages);
+
+ pagelist->length = count;
+ pagelist->type = type;
+ pagelist->offset = offset;
- addrs = pagelist->addrs;
- need_release = (unsigned long *)(addrs + num_pages);
- pages = (struct page **)(addrs + num_pages + 1);
- scatterlist = (struct scatterlist *)(pages + num_pages);
+ /* Populate the fields of the pagelistinfo structure */
+ pagelistinfo->pagelist = pagelist;
+ pagelistinfo->pagelist_buffer_size = pagelist_size;
+ pagelistinfo->dma_addr = dma_addr;
+ pagelistinfo->dma_dir = (type == PAGELIST_WRITE) ?
+ DMA_TO_DEVICE : DMA_FROM_DEVICE;
+ pagelistinfo->num_pages = num_pages;
+ pagelistinfo->pages_need_release = 0;
+ pagelistinfo->pages = pages;
+ pagelistinfo->scatterlist = scatterlist;
+ pagelistinfo->scatterlist_mapped = 0;
if (is_vmalloc_addr(buf)) {
unsigned long length = count;
@@ -417,7 +468,7 @@ create_pagelist(char __user *buf, size_t count, unsigned short type,
length -= bytes;
off = 0;
}
- *need_release = 0; /* do not try and release vmalloc pages */
+ /* do not try and release vmalloc pages */
} else {
down_read(&task->mm->mmap_sem);
actual_pages = get_user_pages(
@@ -441,34 +492,34 @@ create_pagelist(char __user *buf, size_t count, unsigned short type,
actual_pages--;
put_page(pages[actual_pages]);
}
- dma_free_coherent(g_dev, pagelist_size,
- pagelist, *dma_addr);
- if (actual_pages == 0)
- actual_pages = -ENOMEM;
- return actual_pages;
+ cleaup_pagelistinfo(pagelistinfo);
+ return NULL;
}
- *need_release = 1; /* release user pages */
+ /* release user pages */
+ pagelistinfo->pages_need_release = 1;
}
- pagelist->length = count;
- pagelist->type = type;
- pagelist->offset = offset;
-
+ /*
+ * Not completely necessary, initialize the scatterlist
+ * so that the magic cookie is filled if debugging is enabled
+ */
+ sg_init_table(scatterlist, num_pages);
+ /* Now set the pages for each scatterlist */
for (i = 0; i < num_pages; i++)
sg_set_page(scatterlist + i, pages[i], PAGE_SIZE, 0);
dma_buffers = dma_map_sg(g_dev,
scatterlist,
num_pages,
- dir);
+ pagelistinfo->dma_dir);
if (dma_buffers == 0) {
- dma_free_coherent(g_dev, pagelist_size,
- pagelist, *dma_addr);
-
- return -EINTR;
+ cleaup_pagelistinfo(pagelistinfo);
+ return NULL;
}
+ pagelistinfo->scatterlist_mapped = 1;
+
/* Combine adjacent blocks for performance */
k = 0;
for_each_sg(scatterlist, sg, dma_buffers, i) {
@@ -500,11 +551,8 @@ create_pagelist(char __user *buf, size_t count, unsigned short type,
char *fragments;
if (down_interruptible(&g_free_fragments_sema) != 0) {
- dma_unmap_sg(g_dev, scatterlist, num_pages,
- DMA_BIDIRECTIONAL);
- dma_free_coherent(g_dev, pagelist_size,
- pagelist, *dma_addr);
- return -EINTR;
+ cleaup_pagelistinfo(pagelistinfo);
+ return NULL;
}
WARN_ON(g_free_fragments == NULL);
@@ -518,42 +566,28 @@ create_pagelist(char __user *buf, size_t count, unsigned short type,
(fragments - g_fragments_base) / g_fragments_size;
}
- *ppagelist = pagelist;
-
- return 0;
+ return pagelistinfo;
}
static void
-free_pagelist(dma_addr_t dma_addr, PAGELIST_T *pagelist, int actual)
+free_pagelist(struct vchiq_pagelist_info *pagelistinfo,
+ int actual)
{
- unsigned long *need_release;
- struct page **pages;
- unsigned int num_pages, i;
- size_t pagelist_size;
- struct scatterlist *scatterlist;
- int dir;
+ unsigned int i;
+ PAGELIST_T *pagelist = pagelistinfo->pagelist;
+ struct page **pages = pagelistinfo->pages;
+ unsigned int num_pages = pagelistinfo->num_pages;
vchiq_log_trace(vchiq_arm_log_level, "free_pagelist - %pK, %d",
- pagelist, actual);
+ pagelistinfo->pagelist, actual);
- dir = (pagelist->type == PAGELIST_WRITE) ? DMA_TO_DEVICE :
- DMA_FROM_DEVICE;
-
- num_pages =
- (pagelist->length + pagelist->offset + PAGE_SIZE - 1) /
- PAGE_SIZE;
-
- pagelist_size = sizeof(PAGELIST_T) +
- (num_pages * sizeof(unsigned long)) +
- sizeof(unsigned long) +
- (num_pages * sizeof(pages[0]) +
- (num_pages * sizeof(struct scatterlist)));
-
- need_release = (unsigned long *)(pagelist->addrs + num_pages);
- pages = (struct page **)(pagelist->addrs + num_pages + 1);
- scatterlist = (struct scatterlist *)(pages + num_pages);
-
- dma_unmap_sg(g_dev, scatterlist, num_pages, dir);
+ /*
+ * NOTE: dma_unmap_sg must be called before the
+ * cpu can touch and of the data/pages.
+ */
+ dma_unmap_sg(g_dev, pagelistinfo->scatterlist,
+ pagelistinfo->num_pages, pagelistinfo->dma_dir);
+ pagelistinfo->scatterlist_mapped = 0;
/* Deal with any partial cache lines (fragments) */
if (pagelist->type >= PAGELIST_READ_WITH_FRAGMENTS) {
@@ -591,27 +625,12 @@ free_pagelist(dma_addr_t dma_addr, PAGELIST_T *pagelist, int actual)
up(&g_free_fragments_sema);
}
- if (*need_release) {
- unsigned int length = pagelist->length;
- unsigned int offset = pagelist->offset;
-
- for (i = 0; i < num_pages; i++) {
- struct page *pg = pages[i];
-
- if (pagelist->type != PAGELIST_WRITE) {
- unsigned int bytes = PAGE_SIZE - offset;
-
- if (bytes > length)
- bytes = length;
-
- length -= bytes;
- offset = 0;
- set_page_dirty(pg);
- }
- put_page(pg);
- }
+ /* Need to mark all the pages dirty. */
+ if (pagelist->type != PAGELIST_WRITE &&
+ pagelistinfo->pages_need_release) {
+ for (i = 0; i < num_pages; i++)
+ set_page_dirty(pages[i]);
}
- dma_free_coherent(g_dev, pagelist_size,
- pagelist, dma_addr);
+ cleaup_pagelistinfo(pagelistinfo);
}
--
2.10.1
^ permalink raw reply related
* [PATCH 2/2] irqchip/gic-v3: Use nops macro for Cavium ThunderX erratum 23154
From: Marc Zyngier @ 2016-10-28 15:16 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1477653838-21569-2-git-send-email-will.deacon@arm.com>
Hi Will,
On 28/10/16 12:23, Will Deacon wrote:
> The workaround for Cavium ThunderX erratum 23154 has a homebrew
> pipeflush built out of NOP sequences around the read of the IAR.
>
> This patch converts the code to use the new nops macro, which makes it
> a little easier to read.
>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
Both patches queued for 4.10.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply
* [PATCH 2/2] irqchip/gic-v3: Use nops macro for Cavium ThunderX erratum 23154
From: Mark Rutland @ 2016-10-28 15:10 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1477653838-21569-2-git-send-email-will.deacon@arm.com>
On Fri, Oct 28, 2016 at 12:23:58PM +0100, Will Deacon wrote:
> The workaround for Cavium ThunderX erratum 23154 has a homebrew
> pipeflush built out of NOP sequences around the read of the IAR.
>
> This patch converts the code to use the new nops macro, which makes it
> a little easier to read.
>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
(yes, I counted them)
Thanks,
Mark.
> ---
> arch/arm64/include/asm/arch_gicv3.h | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
> index fdf34f8b4ee0..0313670a3e3f 100644
> --- a/arch/arm64/include/asm/arch_gicv3.h
> +++ b/arch/arm64/include/asm/arch_gicv3.h
> @@ -122,14 +122,9 @@ static inline u64 gic_read_iar_cavium_thunderx(void)
> {
> u64 irqstat;
>
> - asm volatile(
> - "nop;nop;nop;nop\n\t"
> - "nop;nop;nop;nop");
> -
> + nops(8);
> irqstat = read_sysreg_s(ICC_IAR1_EL1);
> -
> - asm volatile(
> - "nop;nop;nop;nop");
> + nops(4);
> mb();
>
> return irqstat;
> --
> 2.1.4
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply
* [PATCH 1/2] irqchip/gic-v3: Convert arm64 GIC accessors to {read, write}_sysreg_s
From: Mark Rutland @ 2016-10-28 14:54 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1477653838-21569-1-git-send-email-will.deacon@arm.com>
On Fri, Oct 28, 2016 at 12:23:57PM +0100, Will Deacon wrote:
> The GIC system registers are accessed using open-coded wrappers around
> the mrs_s/msr_s asm macros.
>
> This patch moves the code over to the {read,wrote}_sysreg_s accessors
> instead, reducing the amount of explicit asm blocks in the arch headers.
It's nice to see more of this going away!
[...]
> @@ -134,10 +124,12 @@ static inline u64 gic_read_iar_cavium_thunderx(void)
>
> asm volatile(
> "nop;nop;nop;nop\n\t"
> - "nop;nop;nop;nop\n\t"
> - "mrs_s %0, " __stringify(ICC_IAR1_EL1) "\n\t"
> - "nop;nop;nop;nop"
> - : "=r" (irqstat));
> + "nop;nop;nop;nop");
> +
> + irqstat = read_sysreg_s(ICC_IAR1_EL1);
> +
> + asm volatile(
> + "nop;nop;nop;nop");
This looks odd, but I see that it disappears in the next patch anyway,
and mirrors the above.
Otherwise, all the transformations look correct to me. FWIW:
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Thanks,
Mark.
^ permalink raw reply
* [RFC][PATCH] arm64: Add support for CONFIG_DEBUG_VIRTUAL
From: Mark Rutland @ 2016-10-28 14:49 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1477613892-26076-1-git-send-email-labbott@redhat.com>
Hi Laura,
On Thu, Oct 27, 2016 at 05:18:12PM -0700, Laura Abbott wrote:
> x86 has an option CONFIG_DEBUG_VIRTUAL to do additional checks
> on virt_to_phys calls. The goal is to catch users who are calling
> virt_to_phys on non-linear addresses immediately. As features
> such as CONFIG_VMAP_STACK get enabled for arm64, this becomes
> increasingly important. Add checks to catch bad virt_to_phys
> usage.
>
> Signed-off-by: Laura Abbott <labbott@redhat.com>
> ---
> This has been on my TODO list for a while. It caught a few bugs with
> CONFIG_VMAP_STACK on x86 so when that eventually gets added
> for arm64 it will be useful to have. This caught one driver calling __pa on an
> ioremapped address already.
This is fantastic; thanks for putting this together!
> RFC for a couple of reasons:
>
> 1) This is basically a direct port of the x86 approach.
> 2) I needed some #ifndef __ASSEMBLY__ which I don't like to throw around.
> 3) I'm not quite sure about the bounds check for the VIRTUAL_BUG_ON with KASLR,
> specifically the _end check.
> 4) Is it worth actually keeping this as DEBUG_VIRTUAL vs. folding it into
> another option?
I think it's worth aligning with x86, so modulo a couple of comments
below, (1) and (4) seem fine. I think (2) just requires an additional
shuffle.
> ---
> arch/arm64/include/asm/memory.h | 11 ++++++++++-
> arch/arm64/mm/Makefile | 2 +-
> arch/arm64/mm/physaddr.c | 17 +++++++++++++++++
> lib/Kconfig.debug | 2 +-
> mm/cma.c | 2 +-
> 5 files changed, 30 insertions(+), 4 deletions(-)
> create mode 100644 arch/arm64/mm/physaddr.c
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index ba62df8..9805adc 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -106,11 +106,19 @@
> * private definitions which should NOT be used outside memory.h
> * files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
> */
> -#define __virt_to_phys(x) ({ \
> +#define __virt_to_phys_nodebug(x) ({ \
> phys_addr_t __x = (phys_addr_t)(x); \
> __x & BIT(VA_BITS - 1) ? (__x & ~PAGE_OFFSET) + PHYS_OFFSET : \
> (__x - kimage_voffset); })
>
> +#ifdef CONFIG_DEBUG_VIRTUAL
> +#ifndef __ASSEMBLY__
> +extern unsigned long __virt_to_phys(unsigned long x);
> +#endif
> +#else
> +#define __virt_to_phys(x) __virt_to_phys_nodebug(x)
> +#endif
> +
> #define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET) | PAGE_OFFSET)
> #define __phys_to_kimg(x) ((unsigned long)((x) + kimage_voffset))
I think we can move all the existing conversion logic here (including
into page_to_phys/phys_to_page) into the existing #ifndef __ASSEMBLY__
block at the end of the file. Assembly clearly can't use these, and we
already have virt_to_phys and others in that #ifndef.
Assuming that works, would you mind doing that as a preparatory patch?
> @@ -202,6 +210,7 @@ static inline void *phys_to_virt(phys_addr_t x)
> * Drivers should NOT use these either.
> */
> #define __pa(x) __virt_to_phys((unsigned long)(x))
> +#define __pa_nodebug(x) __virt_to_phys_nodebug((unsigned long)(x))
> #define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x)))
> #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
> #define virt_to_pfn(x) __phys_to_pfn(__virt_to_phys(x))
> diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
> index 54bb209..bcea84e 100644
> --- a/arch/arm64/mm/Makefile
> +++ b/arch/arm64/mm/Makefile
> @@ -5,6 +5,6 @@ obj-y := dma-mapping.o extable.o fault.o init.o \
> obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
> obj-$(CONFIG_ARM64_PTDUMP) += dump.o
> obj-$(CONFIG_NUMA) += numa.o
> -
> +obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> obj-$(CONFIG_KASAN) += kasan_init.o
> KASAN_SANITIZE_kasan_init.o := n
> diff --git a/arch/arm64/mm/physaddr.c b/arch/arm64/mm/physaddr.c
> new file mode 100644
> index 0000000..6c271e2
> --- /dev/null
> +++ b/arch/arm64/mm/physaddr.c
> @@ -0,0 +1,17 @@
> +#include <linux/mm.h>
> +
> +#include <asm/memory.h>
> +
> +unsigned long __virt_to_phys(unsigned long x)
> +{
> + phys_addr_t __x = (phys_addr_t)x;
> +
> + if (__x & BIT(VA_BITS - 1)) {
> + /* The bit check ensures this is the right range */
> + return (__x & ~PAGE_OFFSET) + PHYS_OFFSET;
> + } else {
> + VIRTUAL_BUG_ON(x < kimage_vaddr || x > (unsigned long)_end);
IIUC, in (3) you were asking if the last check should be '>' or '>='?
To match high_memory, I suspect the latter, as _end doesn't fall within
the mapped virtual address space.
> + return (__x - kimage_voffset);
> + }
> +}
> +EXPORT_SYMBOL(__virt_to_phys);
It's a bit annoying that we have to duplicate the logic here to add the
VIRTUAL_BUG_ON(), but I see that x86 also do this, and I don't have a
better suggestion.
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 33bc56c..e5634bb 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -604,7 +604,7 @@ config DEBUG_VM_PGFLAGS
>
> config DEBUG_VIRTUAL
> bool "Debug VM translations"
> - depends on DEBUG_KERNEL && X86
> + depends on DEBUG_KERNEL && (X86 || ARM64)
I have no strong feelings about this, but perhaps it's nicer to have
something like ARCH_HAS_DEBUG_VIRTUAL?
> help
> Enable some costly sanity checks in virtual to page code. This can
> catch mistakes with virt_to_page() and friends.
> diff --git a/mm/cma.c b/mm/cma.c
> index 384c2cb..2345803 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -235,7 +235,7 @@ int __init cma_declare_contiguous(phys_addr_t base,
> phys_addr_t highmem_start;
> int ret = 0;
>
> -#ifdef CONFIG_X86
> +#if defined(CONFIG_X86) || defined(CONFIG_ARM64)
Rather than an architecture list, can we depend on DEBUG_VIRTUAL? Are
there other checks that we're trying to avoid?
... or could we avoid ifdeffery entirely with something like:
/*
* We can't use __pa(high_memory) directly, since high_memory
* isn't a valid direct map VA, and DEBUG_VIRTUAL will (validly)
* complain. Find the boundary by adding one to the last valid
* address.
*/
highmem_start = __pa(high_memory - 1) + 1;
Thanks,
Mark.
> /*
> * high_memory isn't direct mapped memory so retrieving its physical
> * address isn't appropriate. But it would be useful to check the
> --
> 2.7.4
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox