* [PATCH v2 03/40] iommu/sva: Manage process address spaces
From: Ilias Apalodimas @ 2018-05-24 11:50 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <f73b4a0e-669e-8483-88d7-1b2c8a2b9934@arm.com>
> Interesting, I hadn't thought about this use-case before. At first I
> thought you were talking about mdev devices assigned to VMs, but I think
> you're referring to mdevs assigned to userspace drivers instead? Out of
> curiosity, is it only theoretical or does someone actually need this?
There has been some non upstreamed efforts to have mdev and produce userspace
drivers. Huawei is using it on what they call "wrapdrive" for crypto devices and
we did a proof of concept for ethernet interfaces. At the time we choose not to
involve the IOMMU for the reason you mentioned, but having it there would be
good.
Thanks
Ilias
^ permalink raw reply
* [PATCH v2 13/40] vfio: Add support for Shared Virtual Addressing
From: Jean-Philippe Brucker @ 2018-05-24 11:44 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <5B0536A3.1000304@huawei.com>
Hi,
On 23/05/18 10:38, Xu Zaibo wrote:
>> +static int vfio_iommu_bind_group(struct vfio_iommu *iommu,
>> +???????????????? struct vfio_group *group,
>> +???????????????? struct vfio_mm *vfio_mm)
>> +{
>> +??? int ret;
>> +??? bool enabled_sva = false;
>> +??? struct vfio_iommu_sva_bind_data data = {
>> +??????? .vfio_mm??? = vfio_mm,
>> +??????? .iommu??????? = iommu,
>> +??????? .count??????? = 0,
>> +??? };
>> +
>> +??? if (!group->sva_enabled) {
>> +??????? ret = iommu_group_for_each_dev(group->iommu_group, NULL,
>> +?????????????????????????? vfio_iommu_sva_init);
> Do we need to do *sva_init here or do anything to avoid repeated
> initiation?
> while another process already did initiation at this device, I think
> that current process will get an EEXIST.
Right, sva_init() must be called once for any device that intends to use
bind(). For the second process though, group->sva_enabled will be true
so we won't call sva_init() again, only bind().
Thanks,
Jean
^ permalink raw reply
* [PATCH v2 07/40] iommu: Add a page fault handler
From: Jean-Philippe Brucker @ 2018-05-24 11:44 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522163521.413e60c6@jacob-builder>
On 23/05/18 00:35, Jacob Pan wrote:
>>>> + /* Insert *before* the last fault */
>>>> + list_move(&fault->head, &group->faults);
>>>> + }
>>>> +
>>> If you already sorted the group list with last fault at the end,
>>> why do you need a separate entry to track the last fault?
>>
>> Not sure I understand your question, sorry. Do you mean why the
>> iopf_group.last_fault? Just to avoid one more kzalloc.
>>
> kind of :) what i thought was that why can't the last_fault naturally
> be the last entry in your group fault list? then there is no need for
> special treatment in terms of allocation of the last fault. just my
> preference.
But we need a kzalloc for the last fault anyway, so I thought I'd just
piggy-back on the group allocation. And if I don't add the last fault at
the end of group->faults, then it's iopf_handle_group that requires
special treatment. I'm still not sure I understood your question though,
could you send me a patch that does it?
>>>> +
>>>> + queue->flush(queue->flush_arg, dev);
>>>> +
>>>> + /*
>>>> + * No need to clear the partial list. All PRGs containing
>>>> the PASID that
>>>> + * needs to be decommissioned are whole (the device driver
>>>> made sure of
>>>> + * it before this function was called). They have been
>>>> submitted to the
>>>> + * queue by the above flush().
>>>> + */
>>> So you are saying device driver need to make sure LPIG PRQ is
>>> submitted in the flush call above such that partial list is
>>> cleared?
>>
>> Not exactly, it's the IOMMU driver that makes sure all LPIG in its
>> queues are submitted by the above flush call. In more details the
>> flow is:
>>
>> * Either device driver calls unbind()/sva_device_shutdown(), or the
>> process exits.
>> * If the device driver called, then it already told the device to stop
>> using the PASID. Otherwise we use the mm_exit() callback to tell the
>> device driver to stop using the PASID.
>> * In either case, when receiving a stop request from the driver, the
>> device sends the LPIGs to the IOMMU queue.
>> * Then, the flush call above ensures that the IOMMU reports the LPIG
>> with iommu_report_device_fault.
>> * While submitting all LPIGs for this PASID to the work queue,
>> ipof_queue_fault also picked up all partial faults, so the partial
>> list is clean.
>>
>> Maybe I should improve this comment?
>>
> thanks for explaining. LPIG submission is done by device asynchronously
> w.r.t. driver stopping/decommission PASID.
Hmm, it should really be synchronous, otherwise there is no way to know
when the PASID can be decommissioned. We need a guarantee such as the
one in 6.20.1 of the PCIe spec, "Managing PASID TLP Prefix Usage":
"When the stop request mechanism indicates completion, the Function has:
* Completed all Non-Posted Requests associated with this PASID.
* Flushed to the host all Posted Requests addressing host memory in all
TCs that were used by the PASID."
That's in combination with "The function shall [...] finish transmitting
any multi-page Page Request Messages for this PASID (i.e. send the Page
Request Message with the L bit Set)." from the ATS spec.
(If I remember correctly a PRI Page Request is a Posted Request.) Only
after this stop request completes can the driver call unbind(), or
return from exit_mm(). Then we know that if there was a LPIG for that
PASID, it is in the IOMMU's PRI queue (or already completed) once we
call flush().
> so if we were to use this
> flow on vt-d, which does not stall page request queue, then we should
> use the iommu model specific flush() callback to ensure LPIG is
> received? There could be queue full condition and retry. I am just
> trying to understand how and where we can make sure LPIG is
> received and all groups are whole.
For SMMU in patch 30, the flush() callback waits until the PRI queue is
empty or, when the PRI thread is running in parallel, until the thread
has done a full circle (handled as many faults as the queue size). It's
really unpleasant to implement because the flush() callback takes a lock
to inspect the hardware state, but I don't think we have a choice.
Thanks,
Jean
^ permalink raw reply
* [PATCH v2 03/40] iommu/sva: Manage process address spaces
From: Jean-Philippe Brucker @ 2018-05-24 11:44 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522094334.71f0e36b@jacob-builder>
On 22/05/18 17:43, Jacob Pan wrote:
> On Thu, 17 May 2018 11:02:42 +0100
> Jean-Philippe Brucker <jean-philippe.brucker@arm.com> wrote:
>
>> On 17/05/18 00:31, Jacob Pan wrote:
>>> On Fri, 11 May 2018 20:06:04 +0100
>>> I am a little confused about domain vs. pasid relationship. If
>>> each domain represents a address space, should there be a domain for
>>> each pasid?
>>
>> I don't think there is a formal definition, but from previous
>> discussion the consensus seems to be: domains are a collection of
>> devices that have the same virtual address spaces (one or many).
>>
>> Keeping that definition makes things easier, in my opinion. Some time
>> ago, I did try to represent PASIDs using "subdomains" (introducing a
>> hierarchy of struct iommu_domain), but it required invasive changes in
>> the IOMMU subsystem and probably all over the tree.
>>
>> You do need some kind of "root domain" for each device, so that
>> "iommu_get_domain_for_dev()" still makes sense. That root domain
>> doesn't have a single address space but a collection of subdomains.
>> If you need this anyway, representing a PASID with an iommu_domain
>> doesn't seem preferable than using a different structure (io_mm),
>> because they don't have anything in common.
>>
> My main concern is the PASID table storage. If PASID table storage
> is tied to domain, it is ok to scale up, i.e. multiple devices in a
> domain share a single PASID table. But to scale down, e.g. further
> partition device with VFIO mdev for assignment, each mdev may get its
> own domain via vfio. But there is no IOMMU storage for PASID table at
> mdev device level. Perhaps we do need root domain or some parent-child
> relationship to locate PASID table.
Interesting, I hadn't thought about this use-case before. At first I
thought you were talking about mdev devices assigned to VMs, but I think
you're referring to mdevs assigned to userspace drivers instead? Out of
curiosity, is it only theoretical or does someone actually need this?
I don't think mdev for VM assignment are compatible with PASID, at least
not when the IOMMU is involved. I usually ignore mdev in my reasoning
because, as far as I know, currently they only affect devices that have
their own MMU, and IOMMU domains don't come into play. However, if a
device was backed by the IOMMU, and the device driver wanted to
partition it into mdevs, then users would be tempted to assign mdev1 to
VM1 and mdev2 to VM2.
It doesn't work with PASID, because the PASID spaces of VM1 and VM2 will
conflict. If both VM1 and VM2 allocate PASID #1, then the host has to
somehow arbitrate device accesses, for example scheduling first mdev1
then mdev2. That's possible if the device driver is in charge of the
MMU, but not really compatible with the IOMMU.
So in the IOMMU subsystem, for assigning devices to VMs the only
model that makes sense is SR-IOV, where each VF/mdev has its own RID and
its own PASID table. In that case you'd get one IOMMU domain per VF.
But considering userspace drivers in the host alone, it might make sense
to partition a device into mdevs and assign them to multiple processes.
Interestingly this scheme still doesn't work with the classic MAP/UNMAP
ioctl: since there is a single device context entry for all mdevs, the
mediating driver would have to catch all MAP/UNMAP ioctls and reject
those with IOVAs that overlap those of another mdev. It's doesn't seem
viable. But when using PASID then each mdev has its own address space,
and since PASIDs are allocated by the kernel there is no such conflict.
Anyway, I think this use-case can work with the current structures, if
mediating driver does the bind() instead of VFIO. That's necessary
because you can't let userspace program the PASID into the device, or
they would be able to access address spaces owned by other mdevs. Then
the mediating driver does the bind(), and keeps internal structures to
associate the process to the given mdev.
Thanks,
Jean
^ permalink raw reply
* [PATCH 07/14] arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation
From: Mark Rutland @ 2018-05-24 11:43 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522150648.28297-8-marc.zyngier@arm.com>
On Tue, May 22, 2018 at 04:06:41PM +0100, Marc Zyngier wrote:
> In order to avoid checking arm64_ssbd_callback_required on each
> kernel entry/exit even if no mitigation is required, let's
> add yet another alternative that by default jumps over the mitigation,
> and that gets nop'ed out if we're doing dynamic mitigation.
>
> Think of it as a poor man's static key...
I guess in future we can magic up a more general asm static key if we
need them elsewhere.
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Mark.
> ---
> arch/arm64/kernel/cpu_errata.c | 14 ++++++++++++++
> arch/arm64/kernel/entry.S | 3 +++
> 2 files changed, 17 insertions(+)
>
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index f1d4e75b0ddd..8f686f39b9c1 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -283,6 +283,20 @@ void __init arm64_update_smccc_conduit(struct alt_instr *alt,
> *updptr = cpu_to_le32(insn);
> }
>
> +void __init arm64_enable_wa2_handling(struct alt_instr *alt,
> + __le32 *origptr, __le32 *updptr,
> + int nr_inst)
> +{
> + BUG_ON(nr_inst != 1);
> + /*
> + * Only allow mitigation on EL1 entry/exit and guest
> + * ARCH_WORKAROUND_2 handling if the SSBD state allows it to
> + * be flipped.
> + */
> + if (arm64_get_ssbd_state() == ARM64_SSBD_EL1_ENTRY)
> + *updptr = cpu_to_le32(aarch64_insn_gen_nop());
> +}
> +
> static void do_ssbd(bool state)
> {
> switch (psci_ops.conduit) {
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 29ad672a6abd..e6f6e2339b22 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -142,6 +142,9 @@ alternative_else_nop_endif
> // to save/restore them if required.
> .macro apply_ssbd, state, targ, tmp1, tmp2
> #ifdef CONFIG_ARM64_SSBD
> +alternative_cb arm64_enable_wa2_handling
> + b \targ
> +alternative_cb_end
> ldr_this_cpu \tmp2, arm64_ssbd_callback_required, \tmp1
> cbz \tmp2, \targ
> mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
> --
> 2.14.2
>
^ permalink raw reply
* [PATCH V2] PCI/portdrv: do not disable device on reboot/shutdown
From: Sinan Kaya @ 2018-05-24 11:43 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <61f70fd6-52fd-da07-ce73-303f95132131@codeaurora.org>
On 5/23/2018 6:57 PM, Sinan Kaya wrote:
>> The crash seems to indicate that the hpsa device attempted a DMA after
>> we cleared the Root Port's PCI_COMMAND_MASTER, which means
>> hpsa_shutdown() didn't stop DMA from the device (it looks like *most*
>> shutdown methods don't disable device DMA, so it's in good company).
> All drivers are expected to shutdown DMA and interrupts in their shutdown()
> routines. They can skip removing threads, data structures etc. but DMA and
> interrupt disabling are required. This is the difference between shutdown()
> and remove() callbacks.
I found this note yesterday to see why we are not disabling the devices in the
PCI core itself.
pci_device_remove()
/*
* We would love to complain here if pci_dev->is_enabled is set, that
* the driver should have called pci_disable_device(), but the
* unfortunate fact is there are too many odd BIOS and bridge setups
* that don't like drivers doing that all of the time.
* Oh well, we can dream of sane hardware when we sleep, no matter how
* horrible the crap we have to deal with is when we are awake...
*/
Ryan, can you discard the previous patch and test this one instead? remove() path
in hpsa driver seems to call pci_disable_device() via
hpsa_remove_one()
hpsa_free_pci_init()
but nothing on the shutdown path.
diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 4ed3d26..3823f04 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -8651,6 +8651,7 @@ static void hpsa_shutdown(struct pci_dev *pdev)
h->access.set_intr_mask(h, HPSA_INTR_OFF);
hpsa_free_irqs(h); /* init_one 4 */
hpsa_disable_interrupt_mode(h); /* pci_init 2 */
+ pci_disable_device(h->pdev);
}
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
^ permalink raw reply related
* [PATCH 06/14] arm64: ssbd: Add global mitigation state accessor
From: Mark Rutland @ 2018-05-24 11:41 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522150648.28297-7-marc.zyngier@arm.com>
On Tue, May 22, 2018 at 04:06:40PM +0100, Marc Zyngier wrote:
> We're about to need the mitigation state in various parts of the
> kernel in order to do the right thing for userspace and guests.
>
> Let's expose an accessor that will let other subsystems know
> about the state.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Mark.
> ---
> arch/arm64/include/asm/cpufeature.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 9bc548e22784..1bacdf57f0af 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -543,6 +543,16 @@ static inline u64 read_zcr_features(void)
> #define ARM64_SSBD_FORCE_ENABLE 2
> #define ARM64_SSBD_MITIGATED 3
>
> +static inline int arm64_get_ssbd_state(void)
> +{
> +#ifdef CONFIG_ARM64_SSBD
> + extern int ssbd_state;
> + return ssbd_state;
> +#else
> + return ARM64_SSBD_UNKNOWN;
> +#endif
> +}
> +
> #endif /* __ASSEMBLY__ */
>
> #endif
> --
> 2.14.2
>
^ permalink raw reply
* [PATCH 05/14] arm64: Add 'ssbd' command-line option
From: Mark Rutland @ 2018-05-24 11:40 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522150648.28297-6-marc.zyngier@arm.com>
On Tue, May 22, 2018 at 04:06:39PM +0100, Marc Zyngier wrote:
> On a system where the firmware implements ARCH_WORKAROUND_2,
> it may be useful to either permanently enable or disable the
> workaround for cases where the user decides that they'd rather
> not get a trap overhead, and keep the mitigation permanently
> on or off instead of switching it on exception entry/exit.
>
> In any case, default to the mitigation being enabled.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 17 ++++
> arch/arm64/include/asm/cpufeature.h | 6 ++
> arch/arm64/kernel/cpu_errata.c | 102 ++++++++++++++++++++----
> 3 files changed, 109 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f2040d46f095..646e112c6f63 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4092,6 +4092,23 @@
> expediting. Set to zero to disable automatic
> expediting.
>
> + ssbd= [ARM64,HW]
> + Speculative Store Bypass Disable control
> +
> + On CPUs that are vulnerable to the Speculative
> + Store Bypass vulnerability and offer a
> + firmware based mitigation, this parameter
> + indicates how the mitigation should be used:
> +
> + force-on: Unconditionnaly enable mitigation for
> + for both kernel and userspace
> + force-off: Unconditionnaly disable mitigation for
> + for both kernel and userspace
> + kernel: Always enable mitigation in the
> + kernel, and offer a prctl interface
> + to allow userspace to register its
> + interest in being mitigated too.
> +
> stack_guard_gap= [MM]
> override the default stack gap protection. The value
> is in page units and it defines how many pages prior
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 09b0f2a80c8f..9bc548e22784 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -537,6 +537,12 @@ static inline u64 read_zcr_features(void)
> return zcr;
> }
>
> +#define ARM64_SSBD_UNKNOWN -1
> +#define ARM64_SSBD_FORCE_DISABLE 0
> +#define ARM64_SSBD_EL1_ENTRY 1
The EL1_ENTRY part of the name is a bit misleading, since this doesn't
apply to EL1->EL1 exceptions (and as with many other bits of the arm64
code, it's arguably misleading in the VHE case).
Perhaps ARM64_SSBD_KERNEL, which would align with the parameter name?
Not a big deal either way, and otherwise this looks good to me.
Regardless:
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Mark.
> +#define ARM64_SSBD_FORCE_ENABLE 2
> +#define ARM64_SSBD_MITIGATED 3
> +
> #endif /* __ASSEMBLY__ */
>
> #endif
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 7fd6d5b001f5..f1d4e75b0ddd 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -235,6 +235,38 @@ enable_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
> #ifdef CONFIG_ARM64_SSBD
> DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
>
> +int ssbd_state __read_mostly = ARM64_SSBD_EL1_ENTRY;
> +
> +static const struct ssbd_options {
> + const char *str;
> + int state;
> +} ssbd_options[] = {
> + { "force-on", ARM64_SSBD_FORCE_ENABLE, },
> + { "force-off", ARM64_SSBD_FORCE_DISABLE, },
> + { "kernel", ARM64_SSBD_EL1_ENTRY, },
> +};
> +
> +static int __init ssbd_cfg(char *buf)
> +{
> + int i;
> +
> + if (!buf || !buf[0])
> + return -EINVAL;
> +
> + for (i = 0; i < ARRAY_SIZE(ssbd_options); i++) {
> + int len = strlen(ssbd_options[i].str);
> +
> + if (strncmp(buf, ssbd_options[i].str, len))
> + continue;
> +
> + ssbd_state = ssbd_options[i].state;
> + return 0;
> + }
> +
> + return -EINVAL;
> +}
> +early_param("ssbd", ssbd_cfg);
> +
> void __init arm64_update_smccc_conduit(struct alt_instr *alt,
> __le32 *origptr, __le32 *updptr,
> int nr_inst)
> @@ -272,44 +304,82 @@ static bool has_ssbd_mitigation(const struct arm64_cpu_capabilities *entry,
> int scope)
> {
> struct arm_smccc_res res;
> - bool supported = true;
> + bool required = true;
> + s32 val;
>
> WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
>
> - if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
> + if (psci_ops.smccc_version == SMCCC_VERSION_1_0) {
> + ssbd_state = ARM64_SSBD_UNKNOWN;
> return false;
> + }
>
> - /*
> - * The probe function return value is either negative
> - * (unsupported or mitigated), positive (unaffected), or zero
> - * (requires mitigation). We only need to do anything in the
> - * last case.
> - */
> switch (psci_ops.conduit) {
> case PSCI_CONDUIT_HVC:
> arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> ARM_SMCCC_ARCH_WORKAROUND_2, &res);
> - if ((int)res.a0 != 0)
> - supported = false;
> break;
>
> case PSCI_CONDUIT_SMC:
> arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> ARM_SMCCC_ARCH_WORKAROUND_2, &res);
> - if ((int)res.a0 != 0)
> - supported = false;
> break;
>
> default:
> - supported = false;
> + ssbd_state = ARM64_SSBD_UNKNOWN;
> + return false;
> }
>
> - if (supported) {
> - __this_cpu_write(arm64_ssbd_callback_required, 1);
> + val = (s32)res.a0;
> +
> + switch (val) {
> + case SMCCC_RET_NOT_SUPPORTED:
> + ssbd_state = ARM64_SSBD_UNKNOWN;
> + return false;
> +
> + case SMCCC_RET_NOT_REQUIRED:
> + ssbd_state = ARM64_SSBD_MITIGATED;
> + return false;
> +
> + case SMCCC_RET_SUCCESS:
> + required = true;
> + break;
> +
> + case 1: /* Mitigation not required on this CPU */
> + required = false;
> + break;
> +
> + default:
> + WARN_ON(1);
> + return false;
> + }
> +
> + switch (ssbd_state) {
> + case ARM64_SSBD_FORCE_DISABLE:
> + pr_info_once("%s disabled from command-line\n", entry->desc);
> + do_ssbd(false);
> + required = false;
> + break;
> +
> + case ARM64_SSBD_EL1_ENTRY:
> + if (required) {
> + __this_cpu_write(arm64_ssbd_callback_required, 1);
> + do_ssbd(true);
> + }
> + break;
> +
> + case ARM64_SSBD_FORCE_ENABLE:
> + pr_info_once("%s forced from command-line\n", entry->desc);
> do_ssbd(true);
> + required = true;
> + break;
> +
> + default:
> + WARN_ON(1);
> + break;
> }
>
> - return supported;
> + return required;
> }
> #endif /* CONFIG_ARM64_SSBD */
>
> --
> 2.14.2
>
^ permalink raw reply
* [PATCH 04/14] arm64: Add ARCH_WORKAROUND_2 probing
From: Will Deacon @ 2018-05-24 11:39 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <6c3e7b31-0bb6-353b-1b82-7ebdf3be1323@arm.com>
On Thu, May 24, 2018 at 10:58:43AM +0100, Suzuki K Poulose wrote:
> On 22/05/18 16:06, Marc Zyngier wrote:
> >As for Spectre variant-2, we rely on SMCCC 1.1 to provide the
> >discovery mechanism for detecting the SSBD mitigation.
> >
> >A new capability is also allocated for that purpose, and a
> >config option.
> >
> >Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>
>
> >+static bool has_ssbd_mitigation(const struct arm64_cpu_capabilities *entry,
> >+ int scope)
> >+{
> >+ struct arm_smccc_res res;
> >+ bool supported = true;
> >+
> >+ WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
> >+
> >+ if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
> >+ return false;
> >+
> >+ /*
> >+ * The probe function return value is either negative
> >+ * (unsupported or mitigated), positive (unaffected), or zero
> >+ * (requires mitigation). We only need to do anything in the
> >+ * last case.
> >+ */
> >+ switch (psci_ops.conduit) {
> >+ case PSCI_CONDUIT_HVC:
> >+ arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> >+ ARM_SMCCC_ARCH_WORKAROUND_2, &res);
> >+ if ((int)res.a0 != 0)
> >+ supported = false;
> >+ break;
> >+
> >+ case PSCI_CONDUIT_SMC:
> >+ arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> >+ ARM_SMCCC_ARCH_WORKAROUND_2, &res);
> >+ if ((int)res.a0 != 0)
> >+ supported = false;
> >+ break;
> >+
> >+ default:
> >+ supported = false;
> >+ }
> >+
> >+ if (supported) {
> >+ __this_cpu_write(arm64_ssbd_callback_required, 1);
> >+ do_ssbd(true);
> >+ }
>
>
> Marc,
>
> As discussed, we have minor issue with the "corner case". If a CPU
> is hotplugged in which requires the mitigation, after the system has
> finalised the cap to "not available", the CPU could go ahead and
> do the "work around" as above, while not effectively doing anything
> about it at runtime for KVM guests (as thats the only place where
> we rely on the CAP being set).
>
> But, yes this is real corner case. There is no easy way to solve it
> other than
>
> 1) Allow late modifications to CPU hwcaps
>
> OR
>
> 2) Penalise the fastpath to always check per-cpu setting.
Shouldn't we just avoid bring up CPUs that require the mitigation after
we've finalised the capability to say that it's not required? Assuming this
is just another issue with maxcpus=, then I don't much care for it.
Will
^ permalink raw reply
* [PATCH v6 05/17] media: rkisp1: add Rockchip ISP1 subdev driver
From: Baruch Siach @ 2018-05-24 11:30 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <CAAFQd5A7gAK4fXH9YVrobR5_QX_5f8xa272R_56v5YUghV6Sxw@mail.gmail.com>
Hi Tomasz,
On Mon, May 07, 2018 at 06:41:50AM +0000, Tomasz Figa wrote:
> On Mon, May 7, 2018 at 3:38 PM Baruch Siach <baruch@tkos.co.il> wrote:
> > On Mon, May 07, 2018 at 06:13:27AM +0000, Tomasz Figa wrote:
> > > On Thu, May 3, 2018 at 6:09 PM Baruch Siach <baruch@tkos.co.il> wrote:
> > > > On Thu, Mar 08, 2018 at 05:47:55PM +0800, Jacob Chen wrote:
> > > > > +static int rkisp1_isp_sd_s_power(struct v4l2_subdev *sd, int on)
> > > > > +{
> > > > > + struct rkisp1_device *isp_dev = sd_to_isp_dev(sd);
> > > > > + int ret;
> > > > > +
> > > > > + v4l2_dbg(1, rkisp1_debug, &isp_dev->v4l2_dev, "s_power: %d\n",
> > > on);
> > > > > +
> > > > > + if (on) {
> > > > > + ret = pm_runtime_get_sync(isp_dev->dev);
> > > > > + if (ret < 0)
> > > > > + return ret;
> > > > > +
> > > > > + rkisp1_config_clk(isp_dev);
> > > > > + } else {
> > > > > + ret = pm_runtime_put(isp_dev->dev);
> > >
> > > > I commented this line out to make more than one STREAMON work.
> Otherwise,
> > > > the second STREAMON hangs. I guess the bug is not this driver.
> Probably
> > > > something in drivers/soc/rockchip/pm_domains.c. Just noting that in
> case
> > > > you or someone on Cc would like to investigate it further.
> > > >
> > > > I tested v4.16-rc4 on the Tinkerboard.
> > >
> > > Looks like that version doesn't include the IOMMU PM and clock handling
> > > rework [1], which should fix a lot of runtime PM issues. FWIW,
> linux-next
> > > seems to already include it.
> > >
> > > [1] https://lkml.org/lkml/2018/3/23/44
>
> > Thanks for the reference.
>
> > It looks like the iommu driver part is in Linus' tree already. The DT
> part is
> > in the v4.18-armsoc/dts32 branch of Heiko's tree. Am I missing anything?
>
> You're right, most of the series made it in time for 4.17. However, the DT
> part (precisely, the clocks properties added to IOMMU nodes) is crucial for
> the fixes to be effective.
>
> > Anyway, I'll take a look.
>
> Thanks for testing. :) (Forgot to mention in my previous email...)
I finally got around to testing. Unfortunately, kernel v4.17-rc6 with
cherry-pick of commit c78751f91c0b (ARM: dts: rockchip: add clocks in iommu
nodes) from Heiko's tree still exhibit the same problem. STREAMON hangs on
second try. The same workaround "fixes" it.
baruch
--
http://baruch.siach.name/blog/ ~. .~ Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
- baruch at tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -
^ permalink raw reply
* [PATCH 02/14] arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
From: Marc Zyngier @ 2018-05-24 11:28 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180524112305.ld3jghzmb6a35h73@lakrids.cambridge.arm.com>
On 24/05/18 12:23, Mark Rutland wrote:
> On Thu, May 24, 2018 at 12:00:58PM +0100, Mark Rutland wrote:
>> On Tue, May 22, 2018 at 04:06:36PM +0100, Marc Zyngier wrote:
>>> In order for the kernel to protect itself, let's call the SSBD mitigation
>>> implemented by the higher exception level (either hypervisor or firmware)
>>> on each transition between userspace and kernel.
>>>
>>> We must take the PSCI conduit into account in order to target the
>>> right exception level, hence the introduction of a runtime patching
>>> callback.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>> arch/arm64/kernel/cpu_errata.c | 18 ++++++++++++++++++
>>> arch/arm64/kernel/entry.S | 22 ++++++++++++++++++++++
>>> include/linux/arm-smccc.h | 5 +++++
>>> 3 files changed, 45 insertions(+)
>>>
>>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>>> index a900befadfe8..46b3aafb631a 100644
>>> --- a/arch/arm64/kernel/cpu_errata.c
>>> +++ b/arch/arm64/kernel/cpu_errata.c
>>> @@ -232,6 +232,24 @@ enable_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
>>> }
>>> #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
>>>
>>> +#ifdef CONFIG_ARM64_SSBD
>>> +void __init arm64_update_smccc_conduit(struct alt_instr *alt,
>>> + __le32 *origptr, __le32 *updptr,
>>> + int nr_inst)
>>> +{
>>> + u32 insn;
>>> +
>>> + BUG_ON(nr_inst != 1);
>>> +
>>> + if (psci_ops.conduit == PSCI_CONDUIT_HVC)
>>> + insn = aarch64_insn_get_hvc_value();
>>> + else
>>> + insn = aarch64_insn_get_smc_value();
>>
>> Shouldn't this also handle the case where there is no conduit?
>
> Due to the config symbol not being defined yet, and various other fixups
> in later patches, this is actually benign.
>
> However, if you make this:
>
> switch (psci_ops.conduit) {
> case PSCI_CONDUIT_NONE:
> return;
> case PSCI_CONDUIT_HVC:
> insn = aarch64_insn_get_hvc_value();
> break;
> case PSCI_CONDUIT_SMC:
> insn = aarch64_insn_get_smc_value();
> break;
> }
>
> ... then we won't even bother patching the nop in the default case
> regardless, which is nicer, IMO.
Yup, looks better to me too. I'll fold that in.
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply
* [PATCH 04/14] arm64: Add ARCH_WORKAROUND_2 probing
From: Mark Rutland @ 2018-05-24 11:27 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522150648.28297-5-marc.zyngier@arm.com>
On Tue, May 22, 2018 at 04:06:38PM +0100, Marc Zyngier wrote:
> As for Spectre variant-2, we rely on SMCCC 1.1 to provide the
> discovery mechanism for detecting the SSBD mitigation.
>
> A new capability is also allocated for that purpose, and a
> config option.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[...]
> +static void do_ssbd(bool state)
> +{
> + switch (psci_ops.conduit) {
> + case PSCI_CONDUIT_HVC:
> + arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL);
> + break;
> +
> + case PSCI_CONDUIT_SMC:
> + arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, state, NULL);
> + break;
> +
> + default:
> + WARN_ON_ONCE(1);
> + break;
> + }
> +}
> +
> +static bool has_ssbd_mitigation(const struct arm64_cpu_capabilities *entry,
> + int scope)
> +{
> + struct arm_smccc_res res;
> + bool supported = true;
> +
> + WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
> +
> + if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
> + return false;
> +
> + /*
> + * The probe function return value is either negative
> + * (unsupported or mitigated), positive (unaffected), or zero
> + * (requires mitigation). We only need to do anything in the
> + * last case.
> + */
> + switch (psci_ops.conduit) {
> + case PSCI_CONDUIT_HVC:
> + arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> + ARM_SMCCC_ARCH_WORKAROUND_2, &res);
> + if ((int)res.a0 != 0)
> + supported = false;
> + break;
> +
> + case PSCI_CONDUIT_SMC:
> + arm_smccc_1_1_smc(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
> + ARM_SMCCC_ARCH_WORKAROUND_2, &res);
> + if ((int)res.a0 != 0)
> + supported = false;
> + break;
Once this is merged, I'll rebase my SMCCCC conduit cleanup atop.
Mark.
^ permalink raw reply
* [PATCH v12 0/8] Clock for CPU scaling support for msm8996
From: Amit Kucheria @ 2018-05-24 11:25 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527148218-16540-1-git-send-email-ilialin@codeaurora.org>
On Thu, May 24, 2018 at 10:50 AM, Ilia Lin <ilialin@codeaurora.org> wrote:
> [v12]
> * Addressed the kbuild fail on arm architecture
>
> [v11]
> * Split the series into domains
>
> [v9]
> * Addressed comments from Viresh and Russel about the error handling
>
> [v8]
> * Reordered the patch series into 4 groups
> * Addressed comments from Amit about the comments and commit messages
> * Addressed comments from Amit and Viresh about the resourses deallocation
>
> [v7]
> * Addressed comments from Viresh about resourses deallocation
> and DT compatible
>
> [v6]
> * Addressed comments from Viresh about:
> ** Comments style
> ** Kconfig bool instead of tristate
> ** DT and documentation style
> ** Resourses deallocation on an error
> ** Typos
>
> [v5]
> * Rebased
> * Addressed comments from Bjorn about SPDX style,
> functions and parameters naming
> * Addressed comments from Viresh DT properties and style, comments style,
> resourses deallocation, documentation placement
> * Addressed comments from Sricharan about unnessesary include
> * Addressed comments from Nicolas
> * Addressed comments from Rob about the commit messages and acks
> * Addressed comments from Mark
>
> [v4]
> * Adressed all comments from Stephen
> * Added CPU regulator support
> * Added qcom-cpufreq-kryo driver
>
> [v3]
> * Rebased on top of the latest PLL driver changes
> * Addressed comment from Rob Herring for bindings
>
> [v2]
> * Addressed comments from Rob Herring for bindings
> * Addressed comments from Mark Rutland for memory barrier
> * Addressed comments from Julien Thierry for clock reenabling condition
> * Tuned the HW configuration for clock frequencies below 600MHz
>
> SOC (1/15):
> Extracts the kryo l2 accessors driver from the QCOM PMU driver
>
> Clocks (2/15-9/15):
> This series adds support for the CPU clocks on msm8996 devices.
> The driver uses the existing PLL drivers and is required to control
> the CPU frequency scaling on the MSM8996.
>
> Ilia Lin (6):
> soc: qcom: Separate kryo l2 accessors from PMU driver
> clk: Use devm_ in the register fixed factor clock
> clk: qcom: Add CPU clock driver for msm8996
> dt-bindings: clk: qcom: Add bindings for CPU clock for msm8996
> clk: qcom: cpu-8996: Add support to switch below 600Mhz
> clk: qcom: Add ACD path to CPU clock driver for msm8996
>
> Rajendra Nayak (2):
> clk: qcom: Make clk_alpha_pll_configure available to modules
> clk: qcom: cpu-8996: Add support to switch to alternate PLL
>
> .../devicetree/bindings/clock/qcom,kryocc.txt | 17 +
> drivers/clk/clk-fixed-factor.c | 2 +-
> drivers/clk/qcom/Kconfig | 10 +
> drivers/clk/qcom/Makefile | 1 +
> drivers/clk/qcom/clk-alpha-pll.c | 1 +
> drivers/clk/qcom/clk-alpha-pll.h | 6 +
> drivers/clk/qcom/clk-cpu-8996.c | 510 +++++++++++++++++++++
> drivers/perf/Kconfig | 1 +
> drivers/perf/qcom_l2_pmu.c | 90 +---
> drivers/soc/qcom/Kconfig | 3 +
> drivers/soc/qcom/Makefile | 1 +
> drivers/soc/qcom/kryo-l2-accessors.c | 56 +++
> include/soc/qcom/kryo-l2-accessors.h | 12 +
For the perf/l2-accessors part, Reviewed-by: Amit Kucheria
<amit.kucheria@linaro.org>
For the entire series, Tested-by: Amit Kucheria <amit.kucheria@linaro.org>
^ permalink raw reply
* [PATCH 02/14] arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
From: Mark Rutland @ 2018-05-24 11:23 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180524110058.mrkwc3rx5l3v6wad@lakrids.cambridge.arm.com>
On Thu, May 24, 2018 at 12:00:58PM +0100, Mark Rutland wrote:
> On Tue, May 22, 2018 at 04:06:36PM +0100, Marc Zyngier wrote:
> > In order for the kernel to protect itself, let's call the SSBD mitigation
> > implemented by the higher exception level (either hypervisor or firmware)
> > on each transition between userspace and kernel.
> >
> > We must take the PSCI conduit into account in order to target the
> > right exception level, hence the introduction of a runtime patching
> > callback.
> >
> > Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> > arch/arm64/kernel/cpu_errata.c | 18 ++++++++++++++++++
> > arch/arm64/kernel/entry.S | 22 ++++++++++++++++++++++
> > include/linux/arm-smccc.h | 5 +++++
> > 3 files changed, 45 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> > index a900befadfe8..46b3aafb631a 100644
> > --- a/arch/arm64/kernel/cpu_errata.c
> > +++ b/arch/arm64/kernel/cpu_errata.c
> > @@ -232,6 +232,24 @@ enable_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
> > }
> > #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
> >
> > +#ifdef CONFIG_ARM64_SSBD
> > +void __init arm64_update_smccc_conduit(struct alt_instr *alt,
> > + __le32 *origptr, __le32 *updptr,
> > + int nr_inst)
> > +{
> > + u32 insn;
> > +
> > + BUG_ON(nr_inst != 1);
> > +
> > + if (psci_ops.conduit == PSCI_CONDUIT_HVC)
> > + insn = aarch64_insn_get_hvc_value();
> > + else
> > + insn = aarch64_insn_get_smc_value();
>
> Shouldn't this also handle the case where there is no conduit?
Due to the config symbol not being defined yet, and various other fixups
in later patches, this is actually benign.
However, if you make this:
switch (psci_ops.conduit) {
case PSCI_CONDUIT_NONE:
return;
case PSCI_CONDUIT_HVC:
insn = aarch64_insn_get_hvc_value();
break;
case PSCI_CONDUIT_SMC:
insn = aarch64_insn_get_smc_value();
break;
}
... then we won't even bother patching the nop in the default case
regardless, which is nicer, IMO.
With that:
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Thanks,
Mark.
>
> See below comment in apply_ssbd for rationale.
>
> > +
> > + *updptr = cpu_to_le32(insn);
> > +}
> > +#endif /* CONFIG_ARM64_SSBD */
> > +
> > #define CAP_MIDR_RANGE(model, v_min, r_min, v_max, r_max) \
> > .matches = is_affected_midr_range, \
> > .midr_range = MIDR_RANGE(model, v_min, r_min, v_max, r_max)
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index ec2ee720e33e..f33e6aed3037 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -18,6 +18,7 @@
> > * along with this program. If not, see <http://www.gnu.org/licenses/>.
> > */
> >
> > +#include <linux/arm-smccc.h>
> > #include <linux/init.h>
> > #include <linux/linkage.h>
> >
> > @@ -137,6 +138,18 @@ alternative_else_nop_endif
> > add \dst, \dst, #(\sym - .entry.tramp.text)
> > .endm
> >
> > + // This macro corrupts x0-x3. It is the caller's duty
> > + // to save/restore them if required.
> > + .macro apply_ssbd, state
> > +#ifdef CONFIG_ARM64_SSBD
> > + mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
> > + mov w1, #\state
> > +alternative_cb arm64_update_smccc_conduit
> > + nop // Patched to SMC/HVC #0
> > +alternative_cb_end
> > +#endif
> > + .endm
>
> If my system doesn't have SMCCC1.1, or the FW doesn't have an
> implementation of ARCH_WORKAROUND_2, does this stay as a NOP?
>
> It looks like this would be patched to an SMC, which would be fatal on
> systems without EL3 FW.
>
> > +
> > .macro kernel_entry, el, regsize = 64
> > .if \regsize == 32
> > mov w0, w0 // zero upper 32 bits of x0
> > @@ -163,6 +176,13 @@ alternative_else_nop_endif
> > ldr x19, [tsk, #TSK_TI_FLAGS] // since we can unmask debug
> > disable_step_tsk x19, x20 // exceptions when scheduling.
> >
> > + apply_ssbd 1
>
>
> ... and thus kernel_entry would be fatal.
>
> Thanks,
> Mark.
^ permalink raw reply
* [PATCH v10 13/18] arm64/sve: Move sve_pffr() to fpsimd.h and make inline
From: Dave Martin @ 2018-05-24 11:22 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <8736yhtkvo.fsf@linaro.org>
On Thu, May 24, 2018 at 11:20:59AM +0100, Alex Benn?e wrote:
>
> Dave Martin <Dave.Martin@arm.com> writes:
>
> > In order to make sve_save_state()/sve_load_state() more easily
> > reusable and to get rid of a potential branch on context switch
> > critical paths, this patch makes sve_pffr() inline and moves it to
> > fpsimd.h.
> >
> > <asm/processor.h> must be included in fpsimd.h in order to make
> > this work, and this creates an #include cycle that is tricky to
> > avoid without modifying core code, due to the way the PR_SVE_*()
> > prctl helpers are included in the core prctl implementation.
> >
> > Instead of breaking the cycle, this patch defers inclusion of
> > <asm/fpsimd.h> in <asm/processor.h> until the point where it is
> > actually needed: i.e., immediately before the prctl definitions.
> >
> > No functional change.
> >
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> > Acked-by: Marc Zyngier <marc.zyngier@arm.com>
> > ---
> > arch/arm64/include/asm/fpsimd.h | 13 +++++++++++++
> > arch/arm64/include/asm/processor.h | 3 ++-
> > arch/arm64/kernel/fpsimd.c | 12 ------------
> > 3 files changed, 15 insertions(+), 13 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> > index fb60b22..fa92747 100644
> > --- a/arch/arm64/include/asm/fpsimd.h
> > +++ b/arch/arm64/include/asm/fpsimd.h
> > @@ -18,6 +18,8 @@
> >
> > #include <asm/ptrace.h>
> > #include <asm/errno.h>
> > +#include <asm/processor.h>
> > +#include <asm/sigcontext.h>
> >
> > #ifndef __ASSEMBLY__
> >
> > @@ -61,6 +63,17 @@ extern void sve_flush_cpu_state(void);
> > /* Maximum VL that SVE VL-agnostic software can transparently support */
> > #define SVE_VL_ARCH_MAX 0x100
> >
> > +/* Offset of FFR in the SVE register dump */
> > +static inline size_t sve_ffr_offset(int vl)
> > +{
> > + return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
> > +}
> > +
> > +static inline void *sve_pffr(struct thread_struct *thread)
> > +{
> > + return (char *)thread->sve_state + sve_ffr_offset(thread->sve_vl);
> > +}
> > +
> > extern void sve_save_state(void *state, u32 *pfpsr);
> > extern void sve_load_state(void const *state, u32 const *pfpsr,
> > unsigned long vq_minus_1);
> > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > index f902b6d..ebaadb1 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -40,7 +40,6 @@
> >
> > #include <asm/alternative.h>
> > #include <asm/cpufeature.h>
> > -#include <asm/fpsimd.h>
> > #include <asm/hw_breakpoint.h>
> > #include <asm/lse.h>
> > #include <asm/pgtable-hwdef.h>
> > @@ -245,6 +244,8 @@ void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused);
> > void cpu_enable_cache_maint_trap(const struct arm64_cpu_capabilities *__unused);
> > void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused);
> >
> > +#include <asm/fpsimd.h>
> > +
>
> You really need a one-liner comment to note why the include is in a
> funny place to save someone just moving it back and then getting really
> confused. Maybe:
>
> /* included just in time to avoid circular inclusion issues */
> #include <asm/fpsimd.h>
>
> It still seems weird to me though :-/
How about
/*
* Not at the top of the file due to a direct #include cycle between
* <asm/fpsimd.h> and <asm/processor.h>. Deferring this #include
* ensures that contents of processor.h are visible to fpsimd.h even if
* processor.h is included first.
*
* These prctl helpers are the only things in this file that require
* fpsimd.h. The core code expects them to be in this header.
*/
?
Cheers
---Dave
>
> Otherwise:
>
> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
^ permalink raw reply
* [PATCH v4] arm64: allwinner: a64: Add Amarula A64-Relic initial support
From: Jagan Teki @ 2018-05-24 11:16 UTC (permalink / raw)
To: linux-arm-kernel
Amarula A64-Relic is Allwinner A64 based IoT device, which support
- Allwinner A64 Cortex-A53
- Mali-400MP2 GPU
- AXP803 PMIC
- 1GB DDR3 RAM
- 8GB eMMC
- AP6330 Wifi/BLE
- MIPI-DSI
- CSI: OV5640 sensor
- USB OTG
- 12V DC power supply
Signed-off-by: Jagan Teki <jagan@amarulasolutions.com>
---
Changes for v4:
- update dr_mode as 'otg'
Changes for v3:
- Use sun50i-a64-amarula-relic.dts name
- add eldo3 for dvdd-csi
- update dldo4 min voltage as 3.3v as per schematics
- use dldo3 name as dovdd-csi
- update aldo1, aldo2 voltages as per schematics
Changes for v2:
- Rename dts name to sun50i-a64-relic.dts which is simple to use
- Update dldo4 min voltage as 1.8
- Use licence year as 2018
arch/arm64/boot/dts/allwinner/Makefile | 1 +
.../dts/allwinner/sun50i-a64-amarula-relic.dts | 188 +++++++++++++++++++++
2 files changed, 189 insertions(+)
create mode 100644 arch/arm64/boot/dts/allwinner/sun50i-a64-amarula-relic.dts
diff --git a/arch/arm64/boot/dts/allwinner/Makefile b/arch/arm64/boot/dts/allwinner/Makefile
index c31f90a49481..67ce8c500b2e 100644
--- a/arch/arm64/boot/dts/allwinner/Makefile
+++ b/arch/arm64/boot/dts/allwinner/Makefile
@@ -1,4 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
+dtb-$(CONFIG_ARCH_SUNXI) += sun50i-a64-amarula-relic.dtb
dtb-$(CONFIG_ARCH_SUNXI) += sun50i-a64-bananapi-m64.dtb
dtb-$(CONFIG_ARCH_SUNXI) += sun50i-a64-nanopi-a64.dtb
dtb-$(CONFIG_ARCH_SUNXI) += sun50i-a64-olinuxino.dtb
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-amarula-relic.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-amarula-relic.dts
new file mode 100644
index 000000000000..ce4a256ff086
--- /dev/null
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-amarula-relic.dts
@@ -0,0 +1,188 @@
+// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+/*
+ * Copyright (C) 2018 Amarula Solutions B.V.
+ * Author: Jagan Teki <jagan@amarulasolutions.com>
+ */
+
+/dts-v1/;
+
+#include "sun50i-a64.dtsi"
+
+#include <dt-bindings/gpio/gpio.h>
+
+/ {
+ model = "Amarula A64-Relic";
+ compatible = "amarula,a64-relic", "allwinner,sun50i-a64";
+
+ aliases {
+ serial0 = &uart0;
+ };
+
+ chosen {
+ stdout-path = "serial0:115200n8";
+ };
+};
+
+&ehci0 {
+ status = "okay";
+};
+
+&mmc2 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&mmc2_pins>;
+ vmmc-supply = <®_dcdc1>;
+ bus-width = <8>;
+ non-removable;
+ cap-mmc-hw-reset;
+ status = "okay";
+};
+
+&ohci0 {
+ status = "okay";
+};
+
+&r_rsb {
+ status = "okay";
+
+ axp803: pmic at 3a3 {
+ compatible = "x-powers,axp803";
+ reg = <0x3a3>;
+ interrupt-parent = <&r_intc>;
+ interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
+ x-powers,drive-vbus-en; /* set N_VBUSEN as output pin */
+ };
+};
+
+#include "axp803.dtsi"
+
+®_aldo1 {
+ regulator-always-on;
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2800000>;
+ regulator-name = "avdd-csi";
+};
+
+®_aldo2 {
+ regulator-always-on;
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <3300000>;
+ regulator-name = "vcc-pl";
+};
+
+®_aldo3 {
+ regulator-always-on;
+ regulator-min-microvolt = <3000000>;
+ regulator-max-microvolt = <3000000>;
+ regulator-name = "vcc-pll-avcc";
+};
+
+®_dcdc1 {
+ regulator-always-on;
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ regulator-name = "vcc-3v3";
+};
+
+®_dcdc2 {
+ regulator-always-on;
+ regulator-min-microvolt = <1040000>;
+ regulator-max-microvolt = <1300000>;
+ regulator-name = "vdd-cpux";
+};
+
+/* DCDC3 is polyphased with DCDC2 */
+
+®_dcdc5 {
+ regulator-always-on;
+ regulator-min-microvolt = <1500000>;
+ regulator-max-microvolt = <1500000>;
+ regulator-name = "vcc-dram";
+};
+
+®_dcdc6 {
+ regulator-always-on;
+ regulator-min-microvolt = <1100000>;
+ regulator-max-microvolt = <1100000>;
+ regulator-name = "vdd-sys";
+};
+
+®_dldo1 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ regulator-name = "vcc-hdmi-dsi-sensor";
+};
+
+®_dldo2 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ regulator-name = "vcc-mipi";
+};
+
+®_dldo3 {
+ regulator-min-microvolt = <2800000>;
+ regulator-max-microvolt = <2800000>;
+ regulator-name = "dovdd-csi";
+};
+
+®_dldo4 {
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ regulator-name = "vcc-wifi-io";
+};
+
+®_drivevbus {
+ regulator-name = "usb0-vbus";
+ status = "okay";
+};
+
+®_eldo1 {
+ regulator-always-on;
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-name = "cpvdd";
+};
+
+®_eldo3 {
+ regulator-min-microvolt = <1800000>;
+ regulator-max-microvolt = <1800000>;
+ regulator-name = "dvdd-csi";
+};
+
+®_fldo1 {
+ regulator-min-microvolt = <1200000>;
+ regulator-max-microvolt = <1200000>;
+ regulator-name = "vcc-1v2-hsic";
+};
+
+/*
+ * The A64 chip cannot work without this regulator off, although
+ * it seems to be only driving the AR100 core.
+ * Maybe we don't still know well about CPUs domain.
+ */
+®_fldo2 {
+ regulator-always-on;
+ regulator-min-microvolt = <1100000>;
+ regulator-max-microvolt = <1100000>;
+ regulator-name = "vdd-cpus";
+};
+
+®_rtc_ldo {
+ regulator-name = "vcc-rtc";
+};
+
+&uart0 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&uart0_pins_a>;
+ status = "okay";
+};
+
+&usb_otg {
+ dr_mode = "otg";
+ status = "okay";
+};
+
+&usbphy {
+ usb0_id_det-gpios = <&pio 7 9 GPIO_ACTIVE_HIGH>; /* PH9 */
+ usb0_vbus-supply = <®_drivevbus>;
+ status = "okay";
+};
--
2.14.3
^ permalink raw reply related
* [PATCH 03/14] arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2
From: Mark Rutland @ 2018-05-24 11:14 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522150648.28297-4-marc.zyngier@arm.com>
On Tue, May 22, 2018 at 04:06:37PM +0100, Marc Zyngier wrote:
> In a heterogeneous system, we can end up with both affected and
> unaffected CPUs. Let's check their status before calling into the
> firmware.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Ah, I guess this may fix the issue I noted with the prior patch,
assuming we only set arm64_ssbd_callback_required for a CPU when the FW
supports the mitigation.
If so, if you fold this together with the prior patch:
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Thanks,
Mark.
> ---
> arch/arm64/kernel/cpu_errata.c | 2 ++
> arch/arm64/kernel/entry.S | 11 +++++++----
> 2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 46b3aafb631a..0288d6cf560e 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -233,6 +233,8 @@ enable_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
> #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
>
> #ifdef CONFIG_ARM64_SSBD
> +DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
> +
> void __init arm64_update_smccc_conduit(struct alt_instr *alt,
> __le32 *origptr, __le32 *updptr,
> int nr_inst)
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index f33e6aed3037..29ad672a6abd 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -140,8 +140,10 @@ alternative_else_nop_endif
>
> // This macro corrupts x0-x3. It is the caller's duty
> // to save/restore them if required.
> - .macro apply_ssbd, state
> + .macro apply_ssbd, state, targ, tmp1, tmp2
> #ifdef CONFIG_ARM64_SSBD
> + ldr_this_cpu \tmp2, arm64_ssbd_callback_required, \tmp1
> + cbz \tmp2, \targ
> mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
> mov w1, #\state
> alternative_cb arm64_update_smccc_conduit
> @@ -176,12 +178,13 @@ alternative_cb_end
> ldr x19, [tsk, #TSK_TI_FLAGS] // since we can unmask debug
> disable_step_tsk x19, x20 // exceptions when scheduling.
>
> - apply_ssbd 1
> + apply_ssbd 1, 1f, x22, x23
>
> #ifdef CONFIG_ARM64_SSBD
> ldp x0, x1, [sp, #16 * 0]
> ldp x2, x3, [sp, #16 * 1]
> #endif
> +1:
>
> mov x29, xzr // fp pointed to user-space
> .else
> @@ -323,8 +326,8 @@ alternative_if ARM64_WORKAROUND_845719
> alternative_else_nop_endif
> #endif
> 3:
> - apply_ssbd 0
> -
> + apply_ssbd 0, 5f, x0, x1
> +5:
> .endif
>
> msr elr_el1, x21 // set up the return data
> --
> 2.14.2
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Nokia N900: refcount_t underflow, use after free
From: Pavel Machek @ 2018-05-24 11:05 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <5aa1e2de-96d1-0e88-01ca-49df9ae25457@ti.com>
Hi!
> >>
> >> OK, see if the following fixes the issue for you, only build tested.
> >
> > Word-wrapped, so I applied by hand. And yes, the oops at boot is
> > gone. Thanks!
>
> Sorry about that, have to check my mail settings. Anyway will post the
> patch again, glad that it fixed your issue.
Any news here? AFAICT the bug creeped into v4.17-rcX in the
meantime...
Pavel
> regards
> Suman
>
> >
> > (Camera still does not work in -next... kills system. Oh well. Lets
> > debug that some other day.)
> >
> >> 8< ---------------------
> >> >From bac9a48fb646dc51f2030d676a0dbe3298c3b134 Mon Sep 17 00:00:00 2001
> >> From: Suman Anna <s-anna@ti.com>
> >> Date: Fri, 9 Mar 2018 16:39:59 -0600
> >> Subject: [PATCH] media: omap3isp: fix unbalanced dma_iommu_mapping
> >>
> >> The OMAP3 ISP driver manages its MMU mappings through the IOMMU-aware
> >> ARM DMA backend. The current code creates a dma_iommu_mapping and
> >> attaches this to the ISP device, but never detaches the mapping in
> >> either the probe failure paths or the driver remove path resulting
> >> in an unbalanced mapping refcount and a memory leak. Fix this properly.
> >>
> >> Reported-by: Pavel Machek <pavel@ucw.cz>
> >> Signed-off-by: Suman Anna <s-anna@ti.com>
> >
> > Tested-by: Pavel Machek <pavel@ucw.cz>
> > Pavel
> >
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180524/38b64362/attachment-0001.sig>
^ permalink raw reply
* [PATCH 02/14] arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
From: Mark Rutland @ 2018-05-24 11:00 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522150648.28297-3-marc.zyngier@arm.com>
On Tue, May 22, 2018 at 04:06:36PM +0100, Marc Zyngier wrote:
> In order for the kernel to protect itself, let's call the SSBD mitigation
> implemented by the higher exception level (either hypervisor or firmware)
> on each transition between userspace and kernel.
>
> We must take the PSCI conduit into account in order to target the
> right exception level, hence the introduction of a runtime patching
> callback.
>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
> arch/arm64/kernel/cpu_errata.c | 18 ++++++++++++++++++
> arch/arm64/kernel/entry.S | 22 ++++++++++++++++++++++
> include/linux/arm-smccc.h | 5 +++++
> 3 files changed, 45 insertions(+)
>
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index a900befadfe8..46b3aafb631a 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -232,6 +232,24 @@ enable_smccc_arch_workaround_1(const struct arm64_cpu_capabilities *entry)
> }
> #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
>
> +#ifdef CONFIG_ARM64_SSBD
> +void __init arm64_update_smccc_conduit(struct alt_instr *alt,
> + __le32 *origptr, __le32 *updptr,
> + int nr_inst)
> +{
> + u32 insn;
> +
> + BUG_ON(nr_inst != 1);
> +
> + if (psci_ops.conduit == PSCI_CONDUIT_HVC)
> + insn = aarch64_insn_get_hvc_value();
> + else
> + insn = aarch64_insn_get_smc_value();
Shouldn't this also handle the case where there is no conduit?
See below comment in apply_ssbd for rationale.
> +
> + *updptr = cpu_to_le32(insn);
> +}
> +#endif /* CONFIG_ARM64_SSBD */
> +
> #define CAP_MIDR_RANGE(model, v_min, r_min, v_max, r_max) \
> .matches = is_affected_midr_range, \
> .midr_range = MIDR_RANGE(model, v_min, r_min, v_max, r_max)
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index ec2ee720e33e..f33e6aed3037 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -18,6 +18,7 @@
> * along with this program. If not, see <http://www.gnu.org/licenses/>.
> */
>
> +#include <linux/arm-smccc.h>
> #include <linux/init.h>
> #include <linux/linkage.h>
>
> @@ -137,6 +138,18 @@ alternative_else_nop_endif
> add \dst, \dst, #(\sym - .entry.tramp.text)
> .endm
>
> + // This macro corrupts x0-x3. It is the caller's duty
> + // to save/restore them if required.
> + .macro apply_ssbd, state
> +#ifdef CONFIG_ARM64_SSBD
> + mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
> + mov w1, #\state
> +alternative_cb arm64_update_smccc_conduit
> + nop // Patched to SMC/HVC #0
> +alternative_cb_end
> +#endif
> + .endm
If my system doesn't have SMCCC1.1, or the FW doesn't have an
implementation of ARCH_WORKAROUND_2, does this stay as a NOP?
It looks like this would be patched to an SMC, which would be fatal on
systems without EL3 FW.
> +
> .macro kernel_entry, el, regsize = 64
> .if \regsize == 32
> mov w0, w0 // zero upper 32 bits of x0
> @@ -163,6 +176,13 @@ alternative_else_nop_endif
> ldr x19, [tsk, #TSK_TI_FLAGS] // since we can unmask debug
> disable_step_tsk x19, x20 // exceptions when scheduling.
>
> + apply_ssbd 1
... and thus kernel_entry would be fatal.
Thanks,
Mark.
^ permalink raw reply
* [PATCH 9/9] arm64: bitops: Include <asm-generic/bitops/ext2-atomic-setbit.h>
From: Will Deacon @ 2018-05-24 10:59 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527159586-8578-1-git-send-email-will.deacon@arm.com>
asm-generic/bitops/ext2-atomic-setbit.h provides the ext2 atomic bitop
definitions, so we don't need to define our own.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/bitops.h | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/bitops.h b/arch/arm64/include/asm/bitops.h
index 13501460be6b..10d536b1af74 100644
--- a/arch/arm64/include/asm/bitops.h
+++ b/arch/arm64/include/asm/bitops.h
@@ -38,11 +38,6 @@
#include <asm-generic/bitops/lock.h>
#include <asm-generic/bitops/non-atomic.h>
#include <asm-generic/bitops/le.h>
-
-/*
- * Ext2 is defined to use little-endian byte ordering.
- */
-#define ext2_set_bit_atomic(lock, nr, p) test_and_set_bit_le(nr, p)
-#define ext2_clear_bit_atomic(lock, nr, p) test_and_clear_bit_le(nr, p)
+#include <asm-generic/bitops/ext2-atomic-setbit.h>
#endif /* __ASM_BITOPS_H */
--
2.1.4
^ permalink raw reply related
* [PATCH 8/9] arm64: Replace our atomic/lock bitop implementations with asm-generic
From: Will Deacon @ 2018-05-24 10:59 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527159586-8578-1-git-send-email-will.deacon@arm.com>
The asm-generic/bitops/{atomic,lock}.h implementations are built around
the atomic-fetch ops, which we implement efficiently for both LSE and
LL/SC systems. Use that instead of our hand-rolled, out-of-line bitops.S.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/bitops.h | 14 ++------
arch/arm64/lib/Makefile | 2 +-
arch/arm64/lib/bitops.S | 76 -----------------------------------------
3 files changed, 3 insertions(+), 89 deletions(-)
delete mode 100644 arch/arm64/lib/bitops.S
diff --git a/arch/arm64/include/asm/bitops.h b/arch/arm64/include/asm/bitops.h
index 9c19594ce7cb..13501460be6b 100644
--- a/arch/arm64/include/asm/bitops.h
+++ b/arch/arm64/include/asm/bitops.h
@@ -17,22 +17,11 @@
#define __ASM_BITOPS_H
#include <linux/compiler.h>
-#include <asm/barrier.h>
#ifndef _LINUX_BITOPS_H
#error only <linux/bitops.h> can be included directly
#endif
-/*
- * Little endian assembly atomic bitops.
- */
-extern void set_bit(int nr, volatile unsigned long *p);
-extern void clear_bit(int nr, volatile unsigned long *p);
-extern void change_bit(int nr, volatile unsigned long *p);
-extern int test_and_set_bit(int nr, volatile unsigned long *p);
-extern int test_and_clear_bit(int nr, volatile unsigned long *p);
-extern int test_and_change_bit(int nr, volatile unsigned long *p);
-
#include <asm-generic/bitops/builtin-__ffs.h>
#include <asm-generic/bitops/builtin-ffs.h>
#include <asm-generic/bitops/builtin-__fls.h>
@@ -44,8 +33,9 @@ extern int test_and_change_bit(int nr, volatile unsigned long *p);
#include <asm-generic/bitops/sched.h>
#include <asm-generic/bitops/hweight.h>
-#include <asm-generic/bitops/lock.h>
+#include <asm-generic/bitops/atomic.h>
+#include <asm-generic/bitops/lock.h>
#include <asm-generic/bitops/non-atomic.h>
#include <asm-generic/bitops/le.h>
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 137710f4dac3..68755fd70dcf 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-lib-y := bitops.o clear_user.o delay.o copy_from_user.o \
+lib-y := clear_user.o delay.o copy_from_user.o \
copy_to_user.o copy_in_user.o copy_page.o \
clear_page.o memchr.o memcpy.o memmove.o memset.o \
memcmp.o strcmp.o strncmp.o strlen.o strnlen.o \
diff --git a/arch/arm64/lib/bitops.S b/arch/arm64/lib/bitops.S
deleted file mode 100644
index 43ac736baa5b..000000000000
--- a/arch/arm64/lib/bitops.S
+++ /dev/null
@@ -1,76 +0,0 @@
-/*
- * Based on arch/arm/lib/bitops.h
- *
- * Copyright (C) 2013 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program. If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <linux/linkage.h>
-#include <asm/assembler.h>
-#include <asm/lse.h>
-
-/*
- * x0: bits 5:0 bit offset
- * bits 31:6 word offset
- * x1: address
- */
- .macro bitop, name, llsc, lse
-ENTRY( \name )
- and w3, w0, #63 // Get bit offset
- eor w0, w0, w3 // Clear low bits
- mov x2, #1
- add x1, x1, x0, lsr #3 // Get word offset
-alt_lse " prfm pstl1strm, [x1]", "nop"
- lsl x3, x2, x3 // Create mask
-
-alt_lse "1: ldxr x2, [x1]", "\lse x3, [x1]"
-alt_lse " \llsc x2, x2, x3", "nop"
-alt_lse " stxr w0, x2, [x1]", "nop"
-alt_lse " cbnz w0, 1b", "nop"
-
- ret
-ENDPROC(\name )
- .endm
-
- .macro testop, name, llsc, lse
-ENTRY( \name )
- and w3, w0, #63 // Get bit offset
- eor w0, w0, w3 // Clear low bits
- mov x2, #1
- add x1, x1, x0, lsr #3 // Get word offset
-alt_lse " prfm pstl1strm, [x1]", "nop"
- lsl x4, x2, x3 // Create mask
-
-alt_lse "1: ldxr x2, [x1]", "\lse x4, x2, [x1]"
- lsr x0, x2, x3
-alt_lse " \llsc x2, x2, x4", "nop"
-alt_lse " stlxr w5, x2, [x1]", "nop"
-alt_lse " cbnz w5, 1b", "nop"
-alt_lse " dmb ish", "nop"
-
- and x0, x0, #1
- ret
-ENDPROC(\name )
- .endm
-
-/*
- * Atomic bit operations.
- */
- bitop change_bit, eor, steor
- bitop clear_bit, bic, stclr
- bitop set_bit, orr, stset
-
- testop test_and_change_bit, eor, ldeoral
- testop test_and_clear_bit, bic, ldclral
- testop test_and_set_bit, orr, ldsetal
--
2.1.4
^ permalink raw reply related
* [PATCH 7/9] asm-generic/bitops/lock.h: Rewrite using atomic_fetch_*
From: Will Deacon @ 2018-05-24 10:59 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527159586-8578-1-git-send-email-will.deacon@arm.com>
The lock bitops can be implemented more efficiently using the atomic_fetch_*
ops, which provide finer-grained control over the memory ordering semantics
than the bitops.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
include/asm-generic/bitops/lock.h | 68 ++++++++++++++++++++++++++++++++-------
1 file changed, 56 insertions(+), 12 deletions(-)
diff --git a/include/asm-generic/bitops/lock.h b/include/asm-generic/bitops/lock.h
index 67ab280ad134..3ae021368f48 100644
--- a/include/asm-generic/bitops/lock.h
+++ b/include/asm-generic/bitops/lock.h
@@ -2,6 +2,10 @@
#ifndef _ASM_GENERIC_BITOPS_LOCK_H_
#define _ASM_GENERIC_BITOPS_LOCK_H_
+#include <linux/atomic.h>
+#include <linux/compiler.h>
+#include <asm/barrier.h>
+
/**
* test_and_set_bit_lock - Set a bit and return its old value, for lock
* @nr: Bit to set
@@ -11,7 +15,20 @@
* the returned value is 0.
* It can be used to implement bit locks.
*/
-#define test_and_set_bit_lock(nr, addr) test_and_set_bit(nr, addr)
+static inline int test_and_set_bit_lock(unsigned int nr,
+ volatile unsigned long *p)
+{
+ long old;
+ unsigned long mask = BIT_MASK(nr);
+
+ p += BIT_WORD(nr);
+ if (READ_ONCE(*p) & mask)
+ return 1;
+
+ old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
+ return !!(old & mask);
+}
+
/**
* clear_bit_unlock - Clear a bit in memory, for unlock
@@ -20,11 +37,11 @@
*
* This operation is atomic and provides release barrier semantics.
*/
-#define clear_bit_unlock(nr, addr) \
-do { \
- smp_mb__before_atomic(); \
- clear_bit(nr, addr); \
-} while (0)
+static inline void clear_bit_unlock(unsigned int nr, volatile unsigned long *p)
+{
+ p += BIT_WORD(nr);
+ atomic_long_fetch_andnot_release(BIT_MASK(nr), (atomic_long_t *)p);
+}
/**
* __clear_bit_unlock - Clear a bit in memory, for unlock
@@ -37,11 +54,38 @@ do { \
*
* See for example x86's implementation.
*/
-#define __clear_bit_unlock(nr, addr) \
-do { \
- smp_mb__before_atomic(); \
- clear_bit(nr, addr); \
-} while (0)
+static inline void __clear_bit_unlock(unsigned int nr,
+ volatile unsigned long *p)
+{
+ unsigned long old;
-#endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */
+ p += BIT_WORD(nr);
+ old = READ_ONCE(*p);
+ old &= ~BIT_MASK(nr);
+ atomic_long_set_release((atomic_long_t *)p, old);
+}
+
+/**
+ * clear_bit_unlock_is_negative_byte - Clear a bit in memory and test if bottom
+ * byte is negative, for unlock.
+ * @nr: the bit to clear
+ * @addr: the address to start counting from
+ *
+ * This is a bit of a one-trick-pony for the filemap code, which clears
+ * PG_locked and tests PG_waiters,
+ */
+#ifndef clear_bit_unlock_is_negative_byte
+static inline bool clear_bit_unlock_is_negative_byte(unsigned int nr,
+ volatile unsigned long *p)
+{
+ long old;
+ unsigned long mask = BIT_MASK(nr);
+
+ p += BIT_WORD(nr);
+ old = atomic_long_fetch_andnot_release(mask, (atomic_long_t *)p);
+ return !!(old & BIT(7));
+}
+#define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+#endif
+#endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */
--
2.1.4
^ permalink raw reply related
* [PATCH 6/9] asm-generic/bitops/atomic.h: Rewrite using atomic_fetch_*
From: Will Deacon @ 2018-05-24 10:59 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527159586-8578-1-git-send-email-will.deacon@arm.com>
The atomic bitops can actually be implemented pretty efficiently using
the atomic_fetch_* ops, rather than explicit use of spinlocks.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
include/asm-generic/bitops/atomic.h | 188 +++++++-----------------------------
1 file changed, 33 insertions(+), 155 deletions(-)
diff --git a/include/asm-generic/bitops/atomic.h b/include/asm-generic/bitops/atomic.h
index 04deffaf5f7d..bca92586c2f6 100644
--- a/include/asm-generic/bitops/atomic.h
+++ b/include/asm-generic/bitops/atomic.h
@@ -2,189 +2,67 @@
#ifndef _ASM_GENERIC_BITOPS_ATOMIC_H_
#define _ASM_GENERIC_BITOPS_ATOMIC_H_
-#include <asm/types.h>
-#include <linux/irqflags.h>
-
-#ifdef CONFIG_SMP
-#include <asm/spinlock.h>
-#include <asm/cache.h> /* we use L1_CACHE_BYTES */
-
-/* Use an array of spinlocks for our atomic_ts.
- * Hash function to index into a different SPINLOCK.
- * Since "a" is usually an address, use one spinlock per cacheline.
- */
-# define ATOMIC_HASH_SIZE 4
-# define ATOMIC_HASH(a) (&(__atomic_hash[ (((unsigned long) a)/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))
-
-extern arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned;
-
-/* Can't use raw_spin_lock_irq because of #include problems, so
- * this is the substitute */
-#define _atomic_spin_lock_irqsave(l,f) do { \
- arch_spinlock_t *s = ATOMIC_HASH(l); \
- local_irq_save(f); \
- arch_spin_lock(s); \
-} while(0)
-
-#define _atomic_spin_unlock_irqrestore(l,f) do { \
- arch_spinlock_t *s = ATOMIC_HASH(l); \
- arch_spin_unlock(s); \
- local_irq_restore(f); \
-} while(0)
-
-
-#else
-# define _atomic_spin_lock_irqsave(l,f) do { local_irq_save(f); } while (0)
-# define _atomic_spin_unlock_irqrestore(l,f) do { local_irq_restore(f); } while (0)
-#endif
+#include <linux/atomic.h>
+#include <linux/compiler.h>
+#include <asm/barrier.h>
/*
- * NMI events can occur at any time, including when interrupts have been
- * disabled by *_irqsave(). So you can get NMI events occurring while a
- * *_bit function is holding a spin lock. If the NMI handler also wants
- * to do bit manipulation (and they do) then you can get a deadlock
- * between the original caller of *_bit() and the NMI handler.
- *
- * by Keith Owens
+ * Implementation of atomic bitops using atomic-fetch ops.
+ * See Documentation/atomic_bitops.txt for details.
*/
-/**
- * set_bit - Atomically set a bit in memory
- * @nr: the bit to set
- * @addr: the address to start counting from
- *
- * This function is atomic and may not be reordered. See __set_bit()
- * if you do not require the atomic guarantees.
- *
- * Note: there are no guarantees that this function will not be reordered
- * on non x86 architectures, so if you are writing portable code,
- * make sure not to rely on its reordering guarantees.
- *
- * Note that @nr may be almost arbitrarily large; this function is not
- * restricted to acting on a single-word quantity.
- */
-static inline void set_bit(int nr, volatile unsigned long *addr)
+static inline void set_bit(unsigned int nr, volatile unsigned long *p)
{
- unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- unsigned long flags;
-
- _atomic_spin_lock_irqsave(p, flags);
- *p |= mask;
- _atomic_spin_unlock_irqrestore(p, flags);
+ p += BIT_WORD(nr);
+ atomic_long_fetch_or_relaxed(BIT_MASK(nr), (atomic_long_t *)p);
}
-/**
- * clear_bit - Clears a bit in memory
- * @nr: Bit to clear
- * @addr: Address to start counting from
- *
- * clear_bit() is atomic and may not be reordered. However, it does
- * not contain a memory barrier, so if it is used for locking purposes,
- * you should call smp_mb__before_atomic() and/or smp_mb__after_atomic()
- * in order to ensure changes are visible on other processors.
- */
-static inline void clear_bit(int nr, volatile unsigned long *addr)
+static inline void clear_bit(unsigned int nr, volatile unsigned long *p)
{
- unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- unsigned long flags;
-
- _atomic_spin_lock_irqsave(p, flags);
- *p &= ~mask;
- _atomic_spin_unlock_irqrestore(p, flags);
+ p += BIT_WORD(nr);
+ atomic_long_fetch_andnot_relaxed(BIT_MASK(nr), (atomic_long_t *)p);
}
-/**
- * change_bit - Toggle a bit in memory
- * @nr: Bit to change
- * @addr: Address to start counting from
- *
- * change_bit() is atomic and may not be reordered. It may be
- * reordered on other architectures than x86.
- * Note that @nr may be almost arbitrarily large; this function is not
- * restricted to acting on a single-word quantity.
- */
-static inline void change_bit(int nr, volatile unsigned long *addr)
+static inline void change_bit(unsigned int nr, volatile unsigned long *p)
{
- unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- unsigned long flags;
-
- _atomic_spin_lock_irqsave(p, flags);
- *p ^= mask;
- _atomic_spin_unlock_irqrestore(p, flags);
+ p += BIT_WORD(nr);
+ atomic_long_fetch_xor_relaxed(BIT_MASK(nr), (atomic_long_t *)p);
}
-/**
- * test_and_set_bit - Set a bit and return its old value
- * @nr: Bit to set
- * @addr: Address to count from
- *
- * This operation is atomic and cannot be reordered.
- * It may be reordered on other architectures than x86.
- * It also implies a memory barrier.
- */
-static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_set_bit(unsigned int nr, volatile unsigned long *p)
{
+ long old;
unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- unsigned long old;
- unsigned long flags;
- _atomic_spin_lock_irqsave(p, flags);
- old = *p;
- *p = old | mask;
- _atomic_spin_unlock_irqrestore(p, flags);
+ p += BIT_WORD(nr);
+ if (READ_ONCE(*p) & mask)
+ return 1;
- return (old & mask) != 0;
+ old = atomic_long_fetch_or(mask, (atomic_long_t *)p);
+ return !!(old & mask);
}
-/**
- * test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to clear
- * @addr: Address to count from
- *
- * This operation is atomic and cannot be reordered.
- * It can be reorderdered on other architectures other than x86.
- * It also implies a memory barrier.
- */
-static inline int test_and_clear_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_clear_bit(unsigned int nr, volatile unsigned long *p)
{
+ long old;
unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- unsigned long old;
- unsigned long flags;
- _atomic_spin_lock_irqsave(p, flags);
- old = *p;
- *p = old & ~mask;
- _atomic_spin_unlock_irqrestore(p, flags);
+ p += BIT_WORD(nr);
+ if (!(READ_ONCE(*p) & mask))
+ return 0;
- return (old & mask) != 0;
+ old = atomic_long_fetch_andnot(mask, (atomic_long_t *)p);
+ return !!(old & mask);
}
-/**
- * test_and_change_bit - Change a bit and return its old value
- * @nr: Bit to change
- * @addr: Address to count from
- *
- * This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
- */
-static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_change_bit(unsigned int nr, volatile unsigned long *p)
{
+ long old;
unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
- unsigned long old;
- unsigned long flags;
-
- _atomic_spin_lock_irqsave(p, flags);
- old = *p;
- *p = old ^ mask;
- _atomic_spin_unlock_irqrestore(p, flags);
- return (old & mask) != 0;
+ p += BIT_WORD(nr);
+ old = atomic_long_fetch_xor(mask, (atomic_long_t *)p);
+ return !!(old & mask);
}
#endif /* _ASM_GENERIC_BITOPS_ATOMIC_H */
--
2.1.4
^ permalink raw reply related
* [PATCH 5/9] sh: Don't pull in all of linux/bitops.h in asm/cmpxchg-xchg.h
From: Will Deacon @ 2018-05-24 10:59 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527159586-8578-1-git-send-email-will.deacon@arm.com>
The sh implementation of asm/cmpxchg-xchg.h pulls in linux/bitops.h
so that it can refer to BITS_PER_BYTE. It also transitively relies on
this pulling in linux/compiler.h for READ_ONCE.
Replace the #include with linux/bits.h and linux/compiler.h
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/sh/include/asm/cmpxchg-xchg.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/sh/include/asm/cmpxchg-xchg.h b/arch/sh/include/asm/cmpxchg-xchg.h
index 1e881f5db659..593a9704782b 100644
--- a/arch/sh/include/asm/cmpxchg-xchg.h
+++ b/arch/sh/include/asm/cmpxchg-xchg.h
@@ -8,7 +8,8 @@
* This work is licensed under the terms of the GNU GPL, version 2. See the
* file "COPYING" in the main directory of this archive for more details.
*/
-#include <linux/bitops.h>
+#include <linux/bits.h>
+#include <linux/compiler.h>
#include <asm/byteorder.h>
/*
--
2.1.4
^ permalink raw reply related
* [PATCH 4/9] openrisc: Don't pull in all of linux/bitops.h in asm/cmpxchg.h
From: Will Deacon @ 2018-05-24 10:59 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527159586-8578-1-git-send-email-will.deacon@arm.com>
The openrisc implementation of asm/cmpxchg.h pulls in linux/bitops.h
so that it can refer to BITS_PER_BYTE. It also transitively relies on
this pulling in linux/compiler.h for READ_ONCE.
Replace the #include with linux/bits.h and linux/compiler.h
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/openrisc/include/asm/cmpxchg.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/openrisc/include/asm/cmpxchg.h b/arch/openrisc/include/asm/cmpxchg.h
index d29f7db53906..f9cd43a39d72 100644
--- a/arch/openrisc/include/asm/cmpxchg.h
+++ b/arch/openrisc/include/asm/cmpxchg.h
@@ -16,8 +16,9 @@
#ifndef __ASM_OPENRISC_CMPXCHG_H
#define __ASM_OPENRISC_CMPXCHG_H
+#include <linux/bits.h>
+#include <linux/compiler.h>
#include <linux/types.h>
-#include <linux/bitops.h>
#define __HAVE_ARCH_CMPXCHG 1
--
2.1.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox