* [PATCH v4 03/26] arm64: cpufeature: Use alternatives for VHE cpu_enable
From: Julien Thierry @ 2018-05-25 9:49 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527241772-48007-1-git-send-email-julien.thierry@arm.com>
The cpu_enable callback for VHE feature requires all alternatives to have
been applied. This prevents applying VHE alternative separately from the
rest.
Use an alternative depending on VHE feature to know whether VHE
alternatives have already been applied.
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <Marc.Zyngier@arm.com>
Cc: Christoffer Dall <Christoffer.Dall@arm.com>
---
arch/arm64/kernel/cpufeature.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a177104..a3a5585d 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1013,6 +1013,8 @@ static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused
static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
{
+ u64 tmp = 0;
+
/*
* Copy register values that aren't redirected by hardware.
*
@@ -1021,8 +1023,15 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
* that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
* do anything here.
*/
- if (!alternatives_applied)
- write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
+ asm volatile(ALTERNATIVE(
+ "mrs %0, tpidr_el1\n"
+ "msr tpidr_el2, %0",
+ "nop\n"
+ "nop",
+ ARM64_HAS_VIRT_HOST_EXTN)
+ : "+r" (tmp)
+ :
+ : "memory");
}
#endif
--
1.9.1
^ permalink raw reply related
* [PATCH v4 02/26] arm64: cpufeature: Add cpufeature for IRQ priority masking
From: Julien Thierry @ 2018-05-25 9:49 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527241772-48007-1-git-send-email-julien.thierry@arm.com>
Add a cpufeature indicating whether a cpu supports masking interrupts
by priority.
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/kernel/cpufeature.c | 15 +++++++++++++++
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index bc51b72..cd8f9ed 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -48,7 +48,8 @@
#define ARM64_HAS_CACHE_IDC 27
#define ARM64_HAS_CACHE_DIC 28
#define ARM64_HW_DBM 29
+#define ARM64_HAS_IRQ_PRIO_MASKING 30
-#define ARM64_NCAPS 30
+#define ARM64_NCAPS 31
#endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e03e897..a177104 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1202,6 +1202,21 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
.cpu_enable = cpu_enable_hw_dbm,
},
#endif
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+ {
+ /*
+ * Depends on having GICv3
+ */
+ .desc = "IRQ priority masking",
+ .capability = ARM64_HAS_IRQ_PRIO_MASKING,
+ .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
+ .matches = has_useable_gicv3_cpuif,
+ .sys_reg = SYS_ID_AA64PFR0_EL1,
+ .field_pos = ID_AA64PFR0_GIC_SHIFT,
+ .sign = FTR_UNSIGNED,
+ .min_field_value = 1,
+ },
+#endif
{},
};
--
1.9.1
^ permalink raw reply related
* [PATCH v4 01/26] arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature
From: Julien Thierry @ 2018-05-25 9:49 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527241772-48007-1-git-send-email-julien.thierry@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
arch/arm64/kernel/cpufeature.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9d1b06d..e03e897 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1030,7 +1030,7 @@ static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
{
.desc = "GIC system register CPU interface",
.capability = ARM64_HAS_SYSREG_GIC_CPUIF,
- .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
.matches = has_useable_gicv3_cpuif,
.sys_reg = SYS_ID_AA64PFR0_EL1,
.field_pos = ID_AA64PFR0_GIC_SHIFT,
--
1.9.1
^ permalink raw reply related
* [PATCH v4 00/26] arm64: provide pseudo NMI with GICv3
From: Julien Thierry @ 2018-05-25 9:49 UTC (permalink / raw)
To: linux-arm-kernel
This series is a continuation of the work started by Daniel [1]. The goal
is to use GICv3 interrupt priorities to simulate an NMI.
To achieve this, set two priorities, one for standard interrupts and
another, higher priority, for NMIs. Whenever we want to disable interrupts,
we mask the standard priority instead so NMIs can still be raised. Some
corner cases though still require to actually mask all interrupts
effectively disabling the NMI.
Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
for now. I don't think there is any reason LPIs should be allowed to be set
as NMI as they do not have an active state.
When an NMI is active on a CPU, no other NMI can be triggered on the CPU.
After the big refactoring I get performances similar to the ones I had
in v3[2], reposting old results here:
- "hackbench 200 process 1000" (average over 20 runs)
+-----------+----------+------------+------------------+
| | native | PMR guest | v4.17-rc6 guest |
+-----------+----------+------------+------------------+
| PMR host | 40.0336s | 39.3039s | 39.2044s |
| v4.17-rc6 | 40.4040s | 39.6011s | 39.1147s |
+-----------+----------+------------+------------------+
- Kernel build from defconfig:
PMR host: 13m45.743s
v4.17-rc6: 13m40.400s
I'll try to post more detailed benchmarks later if I find notable
differences with the previous version.
Requirements to use this:
- Have GICv3
- SCR_EL3.FIQ is set to 1 when linux runs or have single security state
- Select Kernel Feature -> Use ICC system registers for IRQ masking
* Patches 1 to 4 aim at applying some alternatives early in the boot
process, including the feature for priority masking.
* Patches 5 to 7 and 17 lightly refactor bits of GIC driver to make things
nicer for the rest of the series.
* Patches 8 to 10 and 16 ensure the logic of daifflags remains valid
after arch_local_irq flags use ICC_PMR_EL1.
* Patches 11 to 14 do some required PMR treatement in order for things to
work when the system uses priority masking.
* Patches 15, 18, 19, 20 and 21 actually make the changes to use
ICC_PMR_EL1 for priority masking/unmasking when disabling/enabling
interrupts.
* Patches 22 to 26 provide support for pseudo-NMI in the GICv3 driver
when priority masking is enabled.
Changes since V3[2]:
* Big refactoring. As suggested by Marc Z., some of the bigger patches
needed to be split into smaller one.
* Try to reduce the amount of #ifdef for the new feature by introducing
an individual cpufeature for priority masking
* Do not track which alternatives have been applied (was a bit dodgy
anyway), and use an alternative for VHE cpu_enable callback
* Fix a build failure with arm by adding the correct RPR accessors
* Added Suggested-by tags for changes from comming or inspired by Daniel's
series. Do let me know if you feel I missed something and am not giving
you due credit.
Changes since V2[3]:
* Series rebase to v4.17-rc6
* Adapt pathces 1 and 2 to the rework of cpufeatures framework
* Use the group0 detection scheme in the GICv3 driver to identify
the priority view, and drop the use of a fake interrupt
* Add the case for a GIC configured in a single security state
* Use local_daif_restore instead of local_irq_enable the first time
we enable interrupts after a bp hardening in the handling of a kernel
entry. Otherwise PRS.I remains set...
Changes since V1[4]:
* Series rebased to v4.15-rc8.
* Check for arm64_early_features in this_cpu_has_cap (spotted by Suzuki).
* Fix issue where debug exception were not masked when enabling debug in
mdscr_el1.
Changes since RFC[5]:
* The series was rebased to v4.15-rc2 which implied some changes mainly
related to the work on exception entries and daif flags by James Morse.
- The first patch in the previous series was dropped because no longer
applicable.
- With the semantics James introduced of "inheriting" daif flags,
handling of PMR on exception entry is simplified as PMR is not altered
by taking an exception and already inherited from previous state.
- James pointed out that taking a PseudoNMI before reading the FAR_EL1
register should not be allowed as per the TRM (D10.2.29):
"FAR_EL1 is made UNKNOWN on an exception return from EL1."
So in this submission PSR.I bit is cleared only after FAR_EL1 is read.
* For KVM, only deal with PMR unmasking/restoring in common code, and VHE
specific code makes sure PSR.I bit is set when necessary.
* When detecting the GIC priority view (patch 5), wait for an actual
interrupt instead of trying only once.
[1] http://www.spinics.net/lists/arm-kernel/msg525077.html
[2] https://lkml.org/lkml/2018/5/21/276
[3] https://lkml.org/lkml/2018/1/17/335
[4] https://www.spinics.net/lists/arm-kernel/msg620763.html
[5] https://www.spinics.net/lists/arm-kernel/msg610736.html
Cheers,
Julien
-->
Daniel Thompson (1):
arm64: alternative: Apply alternatives early in boot process
Julien Thierry (25):
arm64: cpufeature: Set SYSREG_GIC_CPUIF as a boot system feature
arm64: cpufeature: Add cpufeature for IRQ priority masking
arm64: cpufeature: Use alternatives for VHE cpu_enable
irqchip/gic: Unify GIC priority definitions
irqchip/gic: Lower priority of GIC interrupts
irqchip/gic-v3: Remove acknowledge loop
arm64: daifflags: Use irqflags functions for daifflags
arm64: Use daifflag_restore after bp_hardening
arm64: Delay daif masking for user return
arm64: Make PMR part of task context
arm64: Unmask PMR before going idle
arm/arm64: gic-v3: Add helper functions to manage IRQ priorities
arm64: kvm: Unmask PMR before entering guest
arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking
arm64: daifflags: Include PMR in daifflags restore operations
irqchip/gic-v3: Factor group0 detection into functions
irqchip/gic-v3: Do not overwrite PMR value
irqchip/gic-v3: Switch to PMR masking after IRQ acknowledge
arm64: Switch to PMR masking when starting CPUs
arm64: Add build option for IRQ masking via priority
arm64: Detect current view of GIC priorities
irqchip/gic: Add functions to access irq priorities
irqchip/gic-v3: Add base support for pseudo-NMI
irqchip/gic-v3: Provide NMI handlers
irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI
Documentation/arm64/booting.txt | 5 +
arch/arm/include/asm/arch_gicv3.h | 33 ++++
arch/arm64/Kconfig | 15 ++
arch/arm64/include/asm/alternative.h | 3 +-
arch/arm64/include/asm/arch_gicv3.h | 32 ++++
arch/arm64/include/asm/assembler.h | 17 +-
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/cpufeature.h | 2 +
arch/arm64/include/asm/daifflags.h | 32 ++--
arch/arm64/include/asm/efi.h | 3 +-
arch/arm64/include/asm/irqflags.h | 100 ++++++++---
arch/arm64/include/asm/kvm_host.h | 12 ++
arch/arm64/include/asm/processor.h | 1 +
arch/arm64/include/asm/ptrace.h | 13 +-
arch/arm64/kernel/alternative.c | 30 +++-
arch/arm64/kernel/asm-offsets.c | 1 +
arch/arm64/kernel/cpufeature.c | 35 +++-
arch/arm64/kernel/entry.S | 67 ++++++-
arch/arm64/kernel/head.S | 35 ++++
arch/arm64/kernel/process.c | 2 +
arch/arm64/kernel/smp.c | 12 ++
arch/arm64/kvm/hyp/switch.c | 17 ++
arch/arm64/mm/fault.c | 5 +-
arch/arm64/mm/proc.S | 18 ++
drivers/irqchip/irq-gic-common.c | 10 ++
drivers/irqchip/irq-gic-common.h | 2 +
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 318 +++++++++++++++++++++++++++------
include/linux/interrupt.h | 1 +
include/linux/irqchip/arm-gic-common.h | 6 +
include/linux/irqchip/arm-gic.h | 5 -
31 files changed, 719 insertions(+), 118 deletions(-)
--
1.9.1
^ permalink raw reply
* REGRESSION: iommu fails to take address limit into account
From: Ard Biesheuvel @ 2018-05-25 9:48 UTC (permalink / raw)
To: linux-arm-kernel
Hello all,
I am looking into an issue where a platform device is wired to a
MMU-500, and for some reason (which is under investigation) the
platform device can not drive all address bits. I can work around this
by limiting the DMA mask to 40 bits in the driver. However, the IORT
table allows me to set the address limit as well, and so I was
expecting this to be taken into account by the SMMU driver.
When the iort/iommu layer sets up the DMA operations,
iommu_dma_init_domain() is entered with the expected values:
base == 0
size == 0x100_0000_0000
However, the iommu layer ends up generating IOVA addresses that have
bits [47:40] set (which is what the MMU-500 supports). Looking closer,
this is not surprising, given that the end_pfn variable that is
calculated in iommu_dma_init_domain() is no longer used after Zhen's
patch aa3ac9469c185 ("iommu/iova: Make dma_32bit_pfn implicit") was
applied.
So effectively, this is a regression, and I would like your help
figuring out how to go about fixing this.
Thanks,
Ard.
^ permalink raw reply
* [PATCH V2 1/4] clk: bulk: add of_clk_bulk_get()
From: A.s. Dong @ 2018-05-25 9:48 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <152182399342.178046.13139513462419815903@swboyd.mtv.corp.google.com>
Hi Stephen,
> -----Original Message-----
> From: Stephen Boyd [mailto:sboyd at kernel.org]
> Sent: Saturday, March 24, 2018 12:53 AM
> To: A.s. Dong <aisheng.dong@nxp.com>; linux-clk at vger.kernel.org
> Cc: linux-kernel at vger.kernel.org; linux-arm-kernel at lists.infradead.org;
> mturquette at baylibre.com; hdegoede at redhat.com;
> b.zolnierkie at samsung.com; linux at armlinux.org.uk; linux-
> fbdev at vger.kernel.org; dl-linux-imx <linux-imx@nxp.com>; A.s. Dong
> <aisheng.dong@nxp.com>; Stephen Boyd <sboyd@codeaurora.org>; Russell
> King <linux@arm.linux.org.uk>
> Subject: Re: [PATCH V2 1/4] clk: bulk: add of_clk_bulk_get()
>
> Quoting Dong Aisheng (2018-03-20 20:19:48)
> > diff --git a/drivers/clk/clk-bulk.c b/drivers/clk/clk-bulk.c index
> > 4c10456..4b357b2 100644
> > --- a/drivers/clk/clk-bulk.c
> > +++ b/drivers/clk/clk-bulk.c
> > @@ -19,6 +19,38 @@
> > #include <linux/clk.h>
> > #include <linux/device.h>
> > #include <linux/export.h>
> > +#include <linux/of.h>
> > +
> > +#if defined(CONFIG_OF) && defined(CONFIG_COMMON_CLK)
>
> Do we need these defines? of_clk_get() is a stub function when these
> configs are false.
>
You're right. Will drop it.
> > +static int __must_check of_clk_bulk_get(struct device_node *np, int
> num_clks,
> > + struct clk_bulk_data *clks) {
> > + int ret;
> > + int i;
> > +
> > + for (i = 0; i < num_clks; i++)
> > + clks[i].clk = NULL;
> > +
> > + for (i = 0; i < num_clks; i++) {
> > + clks[i].clk = of_clk_get(np, i);
> > + if (IS_ERR(clks[i].clk)) {
> > + ret = PTR_ERR(clks[i].clk);
> > + pr_err("%pOF: Failed to get clk index: %d ret: %d\n",
> > + np, i, ret);
> > + clks[i].clk = NULL;
> > + goto err;
> > + }
> > + }
> > +
> > + return 0;
> > +
> > +err:
> > + clk_bulk_put(i, clks);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL(of_clk_bulk_get);
>
> It's static, so don't export it.
Got it.
Sorry for such mistake.
Will fix and sent V3.
Regards
Dong Aisheng
^ permalink raw reply
* [PATCH v2 13/40] vfio: Add support for Shared Virtual Addressing
From: Jean-Philippe Brucker @ 2018-05-25 9:47 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <5B077765.30703@huawei.com>
On 25/05/18 03:39, Xu Zaibo wrote:
> Hi,
>
> On 2018/5/24 23:04, Jean-Philippe Brucker wrote:
>> On 24/05/18 13:35, Xu Zaibo wrote:
>>>> Right, sva_init() must be called once for any device that intends to use
>>>> bind(). For the second process though, group->sva_enabled will be true
>>>> so we won't call sva_init() again, only bind().
>>> Well, while I create mediated devices based on one parent device to support multiple
>>> processes(a new process will create a new 'vfio_group' for the corresponding mediated device,
>>> and 'sva_enabled' cannot work any more), in fact, *sva_init and *sva_shutdown are basically
>>> working on parent device, so, as a result, I just only need sva initiation and shutdown on the
>>> parent device only once. So I change the two as following:
>>>
>>> @@ -551,8 +565,18 @@ int iommu_sva_device_init(struct device *dev, unsigned long features,
>>> if (features & ~IOMMU_SVA_FEAT_IOPF)
>>> return -EINVAL;
>>>
>>> + /* If already exists, do nothing */
>>> + mutex_lock(&dev->iommu_param->lock);
>>> + if (dev->iommu_param->sva_param) {
>>> + mutex_unlock(&dev->iommu_param->lock);
>>> + return 0;
>>> + }
>>> + mutex_unlock(&dev->iommu_param->lock);
>>>
>>> if (features & IOMMU_SVA_FEAT_IOPF) {
>>> ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf,
>>>
>>>
>>> @@ -621,6 +646,14 @@ int iommu_sva_device_shutdown(struct device *dev)
>>> if (!domain)
>>> return -ENODEV;
>>>
>>> + /* If any other process is working on the device, shut down does nothing. */
>>> + mutex_lock(&dev->iommu_param->lock);
>>> + if (!list_empty(&dev->iommu_param->sva_param->mm_list)) {
>>> + mutex_unlock(&dev->iommu_param->lock);
>>> + return 0;
>>> + }
>>> + mutex_unlock(&dev->iommu_param->lock);
>> I don't think iommu-sva.c is the best place for this, it's probably
>> better to implement an intermediate layer (the mediating driver), that
>> calls iommu_sva_device_init() and iommu_sva_device_shutdown() once. Then
>> vfio-pci would still call these functions itself, but for mdev the
>> mediating driver keeps a refcount of groups, and calls device_shutdown()
>> only when freeing the last mdev.
>>
>> A device driver (non mdev in this example) expects to be able to free
>> all its resources after sva_device_shutdown() returns. Imagine the
>> mm_list isn't empty (mm_exit() is running late), and instead of waiting
>> in unbind_dev_all() below, we return 0 immediately. Then the calling
>> driver frees its resources, and the mm_exit callback along with private
>> data passed to bind() disappear. If a mm_exit() is still running in
>> parallel, then it will try to access freed data and corrupt memory. So
>> in this function if mm_list isn't empty, the only thing we can do is wait.
>>
> I still don't understand why we should 'unbind_dev_all', is it possible
> to do a 'unbind_dev_pasid'?
Not in sva_device_shutdown(), it needs to clean up everything. For
example you want to physically unplug the device, or assign it to a VM.
To prevent any leak sva_device_shutdown() needs to remove all bonds. In
theory there shouldn't be any, since either the driver did unbind_dev(),
or all process exited. This is a safety net.
> Then we can do other things instead of waiting that user may not like. :)
They may not like it, but it's for their own good :) At the moment we're
waiting that:
* All exit_mm() callback for this device have finished. If we don't wait
then the caller will free the private data passed to bind and the
mm_exit() callback while they are still being used.
* All page requests targeting this device are dealt with. If we don't
wait then some requests, that are lingering in the IOMMU PRI queue,
may hit the next contexts bound to this device, possibly in a
different VM. It may not be too risky (though probably exploitable in
some way), but is incredibly messy.
All of this is bounded in time, and normally should be over pretty fast
unless the device driver's exit_mm() does something strange. If the
driver did the right thing, there shouldn't be any wait here (although
there may be one in unbind_dev() for the same reasons - prevent use
after free).
Thanks,
Jean
^ permalink raw reply
* [PATCH v10 07/18] arm64: fpsimd: Eliminate task->mm checks
From: Dave Martin @ 2018-05-25 9:45 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525090020.GA54529@C02W217FHV2R.local>
On Fri, May 25, 2018 at 11:00:20AM +0200, Christoffer Dall wrote:
> On Thu, May 24, 2018 at 03:37:15PM +0100, Dave Martin wrote:
> > On Thu, May 24, 2018 at 12:06:59PM +0200, Christoffer Dall wrote:
> > > On Thu, May 24, 2018 at 10:50:56AM +0100, Dave Martin wrote:
[...]
> > > I'm not sure what the reader is to make of that. Do you not mean the
> > > TIF_FOREIGN_FPSTATE is always true for kernel threads?
> >
> > Again, this is probably a red herring. TIF_FOREIGN_FPSTATE is always
> > true for kernel threads prior to the patch, except (randomly) for the
> > init task.
>
> That was really what my initial question was about, and what I thought
> the commit message should make abundantly clear, because that ties the
> message together with the code.
>
> >
> > This change is not really about TIF_FOREIGN_FPSTATE at all, rather
> > that there is nothing to justify handling kernel threads differently,
> > or even distinguishing kernel threads from user threads at all in this
> > code.
>
> Understood.
And my bad was that I hadn't gone to the effort of understanding my own
argument -- I'd glad to be called out on that.
> > Part of the confusion (and I had confused myself) comes from the fact
> > that TIF_FOREIGN_FPSTATE is really a per-cpu property and doesn't make
> > sense as a per-task property -- i.e., the flag is meaningless for
> > scheduled-out tasks and we must explicitly "repair" it when scheduling
> > a task in anyway. I think it's a thread flag primarily so that it's
> > convenient to check alongside other thread flags in the ret_to_user
> > work loop. This is somewhat less of a justification now that loop was
> > ported to C.
> >
> > > >
> > > > The context switch logic is already deliberately optimised to defer
> > > > reloads of the regs until ret_to_user (or sigreturn as a special
> > > > case), and save them only if they have been previously loaded.
> >
> > Does it help to insert the following here?
> >
> > "These paths are the only places where the wrong_task and wrong_cpu
> > conditions can be made false, by calling fpsimd_bind_task_to_cpu()."
> >
>
> yes it does.
>
> > > > Kernel threads by definition never reach these paths. As a result,
> > >
> > > I'm struggling with the "As a result," here. Is this because reloads of
> > > regs in ret_to_user (or sigreturn) are the only places that can make
> > > wrong_cpu or wrong_task be false?
> >
> > See the proposed clarification above. Is that sufficient?
> >
>
> yes.
>
> > > (I'm actually wanting to understand this, not just bikeshedding the
> > > commit message, as new corner cases keep coming up on this logic.)
> >
> > That's a good thing, and I would really like to explain it in a
> > concise manner. See [*] below for the "concise" explanation -- it may
> > demonstrate why I've been evasive...
> >
>
> I don't think you've been evasive at all, I just think we reason about
> this in slightly different ways, and I was trying to convince myself why
> this change is safe and summarize that concisely. I think we've
> accomplished both :)
OK, good. I reposted speculatively on this basis :)
The commit message is in better shape now, and I very much appreciate
you kicking the tyres on my reasoning!
[...]
> > As an aside, the big wall of text before the definition of struct
> > fpsimd_last_state_struct is looking out of date and could use an
> > update to cover at least some of what is explained in [*] better.
> >
> > I'm currently considering that out of scope for this series, but I will
> > keep it in mind to refresh it in the not too distant future.
> >
>
> Fine with me.
OK, good.
[...]
> > [*] The bigger picture:
> >
> > * Consider a relation (C,T) between cpus C and tasks T, such that
[...]
> > but by assuming that the code is already well-optimised, "unnecessary"
> > save/restore work will not be added. If this were not the case, it
> > could in any case be fixed independently.
> >
> > The observation of this _series_ is that we don't need to do very
> > much in order to be able to generalise the logic to accept KVM vcpus
> > in place of T.
> >
>
> Thanks for the explanation.
> -Christoffer
Was this reasonably understandable? If so I could use it as a basis for
improving the comment block in fpsimd.c, but I'd want to squash it down
to the essentials. It's pretty verbose as it stands.
(What I'd really like to do it take an axe to the logic so that we
end up with something that doesn't require anything like this amount
of explanation ... but that's more of an aspiration right now.)
Cheers
---Dave
^ permalink raw reply
* [PATCH v3 3/8] drm/mediatek: add connection from OD1 to RDMA1
From: Stu Hsieh @ 2018-05-25 9:36 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527218776.27165.5.camel@mtksdaap41>
Hi, CK:
For this patch, I would move it after "add ddp component OD1"
And add this line "#define OD1_MOUT_EN_RDMA1 BIT(16)" from
the path "Add support for mediatek SOC MT2712" to this patch
Regards,
Stu
On Fri, 2018-05-25 at 11:26 +0800, CK Hu wrote:
> Hi, Stu:
>
> On Fri, 2018-05-25 at 10:34 +0800, stu.hsieh at mediatek.com wrote:
> > From: Stu Hsieh <stu.hsieh@mediatek.com>
> >
> > This patch add the connection from OD1 to RDMA1 for ext path.
> >
>
> Reviewed-by: CK Hu <ck.hu@mediatek.com>
>
> > Signed-off-by: Stu Hsieh <stu.hsieh@mediatek.com>
> > ---
> > drivers/gpu/drm/mediatek/mtk_drm_ddp.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp.c
> > index 47ffa240bd25..0f568dd853d8 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp.c
> > @@ -151,6 +151,9 @@ static unsigned int mtk_ddp_mout_en(enum mtk_ddp_comp_id cur,
> > } else if (cur == DDP_COMPONENT_GAMMA && next == DDP_COMPONENT_RDMA1) {
> > *addr = DISP_REG_CONFIG_DISP_GAMMA_MOUT_EN;
> > value = GAMMA_MOUT_EN_RDMA1;
> > + } else if (cur == DDP_COMPONENT_OD1 && next == DDP_COMPONENT_RDMA1) {
> > + *addr = DISP_REG_CONFIG_DISP_OD_MOUT_EN;
> > + value = OD1_MOUT_EN_RDMA1;
> > } else if (cur == DDP_COMPONENT_RDMA1 && next == DDP_COMPONENT_DPI0) {
> > *addr = DISP_REG_CONFIG_DISP_RDMA1_MOUT_EN;
> > value = RDMA1_MOUT_DPI0;
>
>
^ permalink raw reply
* [PATCH v3 7/8] drm/mediatek: Add support for mediatek SOC MT2712
From: Stu Hsieh @ 2018-05-25 9:33 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527223874.27165.21.camel@mtksdaap41>
Hi, CK:
On Fri, 2018-05-25 at 12:51 +0800, CK Hu wrote:
> Hi, Stu:
>
> I've some inline comment.
>
> On Fri, 2018-05-25 at 10:34 +0800, stu.hsieh at mediatek.com wrote:
> > From: Stu Hsieh <stu.hsieh@mediatek.com>
> >
> > This patch add support for the Mediatek MT2712 DISP subsystem.
> > There are two OVL engine and three disp output in MT2712.
> >
> > Signed-off-by: Stu Hsieh <stu.hsieh@mediatek.com>
> > ---
> > drivers/gpu/drm/mediatek/mtk_drm_ddp.c | 46 +++++++++++++++++++++++++++--
> > drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 8 +++--
> > drivers/gpu/drm/mediatek/mtk_drm_drv.c | 42 ++++++++++++++++++++++++--
> > drivers/gpu/drm/mediatek/mtk_drm_drv.h | 7 +++--
> > 4 files changed, 94 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp.c
> > index 0f568dd853d8..676726249ae0 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp.c
> > @@ -61,6 +61,24 @@
> > #define MT8173_MUTEX_MOD_DISP_PWM1 24
> > #define MT8173_MUTEX_MOD_DISP_OD 25
> >
> > +#define MT2712_MUTEX_MOD_DISP_OVL0 11
> > +#define MT2712_MUTEX_MOD_DISP_OVL1 12
> > +#define MT2712_MUTEX_MOD_DISP_RDMA0 13
> > +#define MT2712_MUTEX_MOD_DISP_RDMA1 14
> > +#define MT2712_MUTEX_MOD_DISP_RDMA2 15
> > +#define MT2712_MUTEX_MOD_DISP_WDMA0 16
> > +#define MT2712_MUTEX_MOD_DISP_WDMA1 17
> > +#define MT2712_MUTEX_MOD_DISP_COLOR0 18
> > +#define MT2712_MUTEX_MOD_DISP_COLOR1 19
> > +#define MT2712_MUTEX_MOD_DISP_AAL0 20
> > +#define MT2712_MUTEX_MOD_DISP_UFOE 22
> > +#define MT2712_MUTEX_MOD_DISP_PWM0 23
> > +#define MT2712_MUTEX_MOD_DISP_PWM1 24
> > +#define MT2712_MUTEX_MOD_DISP_PWM2 10
> > +#define MT2712_MUTEX_MOD_DISP_OD0 25
> > +#define MT2712_MUTEX_MOD2_DISP_AAL1 33
> > +#define MT2712_MUTEX_MOD2_DISP_OD1 34
>
> I would like this to be in the order by index.
OK
>
> > +
> > #define MT2701_MUTEX_MOD_DISP_OVL 3
> > #define MT2701_MUTEX_MOD_DISP_WDMA 6
> > #define MT2701_MUTEX_MOD_DISP_COLOR 7
> > @@ -75,6 +93,7 @@
> >
> > #define OVL0_MOUT_EN_COLOR0 0x1
> > #define OD_MOUT_EN_RDMA0 0x1
> > +#define OD1_MOUT_EN_RDMA1 BIT(16)
> > #define UFOE_MOUT_EN_DSI0 0x1
> > #define COLOR0_SEL_IN_OVL0 0x1
> > #define OVL1_MOUT_EN_COLOR1 0x1
> > @@ -109,12 +128,32 @@ static const unsigned int mt2701_mutex_mod[DDP_COMPONENT_ID_MAX] = {
> > [DDP_COMPONENT_WDMA0] = MT2701_MUTEX_MOD_DISP_WDMA,
> > };
> >
> > +static const unsigned int mt2712_mutex_mod[DDP_COMPONENT_ID_MAX] = {
> > + [DDP_COMPONENT_AAL0] = MT2712_MUTEX_MOD_DISP_AAL0,
> > + [DDP_COMPONENT_AAL1] = MT2712_MUTEX_MOD2_DISP_AAL1,
> > + [DDP_COMPONENT_COLOR0] = MT2712_MUTEX_MOD_DISP_COLOR0,
> > + [DDP_COMPONENT_COLOR1] = MT2712_MUTEX_MOD_DISP_COLOR1,
> > + [DDP_COMPONENT_OD0] = MT2712_MUTEX_MOD_DISP_OD0,
> > + [DDP_COMPONENT_OD1] = MT2712_MUTEX_MOD2_DISP_OD1,
> > + [DDP_COMPONENT_OVL0] = MT2712_MUTEX_MOD_DISP_OVL0,
> > + [DDP_COMPONENT_OVL1] = MT2712_MUTEX_MOD_DISP_OVL1,
> > + [DDP_COMPONENT_PWM0] = MT2712_MUTEX_MOD_DISP_PWM0,
> > + [DDP_COMPONENT_PWM1] = MT2712_MUTEX_MOD_DISP_PWM1,
> > + [DDP_COMPONENT_PWM2] = MT2712_MUTEX_MOD_DISP_PWM2,
> > + [DDP_COMPONENT_RDMA0] = MT2712_MUTEX_MOD_DISP_RDMA0,
> > + [DDP_COMPONENT_RDMA1] = MT2712_MUTEX_MOD_DISP_RDMA1,
> > + [DDP_COMPONENT_RDMA2] = MT2712_MUTEX_MOD_DISP_RDMA2,
> > + [DDP_COMPONENT_UFOE] = MT2712_MUTEX_MOD_DISP_UFOE,
> > + [DDP_COMPONENT_WDMA0] = MT2712_MUTEX_MOD_DISP_WDMA0,
> > + [DDP_COMPONENT_WDMA1] = MT2712_MUTEX_MOD_DISP_WDMA1,
> > +};
> > +
> > static const unsigned int mt8173_mutex_mod[DDP_COMPONENT_ID_MAX] = {
> > - [DDP_COMPONENT_AAL] = MT8173_MUTEX_MOD_DISP_AAL,
> > + [DDP_COMPONENT_AAL0] = MT8173_MUTEX_MOD_DISP_AAL,
>
> Move this to the patch 'add ddp component AAL1'.
OK
>
> > [DDP_COMPONENT_COLOR0] = MT8173_MUTEX_MOD_DISP_COLOR0,
> > [DDP_COMPONENT_COLOR1] = MT8173_MUTEX_MOD_DISP_COLOR1,
> > [DDP_COMPONENT_GAMMA] = MT8173_MUTEX_MOD_DISP_GAMMA,
> > - [DDP_COMPONENT_OD] = MT8173_MUTEX_MOD_DISP_OD,
> > + [DDP_COMPONENT_OD0] = MT8173_MUTEX_MOD_DISP_OD,
>
> Move this to the patch 'add ddp component OD1'.
OK
>
> > [DDP_COMPONENT_OVL0] = MT8173_MUTEX_MOD_DISP_OVL0,
> > [DDP_COMPONENT_OVL1] = MT8173_MUTEX_MOD_DISP_OVL1,
> > [DDP_COMPONENT_PWM0] = MT8173_MUTEX_MOD_DISP_PWM0,
> > @@ -139,7 +178,7 @@ static unsigned int mtk_ddp_mout_en(enum mtk_ddp_comp_id cur,
> > } else if (cur == DDP_COMPONENT_OVL0 && next == DDP_COMPONENT_RDMA0) {
> > *addr = DISP_REG_CONFIG_DISP_OVL_MOUT_EN;
> > value = OVL_MOUT_EN_RDMA;
> > - } else if (cur == DDP_COMPONENT_OD && next == DDP_COMPONENT_RDMA0) {
> > + } else if (cur == DDP_COMPONENT_OD0 && next == DDP_COMPONENT_RDMA0) {
>
> Move this to the patch 'add ddp component OD1'.
OK
>
> > *addr = DISP_REG_CONFIG_DISP_OD_MOUT_EN;
> > value = OD_MOUT_EN_RDMA0;
> > } else if (cur == DDP_COMPONENT_UFOE && next == DDP_COMPONENT_DSI0) {
> > @@ -429,6 +468,7 @@ static int mtk_ddp_remove(struct platform_device *pdev)
> >
> > static const struct of_device_id ddp_driver_dt_match[] = {
> > { .compatible = "mediatek,mt2701-disp-mutex", .data = mt2701_mutex_mod},
> > + { .compatible = "mediatek,mt2712-disp-mutex", .data = mt2712_mutex_mod},
> > { .compatible = "mediatek,mt8173-disp-mutex", .data = mt8173_mutex_mod},
> > {},
> > };
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > index 4672317e3ad1..86e8c9e5df41 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
> > @@ -218,7 +218,8 @@ struct mtk_ddp_comp_match {
> > };
> >
> > static const struct mtk_ddp_comp_match mtk_ddp_matches[DDP_COMPONENT_ID_MAX] = {
> > - [DDP_COMPONENT_AAL] = { MTK_DISP_AAL, 0, &ddp_aal },
> > + [DDP_COMPONENT_AAL0] = { MTK_DISP_AAL, 0, &ddp_aal },
> > + [DDP_COMPONENT_AAL1] = { MTK_DISP_AAL, 1, &ddp_aal },
>
> Move this to the patch 'add ddp component AAL1'.
ok
>
> > [DDP_COMPONENT_BLS] = { MTK_DISP_BLS, 0, NULL },
> > [DDP_COMPONENT_COLOR0] = { MTK_DISP_COLOR, 0, NULL },
> > [DDP_COMPONENT_COLOR1] = { MTK_DISP_COLOR, 1, NULL },
> > @@ -226,10 +227,13 @@ static const struct mtk_ddp_comp_match mtk_ddp_matches[DDP_COMPONENT_ID_MAX] = {
> > [DDP_COMPONENT_DSI0] = { MTK_DSI, 0, NULL },
> > [DDP_COMPONENT_DSI1] = { MTK_DSI, 1, NULL },
> > [DDP_COMPONENT_GAMMA] = { MTK_DISP_GAMMA, 0, &ddp_gamma },
> > - [DDP_COMPONENT_OD] = { MTK_DISP_OD, 0, &ddp_od },
> > + [DDP_COMPONENT_OD0] = { MTK_DISP_OD, 0, &ddp_od },
> > + [DDP_COMPONENT_OD1] = { MTK_DISP_OD, 1, &ddp_od },
>
> Move this to the patch 'add ddp component OD1'
ok
>
> > [DDP_COMPONENT_OVL0] = { MTK_DISP_OVL, 0, NULL },
> > [DDP_COMPONENT_OVL1] = { MTK_DISP_OVL, 1, NULL },
> > [DDP_COMPONENT_PWM0] = { MTK_DISP_PWM, 0, NULL },
> > + [DDP_COMPONENT_PWM1] = { MTK_DISP_PWM, 1, NULL },
>
> Move this to the patch 'add ddp component PWM1'
ok, i would create the new patch for 'add ddp component PWM1'
>
> > + [DDP_COMPONENT_PWM2] = { MTK_DISP_PWM, 2, NULL },
>
> Move this to the patch 'add ddp component PWM2'
ok
>
> > [DDP_COMPONENT_RDMA0] = { MTK_DISP_RDMA, 0, NULL },
> > [DDP_COMPONENT_RDMA1] = { MTK_DISP_RDMA, 1, NULL },
> > [DDP_COMPONENT_RDMA2] = { MTK_DISP_RDMA, 2, NULL },
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> > index a2ca90fc403c..b32c4cc8d051 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> > @@ -146,11 +146,37 @@ static const enum mtk_ddp_comp_id mt2701_mtk_ddp_ext[] = {
> > DDP_COMPONENT_DPI0,
> > };
> >
> > +static const enum mtk_ddp_comp_id mt2712_mtk_ddp_main[] = {
> > + DDP_COMPONENT_OVL0,
> > + DDP_COMPONENT_COLOR0,
> > + DDP_COMPONENT_AAL0,
> > + DDP_COMPONENT_OD0,
> > + DDP_COMPONENT_RDMA0,
> > + DDP_COMPONENT_DPI0,
> > + DDP_COMPONENT_PWM0,
> > +};
> > +
> > +static const enum mtk_ddp_comp_id mt2712_mtk_ddp_ext[] = {
> > + DDP_COMPONENT_OVL1,
> > + DDP_COMPONENT_COLOR1,
> > + DDP_COMPONENT_AAL1,
> > + DDP_COMPONENT_OD1,
> > + DDP_COMPONENT_RDMA1,
> > + DDP_COMPONENT_DPI1,
> > + DDP_COMPONENT_PWM1,
> > +};
> > +
> > +static const enum mtk_ddp_comp_id mt2712_mtk_ddp_third[] = {
> > + DDP_COMPONENT_RDMA2,
> > + DDP_COMPONENT_DSI2,
> > + DDP_COMPONENT_PWM2,
> > +};
> > +
> > static const enum mtk_ddp_comp_id mt8173_mtk_ddp_main[] = {
> > DDP_COMPONENT_OVL0,
> > DDP_COMPONENT_COLOR0,
> > - DDP_COMPONENT_AAL,
> > - DDP_COMPONENT_OD,
> > + DDP_COMPONENT_AAL0,
>
> Move this to the patch 'add ddp component AAL1'.
ok
>
> > + DDP_COMPONENT_OD0,
>
> Move this to the patch 'add ddp component OD1'
ok
>
> > DDP_COMPONENT_RDMA0,
> > DDP_COMPONENT_UFOE,
> > DDP_COMPONENT_DSI0,
> > @@ -173,6 +199,15 @@ static const struct mtk_mmsys_driver_data mt2701_mmsys_driver_data = {
> > .shadow_register = true,
> > };
> >
> > +static const struct mtk_mmsys_driver_data mt2712_mmsys_driver_data = {
> > + .main_path = mt2712_mtk_ddp_main,
> > + .main_len = ARRAY_SIZE(mt2712_mtk_ddp_main),
> > + .ext_path = mt2712_mtk_ddp_ext,
> > + .ext_len = ARRAY_SIZE(mt2712_mtk_ddp_ext),
> > + .third_path = mt2712_mtk_ddp_third,
> > + .third_len = ARRAY_SIZE(mt2712_mtk_ddp_third),
> > +};
> > +
> > static const struct mtk_mmsys_driver_data mt8173_mmsys_driver_data = {
> > .main_path = mt8173_mtk_ddp_main,
> > .main_len = ARRAY_SIZE(mt8173_mtk_ddp_main),
> > @@ -374,6 +409,7 @@ static const struct of_device_id mtk_ddp_comp_dt_ids[] = {
> > { .compatible = "mediatek,mt8173-dsi", .data = (void *)MTK_DSI },
> > { .compatible = "mediatek,mt8173-dpi", .data = (void *)MTK_DPI },
> > { .compatible = "mediatek,mt2701-disp-mutex", .data = (void *)MTK_DISP_MUTEX },
> > + { .compatible = "mediatek,mt2712-disp-mutex", .data = (void *)MTK_DISP_MUTEX },
> > { .compatible = "mediatek,mt8173-disp-mutex", .data = (void *)MTK_DISP_MUTEX },
> > { .compatible = "mediatek,mt2701-disp-pwm", .data = (void *)MTK_DISP_BLS },
> > { .compatible = "mediatek,mt8173-disp-pwm", .data = (void *)MTK_DISP_PWM },
> > @@ -552,6 +588,8 @@ static SIMPLE_DEV_PM_OPS(mtk_drm_pm_ops, mtk_drm_sys_suspend,
> > static const struct of_device_id mtk_drm_of_ids[] = {
> > { .compatible = "mediatek,mt2701-mmsys",
> > .data = &mt2701_mmsys_driver_data},
> > + { .compatible = "mediatek,mt2712-mmsys",
> > + .data = &mt2712_mmsys_driver_data},
> > { .compatible = "mediatek,mt8173-mmsys",
> > .data = &mt8173_mmsys_driver_data},
> > { }
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.h b/drivers/gpu/drm/mediatek/mtk_drm_drv.h
> > index c3378c452c0a..e821342bc2d3 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.h
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.h
> > @@ -17,8 +17,8 @@
> > #include <linux/io.h>
> > #include "mtk_drm_ddp_comp.h"
> >
> > -#define MAX_CRTC 2
> > -#define MAX_CONNECTOR 2
> > +#define MAX_CRTC 3
> > +#define MAX_CONNECTOR 3
> >
> > struct device;
> > struct device_node;
> > @@ -33,6 +33,9 @@ struct mtk_mmsys_driver_data {
> > unsigned int main_len;
> > const enum mtk_ddp_comp_id *ext_path;
> > unsigned int ext_len;
> > + enum mtk_ddp_comp_id *third_path;
> > + unsigned int third_len;
> > +
>
> Move this to the patch 'add third ddp path'.
ok
>
> > bool shadow_register;
> > };
> >
>
> Regards,
> CK
>
>
^ permalink raw reply
* [PATCH v6 3/9] docs: Add Generic Counter interface documentation
From: Fabrice Gasnier @ 2018-05-25 9:26 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180522180805.2b61f0ed@archlinux>
On 05/22/2018 07:08 PM, Jonathan Cameron wrote:
>>>> +* Quadrature x2 Rising:
>>>> + Rising edges on either quadrature pair signals updates the respective
>>>> + count. Quadrature encoding determines the direction.
>>> This one I've never met. Really? There are devices who do this form
>>> of crazy? It gives really uneven counting and I'm failing to see when
>>> it would ever make sense... References for these odd corner cases
>>> would be good.
>>>
>>>
>>> __|---|____|-----|____
>>> ____|----|____|-----|____
>>>
>>> 001122222223334444444
>> That's the same reaction I had when I discovered this -- in fact the
>> STM32 LP Timer is the first time I've come across such a quadrature
>> mode. I'm not sure of the use case for this mode, because positioning
>> wouldn't be precise as you've pointed out. Perhaps Fabrice or Benjamin
>> can probe the ST guys responsible for this design choice to figure out
>> the rationale.
> Hmm. My inclination would be to not support it unless someone can up
> with a meaningful use. We are adding ABI (be it not much) for a case
> that to us makes no sense.
Hi Jonathan, William,
Sorry for the late reply. To follow your advise, we can probably drop
this for now. I think simple counter, or quadrature x4 will be mostly
used for now. As you pointed out, there's not much ABI for x2
rising/falling cases. It will not be a big deal to add it later if needed.
I can help to update (remove & test) this in LP-Timer counter driver if
you wish.
Please let me know,
Thanks,
Fabrice
>
> Looks rather like the sort of thing that is a side effect of the
> implementation rather than deliberate.
>
>> I'm leaving in these modes for now, as they do exist in the STM32 LP
>> Timer, but it does make me curious what the intentions for them were
>> (perhaps use cases outside of traditional quadrature encoder
>> positioning).
>>
> Sure if there is a usecase then fair enough.
>
> Jonathan
>
>
^ permalink raw reply
* [PATCH v3 4/8] drm/mediatek: add ddp component AAL1
From: Stu Hsieh @ 2018-05-25 9:25 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527222238.27165.12.camel@mtksdaap41>
Hi. CK:
On Fri, 2018-05-25 at 12:23 +0800, CK Hu wrote:
> Hi, Stu:
>
> On Fri, 2018-05-25 at 10:34 +0800, stu.hsieh at mediatek.com wrote:
> > From: Stu Hsieh <stu.hsieh@mediatek.com>
> >
> > This patch add component AAL1 and
> > rename AAL to AAL0
> >
> > Signed-off-by: Stu Hsieh <stu.hsieh@mediatek.com>
> > ---
> > drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> > index 0828cf8bf85c..eee3c0cc2632 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
> > @@ -41,7 +41,8 @@ enum mtk_ddp_comp_type {
> > };
> >
> > enum mtk_ddp_comp_id {
> > - DDP_COMPONENT_AAL,
> > + DDP_COMPONENT_AAL0,
> > + DDP_COMPONENT_AAL1,
>
> Be sure compiling is success when you apply each patch of a series. I
> think when you apply to this patch, it would cause compiling error
> because some related modification is in the patch 'Add support for
> mediatek SOC MT2712'. So move the modification from that patch to this
> patch.
>
> Regards,
> CK
I would move some modification related some component to associated
patch from the patch 'Add support for mediatek SOC MT2712'
Regards,
Stu
>
> > DDP_COMPONENT_BLS,
> > DDP_COMPONENT_COLOR0,
> > DDP_COMPONENT_COLOR1,
>
>
^ permalink raw reply
* [PATCH] usb: gadget: composite: fix delayed_status race condition when set_interface
From: Chunfeng Yun @ 2018-05-25 9:24 UTC (permalink / raw)
To: linux-arm-kernel
It happens when enable debug log, if set_alt() returns
USB_GADGET_DELAYED_STATUS and usb_composite_setup_continue()
is called before increasing count of @delayed_status,
so fix it by using spinlock of @cdev->lock.
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Tested-by: Jay Hsu <shih-chieh.hsu@mediatek.com>
---
drivers/usb/gadget/composite.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c
index f242c2b..d2fa071 100644
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -1719,6 +1719,8 @@ static int fill_ext_prop(struct usb_configuration *c, int interface, u8 *buf)
*/
if (w_value && !f->get_alt)
break;
+
+ spin_lock(&cdev->lock);
value = f->set_alt(f, w_index, w_value);
if (value == USB_GADGET_DELAYED_STATUS) {
DBG(cdev,
@@ -1728,6 +1730,7 @@ static int fill_ext_prop(struct usb_configuration *c, int interface, u8 *buf)
DBG(cdev, "delayed_status count %d\n",
cdev->delayed_status);
}
+ spin_unlock(&cdev->lock);
break;
case USB_REQ_GET_INTERFACE:
if (ctrl->bRequestType != (USB_DIR_IN|USB_RECIP_INTERFACE))
--
1.9.1
^ permalink raw reply related
* [PATCH v3 1/8] drm/mediatek: update dt-bindings for mt2712
From: Stu Hsieh @ 2018-05-25 9:20 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527218292.27165.2.camel@mtksdaap41>
Hi, CK:
On Fri, 2018-05-25 at 11:18 +0800, CK Hu wrote:
> Hi, Stu:
>
> On Fri, 2018-05-25 at 10:34 +0800, stu.hsieh at mediatek.com wrote:
> > From: Stu Hsieh <stu.hsieh@mediatek.com>
> >
> > Update device tree binding documentation for the display subsystem for
> > Mediatek MT2712 SoCs.
> >
>
> I've acked v2 of this patch and v3 is the same as v2, so you should keep
> my ack in commit message.
>
> Regards,
> CK
OK
Regards,
Stu
>
> > Signed-off-by: Stu Hsieh <stu.hsieh@mediatek.com>
> > ---
> > Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
> > index 383183a89164..8469de510001 100644
> > --- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
> > +++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
> > @@ -40,7 +40,7 @@ Required properties (all function blocks):
> > "mediatek,<chip>-dpi" - DPI controller, see mediatek,dpi.txt
> > "mediatek,<chip>-disp-mutex" - display mutex
> > "mediatek,<chip>-disp-od" - overdrive
> > - the supported chips are mt2701 and mt8173.
> > + the supported chips are mt2701, mt2712 and mt8173.
> > - reg: Physical base address and length of the function block register space
> > - interrupts: The interrupt signal from the function block (required, except for
> > merge and split function blocks).
>
>
^ permalink raw reply
* [PATCH v10 5/5] arm64: defconfig: enable f2fs and squashfs
From: Li Wei @ 2018-05-25 9:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525091712.37227-1-liwei213@huawei.com>
Partitions in HiKey960 are formatted as f2fs and squashfs.
f2fs is for userdata; squashfs is for system. Both partitions are required
by Android.
Signed-off-by: Li Wei <liwei213@huawei.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Guodong Xu <guodong.xu@linaro.org>
---
arch/arm64/configs/defconfig | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index d42b1ecaf490..e8036cddb272 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -613,6 +613,7 @@ CONFIG_ACPI_APEI_MEMORY_FAILURE=y
CONFIG_ACPI_APEI_EINJ=y
CONFIG_EXT2_FS=y
CONFIG_EXT3_FS=y
+CONFIG_F2FS_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_BTRFS_FS=m
CONFIG_BTRFS_FS_POSIX_ACL=y
@@ -628,6 +629,13 @@ CONFIG_HUGETLBFS=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=y
CONFIG_SQUASHFS=y
+CONFIG_SQUASHFS_FILE_DIRECT=y
+CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y
+CONFIG_SQUASHFS_XATTR=y
+CONFIG_SQUASHFS_LZ4=y
+CONFIG_SQUASHFS_LZO=y
+CONFIG_SQUASHFS_XZ=y
+CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
CONFIG_NFS_FS=y
CONFIG_NFS_V4=y
CONFIG_NFS_V4_1=y
--
2.15.0
^ permalink raw reply related
* [PATCH v10 4/5] arm64: defconfig: enable configs for Hisilicon ufs
From: Li Wei @ 2018-05-25 9:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525091712.37227-1-liwei213@huawei.com>
This enable configs for Hisilicon Hixxxx UFS driver.
Signed-off-by: Li Wei <liwei213@huawei.com>
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Guodong Xu <guodong.xu@linaro.org>
---
arch/arm64/configs/defconfig | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index ecf613761e78..d42b1ecaf490 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -187,6 +187,9 @@ CONFIG_BLK_DEV_SD=y
CONFIG_SCSI_SAS_ATA=y
CONFIG_SCSI_HISI_SAS=y
CONFIG_SCSI_HISI_SAS_PCI=y
+CONFIG_SCSI_UFSHCD=y
+CONFIG_SCSI_UFSHCD_PLATFORM=y
+CONFIG_SCSI_UFS_HISI=y
CONFIG_ATA=y
CONFIG_SATA_AHCI=y
CONFIG_SATA_AHCI_PLATFORM=y
--
2.15.0
^ permalink raw reply related
* [PATCH v10 3/5] arm64: dts: add ufs dts node
From: Li Wei @ 2018-05-25 9:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525091712.37227-1-liwei213@huawei.com>
arm64: dts: add ufs node for Hisilicon.
Signed-off-by: Li Wei <liwei213@huawei.com>
---
arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
index ec3eb8e33a3a..04438621c6c3 100644
--- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
+++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi
@@ -892,6 +892,24 @@
reset-gpios = <&gpio11 1 0 >;
};
+ /* UFS */
+ ufs: ufs at ff3b0000 {
+ compatible = "hisilicon,hi3660-ufs", "jedec,ufs-1.1";
+ /* 0: HCI standard */
+ /* 1: UFS SYS CTRL */
+ reg = <0x0 0xff3b0000 0x0 0x1000>,
+ <0x0 0xff3b1000 0x0 0x1000>;
+ interrupt-parent = <&gic>;
+ interrupts = <GIC_SPI 278 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&crg_ctrl HI3660_CLK_GATE_UFSIO_REF>,
+ <&crg_ctrl HI3660_CLK_GATE_UFSPHY_CFG>;
+ clock-names = "ref_clk", "phy_clk";
+ freq-table-hz = <0 0>, <0 0>;
+ /* offset: 0x84; bit: 12 */
+ resets = <&crg_rst 0x84 12>;
+ reset-names = "rst";
+ };
+
/* SD */
dwmmc1: dwmmc1 at ff37f000 {
#address-cells = <1>;
--
2.15.0
^ permalink raw reply related
* [PATCH v10 2/5] dt-bindings: scsi: ufs: add document for hisi-ufs
From: Li Wei @ 2018-05-25 9:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525091712.37227-1-liwei213@huawei.com>
add ufs node document for Hisilicon.
Signed-off-by: Li Wei <liwei213@huawei.com>
---
Documentation/devicetree/bindings/ufs/ufs-hisi.txt | 41 ++++++++++++++++++++++
.../devicetree/bindings/ufs/ufshcd-pltfrm.txt | 10 ++++--
2 files changed, 48 insertions(+), 3 deletions(-)
create mode 100644 Documentation/devicetree/bindings/ufs/ufs-hisi.txt
diff --git a/Documentation/devicetree/bindings/ufs/ufs-hisi.txt b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
new file mode 100644
index 000000000000..a48c44817367
--- /dev/null
+++ b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
@@ -0,0 +1,41 @@
+* Hisilicon Universal Flash Storage (UFS) Host Controller
+
+UFS nodes are defined to describe on-chip UFS hardware macro.
+Each UFS Host Controller should have its own node.
+
+Required properties:
+- compatible : compatible list, contains one of the following -
+ "hisilicon,hi3660-ufs", "jedec,ufs-1.1" for hisi ufs
+ host controller present on Hi36xx chipset.
+- reg : should contain UFS register address space & UFS SYS CTRL register address,
+- interrupt-parent : interrupt device
+- interrupts : interrupt number
+- clocks : List of phandle and clock specifier pairs
+- clock-names : List of clock input name strings sorted in the same
+ order as the clocks property. "ref_clk", "phy_clk" is optional
+- freq-table-hz : Array of <min max> operating frequencies stored in the same
+ order as the clocks property. If this property is not
+ defined or a value in the array is "0" then it is assumed
+ that the frequency is set by the parent clock or a
+ fixed rate clock source.
+- resets : describe reset node register
+- reset-names : reset node register, the "rst" corresponds to reset the whole UFS IP.
+
+Example:
+
+ ufs: ufs at ff3b0000 {
+ compatible = "hisilicon,hi3660-ufs", "jedec,ufs-1.1";
+ /* 0: HCI standard */
+ /* 1: UFS SYS CTRL */
+ reg = <0x0 0xff3b0000 0x0 0x1000>,
+ <0x0 0xff3b1000 0x0 0x1000>;
+ interrupt-parent = <&gic>;
+ interrupts = <GIC_SPI 278 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&crg_ctrl HI3660_CLK_GATE_UFSIO_REF>,
+ <&crg_ctrl HI3660_CLK_GATE_UFSPHY_CFG>;
+ clock-names = "ref_clk", "phy_clk";
+ freq-table-hz = <0 0>, <0 0>;
+ /* offset: 0x84; bit: 12 */
+ resets = <&crg_rst 0x84 12>;
+ reset-names = "rst";
+ };
diff --git a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
index c39dfef76a18..2df00524bd21 100644
--- a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
+++ b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
@@ -41,6 +41,8 @@ Optional properties:
-lanes-per-direction : number of lanes available per direction - either 1 or 2.
Note that it is assume same number of lanes is used both
directions at once. If not specified, default is 2 lanes per direction.
+- resets : reset node register
+- reset-names : describe reset node register, the "rst" corresponds to reset the whole UFS IP.
Note: If above properties are not defined it can be assumed that the supply
regulators or clocks are always on.
@@ -61,9 +63,11 @@ Example:
vccq-max-microamp = 200000;
vccq2-max-microamp = 200000;
- clocks = <&core 0>, <&ref 0>, <&iface 0>;
- clock-names = "core_clk", "ref_clk", "iface_clk";
- freq-table-hz = <100000000 200000000>, <0 0>, <0 0>;
+ clocks = <&core 0>, <&ref 0>, <&phy 0>, <&iface 0>;
+ clock-names = "core_clk", "ref_clk", "phy_clk", "iface_clk";
+ freq-table-hz = <100000000 200000000>, <0 0>, <0 0>, <0 0>;
+ resets = <&reset 0 1>;
+ reset-names = "rst";
phys = <&ufsphy1>;
phy-names = "ufsphy";
};
--
2.15.0
^ permalink raw reply related
* [PATCH v10 1/5] scsi: ufs: add Hisilicon ufs driver code
From: Li Wei @ 2018-05-25 9:17 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525091712.37227-1-liwei213@huawei.com>
add Hisilicon ufs driver code.
Signed-off-by: Li Wei <liwei213@huawei.com>
Signed-off-by: Geng Jianfeng <gengjianfeng@hisilicon.com>
Signed-off-by: Zang Leigang <zangleigang@hisilicon.com>
Signed-off-by: Yu Jianfeng <steven.yujianfeng@hisilicon.com>
---
drivers/scsi/ufs/Kconfig | 9 +
drivers/scsi/ufs/Makefile | 1 +
drivers/scsi/ufs/ufs-hisi.c | 619 ++++++++++++++++++++++++++++++++++++++++++++
drivers/scsi/ufs/ufs-hisi.h | 115 ++++++++
4 files changed, 744 insertions(+)
create mode 100644 drivers/scsi/ufs/ufs-hisi.c
create mode 100644 drivers/scsi/ufs/ufs-hisi.h
diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig
index e27b4d4e6ae2..e09fe6ab3572 100644
--- a/drivers/scsi/ufs/Kconfig
+++ b/drivers/scsi/ufs/Kconfig
@@ -100,3 +100,12 @@ config SCSI_UFS_QCOM
Select this if you have UFS controller on QCOM chipset.
If unsure, say N.
+
+config SCSI_UFS_HISI
+ tristate "Hisilicon specific hooks to UFS controller platform driver"
+ depends on (ARCH_HISI || COMPILE_TEST) && SCSI_UFSHCD_PLATFORM
+ ---help---
+ This selects the Hisilicon specific additions to UFSHCD platform driver.
+
+ Select this if you have UFS controller on Hisilicon chipset.
+ If unsure, say N.
diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
index 918f5791202d..2c50f03d8c4a 100644
--- a/drivers/scsi/ufs/Makefile
+++ b/drivers/scsi/ufs/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_SCSI_UFSHCD) += ufshcd-core.o
ufshcd-core-objs := ufshcd.o ufs-sysfs.o
obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o
obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o
+obj-$(CONFIG_SCSI_UFS_HISI) += ufs-hisi.o
diff --git a/drivers/scsi/ufs/ufs-hisi.c b/drivers/scsi/ufs/ufs-hisi.c
new file mode 100644
index 000000000000..524861cd0ffd
--- /dev/null
+++ b/drivers/scsi/ufs/ufs-hisi.c
@@ -0,0 +1,619 @@
+/*
+ * HiSilicon Hixxxx UFS Driver
+ *
+ * Copyright (c) 2016-2017 Linaro Ltd.
+ * Copyright (c) 2016-2017 HiSilicon Technologies Co., Ltd.
+ *
+ * Released under the GPLv2 only.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include <linux/time.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/dma-mapping.h>
+#include <linux/platform_device.h>
+#include <linux/reset.h>
+
+#include "ufshcd.h"
+#include "ufshcd-pltfrm.h"
+#include "unipro.h"
+#include "ufs-hisi.h"
+#include "ufshci.h"
+
+static int ufs_hisi_check_hibern8(struct ufs_hba *hba)
+{
+ int err = 0;
+ u32 tx_fsm_val_0 = 0;
+ u32 tx_fsm_val_1 = 0;
+ unsigned long timeout = jiffies + msecs_to_jiffies(HBRN8_POLL_TOUT_MS);
+
+ do {
+ err = ufshcd_dme_get(hba, UIC_ARG_MIB_SEL(MPHY_TX_FSM_STATE, 0),
+ &tx_fsm_val_0);
+ err |= ufshcd_dme_get(hba,
+ UIC_ARG_MIB_SEL(MPHY_TX_FSM_STATE, 1), &tx_fsm_val_1);
+ if (err || (tx_fsm_val_0 == TX_FSM_HIBERN8 &&
+ tx_fsm_val_1 == TX_FSM_HIBERN8))
+ break;
+
+ /* sleep for max. 200us */
+ usleep_range(100, 200);
+ } while (time_before(jiffies, timeout));
+
+ /*
+ * we might have scheduled out for long during polling so
+ * check the state again.
+ */
+ if (time_after(jiffies, timeout)) {
+ err = ufshcd_dme_get(hba, UIC_ARG_MIB_SEL(MPHY_TX_FSM_STATE, 0),
+ &tx_fsm_val_0);
+ err |= ufshcd_dme_get(hba,
+ UIC_ARG_MIB_SEL(MPHY_TX_FSM_STATE, 1), &tx_fsm_val_1);
+ }
+
+ if (err) {
+ dev_err(hba->dev, "%s: unable to get TX_FSM_STATE, err %d\n",
+ __func__, err);
+ } else if (tx_fsm_val_0 != TX_FSM_HIBERN8 ||
+ tx_fsm_val_1 != TX_FSM_HIBERN8) {
+ err = -1;
+ dev_err(hba->dev, "%s: invalid TX_FSM_STATE, lane0 = %d, lane1 = %d\n",
+ __func__, tx_fsm_val_0, tx_fsm_val_1);
+ }
+
+ return err;
+}
+
+static void ufs_hi3660_clk_init(struct ufs_hba *hba)
+{
+ struct ufs_hisi_host *host = ufshcd_get_variant(hba);
+
+ ufs_sys_ctrl_clr_bits(host, BIT_SYSCTRL_REF_CLOCK_EN, PHY_CLK_CTRL);
+ if (ufs_sys_ctrl_readl(host, PHY_CLK_CTRL) & BIT_SYSCTRL_REF_CLOCK_EN)
+ mdelay(1);
+ /* use abb clk */
+ ufs_sys_ctrl_clr_bits(host, BIT_UFS_REFCLK_SRC_SEl, UFS_SYSCTRL);
+ ufs_sys_ctrl_clr_bits(host, BIT_UFS_REFCLK_ISO_EN, PHY_ISO_EN);
+ /* open mphy ref clk */
+ ufs_sys_ctrl_set_bits(host, BIT_SYSCTRL_REF_CLOCK_EN, PHY_CLK_CTRL);
+}
+
+static void ufs_hi3660_soc_init(struct ufs_hba *hba)
+{
+ struct ufs_hisi_host *host = ufshcd_get_variant(hba);
+ u32 reg;
+
+ if (!IS_ERR(host->rst))
+ reset_control_assert(host->rst);
+
+ /* HC_PSW powerup */
+ ufs_sys_ctrl_set_bits(host, BIT_UFS_PSW_MTCMOS_EN, PSW_POWER_CTRL);
+ udelay(10);
+ /* notify PWR ready */
+ ufs_sys_ctrl_set_bits(host, BIT_SYSCTRL_PWR_READY, HC_LP_CTRL);
+ ufs_sys_ctrl_writel(host, MASK_UFS_DEVICE_RESET | 0,
+ UFS_DEVICE_RESET_CTRL);
+
+ reg = ufs_sys_ctrl_readl(host, PHY_CLK_CTRL);
+ reg = (reg & ~MASK_SYSCTRL_CFG_CLOCK_FREQ) | UFS_FREQ_CFG_CLK;
+ /* set cfg clk freq */
+ ufs_sys_ctrl_writel(host, reg, PHY_CLK_CTRL);
+ /* set ref clk freq */
+ ufs_sys_ctrl_clr_bits(host, MASK_SYSCTRL_REF_CLOCK_SEL, PHY_CLK_CTRL);
+ /* bypass ufs clk gate */
+ ufs_sys_ctrl_set_bits(host, MASK_UFS_CLK_GATE_BYPASS,
+ CLOCK_GATE_BYPASS);
+ ufs_sys_ctrl_set_bits(host, MASK_UFS_SYSCRTL_BYPASS, UFS_SYSCTRL);
+
+ /* open psw clk */
+ ufs_sys_ctrl_set_bits(host, BIT_SYSCTRL_PSW_CLK_EN, PSW_CLK_CTRL);
+ /* disable ufshc iso */
+ ufs_sys_ctrl_clr_bits(host, BIT_UFS_PSW_ISO_CTRL, PSW_POWER_CTRL);
+ /* disable phy iso */
+ ufs_sys_ctrl_clr_bits(host, BIT_UFS_PHY_ISO_CTRL, PHY_ISO_EN);
+ /* notice iso disable */
+ ufs_sys_ctrl_clr_bits(host, BIT_SYSCTRL_LP_ISOL_EN, HC_LP_CTRL);
+
+ /* disable lp_reset_n */
+ ufs_sys_ctrl_set_bits(host, BIT_SYSCTRL_LP_RESET_N, RESET_CTRL_EN);
+ mdelay(1);
+
+ ufs_sys_ctrl_writel(host, MASK_UFS_DEVICE_RESET | BIT_UFS_DEVICE_RESET,
+ UFS_DEVICE_RESET_CTRL);
+
+ msleep(20);
+
+ /*
+ * enable the fix of linereset recovery,
+ * and enable rx_reset/tx_rest beat
+ * enable ref_clk_en override(bit5) &
+ * override value = 1(bit4), with mask
+ */
+ ufs_sys_ctrl_writel(host, 0x03300330, UFS_DEVICE_RESET_CTRL);
+
+ if (!IS_ERR(host->rst))
+ reset_control_deassert(host->rst);
+}
+
+static int ufs_hisi_link_startup_pre_change(struct ufs_hba *hba)
+{
+ int err;
+ uint32_t value;
+ uint32_t reg;
+
+ /* Unipro VS_mphy_disable */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0xD0C1, 0x0), 0x1);
+ /* PA_HSSeries */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x156A, 0x0), 0x2);
+ /* MPHY CBRATESEL */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x8114, 0x0), 0x1);
+ /* MPHY CBOVRCTRL2 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x8121, 0x0), 0x2D);
+ /* MPHY CBOVRCTRL3 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x8122, 0x0), 0x1);
+ /* Unipro VS_MphyCfgUpdt */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0xD085, 0x0), 0x1);
+ /* MPHY RXOVRCTRL4 rx0 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x800D, 0x4), 0x58);
+ /* MPHY RXOVRCTRL4 rx1 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x800D, 0x5), 0x58);
+ /* MPHY RXOVRCTRL5 rx0 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x800E, 0x4), 0xB);
+ /* MPHY RXOVRCTRL5 rx1 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x800E, 0x5), 0xB);
+ /* MPHY RXSQCONTROL rx0 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x8009, 0x4), 0x1);
+ /* MPHY RXSQCONTROL rx1 */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x8009, 0x5), 0x1);
+ /* Unipro VS_MphyCfgUpdt */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0xD085, 0x0), 0x1);
+
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x8113, 0x0), 0x1);
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0xD085, 0x0), 0x1);
+
+ /* Tactive RX */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x008F, 0x4), 0x7);
+ /* Tactive RX */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x008F, 0x5), 0x7);
+
+ /* Gear3 Synclength */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x0095, 0x4), 0x4F);
+ /* Gear3 Synclength */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x0095, 0x5), 0x4F);
+ /* Gear2 Synclength */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x0094, 0x4), 0x4F);
+ /* Gear2 Synclength */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x0094, 0x5), 0x4F);
+ /* Gear1 Synclength */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x008B, 0x4), 0x4F);
+ /* Gear1 Synclength */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x008B, 0x5), 0x4F);
+ /* Thibernate Tx */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x000F, 0x0), 0x5);
+ /* Thibernate Tx */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x000F, 0x1), 0x5);
+
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0xD085, 0x0), 0x1);
+ /* Unipro VS_mphy_disable */
+ ufshcd_dme_get(hba, UIC_ARG_MIB_SEL(0xD0C1, 0x0), &value);
+ if (value != 0x1)
+ dev_info(hba->dev,
+ "Warring!!! Unipro VS_mphy_disable is 0x%x\n", value);
+
+ /* Unipro VS_mphy_disable */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0xD0C1, 0x0), 0x0);
+ err = ufs_hisi_check_hibern8(hba);
+ if (err)
+ dev_err(hba->dev, "ufs_hisi_check_hibern8 error\n");
+
+ ufshcd_writel(hba, UFS_HCLKDIV_NORMAL_VALUE, UFS_REG_HCLKDIV);
+
+ /* disable auto H8 */
+ reg = ufshcd_readl(hba, REG_AUTO_HIBERNATE_IDLE_TIMER);
+ reg = reg & (~UFS_AHIT_AH8ITV_MASK);
+ ufshcd_writel(hba, reg, REG_AUTO_HIBERNATE_IDLE_TIMER);
+
+ /* Unipro PA_Local_TX_LCC_Enable */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0x155E, 0x0), 0x0);
+ /* close Unipro VS_Mk2ExtnSupport */
+ ufshcd_dme_set(hba, UIC_ARG_MIB_SEL(0xD0AB, 0x0), 0x0);
+ ufshcd_dme_get(hba, UIC_ARG_MIB_SEL(0xD0AB, 0x0), &value);
+ if (value != 0) {
+ /* Ensure close success */
+ dev_info(hba->dev, "WARN: close VS_Mk2ExtnSupport failed\n");
+ }
+
+ return err;
+}
+
+static int ufs_hisi_link_startup_post_change(struct ufs_hba *hba)
+{
+ struct ufs_hisi_host *host = ufshcd_get_variant(hba);
+
+ /* Unipro DL_AFC0CreditThreshold */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x2044), 0x0);
+ /* Unipro DL_TC0OutAckThreshold */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x2045), 0x0);
+ /* Unipro DL_TC0TXFCThreshold */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x2040), 0x9);
+
+ /* not bypass ufs clk gate */
+ ufs_sys_ctrl_clr_bits(host, MASK_UFS_CLK_GATE_BYPASS,
+ CLOCK_GATE_BYPASS);
+ ufs_sys_ctrl_clr_bits(host, MASK_UFS_SYSCRTL_BYPASS,
+ UFS_SYSCTRL);
+
+ /* select received symbol cnt */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd09a), 0x80000000);
+ /* reset counter0 and enable */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd09c), 0x00000005);
+
+ return 0;
+}
+
+static int ufs_hi3660_link_startup_notify(struct ufs_hba *hba,
+ enum ufs_notify_change_status status)
+{
+ int err = 0;
+
+ switch (status) {
+ case PRE_CHANGE:
+ err = ufs_hisi_link_startup_pre_change(hba);
+ break;
+ case POST_CHANGE:
+ err = ufs_hisi_link_startup_post_change(hba);
+ break;
+ default:
+ break;
+ }
+
+ return err;
+}
+
+struct ufs_hisi_dev_params {
+ u32 pwm_rx_gear; /* pwm rx gear to work in */
+ u32 pwm_tx_gear; /* pwm tx gear to work in */
+ u32 hs_rx_gear; /* hs rx gear to work in */
+ u32 hs_tx_gear; /* hs tx gear to work in */
+ u32 rx_lanes; /* number of rx lanes */
+ u32 tx_lanes; /* number of tx lanes */
+ u32 rx_pwr_pwm; /* rx pwm working pwr */
+ u32 tx_pwr_pwm; /* tx pwm working pwr */
+ u32 rx_pwr_hs; /* rx hs working pwr */
+ u32 tx_pwr_hs; /* tx hs working pwr */
+ u32 hs_rate; /* rate A/B to work in HS */
+ u32 desired_working_mode;
+};
+
+static int ufs_hisi_get_pwr_dev_param(
+ struct ufs_hisi_dev_params *hisi_param,
+ struct ufs_pa_layer_attr *dev_max,
+ struct ufs_pa_layer_attr *agreed_pwr)
+{
+ int min_hisi_gear;
+ int min_dev_gear;
+ bool is_dev_sup_hs = false;
+ bool is_hisi_max_hs = false;
+
+ if (dev_max->pwr_rx == FASTAUTO_MODE || dev_max->pwr_rx == FAST_MODE)
+ is_dev_sup_hs = true;
+
+ if (hisi_param->desired_working_mode == FAST) {
+ is_hisi_max_hs = true;
+ min_hisi_gear = min_t(u32, hisi_param->hs_rx_gear,
+ hisi_param->hs_tx_gear);
+ } else {
+ min_hisi_gear = min_t(u32, hisi_param->pwm_rx_gear,
+ hisi_param->pwm_tx_gear);
+ }
+
+ /*
+ * device doesn't support HS but
+ * hisi_param->desired_working_mode is HS,
+ * thus device and hisi_param don't agree
+ */
+ if (!is_dev_sup_hs && is_hisi_max_hs) {
+ pr_err("%s: device not support HS\n", __func__);
+ return -ENOTSUPP;
+ } else if (is_dev_sup_hs && is_hisi_max_hs) {
+ /*
+ * since device supports HS, it supports FAST_MODE.
+ * since hisi_param->desired_working_mode is also HS
+ * then final decision (FAST/FASTAUTO) is done according
+ * to hisi_params as it is the restricting factor
+ */
+ agreed_pwr->pwr_rx = agreed_pwr->pwr_tx =
+ hisi_param->rx_pwr_hs;
+ } else {
+ /*
+ * here hisi_param->desired_working_mode is PWM.
+ * it doesn't matter whether device supports HS or PWM,
+ * in both cases hisi_param->desired_working_mode will
+ * determine the mode
+ */
+ agreed_pwr->pwr_rx = agreed_pwr->pwr_tx =
+ hisi_param->rx_pwr_pwm;
+ }
+
+ /*
+ * we would like tx to work in the minimum number of lanes
+ * between device capability and vendor preferences.
+ * the same decision will be made for rx
+ */
+ agreed_pwr->lane_tx =
+ min_t(u32, dev_max->lane_tx, hisi_param->tx_lanes);
+ agreed_pwr->lane_rx =
+ min_t(u32, dev_max->lane_rx, hisi_param->rx_lanes);
+
+ /* device maximum gear is the minimum between device rx and tx gears */
+ min_dev_gear = min_t(u32, dev_max->gear_rx, dev_max->gear_tx);
+
+ /*
+ * if both device capabilities and vendor pre-defined preferences are
+ * both HS or both PWM then set the minimum gear to be the chosen
+ * working gear.
+ * if one is PWM and one is HS then the one that is PWM get to decide
+ * what is the gear, as it is the one that also decided previously what
+ * pwr the device will be configured to.
+ */
+ if ((is_dev_sup_hs && is_hisi_max_hs) ||
+ (!is_dev_sup_hs && !is_hisi_max_hs))
+ agreed_pwr->gear_rx = agreed_pwr->gear_tx =
+ min_t(u32, min_dev_gear, min_hisi_gear);
+ else
+ agreed_pwr->gear_rx = agreed_pwr->gear_tx = min_hisi_gear;
+
+ agreed_pwr->hs_rate = hisi_param->hs_rate;
+
+ pr_info("ufs final power mode: gear = %d, lane = %d, pwr = %d, rate = %d\n",
+ agreed_pwr->gear_rx, agreed_pwr->lane_rx, agreed_pwr->pwr_rx,
+ agreed_pwr->hs_rate);
+ return 0;
+}
+
+static void ufs_hisi_set_dev_cap(struct ufs_hisi_dev_params *hisi_param)
+{
+ hisi_param->rx_lanes = UFS_HISI_LIMIT_NUM_LANES_RX;
+ hisi_param->tx_lanes = UFS_HISI_LIMIT_NUM_LANES_TX;
+ hisi_param->hs_rx_gear = UFS_HISI_LIMIT_HSGEAR_RX;
+ hisi_param->hs_tx_gear = UFS_HISI_LIMIT_HSGEAR_TX;
+ hisi_param->pwm_rx_gear = UFS_HISI_LIMIT_PWMGEAR_RX;
+ hisi_param->pwm_tx_gear = UFS_HISI_LIMIT_PWMGEAR_TX;
+ hisi_param->rx_pwr_pwm = UFS_HISI_LIMIT_RX_PWR_PWM;
+ hisi_param->tx_pwr_pwm = UFS_HISI_LIMIT_TX_PWR_PWM;
+ hisi_param->rx_pwr_hs = UFS_HISI_LIMIT_RX_PWR_HS;
+ hisi_param->tx_pwr_hs = UFS_HISI_LIMIT_TX_PWR_HS;
+ hisi_param->hs_rate = UFS_HISI_LIMIT_HS_RATE;
+ hisi_param->desired_working_mode = UFS_HISI_LIMIT_DESIRED_MODE;
+}
+
+static void ufs_hisi_pwr_change_pre_change(struct ufs_hba *hba)
+{
+ /* update */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x15A8), 0x1);
+ /* PA_TxSkip */
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x155c), 0x0);
+ /*PA_PWRModeUserData0 = 8191, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x15b0), 8191);
+ /*PA_PWRModeUserData1 = 65535, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x15b1), 65535);
+ /*PA_PWRModeUserData2 = 32767, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x15b2), 32767);
+ /*DME_FC0ProtectionTimeOutVal = 8191, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd041), 8191);
+ /*DME_TC0ReplayTimeOutVal = 65535, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd042), 65535);
+ /*DME_AFC0ReqTimeOutVal = 32767, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd043), 32767);
+ /*PA_PWRModeUserData3 = 8191, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x15b3), 8191);
+ /*PA_PWRModeUserData4 = 65535, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x15b4), 65535);
+ /*PA_PWRModeUserData5 = 32767, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0x15b5), 32767);
+ /*DME_FC1ProtectionTimeOutVal = 8191, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd044), 8191);
+ /*DME_TC1ReplayTimeOutVal = 65535, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd045), 65535);
+ /*DME_AFC1ReqTimeOutVal = 32767, default is 0*/
+ ufshcd_dme_set(hba, UIC_ARG_MIB(0xd046), 32767);
+}
+
+static int ufs_hi3660_pwr_change_notify(struct ufs_hba *hba,
+ enum ufs_notify_change_status status,
+ struct ufs_pa_layer_attr *dev_max_params,
+ struct ufs_pa_layer_attr *dev_req_params)
+{
+ struct ufs_hisi_dev_params ufs_hisi_cap;
+ int ret = 0;
+
+ if (!dev_req_params) {
+ dev_err(hba->dev,
+ "%s: incoming dev_req_params is NULL\n", __func__);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ switch (status) {
+ case PRE_CHANGE:
+ ufs_hisi_set_dev_cap(&ufs_hisi_cap);
+ ret = ufs_hisi_get_pwr_dev_param(
+ &ufs_hisi_cap, dev_max_params, dev_req_params);
+ if (ret) {
+ dev_err(hba->dev,
+ "%s: failed to determine capabilities\n", __func__);
+ goto out;
+ }
+
+ ufs_hisi_pwr_change_pre_change(hba);
+ break;
+ case POST_CHANGE:
+ break;
+ default:
+ ret = -EINVAL;
+ break;
+ }
+out:
+ return ret;
+}
+
+static int ufs_hisi_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op)
+{
+ struct ufs_hisi_host *host = ufshcd_get_variant(hba);
+
+ if (ufshcd_is_runtime_pm(pm_op))
+ return 0;
+
+ if (host->in_suspend) {
+ WARN_ON(1);
+ return 0;
+ }
+
+ ufs_sys_ctrl_clr_bits(host, BIT_SYSCTRL_REF_CLOCK_EN, PHY_CLK_CTRL);
+ udelay(10);
+ /* set ref_dig_clk override of PHY PCS to 0 */
+ ufs_sys_ctrl_writel(host, 0x00100000, UFS_DEVICE_RESET_CTRL);
+
+ host->in_suspend = true;
+
+ return 0;
+}
+
+static int ufs_hisi_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op)
+{
+ struct ufs_hisi_host *host = ufshcd_get_variant(hba);
+
+ if (!host->in_suspend)
+ return 0;
+
+ /* set ref_dig_clk override of PHY PCS to 1 */
+ ufs_sys_ctrl_writel(host, 0x00100010, UFS_DEVICE_RESET_CTRL);
+ udelay(10);
+ ufs_sys_ctrl_set_bits(host, BIT_SYSCTRL_REF_CLOCK_EN, PHY_CLK_CTRL);
+
+ host->in_suspend = false;
+ return 0;
+}
+
+static int ufs_hisi_get_resource(struct ufs_hisi_host *host)
+{
+ struct resource *mem_res;
+ struct device *dev = host->hba->dev;
+ struct platform_device *pdev = to_platform_device(dev);
+
+ /* get resource of ufs sys ctrl */
+ mem_res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+ host->ufs_sys_ctrl = devm_ioremap_resource(dev, mem_res);
+ if (IS_ERR(host->ufs_sys_ctrl))
+ return PTR_ERR(host->ufs_sys_ctrl);
+
+ return 0;
+}
+
+static void ufs_hisi_set_pm_lvl(struct ufs_hba *hba)
+{
+ hba->rpm_lvl = UFS_PM_LVL_1;
+ hba->spm_lvl = UFS_PM_LVL_3;
+}
+
+/**
+ * ufs_hisi_init_common
+ * @hba: host controller instance
+ */
+static int ufs_hisi_init_common(struct ufs_hba *hba)
+{
+ int err = 0;
+ struct device *dev = hba->dev;
+ struct ufs_hisi_host *host;
+
+ host = devm_kzalloc(dev, sizeof(*host), GFP_KERNEL);
+ if (!host)
+ return -ENOMEM;
+
+ host->hba = hba;
+ ufshcd_set_variant(hba, host);
+
+ host->rst = devm_reset_control_get(dev, "rst");
+
+ ufs_hisi_set_pm_lvl(hba);
+
+ err = ufs_hisi_get_resource(host);
+ if (err) {
+ ufshcd_set_variant(hba, NULL);
+ return err;
+ }
+
+ return 0;
+}
+
+static int ufs_hi3660_init(struct ufs_hba *hba)
+{
+ int ret = 0;
+ struct device *dev = hba->dev;
+
+ ret = ufs_hisi_init_common(hba);
+ if (ret) {
+ dev_err(dev, "%s: ufs common init fail\n", __func__);
+ return ret;
+ }
+
+ ufs_hi3660_clk_init(hba);
+
+ ufs_hi3660_soc_init(hba);
+
+ return 0;
+}
+
+static struct ufs_hba_variant_ops ufs_hba_hisi_vops = {
+ .name = "hi3660",
+ .init = ufs_hi3660_init,
+ .link_startup_notify = ufs_hi3660_link_startup_notify,
+ .pwr_change_notify = ufs_hi3660_pwr_change_notify,
+ .suspend = ufs_hisi_suspend,
+ .resume = ufs_hisi_resume,
+};
+
+static int ufs_hisi_probe(struct platform_device *pdev)
+{
+ return ufshcd_pltfrm_init(pdev, &ufs_hba_hisi_vops);
+}
+
+static int ufs_hisi_remove(struct platform_device *pdev)
+{
+ struct ufs_hba *hba = platform_get_drvdata(pdev);
+
+ ufshcd_remove(hba);
+ return 0;
+}
+
+static const struct of_device_id ufs_hisi_of_match[] = {
+ { .compatible = "hisilicon,hi3660-ufs" },
+ {},
+};
+
+MODULE_DEVICE_TABLE(of, ufs_hisi_of_match);
+
+static const struct dev_pm_ops ufs_hisi_pm_ops = {
+ .suspend = ufshcd_pltfrm_suspend,
+ .resume = ufshcd_pltfrm_resume,
+ .runtime_suspend = ufshcd_pltfrm_runtime_suspend,
+ .runtime_resume = ufshcd_pltfrm_runtime_resume,
+ .runtime_idle = ufshcd_pltfrm_runtime_idle,
+};
+
+static struct platform_driver ufs_hisi_pltform = {
+ .probe = ufs_hisi_probe,
+ .remove = ufs_hisi_remove,
+ .shutdown = ufshcd_pltfrm_shutdown,
+ .driver = {
+ .name = "ufshcd-hisi",
+ .pm = &ufs_hisi_pm_ops,
+ .of_match_table = of_match_ptr(ufs_hisi_of_match),
+ },
+};
+module_platform_driver(ufs_hisi_pltform);
+
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("platform:ufshcd-hisi");
+MODULE_DESCRIPTION("HiSilicon Hixxxx UFS Driver");
diff --git a/drivers/scsi/ufs/ufs-hisi.h b/drivers/scsi/ufs/ufs-hisi.h
new file mode 100644
index 000000000000..3df9cd7acc29
--- /dev/null
+++ b/drivers/scsi/ufs/ufs-hisi.h
@@ -0,0 +1,115 @@
+/*
+ * Copyright (c) 2017, HiSilicon. All rights reserved.
+ *
+ * Released under the GPLv2 only.
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#ifndef UFS_HISI_H_
+#define UFS_HISI_H_
+
+#define HBRN8_POLL_TOUT_MS 1000
+
+/*
+ * ufs sysctrl specific define
+ */
+#define PSW_POWER_CTRL (0x04)
+#define PHY_ISO_EN (0x08)
+#define HC_LP_CTRL (0x0C)
+#define PHY_CLK_CTRL (0x10)
+#define PSW_CLK_CTRL (0x14)
+#define CLOCK_GATE_BYPASS (0x18)
+#define RESET_CTRL_EN (0x1C)
+#define UFS_SYSCTRL (0x5C)
+#define UFS_DEVICE_RESET_CTRL (0x60)
+
+#define BIT_UFS_PSW_ISO_CTRL (1 << 16)
+#define BIT_UFS_PSW_MTCMOS_EN (1 << 0)
+#define BIT_UFS_REFCLK_ISO_EN (1 << 16)
+#define BIT_UFS_PHY_ISO_CTRL (1 << 0)
+#define BIT_SYSCTRL_LP_ISOL_EN (1 << 16)
+#define BIT_SYSCTRL_PWR_READY (1 << 8)
+#define BIT_SYSCTRL_REF_CLOCK_EN (1 << 24)
+#define MASK_SYSCTRL_REF_CLOCK_SEL (0x3 << 8)
+#define MASK_SYSCTRL_CFG_CLOCK_FREQ (0xFF)
+#define UFS_FREQ_CFG_CLK (0x39)
+#define BIT_SYSCTRL_PSW_CLK_EN (1 << 4)
+#define MASK_UFS_CLK_GATE_BYPASS (0x3F)
+#define BIT_SYSCTRL_LP_RESET_N (1 << 0)
+#define BIT_UFS_REFCLK_SRC_SEl (1 << 0)
+#define MASK_UFS_SYSCRTL_BYPASS (0x3F << 16)
+#define MASK_UFS_DEVICE_RESET (0x1 << 16)
+#define BIT_UFS_DEVICE_RESET (0x1)
+
+/*
+ * M-TX Configuration Attributes for Hixxxx
+ */
+#define MPHY_TX_FSM_STATE 0x41
+#define TX_FSM_HIBERN8 0x1
+
+/*
+ * Hixxxx UFS HC specific Registers
+ */
+enum {
+ UFS_REG_OCPTHRTL = 0xc0,
+ UFS_REG_OOCPR = 0xc4,
+
+ UFS_REG_CDACFG = 0xd0,
+ UFS_REG_CDATX1 = 0xd4,
+ UFS_REG_CDATX2 = 0xd8,
+ UFS_REG_CDARX1 = 0xdc,
+ UFS_REG_CDARX2 = 0xe0,
+ UFS_REG_CDASTA = 0xe4,
+
+ UFS_REG_LBMCFG = 0xf0,
+ UFS_REG_LBMSTA = 0xf4,
+ UFS_REG_UFSMODE = 0xf8,
+
+ UFS_REG_HCLKDIV = 0xfc,
+};
+
+/* AHIT - Auto-Hibernate Idle Timer */
+#define UFS_AHIT_AH8ITV_MASK 0x3FF
+
+/* REG UFS_REG_OCPTHRTL definition */
+#define UFS_HCLKDIV_NORMAL_VALUE 0xE4
+
+/* vendor specific pre-defined parameters */
+#define SLOW 1
+#define FAST 2
+
+#define UFS_HISI_LIMIT_NUM_LANES_RX 2
+#define UFS_HISI_LIMIT_NUM_LANES_TX 2
+#define UFS_HISI_LIMIT_HSGEAR_RX UFS_HS_G3
+#define UFS_HISI_LIMIT_HSGEAR_TX UFS_HS_G3
+#define UFS_HISI_LIMIT_PWMGEAR_RX UFS_PWM_G4
+#define UFS_HISI_LIMIT_PWMGEAR_TX UFS_PWM_G4
+#define UFS_HISI_LIMIT_RX_PWR_PWM SLOW_MODE
+#define UFS_HISI_LIMIT_TX_PWR_PWM SLOW_MODE
+#define UFS_HISI_LIMIT_RX_PWR_HS FAST_MODE
+#define UFS_HISI_LIMIT_TX_PWR_HS FAST_MODE
+#define UFS_HISI_LIMIT_HS_RATE PA_HS_MODE_B
+#define UFS_HISI_LIMIT_DESIRED_MODE FAST
+
+struct ufs_hisi_host {
+ struct ufs_hba *hba;
+ void __iomem *ufs_sys_ctrl;
+
+ struct reset_control *rst;
+
+ uint64_t caps;
+
+ bool in_suspend;
+};
+
+#define ufs_sys_ctrl_writel(host, val, reg) \
+ writel((val), (host)->ufs_sys_ctrl + (reg))
+#define ufs_sys_ctrl_readl(host, reg) readl((host)->ufs_sys_ctrl + (reg))
+#define ufs_sys_ctrl_set_bits(host, mask, reg) \
+ ufs_sys_ctrl_writel( \
+ (host), ((mask) | (ufs_sys_ctrl_readl((host), (reg)))), (reg))
+#define ufs_sys_ctrl_clr_bits(host, mask, reg) \
+ ufs_sys_ctrl_writel((host), \
+ ((~(mask)) & (ufs_sys_ctrl_readl((host), (reg)))), \
+ (reg))
+#endif /* UFS_HISI_H_ */
--
2.15.0
^ permalink raw reply related
* [PATCH v10 0/5] scsi: ufs: add ufs driver code for Hisilicon Hi3660 SoC
From: Li Wei @ 2018-05-25 9:17 UTC (permalink / raw)
To: linux-arm-kernel
This patchset adds driver support for UFS for Hi3660 SoC. It is verified on HiKey960 board.
Li Wei (5):
scsi: ufs: add Hisilicon ufs driver code
dt-bindings: scsi: ufs: add document for hisi-ufs
arm64: dts: add ufs dts node
arm64: defconfig: enable configs for Hisilicon ufs
arm64: defconfig: enable f2fs and squashfs
Documentation/devicetree/bindings/ufs/ufs-hisi.txt | 41 ++
.../devicetree/bindings/ufs/ufshcd-pltfrm.txt | 10 +-
arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 18 +
arch/arm64/configs/defconfig | 11 +
drivers/scsi/ufs/Kconfig | 9 +
drivers/scsi/ufs/Makefile | 1 +
drivers/scsi/ufs/ufs-hisi.c | 619 +++++++++++++++++++++
drivers/scsi/ufs/ufs-hisi.h | 115 ++++
8 files changed, 821 insertions(+), 3 deletions(-)
create mode 100644 Documentation/devicetree/bindings/ufs/ufs-hisi.txt
create mode 100644 drivers/scsi/ufs/ufs-hisi.c
create mode 100644 drivers/scsi/ufs/ufs-hisi.h
Major changes in v10:
- solve review comments from Rob Herring.
*Modify the "reset-names" describe in ufs-hisi.txt binding file.
*List clocks in ufs-hisi.txt binding file.
*remove the "arst" and keep only "rst" in the binging files.
*remove the "arst" member from both dts and c code.
Major changes in v9:
- solve review comments from Rob Herring.
*remove freq-table-hz in ufs-hisi.txt binding file.
*Move the rst to the ufshcd_pltfm.txt common binding file.
*Modify the member "assert" of UFS host structure to "arst".
Major changes in v8:
- solve review comments from zhangfei.
*Add Version history.
- solve review comments from Rob Herring.
*remove freq-table-hz.
- solve review comments from Riku Voipio.
*Add MODULE_DEVICE_TABLE for ufs driver.
--
Major changes in v7:
- solve review comments from Philippe Ombredanne.
*use the new SPDX license ids instead of the GNU General Public License.
--
2.15.0
^ permalink raw reply
* [PATCH] media: v4l2-ctrl: Add control for VP9 profile
From: Hans Verkuil @ 2018-05-25 9:09 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180517095349.203865-1-keiichiw@chromium.org>
On 17/05/18 11:53, Keiichi Watanabe wrote:
> Add a new control V4L2_CID_MPEG_VIDEO_VP9_PROFILE for selecting desired
> profile for VP9 encoder and querying for supported profiles by VP9 encoder
> or decoder.
>
> An existing control V4L2_CID_MPEG_VIDEO_VPX_PROFILE cannot be
> used for querying since it is not a menu control but an integer
> control, which cannot return an arbitrary set of supported profiles.
>
> The new control V4L2_CID_MPEG_VIDEO_VP9_PROFILE is a menu control as
> with controls for other codec profiles. (e.g. H264)
I don't mind adding this control (although I would like to have an Ack from
Sylwester), but we also need this to be used in an actual kernel driver.
Otherwise we're adding a control that nobody uses.
Regards,
Hans
>
> Signed-off-by: Keiichi Watanabe <keiichiw@chromium.org>
> ---
>
> .../media/uapi/v4l/extended-controls.rst | 26 +++++++++++++++++++
> drivers/media/v4l2-core/v4l2-ctrls.c | 12 +++++++++
> include/uapi/linux/v4l2-controls.h | 8 ++++++
> 3 files changed, 46 insertions(+)
>
> diff --git a/Documentation/media/uapi/v4l/extended-controls.rst b/Documentation/media/uapi/v4l/extended-controls.rst
> index 03931f9b1285..4f7f128a4998 100644
> --- a/Documentation/media/uapi/v4l/extended-controls.rst
> +++ b/Documentation/media/uapi/v4l/extended-controls.rst
> @@ -1959,6 +1959,32 @@ enum v4l2_vp8_golden_frame_sel -
> Select the desired profile for VPx encoder. Acceptable values are 0,
> 1, 2 and 3 corresponding to encoder profiles 0, 1, 2 and 3.
>
> +.. _v4l2-mpeg-video-vp9-profile:
> +
> +``V4L2_CID_MPEG_VIDEO_VP9_PROFILE``
> + (enum)
> +
> +enum v4l2_mpeg_video_vp9_profile -
> + This control allows to select the profile for VP9 encoder.
> + This is also used to enumerate supported profiles by VP9 encoder or decoder.
> + Possible values are:
> +
> +
> +
> +.. flat-table::
> + :header-rows: 0
> + :stub-columns: 0
> +
> + * - ``V4L2_MPEG_VIDEO_VP9_PROFILE_0``
> + - Profile 0
> + * - ``V4L2_MPEG_VIDEO_VP9_PROFILE_1``
> + - Profile 1
> + * - ``V4L2_MPEG_VIDEO_VP9_PROFILE_2``
> + - Profile 2
> + * - ``V4L2_MPEG_VIDEO_VP9_PROFILE_3``
> + - Profile 3
> +
> +
>
> High Efficiency Video Coding (HEVC/H.265) Control Reference
> -----------------------------------------------------------
> diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
> index d29e45516eb7..401ce21c2e63 100644
> --- a/drivers/media/v4l2-core/v4l2-ctrls.c
> +++ b/drivers/media/v4l2-core/v4l2-ctrls.c
> @@ -431,6 +431,13 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
> "Use Previous Specific Frame",
> NULL,
> };
> + static const char * const vp9_profile[] = {
> + "0",
> + "1",
> + "2",
> + "3",
> + NULL,
> + };
>
> static const char * const flash_led_mode[] = {
> "Off",
> @@ -614,6 +621,8 @@ const char * const *v4l2_ctrl_get_menu(u32 id)
> return mpeg4_profile;
> case V4L2_CID_MPEG_VIDEO_VPX_GOLDEN_FRAME_SEL:
> return vpx_golden_frame_sel;
> + case V4L2_CID_MPEG_VIDEO_VP9_PROFILE:
> + return vp9_profile;
> case V4L2_CID_JPEG_CHROMA_SUBSAMPLING:
> return jpeg_chroma_subsampling;
> case V4L2_CID_DV_TX_MODE:
> @@ -841,6 +850,8 @@ const char *v4l2_ctrl_get_name(u32 id)
> case V4L2_CID_MPEG_VIDEO_VPX_P_FRAME_QP: return "VPX P-Frame QP Value";
> case V4L2_CID_MPEG_VIDEO_VPX_PROFILE: return "VPX Profile";
>
> + case V4L2_CID_MPEG_VIDEO_VP9_PROFILE: return "VP9 Profile";
> +
> /* HEVC controls */
> case V4L2_CID_MPEG_VIDEO_HEVC_I_FRAME_QP: return "HEVC I-Frame QP Value";
> case V4L2_CID_MPEG_VIDEO_HEVC_P_FRAME_QP: return "HEVC P-Frame QP Value";
> @@ -1180,6 +1191,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
> case V4L2_CID_DEINTERLACING_MODE:
> case V4L2_CID_TUNE_DEEMPHASIS:
> case V4L2_CID_MPEG_VIDEO_VPX_GOLDEN_FRAME_SEL:
> + case V4L2_CID_MPEG_VIDEO_VP9_PROFILE:
> case V4L2_CID_DETECT_MD_MODE:
> case V4L2_CID_MPEG_VIDEO_HEVC_PROFILE:
> case V4L2_CID_MPEG_VIDEO_HEVC_LEVEL:
> diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
> index 8d473c979b61..56203b7b715c 100644
> --- a/include/uapi/linux/v4l2-controls.h
> +++ b/include/uapi/linux/v4l2-controls.h
> @@ -589,6 +589,14 @@ enum v4l2_vp8_golden_frame_sel {
> #define V4L2_CID_MPEG_VIDEO_VPX_P_FRAME_QP (V4L2_CID_MPEG_BASE+510)
> #define V4L2_CID_MPEG_VIDEO_VPX_PROFILE (V4L2_CID_MPEG_BASE+511)
>
> +#define V4L2_CID_MPEG_VIDEO_VP9_PROFILE (V4L2_CID_MPEG_BASE+512)
> +enum v4l2_mpeg_video_vp9_profile {
> + V4L2_MPEG_VIDEO_VP9_PROFILE_0 = 0,
> + V4L2_MPEG_VIDEO_VP9_PROFILE_1 = 1,
> + V4L2_MPEG_VIDEO_VP9_PROFILE_2 = 2,
> + V4L2_MPEG_VIDEO_VP9_PROFILE_3 = 3,
> +};
> +
> /* CIDs for HEVC encoding. */
>
> #define V4L2_CID_MPEG_VIDEO_HEVC_MIN_QP (V4L2_CID_MPEG_BASE + 600)
>
^ permalink raw reply
* [PATCH] mmc: sdhci-*: Don't emit error msg if sdhci_add_host() fails
From: Adrian Hunter @ 2018-05-25 9:05 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525151509.0270dbe1@xhacker.debian>
On 25/05/18 10:15, Jisheng Zhang wrote:
> I noticed below error msg with sdhci-pxav3 on some berlin platforms:
>
> [.....] sdhci-pxav3 f7ab0000.sdhci failed to add host
>
> It is due to getting related vmmc or vqmmc regulator returns
> -EPROBE_DEFER. It doesn't matter at all but it's confusing.
>
>>From another side, if driver probing fails and the error number isn't
> -EPROBE_DEFER, the core will tell us something as below:
>
> [.....] sdhci-pxav3: probe of f7ab0000.sdhci failed with error -EXX
>
> So it's not necessary to emit error msg if sdhci_add_host() fails. And
> some other sdhci host drivers also have this issue, let's fix them
> together.
>
> Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
> drivers/mmc/host/sdhci-bcm-kona.c | 4 +---
> drivers/mmc/host/sdhci-pic32.c | 4 +---
> drivers/mmc/host/sdhci-pxav2.c | 4 +---
> drivers/mmc/host/sdhci-pxav3.c | 4 +---
> drivers/mmc/host/sdhci-s3c.c | 4 +---
> drivers/mmc/host/sdhci-spear.c | 4 +---
> drivers/mmc/host/sdhci-st.c | 4 +---
> 7 files changed, 7 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/mmc/host/sdhci-bcm-kona.c b/drivers/mmc/host/sdhci-bcm-kona.c
> index 11ca95c60bcf..bdbd4897c0f7 100644
> --- a/drivers/mmc/host/sdhci-bcm-kona.c
> +++ b/drivers/mmc/host/sdhci-bcm-kona.c
> @@ -284,10 +284,8 @@ static int sdhci_bcm_kona_probe(struct platform_device *pdev)
> sdhci_bcm_kona_sd_init(host);
>
> ret = sdhci_add_host(host);
> - if (ret) {
> - dev_err(dev, "Failed sdhci_add_host\n");
> + if (ret)
> goto err_reset;
> - }
>
> /* if device is eMMC, emulate card insert right here */
> if (!mmc_card_is_removable(host->mmc)) {
> diff --git a/drivers/mmc/host/sdhci-pic32.c b/drivers/mmc/host/sdhci-pic32.c
> index a6caa49ca25a..a11e6397d4ff 100644
> --- a/drivers/mmc/host/sdhci-pic32.c
> +++ b/drivers/mmc/host/sdhci-pic32.c
> @@ -200,10 +200,8 @@ static int pic32_sdhci_probe(struct platform_device *pdev)
> }
>
> ret = sdhci_add_host(host);
> - if (ret) {
> - dev_err(&pdev->dev, "error adding host\n");
> + if (ret)
> goto err_base_clk;
> - }
>
> dev_info(&pdev->dev, "Successfully added sdhci host\n");
> return 0;
> diff --git a/drivers/mmc/host/sdhci-pxav2.c b/drivers/mmc/host/sdhci-pxav2.c
> index 8986f9d9cf98..2c3827f54927 100644
> --- a/drivers/mmc/host/sdhci-pxav2.c
> +++ b/drivers/mmc/host/sdhci-pxav2.c
> @@ -221,10 +221,8 @@ static int sdhci_pxav2_probe(struct platform_device *pdev)
> host->ops = &pxav2_sdhci_ops;
>
> ret = sdhci_add_host(host);
> - if (ret) {
> - dev_err(&pdev->dev, "failed to add host\n");
> + if (ret)
> goto disable_clk;
> - }
>
> return 0;
>
> diff --git a/drivers/mmc/host/sdhci-pxav3.c b/drivers/mmc/host/sdhci-pxav3.c
> index a34434166ca7..b8e96f392428 100644
> --- a/drivers/mmc/host/sdhci-pxav3.c
> +++ b/drivers/mmc/host/sdhci-pxav3.c
> @@ -472,10 +472,8 @@ static int sdhci_pxav3_probe(struct platform_device *pdev)
> pm_suspend_ignore_children(&pdev->dev, 1);
>
> ret = sdhci_add_host(host);
> - if (ret) {
> - dev_err(&pdev->dev, "failed to add host\n");
> + if (ret)
> goto err_add_host;
> - }
>
> if (host->mmc->pm_caps & MMC_PM_WAKE_SDIO_IRQ)
> device_init_wakeup(&pdev->dev, 1);
> diff --git a/drivers/mmc/host/sdhci-s3c.c b/drivers/mmc/host/sdhci-s3c.c
> index cda83ccb2702..9ef89d00970e 100644
> --- a/drivers/mmc/host/sdhci-s3c.c
> +++ b/drivers/mmc/host/sdhci-s3c.c
> @@ -655,10 +655,8 @@ static int sdhci_s3c_probe(struct platform_device *pdev)
> goto err_req_regs;
>
> ret = sdhci_add_host(host);
> - if (ret) {
> - dev_err(dev, "sdhci_add_host() failed\n");
> + if (ret)
> goto err_req_regs;
> - }
>
> #ifdef CONFIG_PM
> if (pdata->cd_type != S3C_SDHCI_CD_INTERNAL)
> diff --git a/drivers/mmc/host/sdhci-spear.c b/drivers/mmc/host/sdhci-spear.c
> index 14511526a3a8..9247d51f2eed 100644
> --- a/drivers/mmc/host/sdhci-spear.c
> +++ b/drivers/mmc/host/sdhci-spear.c
> @@ -126,10 +126,8 @@ static int sdhci_probe(struct platform_device *pdev)
> }
>
> ret = sdhci_add_host(host);
> - if (ret) {
> - dev_dbg(&pdev->dev, "error adding host\n");
> + if (ret)
> goto disable_clk;
> - }
>
> platform_set_drvdata(pdev, host);
>
> diff --git a/drivers/mmc/host/sdhci-st.c b/drivers/mmc/host/sdhci-st.c
> index c32daed0d418..8f95647195d9 100644
> --- a/drivers/mmc/host/sdhci-st.c
> +++ b/drivers/mmc/host/sdhci-st.c
> @@ -422,10 +422,8 @@ static int sdhci_st_probe(struct platform_device *pdev)
> st_mmcss_cconfig(np, host);
>
> ret = sdhci_add_host(host);
> - if (ret) {
> - dev_err(&pdev->dev, "Failed sdhci_add_host\n");
> + if (ret)
> goto err_out;
> - }
>
> host_version = readw_relaxed((host->ioaddr + SDHCI_HOST_VERSION));
>
>
^ permalink raw reply
* [PATCH v11 08/19] arm64: fpsimd: Eliminate task->mm checks
From: Christoffer Dall @ 2018-05-25 9:02 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <1527181008-13549-9-git-send-email-Dave.Martin@arm.com>
On Thu, May 24, 2018 at 05:56:37PM +0100, Dave Martin wrote:
> Currently the FPSIMD handling code uses the condition task->mm ==
> NULL as a hint that task has no FPSIMD register context.
>
> The ->mm check is only there to filter out tasks that cannot
> possibly have FPSIMD context loaded, for optimisation purposes.
> Also, TIF_FOREIGN_FPSTATE must always be checked anyway before
> saving FPSIMD context back to memory. For these reasons, the ->mm
> checks are not useful, providing that TIF_FOREIGN_FPSTATE is
> maintained in a consistent way for all threads.
>
> The context switch logic is already deliberately optimised to defer
> reloads of the regs until ret_to_user (or sigreturn as a special
> case), and save them only if they have been previously loaded.
> These paths are the only places where the wrong_task and wrong_cpu
> conditions can be made false, by calling fpsimd_bind_task_to_cpu().
> Kernel threads by definition never reach these paths. As a result,
> the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will
> always yield true for kernel threads.
>
> This patch removes the redundant checks and special-case code, ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread is scheduled in, and ensures that this flag is set for the init
> task. The fpsimd_flush_task_state() call already present in
> copy_thread() ensures the same for any new task.
nit: formatting still funny, but you shouldn't respin just for that.
>
> With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch
> ensures that no extra context save work is added for kernel
> threads, and eliminates the redundant context saving that may
> currently occur for kernel threads that have acquired an mm via
> use_mm().
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Reviewed-by: Alex Benn?e <alex.bennee@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>
> ---
>
> Changes since v10:
>
> * The INIT_THREAD flag change is split out into the prior
> patch, since it is in principle a fix rather than simply a
> tidy-up.
>
> Requested by Christoffer Dall / Catalin Marinas:
>
> * Reworded commit message to explain the change more clearly,
> and remove confusing claims about things being true by
> construction.
>
> * Added a comment to the code explaining that wrong_cpu and
> wrong_task will always be true for kernel threads.
>
> * Ensure .fpsimd_cpu = NR_CPUS for the init task.
>
> This does not seem to be a bug, because the wrong_task check in
> fpsimd_thread_switch() should still always be true for the init
> task; but it is nonetheless an inconsistency compared with what
> copy_thread() does.
>
> So fix it to avoid future surprises.
> ---
> arch/arm64/include/asm/processor.h | 4 +++-
> arch/arm64/kernel/fpsimd.c | 40 +++++++++++++++-----------------------
> 2 files changed, 19 insertions(+), 25 deletions(-)
>
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 7675989..36d64f8 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -156,7 +156,9 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset,
> /* Sync TPIDR_EL0 back to thread_struct for current */
> void tls_preserve_current_state(void);
>
> -#define INIT_THREAD { }
> +#define INIT_THREAD { \
> + .fpsimd_cpu = NR_CPUS, \
> +}
>
> static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
> {
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 2d9a9e8..d736b6c 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -892,31 +892,25 @@ asmlinkage void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
>
> void fpsimd_thread_switch(struct task_struct *next)
> {
> + bool wrong_task, wrong_cpu;
> +
> if (!system_supports_fpsimd())
> return;
> +
> + /* Save unsaved fpsimd state, if any: */
> + fpsimd_save();
> +
> /*
> - * Save the current FPSIMD state to memory, but only if whatever is in
> - * the registers is in fact the most recent userland FPSIMD state of
> - * 'current'.
> + * Fix up TIF_FOREIGN_FPSTATE to correctly describe next's
> + * state. For kernel threads, FPSIMD registers are never loaded
> + * and wrong_task and wrong_cpu will always be true.
> */
> - if (current->mm)
> - fpsimd_save();
> -
> - if (next->mm) {
> - /*
> - * If we are switching to a task whose most recent userland
> - * FPSIMD state is already in the registers of *this* cpu,
> - * we can skip loading the state from memory. Otherwise, set
> - * the TIF_FOREIGN_FPSTATE flag so the state will be loaded
> - * upon the next return to userland.
> - */
> - bool wrong_task = __this_cpu_read(fpsimd_last_state.st) !=
> + wrong_task = __this_cpu_read(fpsimd_last_state.st) !=
> &next->thread.uw.fpsimd_state;
> - bool wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id();
> + wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id();
>
> - update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
> - wrong_task || wrong_cpu);
> - }
> + update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE,
> + wrong_task || wrong_cpu);
> }
>
> void fpsimd_flush_thread(void)
> @@ -1121,9 +1115,8 @@ void kernel_neon_begin(void)
>
> __this_cpu_write(kernel_neon_busy, true);
>
> - /* Save unsaved task fpsimd state, if any: */
> - if (current->mm)
> - fpsimd_save();
> + /* Save unsaved fpsimd state, if any: */
> + fpsimd_save();
>
> /* Invalidate any task state remaining in the fpsimd regs: */
> fpsimd_flush_cpu_state();
> @@ -1245,8 +1238,7 @@ static int fpsimd_cpu_pm_notifier(struct notifier_block *self,
> {
> switch (cmd) {
> case CPU_PM_ENTER:
> - if (current->mm)
> - fpsimd_save();
> + fpsimd_save();
> fpsimd_flush_cpu_state();
> break;
> case CPU_PM_EXIT:
> --
> 2.1.4
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v10 07/18] arm64: fpsimd: Eliminate task->mm checks
From: Christoffer Dall @ 2018-05-25 9:00 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180524143714.GV13470@e103592.cambridge.arm.com>
On Thu, May 24, 2018 at 03:37:15PM +0100, Dave Martin wrote:
> On Thu, May 24, 2018 at 12:06:59PM +0200, Christoffer Dall wrote:
> > On Thu, May 24, 2018 at 10:50:56AM +0100, Dave Martin wrote:
> > > On Thu, May 24, 2018 at 10:33:50AM +0200, Christoffer Dall wrote:
>
> [...]
>
> > > > ...with a risk of being a bit over-pedantic and annoying, may I suggest
> > > > the following complete commit text:
> > > >
> > > > ------8<------
> > > > Currently the FPSIMD handling code uses the condition task->mm ==
> > > > NULL as a hint that task has no FPSIMD register context.
> > > >
> > > > The ->mm check is only there to filter out tasks that cannot
> > > > possibly have FPSIMD context loaded, for optimisation purposes.
> > > > However, TIF_FOREIGN_FPSTATE must always be checked anyway before
> > > > saving FPSIMD context back to memory. For this reason, the ->mm
> > > > checks are not useful, providing that that TIF_FOREIGN_FPSTATE is
> > > > maintained properly for kernel threads.
> > > >
> > > > FPSIMD context is never preserved for kernel threads across a context
> > > > switch and therefore TIF_FOREIGN_FPSTATE should always be true for
> > >
> > > (This refactoring opens up the interesting possibility of making
> > > kernel-mode NEON in task context preemptible for kernel threads so
> > > that we actually do preserve state... but that's a discussion for
> > > another day. There may be code around that relies on
> > > kernel_neon_begin() disabling preemption for real.)
> > >
> > > > kernel threads. This is indeed the case, as the wrong_task and
> > >
> > > This suggests that TIF_FOREIGN_FPSTATE is always true for kernel
> > > threads today. This is not quite because use_mm() can make mm non-
> > > NULL.
> > >
> >
> > I was suggesting that it's always true after this patch.
>
> I tend to read the present tense as describing the situation before the
> patch, but this convention isn't followed universally.
>
> This was part of the problem with my "true by construction" weasel
> words: the described property wasn't true by construction prior to the
> patch, and there wasn't sufficient explanation to convince people it's
> true afterwards. If people are bring rigorous, it takes a _lot_ of
> explanation...
>
> >
> > > > wrong_cpu tests in fpsimd_thread_switch() will always yield false for
> > > > kernel threads.
> > >
> > > ("false" -> "true". My bad.)
> > >
> > > > Further, the context switch logic is already deliberately optimised to
> > > > defer reloads of the FPSIMD context until ret_to_user (or sigreturn as a
> > > > special case), which kernel threads by definition never reach, and
> > > > therefore this change introduces no additional work in the critical
> > > > path.
> > > >
> > > > This patch removes the redundant checks and special-case code.
> > > > ------8<------
> > >
> > > Looking at my existing text, I rather reworded it like this.
> > > Does this work any better for you?
> > >
> > > --8<--
> > >
> > > Currently the FPSIMD handling code uses the condition task->mm ==
> > > NULL as a hint that task has no FPSIMD register context.
> > >
> > > The ->mm check is only there to filter out tasks that cannot
> > > possibly have FPSIMD context loaded, for optimisation purposes.
> > > Also, TIF_FOREIGN_FPSTATE must always be checked anyway before
> > > saving FPSIMD context back to memory. For these reasons, the ->mm
> > > checks are not useful, providing that TIF_FOREIGN_FPSTATE is
> > > maintained in a consistent way for kernel threads.
> >
> > Consistent with what? Without more context or explanation,
>
> Consistent with the handling of user threads (though I admit it's not
> explicit in the text.)
>
> > I'm not sure what the reader is to make of that. Do you not mean the
> > TIF_FOREIGN_FPSTATE is always true for kernel threads?
>
> Again, this is probably a red herring. TIF_FOREIGN_FPSTATE is always
> true for kernel threads prior to the patch, except (randomly) for the
> init task.
That was really what my initial question was about, and what I thought
the commit message should make abundantly clear, because that ties the
message together with the code.
>
> This change is not really about TIF_FOREIGN_FPSTATE at all, rather
> that there is nothing to justify handling kernel threads differently,
> or even distinguishing kernel threads from user threads at all in this
> code.
Understood.
>
> Part of the confusion (and I had confused myself) comes from the fact
> that TIF_FOREIGN_FPSTATE is really a per-cpu property and doesn't make
> sense as a per-task property -- i.e., the flag is meaningless for
> scheduled-out tasks and we must explicitly "repair" it when scheduling
> a task in anyway. I think it's a thread flag primarily so that it's
> convenient to check alongside other thread flags in the ret_to_user
> work loop. This is somewhat less of a justification now that loop was
> ported to C.
>
> > >
> > > The context switch logic is already deliberately optimised to defer
> > > reloads of the regs until ret_to_user (or sigreturn as a special
> > > case), and save them only if they have been previously loaded.
>
> Does it help to insert the following here?
>
> "These paths are the only places where the wrong_task and wrong_cpu
> conditions can be made false, by calling fpsimd_bind_task_to_cpu()."
>
yes it does.
> > > Kernel threads by definition never reach these paths. As a result,
> >
> > I'm struggling with the "As a result," here. Is this because reloads of
> > regs in ret_to_user (or sigreturn) are the only places that can make
> > wrong_cpu or wrong_task be false?
>
> See the proposed clarification above. Is that sufficient?
>
yes.
> > (I'm actually wanting to understand this, not just bikeshedding the
> > commit message, as new corner cases keep coming up on this logic.)
>
> That's a good thing, and I would really like to explain it in a
> concise manner. See [*] below for the "concise" explanation -- it may
> demonstrate why I've been evasive...
>
I don't think you've been evasive at all, I just think we reason about
this in slightly different ways, and I was trying to convince myself why
this change is safe and summarize that concisely. I think we've
accomplished both :)
> > > the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will
> > > always yield true for kernel threads.
> > >
> > > This patch removes the redundant checks and special-case code, ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread is scheduled in, and ensures that this flag is set for the init
> > > task. The fpsimd_flush_task_state() call already present in copy_thread() ensures the same for any new task.
> >
> > nit: funny formatting
>
> Dang, I was repeatedly pasing between Mutt and git commit terminals,
> which doesn't always work as I'd like...
>
> > nit: ensuring that TIF_FOREIGN_FPSTATE *remains* set whenever a kernel
> > thread is scheduled in?
>
> Er, yes.
>
> > > With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch
> > > ensures that no extra context save work is added for kernel
> > > threads, and eliminates the redundant context saving that may
> > > currently occur for kernel threads that have acquired an mm via
> > > use_mm().
> > >
> > > -->8--
> >
> > If you can slightly connect the dots with the "As a result" above, I'm
> > fine with your version of the text.
>
>
> As an aside, the big wall of text before the definition of struct
> fpsimd_last_state_struct is looking out of date and could use an
> update to cover at least some of what is explained in [*] better.
>
> I'm currently considering that out of scope for this series, but I will
> keep it in mind to refresh it in the not too distant future.
>
Fine with me.
>
> Cheers
> ---Dave
>
> --8<--
>
> [*] The bigger picture:
>
> * Consider a relation (C,T) between cpus C and tasks T, such that
> (C,T) means "T's FPSIMD regs are loaded on cpu C".
>
> At a given point of execution of some cpu C, there is at most one task
> T for which (C,T) holds.
>
> At a given point of execution of some task T, there is at most one
> cpu C for which (C,T) holds.
>
> * (C,T) becomes true whenever T's registers are loaded into cpu C.
>
> * At sched-out, we must ensure that the registers of current are
> loaded before writing them to current's thread_struct. Thus, we
> must save the registers if and only if (smp_processor_id(), current)
> holds at this time.
>
> * Before entering userspace, we must ensure that current's regs
> are loaded, and we must only load the regs if they are not loaded
> already (since if so, they might have been dirtied by current in
> userspace since last loaded).
>
> Thus, when entering userspace, we must load the regs from memory
> if and only if (smp_processor_id(), current) does not hold.
>
> * Checking this relation involves per-CPU access and inspection of
> current->thread, and was presumably considered too cumbersome for
> implemenation an entry.S, particluarly in the ret_to_user work
> pending loop (which is where the FPSIMD regs are finally loaded
> before entering userspace, if they weren't loaded already).
>
> To mitigate this, the status of the check is cached in a thread flag
> TIF_FOREIGN_FPSTATE: with softirqs disabled, (smp_processor_id(),
> current) holds if and only if TIF_FOREIGN_FPSTATE is false.
> TIF_FOREIGN_FPSTATE is corrected on sched-in by the code in
> fpsimd_thread_switch().
>
> [2] Anything that changes the state of the relation for current
> requires its TIF_FOREIGN_FPSTATE to be changed to match.
>
> * (smp_processor_id(), current) is established in
> fpsimd_bind_task_to_cpu(). This is the only way the relation can be
> made to hold between a task and a CPU.
>
> * (C,T) is broken whenever
>
> [1] T is created;
>
> * T's regs are loaded onto a different cpu C2, so (C2,T) becomes
> true and (C,T) necessarily becomes false;
>
> * another task's regs are loaded into C, so (C,T2) becomes true
> and (C,T) necessarily becomes false;
>
> * the kernel clobbers the regs on C for its own purposes, so
> (C,T) becomes false but there is no T2 for which (C,T2) becomes
> true as a result. Examples are kernel-mode NEON and loading
> the regs for a KVM vcpu;
>
> * T's register context changes via a thread_struct update instead
> of running instructions in userspace, requiring the contents of
> the hardware regs to be thrown away. Examples are exec() (which
> requires the registers to be zeroed), sigreturn (which populates the
> regs from the user signal frame) and modification of the registers
> via PTRACE_SETREGSET;
>
> As a (probably unnecesary) optimisation, sigreturn immediately
> loads the registers and reestablishes (smp_processor_id(), current)
> in anticipation of the return to userspace which is likely to
> occur soon. This allows the relation breaking logic to be omitted
> in fpsimd_update_current_state() which does the work.
>
> * In general, these relation breakings involve an unknown: knowing
> either C or T but *not* both, we want to break (C,T). If the
> relation were recorded in task_struct only, we would need to scan all
> tasks in the "T unknown" case. If the relation were recorded in a
> percpu variable only, we would need to scan all CPUs in the "C
> unknown" case. As well as having gnarly synchronisation
> requirements, these would get expensive in many-tasks or many-cpus
> situations.
>
> This is why the relation is recorded in both places, and is only
> deemed to hold if the two records match up. This is what
> fpsimd_thread_switch() is checking for the task being scheduled in.
>
> The invalidation (breaking) operations are now factored as
>
> fpsimd_flush_task_state(): falsify (C,current) for every cpu C.
> This is done by zapping current->thread.fpsimd_cpu with NR_CPUS
> (chosen because it cannot match smp_processor_id()).
>
> fpsumd_flush_cpu_state(): falsify (smp_processor_id(),T) for every
> task T. This is done by zapping this_cpu(fpsimd_last_state.st)
> with NULL (chosen because it cannot match &T->thread.uw.fpsimd_state
> for any task).
>
> By [2] above, it is necessary to ensure that TIF_FOREIGN_FPSTATE is
> set after calling either of the above functions. Of the two,
> fpsimd_flush_cpu_state() now does this implicitly but
> fpsimd_flush_task_state() does not: but the caller must do it
> instead. I have a vague memory of some refactoring obstacle that
> dissuaded me from pulling the set_thread_flag in, but I can't
> remember it now. I may review this later.
>
> * Because the (C,T) relation may need to be manipulated by
> kernel_neon_{begin,end}() in softirq context, examining or
> manipulating for current or the running CPU must be done under
> local_bh_disable(). The same goes for TIF_FOREIGN_FPSTATE which is
> supposed to represent the same condition but may spontaneously become
> stale if softirqs are not masked. (The rule is not quite as strict
> as this, but in order to make the code easier to reason about, I skip
> the local_bh_disable() only where absolutely necessary --
> restore_sve_fpsimd_context() is the only example today.)
>
> Now, imagine that T is a kernel thread, and consider what needs to
> be done differently. The observation of this patch is that nothing
> needs to be done differently at all.
>
> There is a single anomaly relating to [1] above, in the form of a task
> that can run without ever being scheduled in: the init task. Beyond
> that, kernel_neon_begin() before the first reschedule would spuriously
> save the FPSIMD regs into the init_task's thread struct, even though it
> is pointless to do so. This patch fixes those anomalies by updating
> INIT_THREAD and INIT_THREAD_INFO to set up the init task so that it
> looks the same as some other kernel thread that has been scheduled in.
>
> There is a strong design motivation to avoid unnecessary loads and
> saves of the state, so if removing the special-casing of kernel threads
> were to add cost it would imply that the code were _already_ suboptimal
> for user tasks. This patch does not attempt to address that at all,
> but by assuming that the code is already well-optimised, "unnecessary"
> save/restore work will not be added. If this were not the case, it
> could in any case be fixed independently.
>
> The observation of this _series_ is that we don't need to do very
> much in order to be able to generalise the logic to accept KVM vcpus
> in place of T.
>
Thanks for the explanation.
-Christoffer
^ permalink raw reply
* [PATCH v2 03/40] iommu/sva: Manage process address spaces
From: Jonathan Cameron @ 2018-05-25 8:39 UTC (permalink / raw)
To: linux-arm-kernel
In-Reply-To: <20180525063311.GA11605@apalos>
+CC Kenneth Lee
On Fri, 25 May 2018 09:33:11 +0300
Ilias Apalodimas <ilias.apalodimas@linaro.org> wrote:
> On Thu, May 24, 2018 at 04:04:39PM +0100, Jean-Philippe Brucker wrote:
> > On 24/05/18 12:50, Ilias Apalodimas wrote:
> > >> Interesting, I hadn't thought about this use-case before. At first I
> > >> thought you were talking about mdev devices assigned to VMs, but I think
> > >> you're referring to mdevs assigned to userspace drivers instead? Out of
> > >> curiosity, is it only theoretical or does someone actually need this?
> > >
> > > There has been some non upstreamed efforts to have mdev and produce userspace
> > > drivers. Huawei is using it on what they call "wrapdrive" for crypto devices and
> > > we did a proof of concept for ethernet interfaces. At the time we choose not to
> > > involve the IOMMU for the reason you mentioned, but having it there would be
> > > good.
> >
> > I'm guessing there were good reasons to do it that way but I wonder, is
> > it not simpler to just have the kernel driver create a /dev/foo, with a
> > standard ioctl/mmap/poll interface? Here VFIO adds a layer of
> > indirection, and since the mediating driver has to implement these
> > operations already, what is gained?
> The best reason i can come up with is "common code". You already have one API
> doing that for you so we replicate it in a /dev file?
> The mdev approach still needs extentions to support what we tried to do (i.e
> mdev bus might need yo have access on iommu_ops), but as far as i undestand it's
> a possible case.
> >
> > Thanks,
> > Jean
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox