* Re: [PATCH v2 2/4] KVM: PPC: Book3S HV: track the state GFNs associated with secure VMs
From: Ram Pai @ 2020-06-19 1:16 UTC (permalink / raw)
To: Laurent Dufour
Cc: cclaudio, kvm-ppc, bharata, sathnaga, aneesh.kumar, sukadev,
linuxppc-dev, bauerman, david
In-Reply-To: <7f5aea68-0cc5-6ae5-c30e-eee60eff5a92@linux.ibm.com>
On Thu, Jun 18, 2020 at 03:31:06PM +0200, Laurent Dufour wrote:
> Le 18/06/2020 à 11:19, Ram Pai a écrit :
> >
.snip..
> >************************************************************************
> > 1. States of a GFN
> > ---------------
> > The GFN can be in one of the following states.
> >diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
...snip...
> >index 803940d..3448459 100644
> >--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> >@@ -1100,7 +1100,7 @@ void kvmppc_radix_flush_memslot(struct kvm *kvm,
> > unsigned int shift;
> > if (kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START)
> >- kvmppc_uvmem_drop_pages(memslot, kvm, true);
> >+ kvmppc_uvmem_drop_pages(memslot, kvm, true, false);
>
> When reviewing the v1 of this series, I asked you the question about
> the fact that the call here is made with purge_gfn = false. Your
> answer was:
>
> >This function does not know, under what context it is called. Since
> >its job is to just flush the memslot, it cannot assume anything
> >about purging the pages in the memslot.
>
> Indeed in the case of the memory hotplug operation, this function is
> called to wipe the page from the secure device in the case the pages
> are secured. In that case the purge is required. Indeed, I checked
> the other call to kvmppc_radix_flush_memslot() in
> kvmppc_core_flush_memslot_hv() and I cannot see why in that case too
> purge_gfn should be false, especially when the memslot is reused as
> detailed in __kvm_set_memory_region() around the call to
> kvm_arch_flush_shadow_memslot().
>
> I'm sorry to not have ask this earlier, but could you please elaborate on this?
You are right. kvmppc_radix_flush_memslot() is getting called everytime with
the intention of disassociating the memslot from that VM. Which implies,
the memslot is intended to be deleted and possibly reused.
I should be calling kvmppc_uvmem_drop_pages() with purge_gfn=true, here
aswell.
I expect some form of problem showing up in memhot-plug/unplug path.
RP
^ permalink raw reply
* Re: [PATCH V3 (RESEND) 0/3] arm64: Enable vmemmap mapping from device memory
From: Anshuman Khandual @ 2020-06-19 1:34 UTC (permalink / raw)
To: Mike Rapoport
Cc: Mark Rutland, Michal Hocko, linux-ia64, David Hildenbrand,
Peter Zijlstra, Dave Hansen, linux-mm, Paul Mackerras,
linux-riscv, Will Deacon, Thomas Gleixner, x86,
Matthew Wilcox (Oracle), Ingo Molnar, Catalin Marinas, Fenghua Yu,
Pavel Tatashin, Andy Lutomirski, Paul Walmsley, Dan Williams,
linux-arm-kernel, Tony Luck, linux-kernel, Palmer Dabbelt,
Andrew Morton, linuxppc-dev, Kirill A. Shutemov
In-Reply-To: <20200618085641.GE6493@linux.ibm.com>
On 06/18/2020 02:26 PM, Mike Rapoport wrote:
> On Thu, Jun 18, 2020 at 06:45:27AM +0530, Anshuman Khandual wrote:
>> This series enables vmemmap backing memory allocation from device memory
>> ranges on arm64. But before that, it enables vmemmap_populate_basepages()
>> and vmemmap_alloc_block_buf() to accommodate struct vmem_altmap based
>> alocation requests.
>>
>> This series applies on 5.8-rc1.
>>
>> Pending Question:
>>
>> altmap_alloc_block_buf() does not have any other remaining users in
>> the tree after this change. Should it be converted into a static
>> function and it's declaration be dropped from the header
>> (include/linux/mm.h). Avoided doing so because I was not sure if there
>> are any off-tree users or not.
>
> Well, off-tree users probably have an active fork anyway so they could
> switch to vmemmap_alloc_block_buf()...
Sure, will make the function a static and remove it's declaration
from the header.
>
> Regardless, can you please update Documentation/vm/memory-model.rst to
> keep it in sync with the code?
Sure, will do.
^ permalink raw reply
* [PATCHv2] tpm: ibmvtpm: Wait for ready buffer before probing for TPM2 attributes
From: David Gibson @ 2020-06-19 3:30 UTC (permalink / raw)
To: jarkko.sakkinen, stefanb
Cc: nayna, linux-kernel, jgg, paulus, peterhuewe, linuxppc-dev,
linux-integrity, David Gibson
The tpm2_get_cc_attrs_tbl() call will result in TPM commands being issued,
which will need the use of the internal command/response buffer. But,
we're issuing this *before* we've waited to make sure that buffer is
allocated.
This can result in intermittent failures to probe if the hypervisor / TPM
implementation doesn't respond quickly enough. I find it fails almost
every time with an 8 vcpu guest under KVM with software emulated TPM.
To fix it, just move the tpm2_get_cc_attrs_tlb() call after the
existing code to wait for initialization, which will ensure the buffer
is allocated.
Fixes: 18b3670d79ae9 ("tpm: ibmvtpm: Add support for TPM2")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
Changes from v1:
* Fixed a formatting error in the commit message
* Added some more detail to the commit message
drivers/char/tpm/tpm_ibmvtpm.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
index 09fe45246b8cc..994385bf37c0c 100644
--- a/drivers/char/tpm/tpm_ibmvtpm.c
+++ b/drivers/char/tpm/tpm_ibmvtpm.c
@@ -683,13 +683,6 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
if (rc)
goto init_irq_cleanup;
- if (!strcmp(id->compat, "IBM,vtpm20")) {
- chip->flags |= TPM_CHIP_FLAG_TPM2;
- rc = tpm2_get_cc_attrs_tbl(chip);
- if (rc)
- goto init_irq_cleanup;
- }
-
if (!wait_event_timeout(ibmvtpm->crq_queue.wq,
ibmvtpm->rtce_buf != NULL,
HZ)) {
@@ -697,6 +690,13 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
goto init_irq_cleanup;
}
+ if (!strcmp(id->compat, "IBM,vtpm20")) {
+ chip->flags |= TPM_CHIP_FLAG_TPM2;
+ rc = tpm2_get_cc_attrs_tbl(chip);
+ if (rc)
+ goto init_irq_cleanup;
+ }
+
return tpm_chip_register(chip);
init_irq_cleanup:
do {
--
2.26.2
^ permalink raw reply related
* Re: [PATCHv2] tpm: ibmvtpm: Wait for ready buffer before probing for TPM2 attributes
From: Jerry Snitselaar @ 2020-06-19 3:42 UTC (permalink / raw)
To: David Gibson
Cc: nayna, linux-kernel, jarkko.sakkinen, jgg, paulus, peterhuewe,
linuxppc-dev, linux-integrity, stefanb
In-Reply-To: <20200619033040.121412-1-david@gibson.dropbear.id.au>
On Fri Jun 19 20, David Gibson wrote:
>The tpm2_get_cc_attrs_tbl() call will result in TPM commands being issued,
>which will need the use of the internal command/response buffer. But,
>we're issuing this *before* we've waited to make sure that buffer is
>allocated.
>
>This can result in intermittent failures to probe if the hypervisor / TPM
>implementation doesn't respond quickly enough. I find it fails almost
>every time with an 8 vcpu guest under KVM with software emulated TPM.
>
>To fix it, just move the tpm2_get_cc_attrs_tlb() call after the
>existing code to wait for initialization, which will ensure the buffer
>is allocated.
>
>Fixes: 18b3670d79ae9 ("tpm: ibmvtpm: Add support for TPM2")
>Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>---
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
>
>Changes from v1:
> * Fixed a formatting error in the commit message
> * Added some more detail to the commit message
>
>drivers/char/tpm/tpm_ibmvtpm.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
>diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
>index 09fe45246b8cc..994385bf37c0c 100644
>--- a/drivers/char/tpm/tpm_ibmvtpm.c
>+++ b/drivers/char/tpm/tpm_ibmvtpm.c
>@@ -683,13 +683,6 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
> if (rc)
> goto init_irq_cleanup;
>
>- if (!strcmp(id->compat, "IBM,vtpm20")) {
>- chip->flags |= TPM_CHIP_FLAG_TPM2;
>- rc = tpm2_get_cc_attrs_tbl(chip);
>- if (rc)
>- goto init_irq_cleanup;
>- }
>-
> if (!wait_event_timeout(ibmvtpm->crq_queue.wq,
> ibmvtpm->rtce_buf != NULL,
> HZ)) {
>@@ -697,6 +690,13 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
> goto init_irq_cleanup;
> }
>
>+ if (!strcmp(id->compat, "IBM,vtpm20")) {
>+ chip->flags |= TPM_CHIP_FLAG_TPM2;
>+ rc = tpm2_get_cc_attrs_tbl(chip);
>+ if (rc)
>+ goto init_irq_cleanup;
>+ }
>+
> return tpm_chip_register(chip);
> init_irq_cleanup:
> do {
>--
>2.26.2
>
^ permalink raw reply
* Re: [PATCH 1/2] powerpc/perf/hv-24x7: Add cpu hotplug support
From: Gautham R Shenoy @ 2020-06-19 4:58 UTC (permalink / raw)
To: Kajol Jain; +Cc: nathanl, maddy, suka, anju, linuxppc-dev
In-Reply-To: <20200618122713.9030-2-kjain@linux.ibm.com>
Hello Kajol,
On Thu, Jun 18, 2020 at 05:57:12PM +0530, Kajol Jain wrote:
> Patch here adds cpu hotplug functions to hv_24x7 pmu.
> A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE" enum
> is added.
>
> The online function update the cpumask only if its NULL.
> As the primary intention for adding hotplug support
> is to desiginate a CPU to make HCALL to collect the
> count data.
>
> The offline function test and clear corresponding cpu in a cpumask
> and update cpumask to any other active cpu.
>
> With this patchset, perf tool side does not need "-C <cpu>"
> to be added.
>
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> ---
> arch/powerpc/perf/hv-24x7.c | 45 +++++++++++++++++++++++++++++++++++++
> include/linux/cpuhotplug.h | 1 +
> 2 files changed, 46 insertions(+)
>
> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
> index db213eb7cb02..fdc4ae155d60 100644
> --- a/arch/powerpc/perf/hv-24x7.c
> +++ b/arch/powerpc/perf/hv-24x7.c
> @@ -31,6 +31,8 @@ static int interface_version;
> /* Whether we have to aggregate result data for some domains. */
> static bool aggregate_result_elements;
>
> +static cpumask_t hv_24x7_cpumask;
> +
> static bool domain_is_valid(unsigned domain)
> {
> switch (domain) {
> @@ -1641,6 +1643,44 @@ static struct pmu h_24x7_pmu = {
> .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
> };
>
> +static int ppc_hv_24x7_cpu_online(unsigned int cpu)
> +{
> + /* Make this CPU the designated target for counter collection */
> + if (cpumask_empty(&hv_24x7_cpumask))
> + cpumask_set_cpu(cpu, &hv_24x7_cpumask);
> +
> + return 0;
> +}
> +
> +static int ppc_hv_24x7_cpu_offline(unsigned int cpu)
> +{
> + int target = -1;
> +
> + /* Check if exiting cpu is used for collecting 24x7 events */
> + if (!cpumask_test_and_clear_cpu(cpu, &hv_24x7_cpumask))
> + return 0;
> +
> + /* Find a new cpu to collect 24x7 events */
> + target = cpumask_any_but(cpu_active_mask, cpu);
cpumask_any_but() typically picks the first CPU in cpu_active_mask
that is not @cpu.
> +
> + if (target < 0 || target >= nr_cpu_ids)
> + return -1;
> +
> + /* Migrate 24x7 events to the new target */
> + cpumask_set_cpu(target, &hv_24x7_cpumask);
> + perf_pmu_migrate_context(&h_24x7_pmu, cpu, target);
On a system with N CPUs numbered [O..N-1], can you please verify if
the time required to sequentially offline CPUs [0..N-2] ,in that
order, increase with this patch ?
I am asking this because we have encountered this problem once before
at a customer site and the commit 9c9f8fb71fee ("powerpc/perf: Use
cpumask_last() to determine the designated cpu for nest/core units.")
was introduced to fix that problem.
> +
> + return 0;
> +}
> +
> +static int hv_24x7_cpu_hotplug_init(void)
> +{
> + return cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
> + "perf/powerpc/hv_24x7:online",
> + ppc_hv_24x7_cpu_online,
> + ppc_hv_24x7_cpu_offline);
> +}
> +
> static int hv_24x7_init(void)
> {
> int r;
> @@ -1685,6 +1725,11 @@ static int hv_24x7_init(void)
> if (r)
> return r;
>
> + /* init cpuhotplug */
> + r = hv_24x7_cpu_hotplug_init();
> + if (r)
> + pr_err("hv_24x7: CPU hotplug init failed\n");
> +
> r = perf_pmu_register(&h_24x7_pmu, h_24x7_pmu.name, -1);
> if (r)
> return r;
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 8377afef8806..16ed8f6f8774 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -180,6 +180,7 @@ enum cpuhp_state {
> CPUHP_AP_PERF_POWERPC_CORE_IMC_ONLINE,
> CPUHP_AP_PERF_POWERPC_THREAD_IMC_ONLINE,
> CPUHP_AP_PERF_POWERPC_TRACE_IMC_ONLINE,
> + CPUHP_AP_PERF_POWERPC_HV_24x7_ONLINE,
> CPUHP_AP_WATCHDOG_ONLINE,
> CPUHP_AP_WORKQUEUE_ONLINE,
> CPUHP_AP_RCUTREE_ONLINE,
> --
> 2.18.2
>
^ permalink raw reply
* Re: [PATCH 2/2] powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask
From: Gautham R Shenoy @ 2020-06-19 5:05 UTC (permalink / raw)
To: Kajol Jain; +Cc: nathanl, maddy, suka, anju, linuxppc-dev
In-Reply-To: <20200618122713.9030-3-kjain@linux.ibm.com>
On Thu, Jun 18, 2020 at 05:57:13PM +0530, Kajol Jain wrote:
> Patch here adds a cpumask attr to hv_24x7 pmu along with ABI documentation.
>
> command:# cat /sys/devices/hv_24x7/cpumask
> 0
>
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> ---
> .../sysfs-bus-event_source-devices-hv_24x7 | 6 ++++
> arch/powerpc/perf/hv-24x7.c | 31 ++++++++++++++++++-
> 2 files changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
> index e8698afcd952..281e7b367733 100644
> --- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
> +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
> @@ -43,6 +43,12 @@ Description: read only
> This sysfs interface exposes the number of cores per chip
> present in the system.
>
> +What: /sys/devices/hv_24x7/cpumask
> +Date: June 2020
> +Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
> +Description: read only
> + This sysfs file exposes cpumask.
Could you please describe this in little more detail as to what the
cpumask is ?
> +
> What: /sys/bus/event_source/devices/hv_24x7/event_descs/<event-name>
> Date: February 2014
> Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
> index fdc4ae155d60..03d870a9fc36 100644
> --- a/arch/powerpc/perf/hv-24x7.c
> +++ b/arch/powerpc/perf/hv-24x7.c
> @@ -448,6 +448,12 @@ static ssize_t device_show_string(struct device *dev,
> return sprintf(buf, "%s\n", (char *)d->var);
> }
>
> +static ssize_t cpumask_get_attr(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + return cpumap_print_to_pagebuf(true, buf, &hv_24x7_cpumask);
> +}
> +
> static ssize_t sockets_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> @@ -1116,6 +1122,17 @@ static DEVICE_ATTR_RO(sockets);
> static DEVICE_ATTR_RO(chipspersocket);
> static DEVICE_ATTR_RO(coresperchip);
>
> +static DEVICE_ATTR(cpumask, S_IRUGO, cpumask_get_attr, NULL);
> +
> +static struct attribute *cpumask_attrs[] = {
> + &dev_attr_cpumask.attr,
> + NULL,
> +};
> +
> +static struct attribute_group cpumask_attr_group = {
> + .attrs = cpumask_attrs,
> +};
> +
> static struct bin_attribute *if_bin_attrs[] = {
> &bin_attr_catalog,
> NULL,
> @@ -1143,6 +1160,11 @@ static const struct attribute_group *attr_groups[] = {
> &event_desc_group,
> &event_long_desc_group,
> &if_group,
> + /*
> + * This NULL is a placeholder for the cpumask attr which will update
> + * onlyif cpuhotplug registration is successful
> + */
> + NULL,
> NULL,
> };
>
> @@ -1727,8 +1749,15 @@ static int hv_24x7_init(void)
>
> /* init cpuhotplug */
> r = hv_24x7_cpu_hotplug_init();
> - if (r)
> + if (r) {
> pr_err("hv_24x7: CPU hotplug init failed\n");
> + } else {
> + /*
> + * Cpu hotplug init is successful, add the
> + * cpumask file as part of pmu attr group
> + */
> + attr_groups[5] = &cpumask_attr_group;
Since this is only a one-time initialization, wouldn't it be safer to
iterate through attr_groups[] and assin cpumask_attr_group to the
first NULL location ?
> + }
>
> r = perf_pmu_register(&h_24x7_pmu, h_24x7_pmu.name, -1);
> if (r)
> --
> 2.18.2
>
^ permalink raw reply
* [PATCH 0/4] Remove default DMA window before creating DDW
From: Leonardo Bras @ 2020-06-19 5:06 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Alexey Kardashevskiy, Leonardo Bras, Thiago Jung Bauermann,
Ram Pai
Cc: linuxppc-dev, linux-kernel
There are some devices that only allow 1 DMA window to exist at a time,
and in those cases, a DDW is never created to them, since the default DMA
window keeps using this resource.
LoPAR recommends this procedure:
1. Remove the default DMA window,
2. Query for which configs the DDW can be created,
3. Create a DDW.
Patch #1:
- After LoPAR level 2.8, there is an extension that can make
ibm,query-pe-dma-windows to have 6 outputs instead of 5. This changes the
order of the outputs, and that can cause some trouble.
- query_ddw() was updated to check how many outputs the
ibm,query-pe-dma-windows is supposed to have, update the rtas_call() and
deal correctly with the outputs in both cases.
- This patch looks somehow unrelated to the series, but it can avoid future
problems on DDW creation.
Patch #2 implements a new rtas call to recover the default DMA window,
in case anything fails after it was removed, and a DDW couldn't be created.
Patch #3 moves the window-removing code from remove_ddw() to
remove_dma_window(), creating a way to delete any DMA window, so it can be
used to delete the default DMA window.
Patch #4 makes use of the remove_dma_window() from patch #3 to remove the
default DMA window before query_ddw() and the rtas call from patch #2
to recover it if something goes wrong.
All patches were tested into an LPAR with an Ethernet VF:
4005:01:00.0 Ethernet controller: Mellanox Technologies MT27700 Family
[ConnectX-4 Virtual Function]
Leonardo Bras (4):
powerpc/pseries/iommu: Update call to ibm,query-pe-dma-windows
powerpc/pseries/iommu: Implement ibm,reset-pe-dma-windows rtas call
powerpc/pseries/iommu: Move window-removing part of remove_ddw into
remove_dma_window
powerpc/pseries/iommu: Remove default DMA window before creating DDW
arch/powerpc/platforms/pseries/iommu.c | 163 +++++++++++++++++++------
1 file changed, 127 insertions(+), 36 deletions(-)
--
2.25.4
^ permalink raw reply
* [PATCH 1/4] powerpc/pseries/iommu: Update call to ibm, query-pe-dma-windows
From: Leonardo Bras @ 2020-06-19 5:06 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Alexey Kardashevskiy, Leonardo Bras, Thiago Jung Bauermann,
Ram Pai
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200619050619.266888-1-leobras.c@gmail.com>
From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can make the number of
outputs from "ibm,query-pe-dma-windows" go from 5 to 6.
This change of output size is meant to expand the address size of
largest_available_block PE TCE from 32-bit to 64-bit, which ends up
shifting page_size and migration_capable.
This ends up requiring the update of
ddw_query_response->largest_available_block from u32 to u64, and manually
assigning the values from the buffer into this struct, according to
output size.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
arch/powerpc/platforms/pseries/iommu.c | 57 +++++++++++++++++++++-----
1 file changed, 46 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 6d47b4a3ce39..e5a617738c8b 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -334,7 +334,7 @@ struct direct_window {
/* Dynamic DMA Window support */
struct ddw_query_response {
u32 windows_available;
- u32 largest_available_block;
+ u64 largest_available_block;
u32 page_size;
u32 migration_capable;
};
@@ -869,14 +869,32 @@ static int find_existing_ddw_windows(void)
}
machine_arch_initcall(pseries, find_existing_ddw_windows);
+/*
+ * From LoPAR level 2.8, "ibm,ddw-extensions" index 3 can rule how many output
+ * parameters ibm,query-pe-dma-windows will have, ranging from 5 to 6.
+ */
+
+static int query_ddw_out_sz(struct device_node *par_dn)
+{
+ int ret;
+ u32 ddw_ext[3];
+
+ ret = of_property_read_u32_array(par_dn, "ibm,ddw-extensions",
+ &ddw_ext[0], 3);
+ if (ret || ddw_ext[0] < 2 || ddw_ext[2] != 1)
+ return 5;
+ return 6;
+}
+
static int query_ddw(struct pci_dev *dev, const u32 *ddw_avail,
- struct ddw_query_response *query)
+ struct ddw_query_response *query,
+ struct device_node *par_dn)
{
struct device_node *dn;
struct pci_dn *pdn;
- u32 cfg_addr;
+ u32 cfg_addr, query_out[5];
u64 buid;
- int ret;
+ int ret, out_sz;
/*
* Get the config address and phb buid of the PE window.
@@ -888,12 +906,29 @@ static int query_ddw(struct pci_dev *dev, const u32 *ddw_avail,
pdn = PCI_DN(dn);
buid = pdn->phb->buid;
cfg_addr = ((pdn->busno << 16) | (pdn->devfn << 8));
+ out_sz = query_ddw_out_sz(par_dn);
+
+ ret = rtas_call(ddw_avail[0], 3, out_sz, query_out,
+ cfg_addr, BUID_HI(buid), BUID_LO(buid));
+ dev_info(&dev->dev, "ibm,query-pe-dma-windows(%x) %x %x %x returned %d\n",
+ ddw_avail[0], cfg_addr, BUID_HI(buid), BUID_LO(buid), ret);
+
+ switch (out_sz) {
+ case 5:
+ query->windows_available = query_out[0];
+ query->largest_available_block = query_out[1];
+ query->page_size = query_out[2];
+ query->migration_capable = query_out[3];
+ break;
+ case 6:
+ query->windows_available = query_out[0];
+ query->largest_available_block = ((u64)query_out[1] << 32) |
+ query_out[2];
+ query->page_size = query_out[3];
+ query->migration_capable = query_out[4];
+ break;
+ }
- ret = rtas_call(ddw_avail[0], 3, 5, (u32 *)query,
- cfg_addr, BUID_HI(buid), BUID_LO(buid));
- dev_info(&dev->dev, "ibm,query-pe-dma-windows(%x) %x %x %x"
- " returned %d\n", ddw_avail[0], cfg_addr, BUID_HI(buid),
- BUID_LO(buid), ret);
return ret;
}
@@ -1040,7 +1075,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn)
* of page sizes: supported and supported for migrate-dma.
*/
dn = pci_device_to_OF_node(dev);
- ret = query_ddw(dev, ddw_avail, &query);
+ ret = query_ddw(dev, ddw_avail, &query, pdn);
if (ret != 0)
goto out_failed;
@@ -1068,7 +1103,7 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn)
/* check largest block * page size > max memory hotplug addr */
max_addr = ddw_memory_hotplug_max();
if (query.largest_available_block < (max_addr >> page_shift)) {
- dev_dbg(&dev->dev, "can't map partition max 0x%llx with %u "
+ dev_dbg(&dev->dev, "can't map partition max 0x%llx with %llu "
"%llu-sized pages\n", max_addr, query.largest_available_block,
1ULL << page_shift);
goto out_failed;
--
2.25.4
^ permalink raw reply related
* [PATCH 2/4] powerpc/pseries/iommu: Implement ibm, reset-pe-dma-windows rtas call
From: Leonardo Bras @ 2020-06-19 5:06 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Alexey Kardashevskiy, Leonardo Bras, Thiago Jung Bauermann,
Ram Pai
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200619050619.266888-1-leobras.c@gmail.com>
Platforms supporting the DDW option starting with LoPAR level 2.7 implement
ibm,ddw-extensions. The first extension available (index 2) carries the
token for ibm,reset-pe-dma-windows rtas call, which is used to restore
the default DMA window for a device, if it has been deleted.
It does so by resetting the TCE table allocation for the PE to it's
boot time value, available in "ibm,dma-window" device tree node.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
arch/powerpc/platforms/pseries/iommu.c | 33 ++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index e5a617738c8b..5e1fbc176a37 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1012,6 +1012,39 @@ static phys_addr_t ddw_memory_hotplug_max(void)
return max_addr;
}
+/*
+ * Platforms supporting the DDW option starting with LoPAR level 2.7 implement
+ * ibm,ddw-extensions, which carries the rtas token for
+ * ibm,reset-pe-dma-windows.
+ * That rtas-call can be used to restore the default DMA window for the device.
+ */
+static void reset_dma_window(struct pci_dev *dev, struct device_node *par_dn)
+{
+ int ret;
+ u32 cfg_addr, ddw_ext[3];
+ u64 buid;
+ struct device_node *dn;
+ struct pci_dn *pdn;
+
+ ret = of_property_read_u32_array(par_dn, "ibm,ddw-extensions",
+ &ddw_ext[0], 3);
+ if (ret)
+ return;
+
+ dn = pci_device_to_OF_node(dev);
+ pdn = PCI_DN(dn);
+ buid = pdn->phb->buid;
+ cfg_addr = ((pdn->busno << 16) | (pdn->devfn << 8));
+
+ ret = rtas_call(ddw_ext[1], 3, 1, NULL, cfg_addr,
+ BUID_HI(buid), BUID_LO(buid));
+ if (ret)
+ dev_info(&dev->dev,
+ "ibm,reset-pe-dma-windows(%x) %x %x %x returned %d ",
+ ddw_ext[1], cfg_addr, BUID_HI(buid), BUID_LO(buid),
+ ret);
+}
+
/*
* If the PE supports dynamic dma windows, and there is space for a table
* that can map all pages in a linear offset, then setup such a table,
--
2.25.4
^ permalink raw reply related
* [PATCH 3/4] powerpc/pseries/iommu: Move window-removing part of remove_ddw into remove_dma_window
From: Leonardo Bras @ 2020-06-19 5:06 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Alexey Kardashevskiy, Leonardo Bras, Thiago Jung Bauermann,
Ram Pai
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200619050619.266888-1-leobras.c@gmail.com>
Move the window-removing part of remove_ddw into a new function
(remove_dma_window), so it can be used to remove other DMA windows.
It's useful for removing DMA windows that don't create DIRECT64_PROPNAME
property, like the default DMA window from the device, which uses
"ibm,dma-window".
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
arch/powerpc/platforms/pseries/iommu.c | 53 +++++++++++++++-----------
1 file changed, 31 insertions(+), 22 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 5e1fbc176a37..de633f6ae093 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -767,25 +767,14 @@ static int __init disable_ddw_setup(char *str)
early_param("disable_ddw", disable_ddw_setup);
-static void remove_ddw(struct device_node *np, bool remove_prop)
+static void remove_dma_window(struct device_node *pdn, u32 *ddw_avail,
+ struct property *win)
{
struct dynamic_dma_window_prop *dwp;
- struct property *win64;
- u32 ddw_avail[3];
u64 liobn;
- int ret = 0;
-
- ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
- &ddw_avail[0], 3);
-
- win64 = of_find_property(np, DIRECT64_PROPNAME, NULL);
- if (!win64)
- return;
-
- if (ret || win64->length < sizeof(*dwp))
- goto delprop;
+ int ret;
- dwp = win64->value;
+ dwp = win->value;
liobn = (u64)be32_to_cpu(dwp->liobn);
/* clear the whole window, note the arg is in kernel pages */
@@ -793,24 +782,44 @@ static void remove_ddw(struct device_node *np, bool remove_prop)
1ULL << (be32_to_cpu(dwp->window_shift) - PAGE_SHIFT), dwp);
if (ret)
pr_warn("%pOF failed to clear tces in window.\n",
- np);
+ pdn);
else
pr_debug("%pOF successfully cleared tces in window.\n",
- np);
+ pdn);
ret = rtas_call(ddw_avail[2], 1, 1, NULL, liobn);
if (ret)
pr_warn("%pOF: failed to remove direct window: rtas returned "
"%d to ibm,remove-pe-dma-window(%x) %llx\n",
- np, ret, ddw_avail[2], liobn);
+ pdn, ret, ddw_avail[2], liobn);
else
pr_debug("%pOF: successfully removed direct window: rtas returned "
"%d to ibm,remove-pe-dma-window(%x) %llx\n",
- np, ret, ddw_avail[2], liobn);
+ pdn, ret, ddw_avail[2], liobn);
+}
+
+static void remove_ddw(struct device_node *np, bool remove_prop)
+{
+ struct property *win;
+ u32 ddw_avail[3];
+ int ret = 0;
+
+ ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
+ &ddw_avail[0], 3);
+ if (ret)
+ return;
+
+ win = of_find_property(np, DIRECT64_PROPNAME, NULL);
+ if (!win)
+ return;
+
+ if (win->length >= sizeof(struct dynamic_dma_window_prop))
+ remove_dma_window(np, ddw_avail, win);
+
+ if (!remove_prop)
+ return;
-delprop:
- if (remove_prop)
- ret = of_remove_property(np, win64);
+ ret = of_remove_property(np, win);
if (ret)
pr_warn("%pOF: failed to remove direct window property: %d\n",
np, ret);
--
2.25.4
^ permalink raw reply related
* [PATCH 4/4] powerpc/pseries/iommu: Remove default DMA window before creating DDW
From: Leonardo Bras @ 2020-06-19 5:06 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Alexey Kardashevskiy, Leonardo Bras, Thiago Jung Bauermann,
Ram Pai
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200619050619.266888-1-leobras.c@gmail.com>
On LoPAR "DMA Window Manipulation Calls", it's recommended to remove the
default DMA window for the device, before attempting to configure a DDW,
in order to make the maximum resources available for the next DDW to be
created.
This is a requirement for some devices to use DDW, given they only
allow one DMA window.
If setting up a new DDW fails anywhere after the removal of this
default DMA window, restore it using reset_dma_window.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
arch/powerpc/platforms/pseries/iommu.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index de633f6ae093..68d1ea957ac7 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1074,8 +1074,9 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn)
u64 dma_addr, max_addr;
struct device_node *dn;
u32 ddw_avail[3];
+
struct direct_window *window;
- struct property *win64;
+ struct property *win64, *dfl_win;
struct dynamic_dma_window_prop *ddwprop;
struct failed_ddw_pdn *fpdn;
@@ -1110,8 +1111,19 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn)
if (ret)
goto out_failed;
- /*
- * Query if there is a second window of size to map the
+ /*
+ * First step of setting up DDW is removing the default DMA window,
+ * if it's present. It will make all the resources available to the
+ * new DDW window.
+ * If anything fails after this, we need to restore it.
+ */
+
+ dfl_win = of_find_property(pdn, "ibm,dma-window", NULL);
+ if (dfl_win)
+ remove_dma_window(pdn, ddw_avail, dfl_win);
+
+ /*
+ * Query if there is a window of size to map the
* whole partition. Query returns number of windows, largest
* block assigned to PE (partition endpoint), and two bitmasks
* of page sizes: supported and supported for migrate-dma.
@@ -1219,6 +1231,8 @@ static u64 enable_ddw(struct pci_dev *dev, struct device_node *pdn)
kfree(win64);
out_failed:
+ if (dfl_win)
+ reset_dma_window(dev, pdn);
fpdn = kzalloc(sizeof(*fpdn), GFP_KERNEL);
if (!fpdn)
--
2.25.4
^ permalink raw reply related
* Re: [PATCH] ASoC: fsl_spdif: Add pm runtime function
From: Nicolin Chen @ 2020-06-19 5:49 UTC (permalink / raw)
To: Shengjiu Wang
Cc: alsa-devel, timur, Xiubo.Lee, linuxppc-dev, tiwai, perex, broonie,
festevam, linux-kernel
In-Reply-To: <1592481334-3680-1-git-send-email-shengjiu.wang@nxp.com>
On Thu, Jun 18, 2020 at 07:55:34PM +0800, Shengjiu Wang wrote:
> Add pm runtime support and move clock handling there.
> Close the clocks at suspend to reduce the power consumption.
>
> fsl_spdif_suspend is replaced by pm_runtime_force_suspend.
> fsl_spdif_resume is replaced by pm_runtime_force_resume.
>
> Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
LGTM, yet some nits, please add my ack after fixing:
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
> @@ -495,25 +496,10 @@ static int fsl_spdif_startup(struct snd_pcm_substream *substream,
>
> -disable_txclk:
> - for (i--; i >= 0; i--)
> - clk_disable_unprepare(spdif_priv->txclk[i]);
> err:
> - if (!IS_ERR(spdif_priv->spbaclk))
> - clk_disable_unprepare(spdif_priv->spbaclk);
> -err_spbaclk:
> - clk_disable_unprepare(spdif_priv->coreclk);
> -
> return ret;
Only "return ret;" remains now. We could clean the goto away.
> -static int fsl_spdif_resume(struct device *dev)
> +static int fsl_spdif_runtime_resume(struct device *dev)
> +disable_rx_clk:
> + clk_disable_unprepare(spdif_priv->rxclk);
> +disable_tx_clk:
> +disable_spba_clk:
Why have two duplicated ones? Could probably drop the 2nd one.
^ permalink raw reply
* Re: powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset
From: Oliver O'Halloran @ 2020-06-19 6:09 UTC (permalink / raw)
To: Michael Ellerman; +Cc: Brian King, Wen Xiong, linuxppc-dev, wenxiong
In-Reply-To: <87ftaudx1x.fsf@mpe.ellerman.id.au>
On Wed, Jun 17, 2020 at 4:29 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> "Oliver O'Halloran" <oohall@gmail.com> writes:
> > On Tue, Jun 16, 2020 at 9:55 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> >> wenxiong@linux.vnet.ibm.com writes:
> >> > From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
> >> >
> >> > Several device drivers hit EEH(Extended Error handling) when triggering
> >> > kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
> >> > in pci general code when triggering kdump.
> >>
> >> Actually it's in pseries specific PCI code, and the reset is done in the
> >> 2nd kernel as it boots, not when triggering the kdump.
> >>
> >> You're doing it as a:
> >>
> >> machine_postcore_initcall(pseries, pseries_phb_reset);
> >>
> >> But we do the EEH initialisation in:
> >>
> >> core_initcall_sync(eeh_init);
> >>
> >> Which happens first.
> >>
> >> So it seems to me that this should be called from pseries_eeh_init().
> >
> > This happens to use some of the same RTAS calls as EEH, but it's
> > entirely orthogonal to it.
>
> I don't agree. I mean it's literally calling EEH_RESET_FUNDAMENTAL etc.
> Those RTAS calls are all documented in the EEH section of PAPR.
>
> I guess you're saying it's orthogonal to the kernel handling an EEH and
> doing the recovery process etc, which I can kind of see.
>
> > Wedging the two together doesn't make any real sense IMO since this
> > should be usable even with !CONFIG_EEH.
>
> You can't turn CONFIG_EEH off for pseries or powernv.
Not yet :)
> And if you could this patch wouldn't compile because it uses EEH
> constants that are behind #ifdef CONFIG_EEH.
That's fixable.
> If you could turn CONFIG_EEH off it would presumably be because you were
> on a platform that didn't support EEH, in which case you wouldn't need
> this code.
I think there's an argument to be made for disabling EEH in some
situations. A lot of drivers do a pretty poor job of recovering in the
first place so it's conceivable that someone might want to disable it
in say, a kdump kernel. That said, the real reason is mostly for the
sake of code organisation. EEH is an optional platform feature but you
wouldn't know it looking at the implementation and I'd like to stop it
bleeding into odd places. Making it buildable without !CONFIG_EEH
would probably help.
> So IMO this is EEH code, and should be with the other EEH code and
> should be behind CONFIG_EEH.
*shrug*
I wanted it to follow the model of the powernv implementation of the
same feature which is done immediately after initialising the
pci_controller and independent of all of the EEH setup. Although,
looking at it again I see it calls pnv_eeh_phb_reset() which is in
eeh_powernv.c so I guess that's pretty similar to what you're
suggesting.
> That sounds like a good cleanup. I'm not concerned about conflicts
> within arch/powerpc, I can fix them up.
>
> >> > + list_for_each_entry(phb, &hose_list, list_node) {
> >> > + config_addr = pseries_get_pdn_addr(phb);
> >> > + if (config_addr == -1)
> >> > + continue;
> >> > +
> >> > + ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> >> > + config_addr, BUID_HI(phb->buid),
> >> > + BUID_LO(phb->buid), EEH_RESET_FUNDAMENTAL);
> >> > +
> >> > + /* If fundamental-reset not supported, try hot-reset */
> >> > + if (ret == -8)
> >>
> >> Where does -8 come from?
> >
> > There's a comment right there.
>
> Yeah I guess. I was expecting it would map to some RTAS_ERROR_FOO value,
> but it's just literally -8 in PAPR.
Yeah, as far as I can tell the meaning of the return codes are
specific to each RTAS call, it's a bit bad.
^ permalink raw reply
* Re: rename probe_kernel_* and probe_user_*
From: Michael Ellerman @ 2020-06-19 6:21 UTC (permalink / raw)
To: Linus Torvalds, Christoph Hellwig, Russell King, Tony Luck,
Helge Deller
Cc: linux-arch, linux-ia64, linux-parisc, the arch/x86 maintainers,
Linux Kernel Mailing List, Andrew Morton, linuxppc-dev
In-Reply-To: <CAHk-=wjpnu=882iD9ck9Ywt6R1LYX_Hv-oS7dBMsWZwDRGZ5jA@mail.gmail.com>
Linus Torvalds <torvalds@linux-foundation.org> writes:
> [ Explicitly added architecture lists and developers to the cc to make
> this more visible ]
>
> On Wed, Jun 17, 2020 at 12:38 AM Christoph Hellwig <hch@lst.de> wrote:
>>
>> Andrew and I decided to drop the patches implementing your suggested
>> rename of the probe_kernel_* and probe_user_* helpers from -mm as there
>> were way to many conflicts. After -rc1 might be a good time for this as
>> all the conflicts are resolved now.
>
> So I've merged this renaming now, together with my changes to make
> 'get_kernel_nofault()' look and act a lot more like 'get_user()'.
>
> It just felt wrong (and potentially dangerous) to me to have a
> 'get_kernel_nofault()' naming that implied semantics that we're all
> familiar with from 'get_user()', but acting very differently.
>
> But part of the fixups I made for the type checking are for
> architectures where I didn't even compile-test the end result. I
> looked at every case individually, and the patch looks sane, but I
> could have screwed something up.
>
> Basically, 'get_kernel_nofault()' doesn't do the same automagic type
> munging from the pointer to the target that 'get_user()' does, but at
> least now it checks that the types are superficially compatible.
> There should be build failures if they aren't, but I hopefully fixed
> everything up properly for all architectures.
>
> This email is partly to ask people to double-check, but partly just as
> a heads-up so that _if_ I screwed something up, you'll have the
> background and it won't take you by surprise.
The powerpc changes look right, compile cleanly and seem to work
correctly.
cheers
^ permalink raw reply
* [PATCH V2] powerpc/pseries/svm: Remove unwanted check for shared_lppaca_size
From: Satheesh Rajendran @ 2020-06-19 7:01 UTC (permalink / raw)
To: linuxppc-dev
Cc: Laurent Dufour, Ram Pai, linux-kernel, Satheesh Rajendran,
Sukadev Bhattiprolu, Thiago Jung Bauermann
Early secure guest boot hits the below crash while booting with
vcpus numbers aligned with page boundary for PAGE size of 64k
and LPPACA size of 1k i.e 64, 128 etc, due to the BUG_ON assert
for shared_lppaca_total_size equal to shared_lppaca_size,
[ 0.000000] Partition configured for 64 cpus.
[ 0.000000] CPU maps initialized for 1 thread per core
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] kernel BUG at arch/powerpc/kernel/paca.c:89!
[ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
[ 0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
which is not necessary, let's remove it.
Fixes: bd104e6db6f0 ("powerpc/pseries/svm: Use shared memory for LPPACA structures")
Cc: linux-kernel@vger.kernel.org
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
---
V2:
Added Reviewed by Thiago and Laurent.
Added Fixes tag as per Thiago suggest.
V1: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200609105731.14032-1-sathnaga@linux.vnet.ibm.com/
---
arch/powerpc/kernel/paca.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 2168372b792d..74da65aacbc9 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -87,7 +87,7 @@ static void *__init alloc_shared_lppaca(unsigned long size, unsigned long align,
* This is very early in boot, so no harm done if the kernel crashes at
* this point.
*/
- BUG_ON(shared_lppaca_size >= shared_lppaca_total_size);
+ BUG_ON(shared_lppaca_size > shared_lppaca_total_size);
return ptr;
}
--
2.26.2
^ permalink raw reply related
* Re: [PATCH] ASoC: fsl_spdif: Add pm runtime function
From: Shengjiu Wang @ 2020-06-19 7:15 UTC (permalink / raw)
To: Nicolin Chen
Cc: Linux-ALSA, Timur Tabi, Xiubo Li, Fabio Estevam, Shengjiu Wang,
Takashi Iwai, linux-kernel, Mark Brown, linuxppc-dev
In-Reply-To: <20200619054942.GA25856@Asurada-Nvidia>
On Fri, Jun 19, 2020 at 1:51 PM Nicolin Chen <nicoleotsuka@gmail.com> wrote:
>
> On Thu, Jun 18, 2020 at 07:55:34PM +0800, Shengjiu Wang wrote:
> > Add pm runtime support and move clock handling there.
> > Close the clocks at suspend to reduce the power consumption.
> >
> > fsl_spdif_suspend is replaced by pm_runtime_force_suspend.
> > fsl_spdif_resume is replaced by pm_runtime_force_resume.
> >
> > Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
>
> LGTM, yet some nits, please add my ack after fixing:
>
> Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
>
> > @@ -495,25 +496,10 @@ static int fsl_spdif_startup(struct snd_pcm_substream *substream,
>
> >
> > -disable_txclk:
> > - for (i--; i >= 0; i--)
> > - clk_disable_unprepare(spdif_priv->txclk[i]);
> > err:
> > - if (!IS_ERR(spdif_priv->spbaclk))
> > - clk_disable_unprepare(spdif_priv->spbaclk);
> > -err_spbaclk:
> > - clk_disable_unprepare(spdif_priv->coreclk);
> > -
> > return ret;
>
> Only "return ret;" remains now. We could clean the goto away.
>
> > -static int fsl_spdif_resume(struct device *dev)
> > +static int fsl_spdif_runtime_resume(struct device *dev)
>
> > +disable_rx_clk:
> > + clk_disable_unprepare(spdif_priv->rxclk);
> > +disable_tx_clk:
> > +disable_spba_clk:
>
> Why have two duplicated ones? Could probably drop the 2nd one.
seems can drop one, will send an update.
best regards
wang shengjiu
^ permalink raw reply
* Re: [PATCH] mm/debug_vm_pgtable: Fix build failure with powerpc 8xx
From: Will Deacon @ 2020-06-19 8:00 UTC (permalink / raw)
To: Christophe Leroy
Cc: Anshuman Khandual, Peter Zijlstra (Intel), linux-kernel, linux-mm,
Paul Mackerras, Andrew Morton, linuxppc-dev
In-Reply-To: <6ca8c972e6c920dc4ae0d4affbed9703afa4d010.1592490570.git.christophe.leroy@csgroup.eu>
On Thu, Jun 18, 2020 at 02:31:29PM +0000, Christophe Leroy wrote:
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index e45623016aea..61ab16fb2e36 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -246,13 +246,13 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
> static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
> unsigned long vaddr)
> {
> - pte_t pte = READ_ONCE(*ptep);
> + pte_t pte = ptep_get(ptep);
>
> pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> set_pte_at(mm, vaddr, ptep, pte);
> barrier();
> pte_clear(mm, vaddr, ptep);
> - pte = READ_ONCE(*ptep);
> + pte = ptep_get(ptep);
> WARN_ON(!pte_none(pte));
> }
Acked-by: Will Deacon <will@kernel.org>
I wonder if there's a way to do this with coccinelle in one big go (but the
resulting diff would obviously need manual inspection)?
Will
^ permalink raw reply
* [PATCH v2] ASoC: fsl_spdif: Add pm runtime function
From: Shengjiu Wang @ 2020-06-19 7:54 UTC (permalink / raw)
To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
alsa-devel
Cc: linuxppc-dev, linux-kernel
Add pm runtime support and move clock handling there.
Close the clocks at suspend to reduce the power consumption.
fsl_spdif_suspend is replaced by pm_runtime_force_suspend.
fsl_spdif_resume is replaced by pm_runtime_force_resume.
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
---
changes in v2
- remove goto in startup()
- remove goto disable_spba_clk
- Add Acked-by: Nicolin Chen
sound/soc/fsl/fsl_spdif.c | 117 ++++++++++++++++++++++----------------
1 file changed, 67 insertions(+), 50 deletions(-)
diff --git a/sound/soc/fsl/fsl_spdif.c b/sound/soc/fsl/fsl_spdif.c
index 5bc0e4729341..5b2689ae63d4 100644
--- a/sound/soc/fsl/fsl_spdif.c
+++ b/sound/soc/fsl/fsl_spdif.c
@@ -16,6 +16,7 @@
#include <linux/of_device.h>
#include <linux/of_irq.h>
#include <linux/regmap.h>
+#include <linux/pm_runtime.h>
#include <sound/asoundef.h>
#include <sound/dmaengine_pcm.h>
@@ -495,29 +496,14 @@ static int fsl_spdif_startup(struct snd_pcm_substream *substream,
struct platform_device *pdev = spdif_priv->pdev;
struct regmap *regmap = spdif_priv->regmap;
u32 scr, mask;
- int i;
int ret;
/* Reset module and interrupts only for first initialization */
if (!snd_soc_dai_active(cpu_dai)) {
- ret = clk_prepare_enable(spdif_priv->coreclk);
- if (ret) {
- dev_err(&pdev->dev, "failed to enable core clock\n");
- return ret;
- }
-
- if (!IS_ERR(spdif_priv->spbaclk)) {
- ret = clk_prepare_enable(spdif_priv->spbaclk);
- if (ret) {
- dev_err(&pdev->dev, "failed to enable spba clock\n");
- goto err_spbaclk;
- }
- }
-
ret = spdif_softreset(spdif_priv);
if (ret) {
dev_err(&pdev->dev, "failed to soft reset\n");
- goto err;
+ return ret;
}
/* Disable all the interrupts */
@@ -531,18 +517,10 @@ static int fsl_spdif_startup(struct snd_pcm_substream *substream,
mask = SCR_TXFIFO_AUTOSYNC_MASK | SCR_TXFIFO_CTRL_MASK |
SCR_TXSEL_MASK | SCR_USRC_SEL_MASK |
SCR_TXFIFO_FSEL_MASK;
- for (i = 0; i < SPDIF_TXRATE_MAX; i++) {
- ret = clk_prepare_enable(spdif_priv->txclk[i]);
- if (ret)
- goto disable_txclk;
- }
} else {
scr = SCR_RXFIFO_FSEL_IF8 | SCR_RXFIFO_AUTOSYNC;
mask = SCR_RXFIFO_FSEL_MASK | SCR_RXFIFO_AUTOSYNC_MASK|
SCR_RXFIFO_CTL_MASK | SCR_RXFIFO_OFF_MASK;
- ret = clk_prepare_enable(spdif_priv->rxclk);
- if (ret)
- goto err;
}
regmap_update_bits(regmap, REG_SPDIF_SCR, mask, scr);
@@ -550,17 +528,6 @@ static int fsl_spdif_startup(struct snd_pcm_substream *substream,
regmap_update_bits(regmap, REG_SPDIF_SCR, SCR_LOW_POWER, 0);
return 0;
-
-disable_txclk:
- for (i--; i >= 0; i--)
- clk_disable_unprepare(spdif_priv->txclk[i]);
-err:
- if (!IS_ERR(spdif_priv->spbaclk))
- clk_disable_unprepare(spdif_priv->spbaclk);
-err_spbaclk:
- clk_disable_unprepare(spdif_priv->coreclk);
-
- return ret;
}
static void fsl_spdif_shutdown(struct snd_pcm_substream *substream,
@@ -569,20 +536,17 @@ static void fsl_spdif_shutdown(struct snd_pcm_substream *substream,
struct snd_soc_pcm_runtime *rtd = substream->private_data;
struct fsl_spdif_priv *spdif_priv = snd_soc_dai_get_drvdata(asoc_rtd_to_cpu(rtd, 0));
struct regmap *regmap = spdif_priv->regmap;
- u32 scr, mask, i;
+ u32 scr, mask;
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
scr = 0;
mask = SCR_TXFIFO_AUTOSYNC_MASK | SCR_TXFIFO_CTRL_MASK |
SCR_TXSEL_MASK | SCR_USRC_SEL_MASK |
SCR_TXFIFO_FSEL_MASK;
- for (i = 0; i < SPDIF_TXRATE_MAX; i++)
- clk_disable_unprepare(spdif_priv->txclk[i]);
} else {
scr = SCR_RXFIFO_OFF | SCR_RXFIFO_CTL_ZERO;
mask = SCR_RXFIFO_FSEL_MASK | SCR_RXFIFO_AUTOSYNC_MASK|
SCR_RXFIFO_CTL_MASK | SCR_RXFIFO_OFF_MASK;
- clk_disable_unprepare(spdif_priv->rxclk);
}
regmap_update_bits(regmap, REG_SPDIF_SCR, mask, scr);
@@ -591,9 +555,6 @@ static void fsl_spdif_shutdown(struct snd_pcm_substream *substream,
spdif_intr_status_clear(spdif_priv);
regmap_update_bits(regmap, REG_SPDIF_SCR,
SCR_LOW_POWER, SCR_LOW_POWER);
- if (!IS_ERR(spdif_priv->spbaclk))
- clk_disable_unprepare(spdif_priv->spbaclk);
- clk_disable_unprepare(spdif_priv->coreclk);
}
}
@@ -1350,6 +1311,8 @@ static int fsl_spdif_probe(struct platform_device *pdev)
/* Register with ASoC */
dev_set_drvdata(&pdev->dev, spdif_priv);
+ pm_runtime_enable(&pdev->dev);
+ regcache_cache_only(spdif_priv->regmap, true);
ret = devm_snd_soc_register_component(&pdev->dev, &fsl_spdif_component,
&spdif_priv->cpu_dai_drv, 1);
@@ -1365,36 +1328,90 @@ static int fsl_spdif_probe(struct platform_device *pdev)
return ret;
}
-#ifdef CONFIG_PM_SLEEP
-static int fsl_spdif_suspend(struct device *dev)
+#ifdef CONFIG_PM
+static int fsl_spdif_runtime_suspend(struct device *dev)
{
struct fsl_spdif_priv *spdif_priv = dev_get_drvdata(dev);
+ int i;
regmap_read(spdif_priv->regmap, REG_SPDIF_SRPC,
&spdif_priv->regcache_srpc);
-
regcache_cache_only(spdif_priv->regmap, true);
- regcache_mark_dirty(spdif_priv->regmap);
+
+ clk_disable_unprepare(spdif_priv->rxclk);
+
+ for (i = 0; i < SPDIF_TXRATE_MAX; i++)
+ clk_disable_unprepare(spdif_priv->txclk[i]);
+
+ if (!IS_ERR(spdif_priv->spbaclk))
+ clk_disable_unprepare(spdif_priv->spbaclk);
+ clk_disable_unprepare(spdif_priv->coreclk);
return 0;
}
-static int fsl_spdif_resume(struct device *dev)
+static int fsl_spdif_runtime_resume(struct device *dev)
{
struct fsl_spdif_priv *spdif_priv = dev_get_drvdata(dev);
+ int ret;
+ int i;
+
+ ret = clk_prepare_enable(spdif_priv->coreclk);
+ if (ret) {
+ dev_err(dev, "failed to enable core clock\n");
+ return ret;
+ }
+
+ if (!IS_ERR(spdif_priv->spbaclk)) {
+ ret = clk_prepare_enable(spdif_priv->spbaclk);
+ if (ret) {
+ dev_err(dev, "failed to enable spba clock\n");
+ goto disable_core_clk;
+ }
+ }
+
+ for (i = 0; i < SPDIF_TXRATE_MAX; i++) {
+ ret = clk_prepare_enable(spdif_priv->txclk[i]);
+ if (ret)
+ goto disable_tx_clk;
+ }
+
+ ret = clk_prepare_enable(spdif_priv->rxclk);
+ if (ret)
+ goto disable_tx_clk;
regcache_cache_only(spdif_priv->regmap, false);
+ regcache_mark_dirty(spdif_priv->regmap);
regmap_update_bits(spdif_priv->regmap, REG_SPDIF_SRPC,
SRPC_CLKSRC_SEL_MASK | SRPC_GAINSEL_MASK,
spdif_priv->regcache_srpc);
- return regcache_sync(spdif_priv->regmap);
+ ret = regcache_sync(spdif_priv->regmap);
+ if (ret)
+ goto disable_rx_clk;
+
+ return 0;
+
+disable_rx_clk:
+ clk_disable_unprepare(spdif_priv->rxclk);
+disable_tx_clk:
+ for (i--; i >= 0; i--)
+ clk_disable_unprepare(spdif_priv->txclk[i]);
+ if (!IS_ERR(spdif_priv->spbaclk))
+ clk_disable_unprepare(spdif_priv->spbaclk);
+disable_core_clk:
+ clk_disable_unprepare(spdif_priv->coreclk);
+
+ return ret;
}
-#endif /* CONFIG_PM_SLEEP */
+#endif /* CONFIG_PM */
static const struct dev_pm_ops fsl_spdif_pm = {
- SET_SYSTEM_SLEEP_PM_OPS(fsl_spdif_suspend, fsl_spdif_resume)
+ SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+ pm_runtime_force_resume)
+ SET_RUNTIME_PM_OPS(fsl_spdif_runtime_suspend, fsl_spdif_runtime_resume,
+ NULL)
};
static const struct of_device_id fsl_spdif_dt_ids[] = {
--
2.21.0
^ permalink raw reply related
* Re: [PATCH 3/6] exec: cleanup the count() function
From: Sergei Shtylyov @ 2020-06-19 8:28 UTC (permalink / raw)
To: Christoph Hellwig, Al Viro
Cc: linux-arch, linux-s390, linux-parisc, Arnd Bergmann, Brian Gerst,
x86, linux-mips, linux-kernel, linux-fsdevel, Luis Chamberlain,
sparclinux, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20200618144627.114057-4-hch@lst.de>
Hello!
On 18.06.2020 17:46, Christoph Hellwig wrote:
> Remove the max argument as it is hard wired to MAX_ARG_STRINGS, and
Technically, argument is what's actually passed to a function, you're
removing a function parameter.
> give the function a slightly less generic name.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
[...]
MBR, Sergei
^ permalink raw reply
* Re: [PATCH 2/2] powerpc/syscalls: Split SPU-ness out of ABI
From: Michael Ellerman @ 2020-06-19 10:26 UTC (permalink / raw)
To: linuxppc-dev; +Cc: linux-arch, linux-kernel, arnd
In-Reply-To: <20200616135617.2937252-2-mpe@ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> Using the ABI field to encode whether a syscall is usable by SPU
> programs or not is a bit of kludge.
>
> The ABI of the syscall doesn't change depending on the SPU-ness, but
> in order to make the syscall generation work we have to pretend that
> it does.
>
> It also means we have more duplicated syscall lines than we need to,
> and the SPU logic is not well contained, instead all of the syscall
> generation targets need to know if they are spu or nospu.
>
> So instead add a separate file which contains the information on which
> syscalls are available for SPU programs. It's just a list of syscall
> numbers with a single "spu" field. If the field has the value "spu"
> then the syscall is available to SPU programs, any other value or no
> entry entirely means the syscall is not available to SPU programs.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
> arch/powerpc/kernel/syscalls/Makefile | 16 +-
> arch/powerpc/kernel/syscalls/spu.tbl | 430 +++++++++++++++++++++
> arch/powerpc/kernel/syscalls/syscall.tbl | 195 ++++------
> arch/powerpc/kernel/syscalls/syscalltbl.sh | 10 +-
> 4 files changed, 523 insertions(+), 128 deletions(-)
> create mode 100644 arch/powerpc/kernel/syscalls/spu.tbl
For the archives, the changes to the syscall table & the generation of
the spu.tbl can be more-or-less generated with the script below
(ignoring whitespace & comments).
cheers
#!/bin/bash
git checkout v5.8-rc1
table=arch/powerpc/kernel/syscalls/syscall.tbl
for number in {0..439}
do
line=$(grep -E "^$number\s+(common|spu)" $table)
if [[ -n "$line" ]]; then
read number abi name syscall compat <<< "$line"
if [[ "$syscall" != "sys_ni_syscall" ]]; then
if [[ "$name" == "utimesat" ]]; then # fix typo
name="futimesat"
fi
echo -e "$number\t$name\tspu"
continue
fi
fi
line=$(grep -m 1 -E "^$number\s+" $table)
read number abi name syscall compat <<< "$line"
if [[ -n "$name" ]]; then
echo -e "$number\t$name\t-"
fi
done > spu-generated.tbl
cat $table | while read line
do
read number abi name syscall compat <<< "$line"
if [[ "$number" == "#" ]]; then
echo $line
continue
fi
case "$abi" in
"nospu") ;&
"common") ;&
"32") ;&
"64") echo "$line" | sed -e "s/nospu/common/" ;;
esac
done > syscall-generated.tbl
git cat-file -p 35e32a6cb5f6:$table | diff -w -u - syscall-generated.tbl
git cat-file -p 35e32a6cb5f6:arch/powerpc/kernel/syscalls/spu.tbl | diff -w -u - spu-generated.tbl
^ permalink raw reply
* Re: [PATCH v5 01/13] powerpc: Remove Xilinx PPC405/PPC440 support
From: Michael Ellerman @ 2020-06-19 11:02 UTC (permalink / raw)
To: Nathan Chancellor
Cc: Arnd Bergmann, Nick Desaulniers, Michal Simek, LKML,
clang-built-linux, Paul Mackerras, linuxppc-dev
In-Reply-To: <20200618031622.GA195@Ryzen-9-3900X.localdomain>
Nathan Chancellor <natechancellor@gmail.com> writes:
> On Thu, Jun 18, 2020 at 10:48:21AM +1000, Michael Ellerman wrote:
>> Nick Desaulniers <ndesaulniers@google.com> writes:
>> > On Wed, Jun 17, 2020 at 3:20 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>> >> Michael Ellerman <mpe@ellerman.id.au> writes:
>> >> > Michal Simek <michal.simek@xilinx.com> writes:
>> >> <snip>
>> >>
>> >> >> Or if bamboo requires uImage to be built by default you can do it via
>> >> >> Kconfig.
>> >> >>
>> >> >> diff --git a/arch/powerpc/platforms/44x/Kconfig
>> >> >> b/arch/powerpc/platforms/44x/Kconfig
>> >> >> index 39e93d23fb38..300864d7b8c9 100644
>> >> >> --- a/arch/powerpc/platforms/44x/Kconfig
>> >> >> +++ b/arch/powerpc/platforms/44x/Kconfig
>> >> >> @@ -13,6 +13,7 @@ config BAMBOO
>> >> >> select PPC44x_SIMPLE
>> >> >> select 440EP
>> >> >> select FORCE_PCI
>> >> >> + select DEFAULT_UIMAGE
>> >> >> help
>> >> >> This option enables support for the IBM PPC440EP evaluation board.
>> >> >
>> >> > Who knows what the actual bamboo board used. But I'd be happy to take a
>> >> > SOB'ed patch to do the above, because these days the qemu emulation is
>> >> > much more likely to be used than the actual board.
>> >>
>> >> I just went to see why my CI boot of 44x didn't catch this, and it's
>> >> because I don't use the uImage, I just boot the vmlinux directly:
>> >>
>> >> $ qemu-system-ppc -M bamboo -m 128m -display none -kernel build~/vmlinux -append "console=ttyS0" -display none -nodefaults -serial mon:stdio
>> >> Linux version 5.8.0-rc1-00118-g69119673bd50 (michael@alpine1-p1) (gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #4 Wed Jun 17 20:19:22 AEST 2020
>> >> Using PowerPC 44x Platform machine description
>> >> ioremap() called early from find_legacy_serial_ports+0x690/0x770. Use early_ioremap() instead
>> >> printk: bootconsole [udbg0] enabled
>> >>
>> >>
>> >> So that's probably the simplest solution?
>> >
>> > If the uImage or zImage self decompresses, I would prefer to test that as well.
>>
>> The uImage is decompressed by qemu AIUI.
>>
>> >> That means previously arch/powerpc/boot/zImage was just a hardlink to
>> >> the uImage:
>> >
>> > It sounds like we can just boot the zImage, or is that no longer
>> > created with the uImage?
>>
>> The zImage won't boot on bamboo.
>>
>> Because of the vagaries of the arch/powerpc/boot/Makefile the zImage
>> ends up pointing to treeImage.ebony, which is for a different board.
>>
>> The zImage link is made to the first item in $(image-y):
>>
>> $(obj)/zImage: $(addprefix $(obj)/, $(image-y))
>> $(Q)rm -f $@; ln $< $@
>> ^
>> first preqrequisite
>>
>> Which for this defconfig happens to be:
>>
>> image-$(CONFIG_EBONY) += treeImage.ebony cuImage.ebony
>>
>> If you turned off CONFIG_EBONY then the zImage will be a link to
>> treeImage.bamboo, but qemu can't boot that either.
>>
>> It's kind of nuts that the zImage points to some arbitrary image
>> depending on what's configured and the order of things in the Makefile.
>> But I'm not sure how we make it less nuts without risking breaking
>> people's existing setups.
>
> Hi Michael,
>
> For what it's worth, this is squared this away in terms of our CI by
> just building and booting the uImage directly, rather than implicitly
> using the zImage:
>
> https://github.com/ClangBuiltLinux/continuous-integration/pull/282
> https://github.com/ClangBuiltLinux/boot-utils/pull/22
Great.
> We were only using the zImage because that is what Joel Stanley intially
> set us up with when PowerPC 32-bit was added to our CI:
>
> https://github.com/ClangBuiltLinux/continuous-integration/pull/100
Ah, so Joel owes us all beers then ;)
> Admittedly, we really do not have many PowerPC experts in our
> organization so we are supporting it on a "best effort" basis, which
> often involves using whatever knowledge is floating around or can be
> gained from interactions such as this :) so thank you for that!
No worries. I definitely don't expect you folks to invest much effort in
powerpc, especially the old 32-bit stuff, so always happy to help debug
things, and really appreciate the testing you do.
cheers
^ permalink raw reply
* Re: [PATCH] mm/debug_vm_pgtable: Fix build failure with powerpc 8xx
From: Anshuman Khandual @ 2020-06-19 11:15 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman, Will Deacon, Andrew Morton,
Peter Zijlstra (Intel)
Cc: linux-mm, linuxppc-dev, linux-kernel
In-Reply-To: <6ca8c972e6c920dc4ae0d4affbed9703afa4d010.1592490570.git.christophe.leroy@csgroup.eu>
On 06/18/2020 08:01 PM, Christophe Leroy wrote:
> Fix it by using the recently added ptep_get() helper.
>
> Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> mm/debug_vm_pgtable.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index e45623016aea..61ab16fb2e36 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -246,13 +246,13 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
> static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
> unsigned long vaddr)
> {
> - pte_t pte = READ_ONCE(*ptep);
> + pte_t pte = ptep_get(ptep);
>
> pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> set_pte_at(mm, vaddr, ptep, pte);
> barrier();
> pte_clear(mm, vaddr, ptep);
> - pte = READ_ONCE(*ptep);
> + pte = ptep_get(ptep);
> WARN_ON(!pte_none(pte));
> }
Tested this on arm64 and x86 platforms after applying the previous
series which adds ptep_get() and a follow up patch.
https://patchwork.kernel.org/project/linux-mm/list/?series=302949
https://patchwork.kernel.org/patch/11611929/
Build tested on s390 and arc platforms as well.
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
^ permalink raw reply
* Re: linux-next: manual merge of the pidfd tree with the powerpc-fixes tree
From: Michael Ellerman @ 2020-06-19 11:17 UTC (permalink / raw)
To: Stephen Rothwell, Christian Brauner, PowerPC
Cc: Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20200618121131.4ad29150@canb.auug.org.au>
Stephen Rothwell <sfr@canb.auug.org.au> writes:
> Hi all,
>
> Today's linux-next merge of the pidfd tree got a conflict in:
>
> arch/powerpc/kernel/syscalls/syscall.tbl
>
> between commit:
>
> 35e32a6cb5f6 ("powerpc/syscalls: Split SPU-ness out of ABI")
>
> from the powerpc-fixes tree and commit:
>
> 9b4feb630e8e ("arch: wire-up close_range()")
>
> from the pidfd tree.
>
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging. You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
Thanks.
I thought the week between rc1 and rc2 would be a safe time to do that
conversion of the syscall table, but I guess I was wrong :)
I'm planning to send those changes to Linus for rc2, so the conflict
will then be vs mainline. But I guess it's pretty trivial so it doesn't
really matter.
cheers
> diff --cc arch/powerpc/kernel/syscalls/syscall.tbl
> index c0cdaacd770e,dd87a782d80e..000000000000
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@@ -480,6 -524,8 +480,7 @@@
> 434 common pidfd_open sys_pidfd_open
> 435 32 clone3 ppc_clone3 sys_clone3
> 435 64 clone3 sys_clone3
> -435 spu clone3 sys_ni_syscall
> + 436 common close_range sys_close_range
> 437 common openat2 sys_openat2
> 438 common pidfd_getfd sys_pidfd_getfd
> 439 common faccessat2 sys_faccessat2
^ permalink raw reply
* Re: [PATCH 2/2] powerpc/syscalls: Split SPU-ness out of ABI
From: Arnd Bergmann @ 2020-06-19 12:07 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, linux-arch, linux-kernel@vger.kernel.org
In-Reply-To: <20200616135617.2937252-2-mpe@ellerman.id.au>
On Tue, Jun 16, 2020 at 3:56 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Using the ABI field to encode whether a syscall is usable by SPU
> programs or not is a bit of kludge.
>
> The ABI of the syscall doesn't change depending on the SPU-ness, but
> in order to make the syscall generation work we have to pretend that
> it does.
The idea of the ABI field is not to identify which ABI a syscall follows
but which ABIs do or do not implement it. This is the same with e.g.
the x32 ABI on x86.
> It also means we have more duplicated syscall lines than we need to,
> and the SPU logic is not well contained, instead all of the syscall
> generation targets need to know if they are spu or nospu.
>
> So instead add a separate file which contains the information on which
> syscalls are available for SPU programs. It's just a list of syscall
> numbers with a single "spu" field. If the field has the value "spu"
> then the syscall is available to SPU programs, any other value or no
> entry entirely means the syscall is not available to SPU programs.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
I have a patch series originally from Firoz that was never quite finished
to unify the scripts across all architectures. I think making the format of
the table format more powerpc specific like you do here takes it a step
backwards and makes it harder to do that eventually.
> 4 files changed, 523 insertions(+), 128 deletions(-)
> create mode 100644 arch/powerpc/kernel/syscalls/spu.tbl
>
>
> I'm inclined to put this in next and ask Linus to pull it before rc2, that seems
> like the least disruptive way to get this in, unless anyone objects?
I still hope we can get a better solution.
> diff --git a/arch/powerpc/kernel/syscalls/spu.tbl b/arch/powerpc/kernel/syscalls/spu.tbl
> new file mode 100644
> index 000000000000..5eac04919303
> --- /dev/null
> +++ b/arch/powerpc/kernel/syscalls/spu.tbl
> @@ -0,0 +1,430 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# The format is:
> +# <number> <name> <spu>
> +#
> +# To indicate a syscall can be used by SPU programs use "spu" for the spu column.
> +#
> +# Syscalls that are not to be used by SPU programs can be left out of the file
> +# entirely, or an entry with a value other than "spu" can be added.
> +0 restart_syscall -
> +1 exit -
> +2 fork -
> +3 read spu
> +4 write spu
> +5 open spu
Having a new table format here also makes it harder for others to add
a new system call, both because it doesn't follow the syscall*.tbl naming
and because one has to first understand what the format is.
If you absolutely want to split it out, could you at least make the format
compatible with the existing scripts and avoid the change to
the syscalltbl.sh file?
Arnd
^ permalink raw reply
* Re: [PATCH v5 00/10] Support new pmem flush and sync instructions for POWER
From: Aneesh Kumar K.V @ 2020-06-19 13:10 UTC (permalink / raw)
To: linuxppc-dev, mpe, linux-nvdimm, dan.j.williams
Cc: Jeff Moyer, msuchanek, Jan Kara
In-Reply-To: <20200610062343.492293-1-aneesh.kumar@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> This patch series enables the usage os new pmem flush and sync instructions on POWER
> architecture. POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps)
> that can be used to write modified locations back to persistent storage. Additionally,
> POWER10 also introduce phwsync and plwsync which can be used to establish order of these
> writes to persistent storage.
>
> This series exposes these instructions to the rest of the kernel. The existing
> dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
> synchronization with OpenCAPI-hosted persistent storage. Hence the new instructions
> are added as a variant of the old ones that old hardware won't differentiate.
>
> On POWER10, pmem devices will be represented by a different device tree compat
> strings. This ensures that older kernels won't initialize pmem devices on POWER10.
>
> W.r.t userspace we want to make sure applications are enabled to use MAP_SYNC only
> if they are using the new instructions. To avoid the wrong usage of MAP_SYNC on
> newer hardware, we disable MAP_SYNC by default on newer hardware. The namespace specific
> attribute /sys/block/pmem0/dax/sync_fault can be used to enable MAP_SYNC later.
>
> With this:
> 1) vPMEM continues to work since it is a volatile region. That
> doesn't need any flush instructions.
>
> 2) pmdk and other user applications get updated to use new instructions
> and updated packages are made available to all distributions
>
> 3) On newer hardware, the device will appear with a new compat string.
> Hence older distributions won't initialize pmem on newer hardware.
>
> 4) If we have a newer kernel with an older distro, we use the per
> namespace sysfs knob that prevents the usage of MAP_SYNC.
>
> 5) Sometime in the future, we mark the CONFIG_ARCH_MAP_SYNC_DISABLE=n
> on ppc64 when we are confident that everybody is using the new flush
> instruction.
>
> Chaanges from V4:
> * Add namespace specific sychronous fault control.
>
> Changes from V3:
> * Add new compat string to be used for the device.
> * Use arch_pmem_flush_barrier() in dm-writecache.
>
> Aneesh Kumar K.V (10):
> powerpc/pmem: Restrict papr_scm to P8 and above.
> powerpc/pmem: Add new instructions for persistent storage and sync
> powerpc/pmem: Add flush routines using new pmem store and sync
> instruction
> libnvdimm/nvdimm/flush: Allow architecture to override the flush
> barrier
> powerpc/pmem/of_pmem: Update of_pmem to use the new barrier
> instruction.
> powerpc/pmem: Avoid the barrier in flush routines
> powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem
> flush functions.
> libnvdimm/dax: Add a dax flag to control synchronous fault support
> powerpc/pmem: Disable synchronous fault by default
> powerpc/pmem: Initialize pmem device on newer hardware
>
> arch/powerpc/include/asm/cacheflush.h | 10 ++++
> arch/powerpc/include/asm/ppc-opcode.h | 12 ++++
> arch/powerpc/lib/pmem.c | 46 ++++++++++++--
> arch/powerpc/platforms/Kconfig.cputype | 9 +++
> arch/powerpc/platforms/pseries/papr_scm.c | 31 +++++++++-
> arch/powerpc/platforms/pseries/pmem.c | 6 ++
> drivers/dax/bus.c | 2 +-
> drivers/dax/super.c | 73 +++++++++++++++++++++++
> drivers/md/dm-writecache.c | 2 +-
> drivers/nvdimm/of_pmem.c | 8 +++
> drivers/nvdimm/pmem.c | 4 ++
> drivers/nvdimm/region_devs.c | 24 ++++++--
> include/linux/dax.h | 16 +++++
> include/linux/libnvdimm.h | 8 +++
> mm/Kconfig | 3 +
> 15 files changed, 243 insertions(+), 11 deletions(-)
Ping.
Are we good with the approach here?
-aneesh
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox