Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 10/13] ARM: dts: exynos: replace to "max-frequecy" instead of "clock-freq-min-max"
From: Krzysztof Kozlowski @ 2016-11-09 20:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <72612112-3b79-8fd3-8be4-a8f60ab3b68a@samsung.com>

On Mon, Nov 07, 2016 at 09:38:15AM +0900, Jaehoon Chung wrote:
> On 11/05/2016 12:04 AM, Krzysztof Kozlowski wrote:
> > On Fri, Nov 04, 2016 at 12:19:49PM +0100, Heiko Stuebner wrote:
> >> Hi Jaehoon,
> >>
> >> Am Freitag, 4. November 2016, 19:21:30 CET schrieb Jaehoon Chung:
> >>> On 11/04/2016 03:41 AM, Krzysztof Kozlowski wrote:
> >>>> On Thu, Nov 03, 2016 at 03:21:32PM +0900, Jaehoon Chung wrote:
> >>>>> In drivers/mmc/core/host.c, there is "max-frequency" property.
> >>>>> It should be same behavior. So Use the "max-frequency" instead of
> >>>>> "clock-freq-min-max".
> >>>>>
> >>>>> Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
> >>>>> ---
> >>>>>
> >>>>>  arch/arm/boot/dts/exynos3250-artik5-eval.dts | 2 +-
> >>>>>  arch/arm/boot/dts/exynos3250-artik5.dtsi     | 2 +-
> >>>>>  arch/arm/boot/dts/exynos3250-monk.dts        | 2 +-
> >>>>>  arch/arm/boot/dts/exynos3250-rinato.dts      | 2 +-
> >>>>>  4 files changed, 4 insertions(+), 4 deletions(-)
> >>>>
> >>>> This looks totally independent to rest of patches so it can be applied
> >>>> separately without any functional impact (except lack of minimum
> >>>> frequency). Is that correct?
> >>>
> >>> You're right. I will split the patches. And will resend.
> >>> Thanks!
> >>
> >> I think what Krzysztof was asking was just if he can simply pick up this patch 
> >> alone, as it does not require any of the previous changes.
> >>
> >> Same is true for the Rockchip patches I guess, so we could just take them 
> >> individually into samsung/rockchip dts branches.
> > 
> > Yes, I wanted to get exactly this information. I couldn't find it in
> > cover letter.
> 
> In drivers/mmc/core/host.c, there already is "max-frequency" property.
> It's same functionality with "clock-freq-min-max". 
> Minimum clock value can be fixed to 100K. because MMC core will check clock value from 400K to 100K.
> But max-frequency can be difference.
> If we can use "max-frequency" property, we don't need to use "clock-freq-min-max" property anymore.
> I will resend the deprecated property instead of removing "clock-freq-min-max".
> 
> If you want to pick this, it's possible to pick. Then i will resend the patches without dt patches.

Thanks, applied.

Best regards,
Krzysztof

^ permalink raw reply

* [PATCHv2] PCI: QDF2432 32 bit config space accessors
From: Bjorn Helgaas @ 2016-11-09 20:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <aa37ca67-9909-e8f2-1a3e-b724cdb46c7b@codeaurora.org>

On Wed, Nov 09, 2016 at 02:25:56PM -0500, Christopher Covington wrote:
> Hi Bjorn,
> 
> On 11/02/2016 12:08 PM, Bjorn Helgaas wrote:
> > On Tue, Nov 01, 2016 at 07:06:31AM -0600, cov at codeaurora.org wrote:
> >> Hi Bjorn,
> >>
> >> On 2016-10-31 15:48, Bjorn Helgaas wrote:
> >>> On Wed, Sep 21, 2016 at 06:38:05PM -0400, Christopher Covington wrote:
> >>>> The Qualcomm Technologies QDF2432 SoC does not support accesses
> >>>> smaller
> >>>> than 32 bits to the PCI configuration space. Register the appropriate
> >>>> quirk.
> >>>>
> >>>> Signed-off-by: Christopher Covington <cov@codeaurora.org>
> >>>
> >>> Hi Christopher,
> >>>
> >>> Can you rebase this against v4.9-rc1?  It no longer applies to my tree.
> >>
> >> I apologize for not being clearer. This patch depends on:
> >>
> >> PCI/ACPI: Extend pci_mcfg_lookup() responsibilities
> >> PCI/ACPI: Check platform-specific ECAM quirks
> >>
> >> These patches from Tomasz Nowicki were previously in your pci/ecam-v6
> >> branch, but that seems to have come and gone. How would you like to
> >> proceed?
> > 
> > Oh yes, that's right, I forgot that connection.  I'm afraid I kind of
> > dropped the ball on that thread, so I went back and read through it
> > again.
> > 
> > I *think* the current state is:
> > 
> >   - I'm OK with the first two patches that add the quirk
> >     infrastructure.
> > 
> >   - My issue with the last three patches that add ThunderX quirks is
> >     that there's no generic description of the ECAM address space.
> > 
> > So if I understand correctly, your Qualcomm patch depends only on the
> > first two patches.
> > 
> > Then the question is how the Qualcomm ECAM address space is described.
> > Your quirk overrides the default pci_generic_ecam_ops with the
> > &pci_32b_ops, but it doesn't touch the address space part, so I assume
> > the bus ranges and corresponding address space in your MCFG is
> > correct.  So far, so good.
> > 
> > Is there also an ACPI device that contains that space in _CRS?  I
> > think we concluded that the standard solution is to describe this with
> > a PNP0C02 device.
> > 
> > Would you mind opening a bugzilla at bugzilla.kernel.org and attaching
> > the dmesg log, /proc/iomem, and maybe a DSDT dump?  I'd like to have
> > something to point at to say "if you need an MCFG quirk, you need the
> > MCFG bit and *also* these other related ACPI device bits, and here's
> > how it should be done."
> 
> We're working to add the PNP0C02 resource to future firmware, but it's
> not in the current firmware. Are dmesg and /proc/iomem from the
> current firmware interesting or should we wait for the update to file?

Note that the ECAM space is not the only thing that should be
described via these PNP0C02 devices.  *All* non-enumerable resources
should be described by the _CRS method of some ACPI device.  Here's a
sample from my laptop:

  PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)
  system 00:01: [io  0x1800-0x189f] could not be reserved
  system 00:01: [io  0x0800-0x087f] has been reserved
  system 00:01: [io  0x0880-0x08ff] has been reserved
  system 00:01: [io  0x0900-0x097f] has been reserved
  system 00:01: [io  0x0980-0x09ff] has been reserved
  system 00:01: [io  0x0a00-0x0a7f] has been reserved
  system 00:01: [io  0x0a80-0x0aff] has been reserved
  system 00:01: [io  0x0b00-0x0b7f] has been reserved
  system 00:01: [io  0x0b80-0x0bff] has been reserved
  system 00:01: [io  0x15e0-0x15ef] has been reserved
  system 00:01: [io  0x1600-0x167f] has been reserved
  system 00:01: [io  0x1640-0x165f] has been reserved
  system 00:01: [mem 0xf8000000-0xfbffffff] could not be reserved
  system 00:01: [mem 0xfed10000-0xfed13fff] has been reserved
  system 00:01: [mem 0xfed18000-0xfed18fff] has been reserved
  system 00:01: [mem 0xfed19000-0xfed19fff] has been reserved
  system 00:01: [mem 0xfeb00000-0xfebfffff] has been reserved
  system 00:01: [mem 0xfed20000-0xfed3ffff] has been reserved
  system 00:01: [mem 0xfed90000-0xfed93fff] has been reserved
  system 00:01: [mem 0xf7fe0000-0xf7ffffff] has been reserved
  system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)

Do you have firmware in the field that may not get updated?  If so,
I'd like to see the whole solution for that firmware, including the
MCFG quirk (which tells the PCI core where the ECAM region is) and
whatever PNP0C02 quirk you figure out to actually reserve the region.

I proposed a PNP0C02 quirk to Duc along these lines of the below.  I
don't actually know if it's feasible, but it didn't look as bad as I
expected, so I'd kind of like somebody to try it out.  I think you
would have to call this via a DMI hook (do you have DMI on arm64?),
maybe from pnp_init() or similar.

  struct pnp_protocol pnpquirk_protocol = {
    .name = "Plug and Play Quirks",
  };

  void quirk()
  {
    struct pnp_dev *dev;
    struct resource res;

    ret = pnp_register_protocol(&pnpquirk_protocol);
    if (ret)
      return;

    dev = pnp_alloc_dev(&pnpquirk_protocol, 0, "PNP0C02");
    if (!dev)
      return;

    res.start = XX;          /* ECAM start */
    res.end = YY;            /* ECAM end */
    res.flags = IORESOURCE_MEM;
    pnp_add_resource(dev, &res);

    dev->active = 1;
    pnp_add_device(dev);

    dev_info(&dev->dev, "fabricated device to reserve ECAM space %pR\n", &res);
  }

Bjorn

^ permalink raw reply

* [PATCH v3 2/2] ARM: EXYNOS: Remove unused soc_is_exynos{4,5}
From: Krzysztof Kozlowski @ 2016-11-09 20:04 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478693755-11953-3-git-send-email-pankaj.dubey@samsung.com>

On Wed, Nov 09, 2016 at 05:45:55PM +0530, Pankaj Dubey wrote:
> As no more user of soc_is_exynos{4,5} we can safely remove them.
> 
> Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
> ---
>  arch/arm/mach-exynos/common.h | 5 -----
>  1 file changed, 5 deletions(-)
>

Thanks, applied.

Best regards,
Krzysztof

^ permalink raw reply

* [PATCH v3 1/2] ARM: EXYNOS: Remove static mapping of SCU SFR
From: Krzysztof Kozlowski @ 2016-11-09 20:03 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478693755-11953-2-git-send-email-pankaj.dubey@samsung.com>

On Wed, Nov 09, 2016 at 05:45:54PM +0530, Pankaj Dubey wrote:
> Lets remove static mapping of SCU SFR mainly used in CORTEX-A9 SoC based
> boards. Instead use mapping from device tree node of SCU.
> 
> NOTE: This patch has dependency on DT file of any such CORTEX-A9 SoC
> based boards, in the absence of SCU device node in DTS file, only single
> CPU will boot. So if you are using OUT-OF-TREE DTS file of CORTEX-A9 based
> Exynos SoC make sure to add SCU device node to DTS file for SMP boot.
> 
> Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
> ---
>  arch/arm/mach-exynos/common.h                |  1 +
>  arch/arm/mach-exynos/exynos.c                | 22 ------------------
>  arch/arm/mach-exynos/include/mach/map.h      |  2 --
>  arch/arm/mach-exynos/platsmp.c               | 34 +++++++++++++++++++++-------
>  arch/arm/mach-exynos/pm.c                    |  4 +---
>  arch/arm/mach-exynos/suspend.c               |  4 +---
>  arch/arm/plat-samsung/include/plat/map-s5p.h |  4 ----
>  7 files changed, 29 insertions(+), 42 deletions(-)
>

Applied on a next/soc branch after merging next/dt to preserve
bisectability.

Best regards,
Krzysztof

^ permalink raw reply

* Summary of LPC guest MSI discussion in Santa Fe
From: Alex Williamson @ 2016-11-09 20:01 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109192303.GD15676@cbox>

On Wed, 9 Nov 2016 20:23:03 +0100
Christoffer Dall <christoffer.dall@linaro.org> wrote:

> On Wed, Nov 09, 2016 at 01:59:07PM -0500, Don Dutile wrote:
> > On 11/09/2016 12:03 PM, Will Deacon wrote:  
> > >On Tue, Nov 08, 2016 at 09:52:33PM -0500, Don Dutile wrote:  
> > >>On 11/08/2016 06:35 PM, Alex Williamson wrote:  
> > >>>On Tue, 8 Nov 2016 21:29:22 +0100
> > >>>Christoffer Dall <christoffer.dall@linaro.org> wrote:  
> > >>>>Is my understanding correct, that you need to tell userspace about the
> > >>>>location of the doorbell (in the IOVA space) in case (2), because even
> > >>>>though the configuration of the device is handled by the (host) kernel
> > >>>>through trapping of the BARs, we have to avoid the VFIO user programming
> > >>>>the device to create other DMA transactions to this particular address,
> > >>>>since that will obviously conflict and either not produce the desired
> > >>>>DMA transactions or result in unintended weird interrupts?  
> > >
> > >Yes, that's the crux of the issue.
> > >  
> > >>>Correct, if the MSI doorbell IOVA range overlaps RAM in the VM, then
> > >>>it's potentially a DMA target and we'll get bogus data on DMA read from
> > >>>the device, and lose data and potentially trigger spurious interrupts on
> > >>>DMA write from the device.  Thanks,
> > >>>  
> > >>That's b/c the MSI doorbells are not positioned *above* the SMMU, i.e.,
> > >>they address match before the SMMU checks are done.  if
> > >>all DMA addrs had to go through SMMU first, then the DMA access could
> > >>be ignored/rejected.  
> > >
> > >That's actually not true :( The SMMU can't generally distinguish between MSI
> > >writes and DMA writes, so it would just see a write transaction to the
> > >doorbell address, regardless of how it was generated by the endpoint.
> > >
> > >Will
> > >  
> > So, we have real systems where MSI doorbells are placed at the same IOVA
> > that could have memory for a guest  
> 
> I don't think this is a property of a hardware system.  THe problem is
> userspace not knowing where in the IOVA space the kernel is going to
> place the doorbell, so you can end up (basically by chance) that some
> IPA range of guest memory overlaps with the IOVA space for the doorbell.
> 
> 
> >, but not at the same IOVA as memory on real hw ?  
> 
> On real hardware without an IOMMU the system designer would have to
> separate the IOVA and RAM in the physical address space.  With an IOMMU,
> the SMMU driver just makes sure to allocate separate regions in the IOVA
> space.
> 
> The challenge, as I understand it, happens with the VM, because the VM
> doesn't allocate the IOVA for the MSI doorbell itself, but the host
> kernel does this, independently from the attributes (e.g. memory map) of
> the VM.
> 
> Because the IOVA is a single resource, but with two independent entities
> allocating chunks of it (the host kernel for the MSI doorbell IOVA, and
> the VFIO user for other DMA operations), you have to provide some
> coordination between those to entities to avoid conflicts.  In the case
> of KVM, the two entities are the host kernel and the VFIO user (QEMU/the
> VM), and the host kernel informs the VFIO user to never attempt to use
> the doorbell IOVA already reserved by the host kernel for DMA.
> 
> One way to do that is to ensure that the IPA space of the VFIO user
> corresponding to the doorbell IOVA is simply not valid, ie. the reserved
> regions that avoid for example QEMU to allocate RAM there.
> 
> (I suppose it's technically possible to get around this issue by letting
> QEMU place RAM wherever it wants but tell the guest to never use a
> particular subset of its RAM for DMA, because that would conflict with
> the doorbell IOVA or be seen as p2p transactions.  But I think we all
> probably agree that it's a disgusting idea.)

Well, it's not like QEMU or libvirt stumbling through sysfs to figure
out where holes could be in order to instantiate a VM with matching
holes, just in case someone might decide to hot-add a device into the
VM, at some point, and hopefully they don't migrate the VM to another
host with a different layout first, is all that much less disgusting or
foolproof. It's just that in order to dynamically remove a page as a
possible DMA target we require a paravirt channel, such as a balloon
driver that's able to pluck a specific page.  In some ways it's
actually less disgusting, but it puts some prerequisites on
enlightening the guest OS.  Thanks,

Alex

^ permalink raw reply

* [PATCH 2/2] KVM: ARM64: Fix the issues when guest PMCCFILTR is configured
From: Wei Huang @ 2016-11-09 19:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478721480-24852-1-git-send-email-wei@redhat.com>

KVM calls kvm_pmu_set_counter_event_type() when PMCCFILTR is configured.
But this function can't deals with PMCCFILTR correctly because the evtCount
bit of PMCCFILTR, which is reserved 0, conflits with the SW_INCR event
type of other PMXEVTYPER<n> registers. To fix it, when eventsel == 0, this
function shouldn't return immediately; instead it needs to check further
if select_idx is ARMV8_PMU_CYCLE_IDX.

Another issue is that KVM shouldn't copy the eventsel bits of PMCCFILTER
blindly to attr.config. Instead it ought to convert the request to the
"cpu cycle" event type (i.e. 0x11).

Signed-off-by: Wei Huang <wei@redhat.com>
---
 virt/kvm/arm/pmu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 6e9c40e..69ccce3 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -305,7 +305,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val)
 			continue;
 		type = vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + i)
 		       & ARMV8_PMU_EVTYPE_EVENT;
-		if ((type == ARMV8_PMU_EVTYPE_EVENT_SW_INCR)
+		if ((type == ARMV8_PMUV3_PERFCTR_SW_INCR)
 		    && (enable & BIT(i))) {
 			reg = vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i) + 1;
 			reg = lower_32_bits(reg);
@@ -379,7 +379,8 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 	eventsel = data & ARMV8_PMU_EVTYPE_EVENT;
 
 	/* Software increment event does't need to be backed by a perf event */
-	if (eventsel == ARMV8_PMU_EVTYPE_EVENT_SW_INCR)
+	if (eventsel == ARMV8_PMUV3_PERFCTR_SW_INCR &&
+	    select_idx != ARMV8_PMU_CYCLE_IDX)
 		return;
 
 	memset(&attr, 0, sizeof(struct perf_event_attr));
@@ -391,7 +392,8 @@ void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data,
 	attr.exclude_kernel = data & ARMV8_PMU_EXCLUDE_EL1 ? 1 : 0;
 	attr.exclude_hv = 1; /* Don't count EL2 events */
 	attr.exclude_host = 1; /* Don't count host events */
-	attr.config = eventsel;
+	attr.config = (select_idx == ARMV8_PMU_CYCLE_IDX) ?
+		ARMV8_PMUV3_PERFCTR_CPU_CYCLES : eventsel;
 
 	counter = kvm_pmu_get_counter_value(vcpu, select_idx);
 	/* The initial sample period (overflow count) of an event. */
-- 
2.7.4

^ permalink raw reply related

* [PATCH 1/2] arm64: perf: Move ARMv8 PMU perf event definitions to asm/perf_event.h
From: Wei Huang @ 2016-11-09 19:57 UTC (permalink / raw)
  To: linux-arm-kernel

This patch moves ARMv8-related perf event definitions from perf_event.c
to asm/perf_event.h; so KVM code can use them directly. This also help
remove a duplicated definition of SW_INCR in perf_event.h.

Signed-off-by: Wei Huang <wei@redhat.com>
---
 arch/arm64/include/asm/perf_event.h | 161 +++++++++++++++++++++++++++++++++++-
 arch/arm64/kernel/perf_event.c      | 161 ------------------------------------
 2 files changed, 160 insertions(+), 162 deletions(-)

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 2065f46..6c7b18b 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -46,7 +46,166 @@
 #define	ARMV8_PMU_EVTYPE_MASK	0xc800ffff	/* Mask for writable bits */
 #define	ARMV8_PMU_EVTYPE_EVENT	0xffff		/* Mask for EVENT bits */
 
-#define ARMV8_PMU_EVTYPE_EVENT_SW_INCR	0	/* Software increment event */
+/*
+ * ARMv8 PMUv3 Performance Events handling code.
+ * Common event types.
+ */
+
+/* Required events. */
+#define ARMV8_PMUV3_PERFCTR_SW_INCR				0x00
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL			0x03
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE				0x04
+#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED				0x10
+#define ARMV8_PMUV3_PERFCTR_CPU_CYCLES				0x11
+#define ARMV8_PMUV3_PERFCTR_BR_PRED				0x12
+
+/* At least one of the following is required. */
+#define ARMV8_PMUV3_PERFCTR_INST_RETIRED			0x08
+#define ARMV8_PMUV3_PERFCTR_INST_SPEC				0x1B
+
+/* Common architectural events. */
+#define ARMV8_PMUV3_PERFCTR_LD_RETIRED				0x06
+#define ARMV8_PMUV3_PERFCTR_ST_RETIRED				0x07
+#define ARMV8_PMUV3_PERFCTR_EXC_TAKEN				0x09
+#define ARMV8_PMUV3_PERFCTR_EXC_RETURN				0x0A
+#define ARMV8_PMUV3_PERFCTR_CID_WRITE_RETIRED			0x0B
+#define ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED			0x0C
+#define ARMV8_PMUV3_PERFCTR_BR_IMMED_RETIRED			0x0D
+#define ARMV8_PMUV3_PERFCTR_BR_RETURN_RETIRED			0x0E
+#define ARMV8_PMUV3_PERFCTR_UNALIGNED_LDST_RETIRED		0x0F
+#define ARMV8_PMUV3_PERFCTR_TTBR_WRITE_RETIRED			0x1C
+#define ARMV8_PMUV3_PERFCTR_CHAIN				0x1E
+#define ARMV8_PMUV3_PERFCTR_BR_RETIRED				0x21
+
+/* Common microarchitectural events. */
+#define ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL			0x01
+#define ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL			0x02
+#define ARMV8_PMUV3_PERFCTR_L1D_TLB_REFILL			0x05
+#define ARMV8_PMUV3_PERFCTR_MEM_ACCESS				0x13
+#define ARMV8_PMUV3_PERFCTR_L1I_CACHE				0x14
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_WB			0x15
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE				0x16
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_REFILL			0x17
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_WB			0x18
+#define ARMV8_PMUV3_PERFCTR_BUS_ACCESS				0x19
+#define ARMV8_PMUV3_PERFCTR_MEMORY_ERROR			0x1A
+#define ARMV8_PMUV3_PERFCTR_BUS_CYCLES				0x1D
+#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_ALLOCATE			0x1F
+#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_ALLOCATE			0x20
+#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED_RETIRED			0x22
+#define ARMV8_PMUV3_PERFCTR_STALL_FRONTEND			0x23
+#define ARMV8_PMUV3_PERFCTR_STALL_BACKEND			0x24
+#define ARMV8_PMUV3_PERFCTR_L1D_TLB				0x25
+#define ARMV8_PMUV3_PERFCTR_L1I_TLB				0x26
+#define ARMV8_PMUV3_PERFCTR_L2I_CACHE				0x27
+#define ARMV8_PMUV3_PERFCTR_L2I_CACHE_REFILL			0x28
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_ALLOCATE			0x29
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL			0x2A
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE				0x2B
+#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_WB			0x2C
+#define ARMV8_PMUV3_PERFCTR_L2D_TLB_REFILL			0x2D
+#define ARMV8_PMUV3_PERFCTR_L2I_TLB_REFILL			0x2E
+#define ARMV8_PMUV3_PERFCTR_L2D_TLB				0x2F
+#define ARMV8_PMUV3_PERFCTR_L2I_TLB				0x30
+
+/* ARMv8 recommended implementation defined event types */
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_RD			0x40
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WR			0x41
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_RD		0x42
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_WR		0x43
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_INNER		0x44
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_OUTER		0x45
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WB_VICTIM		0x46
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WB_CLEAN			0x47
+#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_INVAL			0x48
+
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_REFILL_RD			0x4C
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_REFILL_WR			0x4D
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_RD				0x4E
+#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_WR				0x4F
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_RD			0x50
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WR			0x51
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_REFILL_RD		0x52
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_REFILL_WR		0x53
+
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WB_VICTIM		0x56
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WB_CLEAN			0x57
+#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_INVAL			0x58
+
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_REFILL_RD			0x5C
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_REFILL_WR			0x5D
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_RD				0x5E
+#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_WR				0x5F
+
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_RD			0x60
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_WR			0x61
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_SHARED			0x62
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_NOT_SHARED		0x63
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_NORMAL			0x64
+#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_PERIPH			0x65
+
+#define ARMV8_IMPDEF_PERFCTR_MEM_ACCESS_RD			0x66
+#define ARMV8_IMPDEF_PERFCTR_MEM_ACCESS_WR			0x67
+#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_LD_SPEC			0x68
+#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_ST_SPEC			0x69
+#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_LDST_SPEC		0x6A
+
+#define ARMV8_IMPDEF_PERFCTR_LDREX_SPEC				0x6C
+#define ARMV8_IMPDEF_PERFCTR_STREX_PASS_SPEC			0x6D
+#define ARMV8_IMPDEF_PERFCTR_STREX_FAIL_SPEC			0x6E
+#define ARMV8_IMPDEF_PERFCTR_STREX_SPEC				0x6F
+#define ARMV8_IMPDEF_PERFCTR_LD_SPEC				0x70
+#define ARMV8_IMPDEF_PERFCTR_ST_SPEC				0x71
+#define ARMV8_IMPDEF_PERFCTR_LDST_SPEC				0x72
+#define ARMV8_IMPDEF_PERFCTR_DP_SPEC				0x73
+#define ARMV8_IMPDEF_PERFCTR_ASE_SPEC				0x74
+#define ARMV8_IMPDEF_PERFCTR_VFP_SPEC				0x75
+#define ARMV8_IMPDEF_PERFCTR_PC_WRITE_SPEC			0x76
+#define ARMV8_IMPDEF_PERFCTR_CRYPTO_SPEC			0x77
+#define ARMV8_IMPDEF_PERFCTR_BR_IMMED_SPEC			0x78
+#define ARMV8_IMPDEF_PERFCTR_BR_RETURN_SPEC			0x79
+#define ARMV8_IMPDEF_PERFCTR_BR_INDIRECT_SPEC			0x7A
+
+#define ARMV8_IMPDEF_PERFCTR_ISB_SPEC				0x7C
+#define ARMV8_IMPDEF_PERFCTR_DSB_SPEC				0x7D
+#define ARMV8_IMPDEF_PERFCTR_DMB_SPEC				0x7E
+
+#define ARMV8_IMPDEF_PERFCTR_EXC_UNDEF				0x81
+#define ARMV8_IMPDEF_PERFCTR_EXC_SVC				0x82
+#define ARMV8_IMPDEF_PERFCTR_EXC_PABORT				0x83
+#define ARMV8_IMPDEF_PERFCTR_EXC_DABORT				0x84
+
+#define ARMV8_IMPDEF_PERFCTR_EXC_IRQ				0x86
+#define ARMV8_IMPDEF_PERFCTR_EXC_FIQ				0x87
+#define ARMV8_IMPDEF_PERFCTR_EXC_SMC				0x88
+
+#define ARMV8_IMPDEF_PERFCTR_EXC_HVC				0x8A
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_PABORT			0x8B
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_DABORT			0x8C
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_OTHER			0x8D
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_IRQ			0x8E
+#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_FIQ			0x8F
+#define ARMV8_IMPDEF_PERFCTR_RC_LD_SPEC				0x90
+#define ARMV8_IMPDEF_PERFCTR_RC_ST_SPEC				0x91
+
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_RD			0xA0
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WR			0xA1
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_REFILL_RD		0xA2
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_REFILL_WR		0xA3
+
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WB_VICTIM		0xA6
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WB_CLEAN			0xA7
+#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_INVAL			0xA8
+
+/* ARMv8 Cortex-A53 specific event types. */
+#define ARMV8_A53_PERFCTR_PREF_LINEFILL				0xC2
+
+/* ARMv8 Cavium ThunderX specific event types. */
+#define ARMV8_THUNDER_PERFCTR_L1D_CACHE_MISS_ST			0xE9
+#define ARMV8_THUNDER_PERFCTR_L1D_CACHE_PREF_ACCESS		0xEA
+#define ARMV8_THUNDER_PERFCTR_L1D_CACHE_PREF_MISS		0xEB
+#define ARMV8_THUNDER_PERFCTR_L1I_CACHE_PREF_ACCESS		0xEC
+#define ARMV8_THUNDER_PERFCTR_L1I_CACHE_PREF_MISS		0xED
 
 /*
  * Event filters for PMUv3
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a9310a6..108ba40 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -29,167 +29,6 @@
 #include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
 
-/*
- * ARMv8 PMUv3 Performance Events handling code.
- * Common event types.
- */
-
-/* Required events. */
-#define ARMV8_PMUV3_PERFCTR_SW_INCR				0x00
-#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL			0x03
-#define ARMV8_PMUV3_PERFCTR_L1D_CACHE				0x04
-#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED				0x10
-#define ARMV8_PMUV3_PERFCTR_CPU_CYCLES				0x11
-#define ARMV8_PMUV3_PERFCTR_BR_PRED				0x12
-
-/* At least one of the following is required. */
-#define ARMV8_PMUV3_PERFCTR_INST_RETIRED			0x08
-#define ARMV8_PMUV3_PERFCTR_INST_SPEC				0x1B
-
-/* Common architectural events. */
-#define ARMV8_PMUV3_PERFCTR_LD_RETIRED				0x06
-#define ARMV8_PMUV3_PERFCTR_ST_RETIRED				0x07
-#define ARMV8_PMUV3_PERFCTR_EXC_TAKEN				0x09
-#define ARMV8_PMUV3_PERFCTR_EXC_RETURN				0x0A
-#define ARMV8_PMUV3_PERFCTR_CID_WRITE_RETIRED			0x0B
-#define ARMV8_PMUV3_PERFCTR_PC_WRITE_RETIRED			0x0C
-#define ARMV8_PMUV3_PERFCTR_BR_IMMED_RETIRED			0x0D
-#define ARMV8_PMUV3_PERFCTR_BR_RETURN_RETIRED			0x0E
-#define ARMV8_PMUV3_PERFCTR_UNALIGNED_LDST_RETIRED		0x0F
-#define ARMV8_PMUV3_PERFCTR_TTBR_WRITE_RETIRED			0x1C
-#define ARMV8_PMUV3_PERFCTR_CHAIN				0x1E
-#define ARMV8_PMUV3_PERFCTR_BR_RETIRED				0x21
-
-/* Common microarchitectural events. */
-#define ARMV8_PMUV3_PERFCTR_L1I_CACHE_REFILL			0x01
-#define ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL			0x02
-#define ARMV8_PMUV3_PERFCTR_L1D_TLB_REFILL			0x05
-#define ARMV8_PMUV3_PERFCTR_MEM_ACCESS				0x13
-#define ARMV8_PMUV3_PERFCTR_L1I_CACHE				0x14
-#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_WB			0x15
-#define ARMV8_PMUV3_PERFCTR_L2D_CACHE				0x16
-#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_REFILL			0x17
-#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_WB			0x18
-#define ARMV8_PMUV3_PERFCTR_BUS_ACCESS				0x19
-#define ARMV8_PMUV3_PERFCTR_MEMORY_ERROR			0x1A
-#define ARMV8_PMUV3_PERFCTR_BUS_CYCLES				0x1D
-#define ARMV8_PMUV3_PERFCTR_L1D_CACHE_ALLOCATE			0x1F
-#define ARMV8_PMUV3_PERFCTR_L2D_CACHE_ALLOCATE			0x20
-#define ARMV8_PMUV3_PERFCTR_BR_MIS_PRED_RETIRED			0x22
-#define ARMV8_PMUV3_PERFCTR_STALL_FRONTEND			0x23
-#define ARMV8_PMUV3_PERFCTR_STALL_BACKEND			0x24
-#define ARMV8_PMUV3_PERFCTR_L1D_TLB				0x25
-#define ARMV8_PMUV3_PERFCTR_L1I_TLB				0x26
-#define ARMV8_PMUV3_PERFCTR_L2I_CACHE				0x27
-#define ARMV8_PMUV3_PERFCTR_L2I_CACHE_REFILL			0x28
-#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_ALLOCATE			0x29
-#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL			0x2A
-#define ARMV8_PMUV3_PERFCTR_L3D_CACHE				0x2B
-#define ARMV8_PMUV3_PERFCTR_L3D_CACHE_WB			0x2C
-#define ARMV8_PMUV3_PERFCTR_L2D_TLB_REFILL			0x2D
-#define ARMV8_PMUV3_PERFCTR_L2I_TLB_REFILL			0x2E
-#define ARMV8_PMUV3_PERFCTR_L2D_TLB				0x2F
-#define ARMV8_PMUV3_PERFCTR_L2I_TLB				0x30
-
-/* ARMv8 recommended implementation defined event types */
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_RD			0x40
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WR			0x41
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_RD		0x42
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_WR		0x43
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_INNER		0x44
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_REFILL_OUTER		0x45
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WB_VICTIM		0x46
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_WB_CLEAN			0x47
-#define ARMV8_IMPDEF_PERFCTR_L1D_CACHE_INVAL			0x48
-
-#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_REFILL_RD			0x4C
-#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_REFILL_WR			0x4D
-#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_RD				0x4E
-#define ARMV8_IMPDEF_PERFCTR_L1D_TLB_WR				0x4F
-#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_RD			0x50
-#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WR			0x51
-#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_REFILL_RD		0x52
-#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_REFILL_WR		0x53
-
-#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WB_VICTIM		0x56
-#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_WB_CLEAN			0x57
-#define ARMV8_IMPDEF_PERFCTR_L2D_CACHE_INVAL			0x58
-
-#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_REFILL_RD			0x5C
-#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_REFILL_WR			0x5D
-#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_RD				0x5E
-#define ARMV8_IMPDEF_PERFCTR_L2D_TLB_WR				0x5F
-
-#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_RD			0x60
-#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_WR			0x61
-#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_SHARED			0x62
-#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_NOT_SHARED		0x63
-#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_NORMAL			0x64
-#define ARMV8_IMPDEF_PERFCTR_BUS_ACCESS_PERIPH			0x65
-
-#define ARMV8_IMPDEF_PERFCTR_MEM_ACCESS_RD			0x66
-#define ARMV8_IMPDEF_PERFCTR_MEM_ACCESS_WR			0x67
-#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_LD_SPEC			0x68
-#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_ST_SPEC			0x69
-#define ARMV8_IMPDEF_PERFCTR_UNALIGNED_LDST_SPEC		0x6A
-
-#define ARMV8_IMPDEF_PERFCTR_LDREX_SPEC				0x6C
-#define ARMV8_IMPDEF_PERFCTR_STREX_PASS_SPEC			0x6D
-#define ARMV8_IMPDEF_PERFCTR_STREX_FAIL_SPEC			0x6E
-#define ARMV8_IMPDEF_PERFCTR_STREX_SPEC				0x6F
-#define ARMV8_IMPDEF_PERFCTR_LD_SPEC				0x70
-#define ARMV8_IMPDEF_PERFCTR_ST_SPEC				0x71
-#define ARMV8_IMPDEF_PERFCTR_LDST_SPEC				0x72
-#define ARMV8_IMPDEF_PERFCTR_DP_SPEC				0x73
-#define ARMV8_IMPDEF_PERFCTR_ASE_SPEC				0x74
-#define ARMV8_IMPDEF_PERFCTR_VFP_SPEC				0x75
-#define ARMV8_IMPDEF_PERFCTR_PC_WRITE_SPEC			0x76
-#define ARMV8_IMPDEF_PERFCTR_CRYPTO_SPEC			0x77
-#define ARMV8_IMPDEF_PERFCTR_BR_IMMED_SPEC			0x78
-#define ARMV8_IMPDEF_PERFCTR_BR_RETURN_SPEC			0x79
-#define ARMV8_IMPDEF_PERFCTR_BR_INDIRECT_SPEC			0x7A
-
-#define ARMV8_IMPDEF_PERFCTR_ISB_SPEC				0x7C
-#define ARMV8_IMPDEF_PERFCTR_DSB_SPEC				0x7D
-#define ARMV8_IMPDEF_PERFCTR_DMB_SPEC				0x7E
-
-#define ARMV8_IMPDEF_PERFCTR_EXC_UNDEF				0x81
-#define ARMV8_IMPDEF_PERFCTR_EXC_SVC				0x82
-#define ARMV8_IMPDEF_PERFCTR_EXC_PABORT				0x83
-#define ARMV8_IMPDEF_PERFCTR_EXC_DABORT				0x84
-
-#define ARMV8_IMPDEF_PERFCTR_EXC_IRQ				0x86
-#define ARMV8_IMPDEF_PERFCTR_EXC_FIQ				0x87
-#define ARMV8_IMPDEF_PERFCTR_EXC_SMC				0x88
-
-#define ARMV8_IMPDEF_PERFCTR_EXC_HVC				0x8A
-#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_PABORT			0x8B
-#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_DABORT			0x8C
-#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_OTHER			0x8D
-#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_IRQ			0x8E
-#define ARMV8_IMPDEF_PERFCTR_EXC_TRAP_FIQ			0x8F
-#define ARMV8_IMPDEF_PERFCTR_RC_LD_SPEC				0x90
-#define ARMV8_IMPDEF_PERFCTR_RC_ST_SPEC				0x91
-
-#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_RD			0xA0
-#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WR			0xA1
-#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_REFILL_RD		0xA2
-#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_REFILL_WR		0xA3
-
-#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WB_VICTIM		0xA6
-#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_WB_CLEAN			0xA7
-#define ARMV8_IMPDEF_PERFCTR_L3D_CACHE_INVAL			0xA8
-
-/* ARMv8 Cortex-A53 specific event types. */
-#define ARMV8_A53_PERFCTR_PREF_LINEFILL				0xC2
-
-/* ARMv8 Cavium ThunderX specific event types. */
-#define ARMV8_THUNDER_PERFCTR_L1D_CACHE_MISS_ST			0xE9
-#define ARMV8_THUNDER_PERFCTR_L1D_CACHE_PREF_ACCESS		0xEA
-#define ARMV8_THUNDER_PERFCTR_L1D_CACHE_PREF_MISS		0xEB
-#define ARMV8_THUNDER_PERFCTR_L1I_CACHE_PREF_ACCESS		0xEC
-#define ARMV8_THUNDER_PERFCTR_L1I_CACHE_PREF_MISS		0xED
-
 /* PMUv3 HW events mapping. */
 
 /*
-- 
2.7.4

^ permalink raw reply related

* [PATCH] arm64: mm: Fix memmap to be initialized for the entire section
From: Robert Richter @ 2016-11-09 19:51 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161107210514.GP20591@arm.com>

Will,

On 07.11.16 21:05:14, Will Deacon wrote:
> Just to reiterate here, but your patch as it stands will break other parts
> of the kernel. For example, acpi_os_ioremap relies on being able to ioremap
> these regions afaict.
> 
> I think any solution involving pfn_valid is just going to move the crash
> around.

Let me describe the crash more detailed.

Following range is marked nomap (full efi map below):

[    0.000000] efi:   0x010ffffb9000-0x010ffffccfff [Runtime Code       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] efi:   0x010ffffcd000-0x010fffffefff [Runtime Data       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*

The mem belongs to this nodes:

[    0.000000] NUMA: Adding memblock [0x1400000 - 0xfffffffff] on node 0
[    0.000000] NUMA: Adding memblock [0x10000400000 - 0x10fffffffff] on node 1

The following mem ranges are created:

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000001400000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x0000010fffffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000001400000-0x00000000fffdffff]
[    0.000000]   node   0: [mem 0x00000000fffe0000-0x00000000ffffffff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x0000000fffffffff]
[    0.000000]   node   1: [mem 0x0000010000400000-0x0000010ff9e7ffff]
[    0.000000]   node   1: [mem 0x0000010ff9e80000-0x0000010ff9f1ffff]
[    0.000000]   node   1: [mem 0x0000010ff9f20000-0x0000010ffaeaffff]
[    0.000000]   node   1: [mem 0x0000010ffaeb0000-0x0000010ffaffffff]
[    0.000000]   node   1: [mem 0x0000010ffb000000-0x0000010ffffaffff]
[    0.000000]   node   1: [mem 0x0000010ffffb0000-0x0000010fffffffff]

The last range 0x0000010ffffb0000-0x0000010fffffffff is correctly
marked as a nomap area.

Paging is then initalized in free_area_init_core() (mm/page_alloc.c)
which calls memmap_init_zone(). This initializes all pages (struct
page) of the zones for each node except for pfns from nomap memory. It
uses early_pfn_valid() which is bound to pfn_valid().

In 4.4 *all* pages of a zone were initialized. Now page from nomap
ranges are skipped, and e.g. pfn 0x10ffffb has an uninitalized struct
page.  Note that nomap mem ranges are still part of the memblock list
and also part of the zone ranges within start_pfn and end_pfn, but
those don't have a valid struct page now. IMO this is a bug.

Later on all mappable memory is reserved and all pages of it are freed
by adding them to the free-pages list except for the reserved
memblocks.

Now the initrd is loaded. This reserves free memory by calling
move_freepages_block(). A block has the size of pageblock_nr_pages
which is in my configuration 0x2000. The kernel choses the block

 0x0000010fe0000000-0x0000010fffffffff

that also contains the nomap region around 10ffb000000.

In move_freepages() the pages of the zone are freed. The code accesses
pfn #10fffff but the struct page is uninitialized and thus node is 0.
This is different to #10fe000 and the zone which is node 1. This
causes the BUG_ON() to trigger:

[    4.173521] Unpacking initramfs...
[    8.420710] ------------[ cut here ]------------
[    8.425344] kernel BUG at mm/page_alloc.c:1844!
[    8.429870] Internal error: Oops - BUG: 0 [#1] SMP

I believe this is not the only case where the mm code relies on a
valid struct page for all pfns of a zone, thus modifying mm code to
make this case work is not an option. (E.g. this could be done by
moving the early_pfn_valid() check at the beginning of the loop in
memmap_init_zone().)

So for a fix we need to change early_pfn_valid() which is mapped to
pfn_valid() for SPARSEMEM. Using a different function than pfn_valid()
for this does not look reasonable. The only option is to revert
pfn_valid() to it's original bahaviour.

My fix now uses again memblock_is_memory() for pfn_valid() instead of
memblock_is_map_memory(). So I needed to change some users of
pfn_valid() to use memblock_is_map_memory() there necessary. This is
only required for arch specific code, all other uses of pfn_valid()
are save to use memblock_is_memory().

Thus, I don't see where my patch breaks code. Even acpi_os_ioremap()
keeps the same behaviour as before since it still uses memblock_is_
memory(). Could you more describe your concerns why do you think this
patch breaks the kernel and moves the problem somewhere else? I
believe it fixes the problem at all.

Thanks,

-Robert



[    0.000000] efi: Processing EFI memory map:
[    0.000000] MEMBLOCK configuration:
[    0.000000]  memory size = 0x1ffe800000 reserved size = 0x10537
[    0.000000]  memory.cnt  = 0x2
[    0.000000]  memory[0x0]     [0x00000001400000-0x00000fffffffff], 0xffec00000 bytes flags: 0x0
[    0.000000]  memory[0x1]     [0x00010000400000-0x00010fffffffff], 0xfffc00000 bytes flags: 0x0
[    0.000000]  reserved.cnt  = 0x1
[    0.000000]  reserved[0x0]   [0x00000021200000-0x00000021210536], 0x10537 bytes flags: 0x0
[    0.000000] efi:   0x000001400000-0x00000147ffff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000001400000-0x0000000147ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x000001480000-0x0000024affff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000001480000-0x000000024affff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x0000024b0000-0x0000211fffff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x000000024b0000-0x000000211fffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x000021200000-0x00002121ffff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000021200000-0x0000002121ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x000021220000-0x0000fffebfff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000021220000-0x000000fffeffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x0000fffec000-0x0000ffff5fff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x000000fffe0000-0x000000ffffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x0000ffff6000-0x0000ffff6fff [ACPI Memory NVS    |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x000000ffff0000-0x000000ffffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x0000ffff7000-0x0000ffffffff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x000000ffff0000-0x000000ffffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x000100000000-0x000ff7ffffff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000100000000-0x00000ff7ffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x000ff8000000-0x000ff801ffff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000ff8000000-0x00000ff801ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x000ff8020000-0x000fffa9efff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000ff8020000-0x00000fffa9ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x000fffa9f000-0x000fffffffff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00000fffa90000-0x00000fffffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010000400000-0x010f812b3fff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010000400000-0x00010f812bffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f812b4000-0x010f812b6fff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f812b0000-0x00010f812bffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f812b7000-0x010f822e6fff [Loader Code        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f812b0000-0x00010f822effff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f822e7000-0x010f822f6fff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f822e0000-0x00010f822fffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f822f7000-0x010f82342fff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f822f0000-0x00010f8234ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f82343000-0x010f91e73fff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f82340000-0x00010f91e7ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f91e74000-0x010f91e74fff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f91e70000-0x00010f91e7ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f91e75000-0x010f92c98fff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f91e70000-0x00010f92c9ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f92c99000-0x010f93880fff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f92c90000-0x00010f9388ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010f93881000-0x010ff7880fff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010f93880000-0x00010ff788ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ff7881000-0x010ff7886fff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010ff7880000-0x00010ff788ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ff7887000-0x010ff78a3fff [Loader Code        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010ff7880000-0x00010ff78affff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ff78a4000-0x010ff9e8dfff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010ff78a0000-0x00010ff9e8ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ff9e8e000-0x010ff9f16fff [Runtime Data       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x00010ff9e80000-0x00010ff9f1ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ff9f17000-0x010ffaeb5fff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010ff9f10000-0x00010ffaebffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffaeb6000-0x010ffafc8fff [Runtime Data       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x00010ffaeb0000-0x00010ffafcffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffafc9000-0x010ffafccfff [Runtime Code       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x00010ffafc0000-0x00010ffafcffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffafcd000-0x010ffaff4fff [Runtime Data       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x00010ffafc0000-0x00010ffaffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffaff5000-0x010ffb008fff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010ffaff0000-0x00010ffb00ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffb009000-0x010fffe28fff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010ffb000000-0x00010fffe2ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010fffe29000-0x010fffe3ffff [Conventional Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010fffe20000-0x00010fffe3ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010fffe40000-0x010fffe53fff [Loader Data        |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010fffe40000-0x00010fffe5ffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010fffe54000-0x010ffffb8fff [Boot Code          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010fffe50000-0x00010ffffbffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffffb9000-0x010ffffccfff [Runtime Code       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x00010ffffb0000-0x00010ffffcffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffffcd000-0x010fffffefff [Runtime Data       |RUN|  |  |  |  |  |  |   |WB|WT|WC|UC]*
[    0.000000] memblock_add: [0x00010ffffc0000-0x00010fffffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x010ffffff000-0x010fffffffff [Boot Data          |   |  |  |  |  |  |  |   |WB|WT|WC|UC]
[    0.000000] memblock_add: [0x00010fffff0000-0x00010fffffffff] flags 0x0 early_init_dt_add_memory_arch+0x54/0x5c
[    0.000000] efi:   0x804000001000-0x804000001fff [Memory Mapped I/O  |RUN|  |  |  |  |  |  |   |  |  |  |UC]
[    0.000000] efi:   0x87e0d0001000-0x87e0d0001fff [Memory Mapped I/O  |RUN|  |  |  |  |  |  |   |  |  |  |UC]
[    0.000000] memblock_reserve: [0x00010f812b0000-0x00010f812bffff] flags 0x0 efi_init+0x208/0x2c8
[    0.000000] memblock_add: [0x00010f82340000-0x00010f91e7ffff] flags 0x0 arm64_memblock_init+0x16c/0x248
[    0.000000] memblock_reserve: [0x00010f82340000-0x00010f91e7ffff] flags 0x0 arm64_memblock_init+0x178/0x248
[    0.000000] memblock_reserve: [0x00000001480000-0x000000024affff] flags 0x0 arm64_memblock_init+0x1b0/0x248
[    0.000000] memblock_reserve: [0x00010f82343000-0x00010f91e73613] flags 0x0 arm64_memblock_init+0x1cc/0x248
[    0.000000] memblock_reserve: [0x000000c0000000-0x000000dfffffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] cma: Reserved 512 MiB at 0x00000000c0000000
[    0.000000] memblock_reserve: [0x00010ffffa0000-0x00010ffffaffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffff90000-0x00010ffff9ffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffff80000-0x00010ffff8ffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffff70000-0x00010ffff7ffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffff60000-0x00010ffff6ffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffff50000-0x00010ffff5ffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffff40000-0x00010ffff4ffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffff30000-0x00010ffff3ffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000]    memblock_free: [0x00010ffffa0000-0x00010ffffaffff] paging_init+0x5c0/0x610
[    0.000000]    memblock_free: [0x00000002490000-0x000000024affff] paging_init+0x5f4/0x610
[    0.000000] memblock_reserve: [0x00010fffefed80-0x00010ffff2fffb] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffffaffd0-0x00010ffffafffe] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffffaffa0-0x00010ffffaffce] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] memblock_reserve: [0x00010ffffa0000-0x00010ffffa000f] flags 0x0 numa_init+0x88/0x3f0
[    0.000000] NUMA: Adding memblock [0x1400000 - 0xfffffffff] on node 0
[    0.000000] NUMA: Adding memblock [0x10000400000 - 0x10fffffffff] on node 1
[    0.000000] NUMA: parsing numa-distance-map-v1
[    0.000000] NUMA: Initmem setup node 0 [mem 0x01400000-0xfffffffff]
[    0.000000] memblock_reserve: [0x00000fffffe500-0x00000fffffffff] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] NUMA: NODE_DATA [mem 0xfffffe500-0xfffffffff]
[    0.000000] NUMA: Initmem setup node 1 [mem 0x10000400000-0x10fffffffff]
[    0.000000] memblock_reserve: [0x00010ffffae480-0x00010ffffaff7f] flags 0x0 memblock_alloc_range_nid+0x30/0x58
[    0.000000] NUMA: NODE_DATA [mem 0x10ffffae480-0x10ffffaff7f]
...
[    8.420710] ------------[ cut here ]------------
[    8.425344] kernel BUG at mm/page_alloc.c:1844!
[    8.429870] Internal error: Oops - BUG: 0 [#1] SMP
[    8.434654] Modules linked in:
[    8.437712] CPU: 72 PID: 1 Comm: swapper/0 Tainted: G        W       4.8.1.4.vanilla10-00007-g9eb3a76b8b88 #111
[    8.447788] Hardware name: www.cavium.com ThunderX CRB-2S/ThunderX CRB-2S, BIOS 0.3 Sep 13 2016
[    8.456474] task: ffff800fee626800 task.stack: ffff800fec02c000
[    8.462404] PC is at move_freepages+0x198/0x1b0
[    8.466924] LR is at move_freepages_block+0xa8/0xb8
[    8.471791] pc : [<ffff0000081e5dd0>] lr : [<ffff0000081e5e90>] pstate: 000000c5
[    8.479174] sp : ffff800fec02f4d0
[    8.482478] x29: ffff800fec02f4d0 x28: 0000000000000001 
[    8.487786] x27: 0000000000000000 x26: 0000000000000000 
[    8.493093] x25: 000000000000000a x24: ffff7fe043fd0020 
[    8.498401] x23: ffff810ffffaec00 x22: 0000000000000000 
[    8.503708] x21: ffff7fe043ffffc0 x20: 0000000000000000 
[    8.509015] x19: ffff7fe043f80000 x18: ffff8100102f2548 
[    8.514322] x17: 0000000000000000 x16: 0000000100000000 
[    8.519630] x15: 0000000000000007 x14: 0000010200000000 
[    8.524938] x13: 0004a41400000000 x12: 00032c920000001a 
[    8.530245] x11: 0000010200000000 x10: 0004a40800000000 
[    8.535553] x9 : 0000000000000040 x8 : 0000000000000000 
[    8.540860] x7 : 0000000000000000 x6 : ffff810ffffaf120 
[    8.546168] x5 : 0000000000000001 x4 : ffff000008d70c48 
[    8.551475] x3 : ffff810ffffae480 x2 : 0000000000000000 
[    8.556782] x1 : 0000000000000001 x0 : ffff810ffffaec00 
[    8.562089] 
[    8.563571] Process swapper/0 (pid: 1, stack limit = 0xffff800fec02c020)
[    8.570261] Stack: (0xffff800fec02f4d0 to 0xffff800fec030000)
...
[    9.279652] Call trace:
[    9.282091] Exception stack(0xffff800fec02f300 to 0xffff800fec02f430)
[    9.288520] f300: ffff7fe043f80000 0001000000000000 ffff800fec02f4d0 ffff0000081e5dd0
[    9.296339] f320: 0000000000000001 ffff800fee626800 0000000000000000 00000000026012d0
[    9.304157] f340: 0000000100000000 ffffffffffffffff 0000000000000130 0000000000000040
[    9.311976] f360: 0000001200000000 ffff000008cf8198 0000000100000000 0000000000000001
[    9.319794] f380: 00000000026012d0 ffff810ff9a1e8e8 0000000000000000 0000000000000040
[    9.327613] f3a0: ffff810ffffaec00 0000000000000001 0000000000000000 ffff810ffffae480
[    9.335431] f3c0: ffff000008d70c48 0000000000000001 ffff810ffffaf120 0000000000000000
[    9.343250] f3e0: 0000000000000000 0000000000000040 0004a40800000000 0000010200000000
[    9.351068] f400: 00032c920000001a 0004a41400000000 0000010200000000 0000000000000007
[    9.358885] f420: 0000000100000000 0000000000000000
[    9.363756] [<ffff0000081e5dd0>] move_freepages+0x198/0x1b0
[    9.369318] [<ffff0000081e5e90>] move_freepages_block+0xa8/0xb8
[    9.375227] [<ffff0000081e657c>] __rmqueue+0x604/0x650
[    9.380355] [<ffff0000081e7898>] get_page_from_freelist+0x3f0/0xb88
[    9.386612] [<ffff0000081e860c>] __alloc_pages_nodemask+0x12c/0xce8
[    9.392876] [<ffff00000823dc1c>] alloc_page_interleave+0x64/0xc0
[    9.398873] [<ffff00000823e2f8>] alloc_pages_current+0x108/0x168
[    9.404870] [<ffff0000081de1cc>] __page_cache_alloc+0x104/0x140
[    9.410779] [<ffff0000081de31c>] pagecache_get_page+0x114/0x300
[    9.416688] [<ffff0000081de550>] grab_cache_page_write_begin+0x48/0x68
[    9.423207] [<ffff00000828e5f8>] simple_write_begin+0x40/0x150
[    9.429029] [<ffff0000081ddfe0>] generic_perform_write+0xb8/0x1a0
[    9.435112] [<ffff0000081df8c8>] __generic_file_write_iter+0x170/0x1b0
[    9.441628] [<ffff0000081df9d4>] generic_file_write_iter+0xcc/0x1c8
[    9.447893] [<ffff000008262eb4>] __vfs_write+0xcc/0x140
[    9.453108] [<ffff000008263b84>] vfs_write+0xa4/0x1c0
[    9.458150] [<ffff000008264bc4>] SyS_write+0x54/0xb0
[    9.463108] [<ffff000008be1fe4>] xwrite+0x34/0x7c
[    9.467801] [<ffff000008be20c8>] do_copy+0x9c/0xf4
[    9.472583] [<ffff000008be1da4>] write_buffer+0x34/0x50
[    9.477796] [<ffff000008be1e08>] flush_buffer+0x48/0xb4
[    9.483016] [<ffff000008c0fe08>] __gunzip+0x288/0x334
[    9.488057] [<ffff000008c0fecc>] gunzip+0x18/0x20
[    9.492750] [<ffff000008be26bc>] unpack_to_rootfs+0x168/0x284
[    9.498486] [<ffff000008be2848>] populate_rootfs+0x70/0x138
[    9.504051] [<ffff000008082ff4>] do_one_initcall+0x44/0x138
[    9.509613] [<ffff000008be0d04>] kernel_init_freeable+0x1ac/0x24c
[    9.515699] [<ffff000008847f20>] kernel_init+0x20/0xf8
[    9.520826] [<ffff000008082b80>] ret_from_fork+0x10/0x50
[    9.526129] Code: 910a0021 9400b47d d4210000 d503201f (d4210000) 
[    9.532233] ---[ end trace c3040dccdcf12d3a ]---
[    9.536889] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

^ permalink raw reply

* [PATCHv2] PCI: QDF2432 32 bit config space accessors
From: Christopher Covington @ 2016-11-09 19:25 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161102160820.GA6568@bhelgaas-glaptop.roam.corp.google.com>

Hi Bjorn,

On 11/02/2016 12:08 PM, Bjorn Helgaas wrote:
> On Tue, Nov 01, 2016 at 07:06:31AM -0600, cov at codeaurora.org wrote:
>> Hi Bjorn,
>>
>> On 2016-10-31 15:48, Bjorn Helgaas wrote:
>>> On Wed, Sep 21, 2016 at 06:38:05PM -0400, Christopher Covington wrote:
>>>> The Qualcomm Technologies QDF2432 SoC does not support accesses
>>>> smaller
>>>> than 32 bits to the PCI configuration space. Register the appropriate
>>>> quirk.
>>>>
>>>> Signed-off-by: Christopher Covington <cov@codeaurora.org>
>>>
>>> Hi Christopher,
>>>
>>> Can you rebase this against v4.9-rc1?  It no longer applies to my tree.
>>
>> I apologize for not being clearer. This patch depends on:
>>
>> PCI/ACPI: Extend pci_mcfg_lookup() responsibilities
>> PCI/ACPI: Check platform-specific ECAM quirks
>>
>> These patches from Tomasz Nowicki were previously in your pci/ecam-v6
>> branch, but that seems to have come and gone. How would you like to
>> proceed?
> 
> Oh yes, that's right, I forgot that connection.  I'm afraid I kind of
> dropped the ball on that thread, so I went back and read through it
> again.
> 
> I *think* the current state is:
> 
>   - I'm OK with the first two patches that add the quirk
>     infrastructure.
> 
>   - My issue with the last three patches that add ThunderX quirks is
>     that there's no generic description of the ECAM address space.
> 
> So if I understand correctly, your Qualcomm patch depends only on the
> first two patches.
> 
> Then the question is how the Qualcomm ECAM address space is described.
> Your quirk overrides the default pci_generic_ecam_ops with the
> &pci_32b_ops, but it doesn't touch the address space part, so I assume
> the bus ranges and corresponding address space in your MCFG is
> correct.  So far, so good.
> 
> Is there also an ACPI device that contains that space in _CRS?  I
> think we concluded that the standard solution is to describe this with
> a PNP0C02 device.
> 
> Would you mind opening a bugzilla at bugzilla.kernel.org and attaching
> the dmesg log, /proc/iomem, and maybe a DSDT dump?  I'd like to have
> something to point at to say "if you need an MCFG quirk, you need the
> MCFG bit and *also* these other related ACPI device bits, and here's
> how it should be done."

We're working to add the PNP0C02 resource to future firmware, but it's
not in the current firmware. Are dmesg and /proc/iomem from the
current firmware interesting or should we wait for the update to file?

Thanks,
Cov

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code
Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* Summary of LPC guest MSI discussion in Santa Fe
From: Christoffer Dall @ 2016-11-09 19:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <582371FB.2040808@redhat.com>

On Wed, Nov 09, 2016 at 01:59:07PM -0500, Don Dutile wrote:
> On 11/09/2016 12:03 PM, Will Deacon wrote:
> >On Tue, Nov 08, 2016 at 09:52:33PM -0500, Don Dutile wrote:
> >>On 11/08/2016 06:35 PM, Alex Williamson wrote:
> >>>On Tue, 8 Nov 2016 21:29:22 +0100
> >>>Christoffer Dall <christoffer.dall@linaro.org> wrote:
> >>>>Is my understanding correct, that you need to tell userspace about the
> >>>>location of the doorbell (in the IOVA space) in case (2), because even
> >>>>though the configuration of the device is handled by the (host) kernel
> >>>>through trapping of the BARs, we have to avoid the VFIO user programming
> >>>>the device to create other DMA transactions to this particular address,
> >>>>since that will obviously conflict and either not produce the desired
> >>>>DMA transactions or result in unintended weird interrupts?
> >
> >Yes, that's the crux of the issue.
> >
> >>>Correct, if the MSI doorbell IOVA range overlaps RAM in the VM, then
> >>>it's potentially a DMA target and we'll get bogus data on DMA read from
> >>>the device, and lose data and potentially trigger spurious interrupts on
> >>>DMA write from the device.  Thanks,
> >>>
> >>That's b/c the MSI doorbells are not positioned *above* the SMMU, i.e.,
> >>they address match before the SMMU checks are done.  if
> >>all DMA addrs had to go through SMMU first, then the DMA access could
> >>be ignored/rejected.
> >
> >That's actually not true :( The SMMU can't generally distinguish between MSI
> >writes and DMA writes, so it would just see a write transaction to the
> >doorbell address, regardless of how it was generated by the endpoint.
> >
> >Will
> >
> So, we have real systems where MSI doorbells are placed at the same IOVA
> that could have memory for a guest

I don't think this is a property of a hardware system.  THe problem is
userspace not knowing where in the IOVA space the kernel is going to
place the doorbell, so you can end up (basically by chance) that some
IPA range of guest memory overlaps with the IOVA space for the doorbell.


>, but not at the same IOVA as memory on real hw ?

On real hardware without an IOMMU the system designer would have to
separate the IOVA and RAM in the physical address space.  With an IOMMU,
the SMMU driver just makes sure to allocate separate regions in the IOVA
space.

The challenge, as I understand it, happens with the VM, because the VM
doesn't allocate the IOVA for the MSI doorbell itself, but the host
kernel does this, independently from the attributes (e.g. memory map) of
the VM.

Because the IOVA is a single resource, but with two independent entities
allocating chunks of it (the host kernel for the MSI doorbell IOVA, and
the VFIO user for other DMA operations), you have to provide some
coordination between those to entities to avoid conflicts.  In the case
of KVM, the two entities are the host kernel and the VFIO user (QEMU/the
VM), and the host kernel informs the VFIO user to never attempt to use
the doorbell IOVA already reserved by the host kernel for DMA.

One way to do that is to ensure that the IPA space of the VFIO user
corresponding to the doorbell IOVA is simply not valid, ie. the reserved
regions that avoid for example QEMU to allocate RAM there.

(I suppose it's technically possible to get around this issue by letting
QEMU place RAM wherever it wants but tell the guest to never use a
particular subset of its RAM for DMA, because that would conflict with
the doorbell IOVA or be seen as p2p transactions.  But I think we all
probably agree that it's a disgusting idea.)

> How are memory holes passed to SMMU so it doesn't have this issue for bare-metal
> (assign an IOVA that overlaps an MSI doorbell address)?
> 

As I understand it, the SMMU driver manages the whole IOVA space when
VFIO is *not* involved, so it simply allocates non-overlapping regions.

The problem occurs when you have two independent entities essentially
attempting to mange the same resource (and the problem is exacerbated by
the VM potentially allocating slots in the IOVA space which may have
other limitations it doesn't know about, for example the p2p regions,
because the VM doesn't know anything about the topology of the
underlying physical system).

Christoffer

^ permalink raw reply

* PM regression with LED changes in next-20161109
From: Tony Lindgren @ 2016-11-09 19:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Looks like commit 883d32ce3385 ("leds: core: Add support for poll()ing
the sysfs brightness attr for changes.") breaks runtime PM for me.

On my omap dm3730 based test system, idle power consumption is over 70
times higher now with this patch! It goes from about 6mW for the core
system to over 440mW during idle meaning there's some busy timer now
active.

Reverting this patch fixes the issue. Any ideas?

Regards,

Tony

^ permalink raw reply

* [PATCH 2/3] ipmi/bt-bmc: maintain a request expiry list
From: Cédric Le Goater @ 2016-11-09 19:08 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1e0187c4-d503-ce4a-3d4c-cf21f0bffb96@acm.org>

On 11/09/2016 04:52 PM, Corey Minyard wrote:
> On 11/09/2016 08:30 AM, C?dric Le Goater wrote:
>> On 11/07/2016 08:04 PM, Corey Minyard wrote:
>>> On 11/02/2016 02:57 AM, C?dric Le Goater wrote:
>>>> Regarding the response expiration handling, the IPMI spec says :
>>>>
>>>>      The BMC must not return a given response once the corresponding
>>>>      Request-to-Response interval has passed. The BMC can ensure this
>>>>      by maintaining its own internal list of outstanding requests through
>>>>      the interface. The BMC could age and expire the entries in the list
>>>>      by expiring the entries at an interval that is somewhat shorter than
>>>>      the specified Request-to-Response interval....
>>>>
>>>> To handle such case, we maintain list of received requests using the
>>>> seq number of the BT message to identify them. The list is updated
>>>> each time a request is received and a response is sent. The expiration
>>>> of the reponses is handled at each updates but also with a timer.
>>> This looks correct, but it seems awfully complicated.
>>>
>>> Why can't you get the current time before the wait_event_interruptible()
>>> and then compare the time before you do the write?  That would seem to
>>> accomplish the same thing without any lists or extra locks.
>> Well, the expiry list needs a request identifier and it is currently using
>> the Seq byte for this purpose. So the BT message needs to be read to grab
>> that byte. The request is added to a list and that involves some locking.
>>
>> When the response is written, the first matching request is removed from
>> the list and a garbage collector loop is also run. Then, as we might not
>> get any responses to run that loop, we use a timer to empty the list from
>> any expired requests.
>>
>> The read/write ops of the driver are protected with a mutex, the list and
>> the timer add their share of locking. That could have been done with RCU
>> surely but we don't really need performance in this driver.
>>
>> Caveats :
>>
>> bt_bmc_remove_request() should not be done in the writing loop though.
>> It needs a fix.
>>
>> The request identifier is currently Seq but the spec say we should use
>> Seq, NetFn, and Command or an internal Seq value as a request identifier.
>> Google is also working on an OEM/Group extension (Brendan in CC: )
>> which has a different message format. I need to look closer at what
>> should be done in this case.
> 
> I'm still not sure why the list is necessary.  You have a separate
> thread of execution for each writer, why not just time it in that
> thread?

No, we don't in the current design. This is only a single process 
acting as a proxy and dispatching commands on dbus to other
processes doing whatever they need to do. So the request/responses 
can interlace. 

The current daemon already handles an expiry list but I thought it 
would be better to move it in the kernel to have a better response
time. The BMC can be quite slow when busy. It seems that keeping
the logic in user space is better. So let's have it that way. Not
a problem.

> What about the following, not even compile-tested, patch?  I'm
> sure my mailer will munge this up, I can send you a clean version
> if you like.

No it is ok. I will give your fix a try on our system and resend.

Thanks,

C.  

 
> From 1a73585a9c1c74ac1d59d82f22e05b30447619a6 Mon Sep 17 00:00:00 2001
> From: Corey Minyard <cminyard@mvista.com>
> Date: Wed, 9 Nov 2016 09:07:48 -0600
> Subject: [PATCH] ipmi:bt-bmc: Fix a multi-user race, time out responses
> 
> The IPMI spec says to time out responses after a given amount of
> time, so don't let a writer send something after an amount of time
> has elapsed.
> 
> Also, fix a race condition in the same area where if you have two
> writers at the same time, one can get a EIO return when it should
> still be waiting its turn to send.  A mutex_lock_interruptible_timeout()
> would be handy here, but it doesn't exist.
> 
> Signed-off-by: Corey Minyard <cminyard@mvista.com>
> ---
>  drivers/char/ipmi/bt-bmc.c | 39 ++++++++++++++++++++++++---------------
>  1 file changed, 24 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/char/ipmi/bt-bmc.c b/drivers/char/ipmi/bt-bmc.c
> index b49e613..5be94cf 100644
> --- a/drivers/char/ipmi/bt-bmc.c
> +++ b/drivers/char/ipmi/bt-bmc.c
> @@ -57,6 +57,8 @@
> 
>  #define BT_BMC_BUFFER_SIZE 256
> 
> +#define BT_BMC_RESPONSE_JIFFIES    (5 * HZ)
> +
>  struct bt_bmc {
>      struct device        dev;
>      struct miscdevice    miscdev;
> @@ -190,14 +192,12 @@ static ssize_t bt_bmc_read(struct file *file, char __user *buf,
> 
>      WARN_ON(*ppos);
> 
> -    if (wait_event_interruptible(bt_bmc->queue,
> -                     bt_inb(bt_bmc, BT_CTRL) & BT_CTRL_H2B_ATN))
> +    if (mutex_lock_interruptible(&bt_bmc->mutex))
>          return -ERESTARTSYS;
> 
> -    mutex_lock(&bt_bmc->mutex);
> -
> -    if (unlikely(!(bt_inb(bt_bmc, BT_CTRL) & BT_CTRL_H2B_ATN))) {
> -        ret = -EIO;
> +    if (wait_event_interruptible(bt_bmc->queue,
> +                bt_inb(bt_bmc, BT_CTRL) & BT_CTRL_H2B_ATN)) {
> +        ret = -ERESTARTSYS;
>          goto out_unlock;
>      }
> 
> @@ -251,6 +251,7 @@ static ssize_t bt_bmc_write(struct file *file, const char __user *buf,
>      u8 kbuffer[BT_BMC_BUFFER_SIZE];
>      ssize_t ret = 0;
>      ssize_t nwritten;
> +    unsigned long start_jiffies = jiffies, wait_time;
> 
>      /*
>       * send a minimum response size
> @@ -263,23 +264,31 @@ static ssize_t bt_bmc_write(struct file *file, const char __user *buf,
> 
>      WARN_ON(*ppos);
> 
> +    if (mutex_lock_interruptible(&bt_bmc->mutex))
> +        return -ERESTARTSYS;
> +
> +    wait_time = jiffies - start_jiffies;
> +    if (wait_time >= BT_BMC_RESPONSE_TIME_JIFFIES) {
> +        ret = -ETIMEDOUT;
> +        goto out_unlock;
> +    }
> +    wait_time = BT_BMC_RESPONSE_TIME_JIFFIES - wait_time;
> +
>      /*
>       * There's no interrupt for clearing bmc busy so we have to
>       * poll
>       */
> -    if (wait_event_interruptible(bt_bmc->queue,
> +    ret = wait_event_interruptible_timeout(bt_bmc->queue,
>                       !(bt_inb(bt_bmc, BT_CTRL) &
> -                       (BT_CTRL_H_BUSY | BT_CTRL_B2H_ATN))))
> -        return -ERESTARTSYS;
> -
> -    mutex_lock(&bt_bmc->mutex);
> -
> -    if (unlikely(bt_inb(bt_bmc, BT_CTRL) &
> -             (BT_CTRL_H_BUSY | BT_CTRL_B2H_ATN))) {
> -        ret = -EIO;
> +                       (BT_CTRL_H_BUSY | BT_CTRL_B2H_ATN)),
> +                     wait_time);
> +    if (ret <= 0) {
> +        if (ret == 0)
> +            ret = -ETIMEDOUT;
>          goto out_unlock;
>      }
> 
> +    ret = 0;
>      clr_wr_ptr(bt_bmc);
> 
>      while (count) {

^ permalink raw reply

* Summary of LPC guest MSI discussion in Santa Fe
From: Don Dutile @ 2016-11-09 18:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109170326.GG17771@arm.com>

On 11/09/2016 12:03 PM, Will Deacon wrote:
> On Tue, Nov 08, 2016 at 09:52:33PM -0500, Don Dutile wrote:
>> On 11/08/2016 06:35 PM, Alex Williamson wrote:
>>> On Tue, 8 Nov 2016 21:29:22 +0100
>>> Christoffer Dall <christoffer.dall@linaro.org> wrote:
>>>> Is my understanding correct, that you need to tell userspace about the
>>>> location of the doorbell (in the IOVA space) in case (2), because even
>>>> though the configuration of the device is handled by the (host) kernel
>>>> through trapping of the BARs, we have to avoid the VFIO user programming
>>>> the device to create other DMA transactions to this particular address,
>>>> since that will obviously conflict and either not produce the desired
>>>> DMA transactions or result in unintended weird interrupts?
>
> Yes, that's the crux of the issue.
>
>>> Correct, if the MSI doorbell IOVA range overlaps RAM in the VM, then
>>> it's potentially a DMA target and we'll get bogus data on DMA read from
>>> the device, and lose data and potentially trigger spurious interrupts on
>>> DMA write from the device.  Thanks,
>>>
>> That's b/c the MSI doorbells are not positioned *above* the SMMU, i.e.,
>> they address match before the SMMU checks are done.  if
>> all DMA addrs had to go through SMMU first, then the DMA access could
>> be ignored/rejected.
>
> That's actually not true :( The SMMU can't generally distinguish between MSI
> writes and DMA writes, so it would just see a write transaction to the
> doorbell address, regardless of how it was generated by the endpoint.
>
> Will
>
So, we have real systems where MSI doorbells are placed at the same IOVA
that could have memory for a guest, but not at the same IOVA as memory on real hw ?
How are memory holes passed to SMMU so it doesn't have this issue for bare-metal
(assign an IOVA that overlaps an MSI doorbell address)?

^ permalink raw reply

* [PATCH 00/13] mmc: dw_mmc: cleans the codes for dwmmc controller
From: Heiko Stuebner @ 2016-11-09 18:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161103062135.10697-1-jh80.chung@samsung.com>

Am Donnerstag, 3. November 2016, 15:21:22 CET schrieb Jaehoon Chung:
> This patchset is modified the some minor fixing and cleaning code.
> If needs to split the patches, i will re-send the patches.
> 
> * Major changes
> - Use the cookie enum values like sdhci controller.
> - Remove the unnecessary codes and use stop_abort() by default.
> - Remove the obsoleted property "supports-highspeed"
> - Remove the "clock-freq-min-max" property. Instead, use "max-frequency"
> - Minimum clock value is set to 100K by default.
> 
> Jaehoon Chung (13):
>   mmc: dw_mmc: display the real register value on debugfs
>   mmc: dw_mmc: fix the debug message for checking card's present
>   mmc: dw_mmc: change the DW_MCI_FREQ_MIN from 400K to 100K
>   mmc: dw_mmc: use the hold register when send stop command
>   mmc: dw_mmc: call the dw_mci_prep_stop_abort() by default
>   mmc: core: move the cookie's enum values from sdhci.h to mmc.h
>   mmc: dw_mmc: use the cookie's enum values for post/pre_req()
>   mmc: dw_mmc: remove the unnecessary mmc_data structure

patches 1-8 on rk3036, rk3288, rk3368 and rk3399 Rockchip platforms
Tested-by: Heiko Stuebner <heiko@sntech.de>

^ permalink raw reply

* [PATCH] ARM: dts: socfpga: add nand controller nodes
From: Dinh Nguyen @ 2016-11-09 18:48 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109073501.2010-1-s.trumtrar@pengutronix.de>



On 11/09/2016 01:35 AM, Steffen Trumtrar wrote:
> Add the denali nand controller to the socfpga dtsi.
> 
> Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
> ---
>  arch/arm/boot/dts/socfpga.dtsi | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi
> index 9f48141270b8..6b0c23ca5e88 100644
> --- a/arch/arm/boot/dts/socfpga.dtsi
> +++ b/arch/arm/boot/dts/socfpga.dtsi
> @@ -700,6 +700,19 @@
>  			status = "disabled";
>  		};
>  
> +		nand0: nand at ff900000 {
> +			#address-cells = <0x1>;
> +			#size-cells = <0x1>;
> +			compatible = "denali,denali-nand-dt";
> +			reg = <0xff900000 0x100000>,
> +			      <0xffb80000 0x10000>;
> +			reg-names = "nand_data", "denali_reg";
> +			interrupts = <0x0 0x90 0x4>;
> +			dma-mask = <0xffffffff>;
> +			clocks = <&nand_clk>;
> +			status = "disabled";
> +		};
> +
>  		ocram: sram at ffff0000 {
>  			compatible = "mmio-sram";
>  			reg = <0xffff0000 0x10000>;
> 

Since there's only 1 NAND node, do we need to call the node "nand0"? No
need to resend a patch, I can change it locally if we agree that the
node should be just:

nand: nand at ff900000

Dinh

^ permalink raw reply

* [PATCH 01/30] usb: dwc2: Deprecate g-use-dma binding
From: John Youn @ 2016-11-09 18:47 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <87zil9gkcq.fsf@linux.intel.com>

On 11/8/2016 11:54 PM, Felipe Balbi wrote:
> 
> Hi,
> 
> John Youn <John.Youn@synopsys.com> writes:
>> On 11/8/2016 1:12 AM, Felipe Balbi wrote:
>>>
>>> Hi,
>>>
>>> John Youn <johnyoun@synopsys.com> writes:
>>>> Add a vendor prefix and make the name more consistent by renaming it to
>>>> "snps,gadget-dma-enable".
>>>>
>>>> Signed-off-by: John Youn <johnyoun@synopsys.com>
>>>> ---
>>>>  Documentation/devicetree/bindings/usb/dwc2.txt | 5 ++++-
>>>>  arch/arm/boot/dts/rk3036.dtsi                  | 2 +-
>>>>  arch/arm/boot/dts/rk3288.dtsi                  | 2 +-
>>>>  arch/arm/boot/dts/rk3xxx.dtsi                  | 2 +-
>>>>  arch/arm64/boot/dts/hisilicon/hi6220.dtsi      | 2 +-
>>>>  arch/arm64/boot/dts/rockchip/rk3368.dtsi       | 2 +-
>>>>  drivers/usb/dwc2/params.c                      | 9 ++++++++-
>>>>  drivers/usb/dwc2/pci.c                         | 2 +-
>>>>  8 files changed, 18 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/usb/dwc2.txt b/Documentation/devicetree/bindings/usb/dwc2.txt
>>>> index 9472111..389a461 100644
>>>> --- a/Documentation/devicetree/bindings/usb/dwc2.txt
>>>> +++ b/Documentation/devicetree/bindings/usb/dwc2.txt
>>>> @@ -26,11 +26,14 @@ Refer to phy/phy-bindings.txt for generic phy consumer properties
>>>>  - dr_mode: shall be one of "host", "peripheral" and "otg"
>>>>    Refer to usb/generic.txt
>>>>  - snps,host-dma-disable: disable host DMA mode.
>>>> -- g-use-dma: enable dma usage in gadget driver.
>>>> +- snps,gadget-dma-enable: enable gadget DMA mode.
>>>
>>> I don't see why you even have this binding. Looking through the code,
>>> you have:
>>>
>>> #define GHWCFG2_SLAVE_ONLY_ARCH			0
>>> #define GHWCFG2_EXT_DMA_ARCH			1
>>> #define GHWCFG2_INT_DMA_ARCH			2
>>>
>>> void dwc2_set_param_dma_enable(struct dwc2_hsotg *hsotg, int val)
>>> {
>>> 	int valid = 1;
>>>
>>> 	if (val > 0 && hsotg->hw_params.arch == GHWCFG2_SLAVE_ONLY_ARCH)
>>> 		valid = 0;
>>> 	if (val < 0)
>>> 		valid = 0;
>>>
>>> 	if (!valid) {
>>> 		if (val >= 0)
>>> 			dev_err(hsotg->dev,
>>> 				"%d invalid for dma_enable parameter. Check HW configuration.\n",
>>> 				val);
>>> 		val = hsotg->hw_params.arch != GHWCFG2_SLAVE_ONLY_ARCH;
>>> 		dev_dbg(hsotg->dev, "Setting dma_enable to %d\n", val);
>>> 	}
>>>
>>> 	hsotg->core_params->dma_enable = val;
>>> }
>>>
>>> which seems to hint that DMA support is discoverable. If there is DMA,
>>> why would disable it?
>>>
>>
>> Yes that's the case and I would prefer to make it discoverable and
>> enabled by default.
>>
>> But the legacy behavior has always been like this because DMA was
>> never fully implemented in the gadget driver and it was an opt-in
>> feature. Periodic support was only added recently.
> 
> legacy behavior can be changed if another 'policy' makes more
> sense. IMHO, whatever can be discovered in runtime, should be enabled by
> default. That way, we force people to use it and find bugs in certain
> features.

Sounds good to me. I'll make the changes.

Regards,
John

^ permalink raw reply

* [PATCH v2 2/2] mmc: sdhci-iproc: support standard byte register accesses
From: Scott Branden @ 2016-11-09 18:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <132c772b-d4b9-c4b6-2eca-0393e7c995f9@broadcom.com>

Hi Adrian/Ulf,

Please ignore my comments in last email I sent out.  The v2 patch 
documentation matches the code and is good.  I am confusing myself 
between internal versions and external upstream versions of this code.

On 16-11-09 10:38 AM, Scott Branden wrote:
> Hi Adrian/Ulf,
>
> On 16-11-08 01:55 AM, Adrian Hunter wrote:
>> On 01/11/16 18:37, Scott Branden wrote:
>>> Add bytewise register accesses support for newer versions of IPROC
>>> SDHCI controllers.
>>> Previous sdhci-iproc versions of SDIO controllers
>>> (such as Raspberry Pi and Cygnus) only allowed for 32-bit register
>>> accesses.
>>>
>>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>>> Signed-off-by: Scott Branden <scott.branden@broadcom.com>
>>
>> This is unchanged from V1 which I acked, so:
> I updated the binding name in the documentation but forgot to change it
> in this patch.  Now that Rob has ack'd the binding documentation I will
> send out an updated patch with binding string in the code matching the
> ack'd documentation.
Ignore this - PATCH v2 is good.
>
>>
>> Acked-by: Adrian Hunter <adrian.hunter@intel.com>
>>
> With the minor change I will add your ack to the next version I send out.
>
> Thanks,
>  Scott

^ permalink raw reply

* KASAN & the vmalloc area
From: Dmitry Vyukov @ 2016-11-09 18:42 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109183017.GA837@leverpostej>

On Wed, Nov 9, 2016 at 10:30 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> >> I've seen the same iteration slowness problem on x86 with
>> >> CONFIG_DEBUG_RODATA which walks all pages. The is about 1 minute, but
>> >> it is enough to trigger rcu stall warning.
>> >
>> > Interesting; do you know where that happens? I can't spot any obvious
>> > case where we'd have to walk all the page tables for DEBUG_RODATA.
>>
>> As far as I remember it was this path:
>>
>> mark_readonly in main.c -> mark_rodata_ro -> debug_checkwx ->
>> ptdump_walk_pgd_level_checkwx -> ptdump_walk_pgd_level_core.
>
> Ah, that's x86's equivalent DEBUG_WX checks.
>
>> >> The zero pud and vmalloc-ed stacks looks like different problems.
>> >> To overcome the slowness we could map zero shadow for vmalloc area lazily.
>> >> However for vmalloc-ed stacks we need to map actual memory, because
>> >> stack instrumentation will read/write into the shadow.
>> >
>> > Sure. The point I was trying to make is that there' be fewer page tables
>> > to walk (unless the vmalloc area was exhausted), assuming we also lazily
>> > mapped the common zero shadow for the vmalloc area.
>> >
>> >> One downside here is that vmalloc shadow can be as large as 1:1 (if we
>> >> allocate 1 page in vmalloc area we need to allocate 1 page for
>> >> shadow).
>> >
>> > I thought per prior discussion we'd only need to allocate new pages for
>> > the stacks in the vmalloc region, and we could re-use the zero pages?
>>
>> We can't reuse zero ro pages for stacks, because stack instrumentation
>> writes to stack shadow.
>
> Sorry, I'd meant we'd use the zero pages for everything else but stacks.
> I understand we'd have to allocate real shadow for the stacks.
>
>> When we have a large continuous range of memory, shadow for it is
>> 1/8th. However, if we have a separate page, we will need to map whole
>> page of shadow for it, i.e. 1:1 shadow overhead.
>
> Sure, but for everything but stacks we can re-use the same zero pages,
> no?
>
> For everything else, the cost would be dominated by the page tables for
> the shadow.


Can we estimate the memory overhead?

^ permalink raw reply

* [PATCH v2 2/2] mmc: sdhci-iproc: support standard byte register accesses
From: Scott Branden @ 2016-11-09 18:38 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <c94dc018-5a2d-d50b-5746-43ae7fc258ce@intel.com>

Hi Adrian/Ulf,

On 16-11-08 01:55 AM, Adrian Hunter wrote:
> On 01/11/16 18:37, Scott Branden wrote:
>> Add bytewise register accesses support for newer versions of IPROC
>> SDHCI controllers.
>> Previous sdhci-iproc versions of SDIO controllers
>> (such as Raspberry Pi and Cygnus) only allowed for 32-bit register
>> accesses.
>>
>> Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
>> Signed-off-by: Scott Branden <scott.branden@broadcom.com>
>
> This is unchanged from V1 which I acked, so:
I updated the binding name in the documentation but forgot to change it 
in this patch.  Now that Rob has ack'd the binding documentation I will 
send out an updated patch with binding string in the code matching the 
ack'd documentation.

>
> Acked-by: Adrian Hunter <adrian.hunter@intel.com>
>
With the minor change I will add your ack to the next version I send out.

Thanks,
  Scott

^ permalink raw reply

* KASAN & the vmalloc area
From: Mark Rutland @ 2016-11-09 18:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CACT4Y+Ye+nxj=bQ9q9V2nEUpO+3sSWN1E2d_0KZapYyxx0Y69Q@mail.gmail.com>

On Wed, Nov 09, 2016 at 10:16:03AM -0800, Dmitry Vyukov wrote:
> On Wed, Nov 9, 2016 at 2:56 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Tue, Nov 08, 2016 at 02:09:27PM -0800, Dmitry Vyukov wrote:
> >> On Tue, Nov 8, 2016 at 11:03 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> >> I've seen the same iteration slowness problem on x86 with
> >> CONFIG_DEBUG_RODATA which walks all pages. The is about 1 minute, but
> >> it is enough to trigger rcu stall warning.
> >
> > Interesting; do you know where that happens? I can't spot any obvious
> > case where we'd have to walk all the page tables for DEBUG_RODATA.
> 
> As far as I remember it was this path:
> 
> mark_readonly in main.c -> mark_rodata_ro -> debug_checkwx ->
> ptdump_walk_pgd_level_checkwx -> ptdump_walk_pgd_level_core.

Ah, that's x86's equivalent DEBUG_WX checks.

> >> The zero pud and vmalloc-ed stacks looks like different problems.
> >> To overcome the slowness we could map zero shadow for vmalloc area lazily.
> >> However for vmalloc-ed stacks we need to map actual memory, because
> >> stack instrumentation will read/write into the shadow.
> >
> > Sure. The point I was trying to make is that there' be fewer page tables
> > to walk (unless the vmalloc area was exhausted), assuming we also lazily
> > mapped the common zero shadow for the vmalloc area.
> >
> >> One downside here is that vmalloc shadow can be as large as 1:1 (if we
> >> allocate 1 page in vmalloc area we need to allocate 1 page for
> >> shadow).
> >
> > I thought per prior discussion we'd only need to allocate new pages for
> > the stacks in the vmalloc region, and we could re-use the zero pages?
> 
> We can't reuse zero ro pages for stacks, because stack instrumentation
> writes to stack shadow.

Sorry, I'd meant we'd use the zero pages for everything else but stacks.
I understand we'd have to allocate real shadow for the stacks.

> When we have a large continuous range of memory, shadow for it is
> 1/8th. However, if we have a separate page, we will need to map whole
> page of shadow for it, i.e. 1:1 shadow overhead.

Sure, but for everything but stacks we can re-use the same zero pages,
no?

For everything else, the cost would be dominated by the page tables for
the shadow.

Thanks,
Mark.

^ permalink raw reply

* [v16, 0/7] Fix eSDHC host version register bug
From: Ulf Hansson @ 2016-11-09 18:27 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478661252-42439-1-git-send-email-yangbo.lu@nxp.com>

- i2c-list

On 9 November 2016 at 04:14, Yangbo Lu <yangbo.lu@nxp.com> wrote:
> This patchset is used to fix a host version register bug in the T4240-R1.0-R2.0
> eSDHC controller. To match the SoC version and revision, 15 previous version
> patchsets had tried many methods but all of them were rejected by reviewers.
> Such as
>         - dts compatible method
>         - syscon method
>         - ifdef PPC method
>         - GUTS driver getting SVR method
> Anrd suggested a soc_device_match method in v10, and this is the only available
> method left now. This v11 patchset introduces the soc_device_match interface in
> soc driver.
>
> The first four patches of Yangbo are to add the GUTS driver. This is used to
> register a soc device which contain soc version and revision information.
> The other three patches introduce the soc_device_match method in soc driver
> and apply it on esdhc driver to fix this bug.
>
> ---
> Changes for v15:
>         - Dropped patch 'dt: bindings: update Freescale DCFG compatible'
>           since the work had been done by below patch on ShawnGuo's linux tree.
>           'dt-bindings: fsl: add LS1043A/LS1046A/LS2080A compatible for SCFG
>            and DCFG'
>         - Fixed error code issue in guts driver
> Changes for v16:
>         - Dropped patch 'powerpc/fsl: move mpc85xx.h to include/linux/fsl'
>         - Added a bug-fix patch from Geert
> ---
>
> Arnd Bergmann (1):
>   base: soc: introduce soc_device_match() interface
>
> Geert Uytterhoeven (1):
>   base: soc: Check for NULL SoC device attributes
>
> Yangbo Lu (5):
>   ARM64: dts: ls2080a: add device configuration node
>   dt: bindings: move guts devicetree doc out of powerpc directory
>   soc: fsl: add GUTS driver for QorIQ platforms
>   MAINTAINERS: add entry for Freescale SoC drivers
>   mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0
>
>  .../bindings/{powerpc => soc}/fsl/guts.txt         |   3 +
>  MAINTAINERS                                        |  11 +-
>  arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi     |   6 +
>  drivers/base/Kconfig                               |   1 +
>  drivers/base/soc.c                                 |  70 ++++++
>  drivers/mmc/host/Kconfig                           |   1 +
>  drivers/mmc/host/sdhci-of-esdhc.c                  |  20 ++
>  drivers/soc/Kconfig                                |   3 +-
>  drivers/soc/fsl/Kconfig                            |  18 ++
>  drivers/soc/fsl/Makefile                           |   1 +
>  drivers/soc/fsl/guts.c                             | 236 +++++++++++++++++++++
>  include/linux/fsl/guts.h                           | 125 ++++++-----
>  include/linux/sys_soc.h                            |   3 +
>  13 files changed, 447 insertions(+), 51 deletions(-)
>  rename Documentation/devicetree/bindings/{powerpc => soc}/fsl/guts.txt (91%)
>  create mode 100644 drivers/soc/fsl/Kconfig
>  create mode 100644 drivers/soc/fsl/guts.c
>
> --
> 2.1.0.27.g96db324
>

Thanks, applied on my mmc tree for next!

I noticed that some DT compatibles weren't documented, according to
checkpatch. Please fix that asap!

Kind regards
Ulf Hansson

^ permalink raw reply

* [PATCH v2 4/6] pinctrl: aspeed: Read and write bits in LPCHC and GFX controllers
From: Rob Herring @ 2016-11-09 18:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478097481-14895-5-git-send-email-andrew@aj.id.au>

On Thu, Nov 03, 2016 at 01:07:59AM +1030, Andrew Jeffery wrote:
> The System Control Unit IP block in the Aspeed SoCs is typically where
> the pinmux configuration is found, but not always. A number of pins
> depend on state in one of LPC Host Control (LPCHC) or SoC Display
> Controller (GFX) IP blocks, so the Aspeed pinmux drivers should have the
> means to adjust these as necessary.
> 
> We use syscon to cast a regmap over the GFX and LPCHCR blocks, which is
> used as an arbitration layer between the relevant driver and the pinctrl
> subsystem. The regmaps are then exposed to the SoC-specific pinctrl
> drivers by phandles in the devicetree, and are selected during a mux
> request by querying a new 'ip' member in struct aspeed_sig_desc.
> 
> Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
> ---
> Since v1:
> 
> The change is now proactive: instead of reporting that we need to flip bits in
> controllers we can't access, the patch provides access via regmaps for the
> relevant controllers. The implementation also splits out the IP block ID into
> its own variable rather than packing the value into the upper bits of the reg
> member of struct aspeed_sig_desc. This drives some churn in the diff, but I've
> tried to minimise it.
> 
>  .../devicetree/bindings/pinctrl/pinctrl-aspeed.txt | 50 +++++++++++++---
>  drivers/pinctrl/aspeed/pinctrl-aspeed-g4.c         | 18 +++---
>  drivers/pinctrl/aspeed/pinctrl-aspeed-g5.c         | 39 ++++++++++---
>  drivers/pinctrl/aspeed/pinctrl-aspeed.c            | 66 +++++++++++++---------
>  drivers/pinctrl/aspeed/pinctrl-aspeed.h            | 32 ++++++++---
>  5 files changed, 144 insertions(+), 61 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt b/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt
> index 2ad18c4ea55c..115b0cce6c1c 100644
> --- a/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt
> +++ b/Documentation/devicetree/bindings/pinctrl/pinctrl-aspeed.txt
> @@ -4,12 +4,19 @@ Aspeed Pin Controllers
>  The Aspeed SoCs vary in functionality inside a generation but have a common mux
>  device register layout.
>  
> -Required properties:
> -- compatible : Should be any one of the following:
> -		"aspeed,ast2400-pinctrl"
> -		"aspeed,g4-pinctrl"
> -		"aspeed,ast2500-pinctrl"
> -		"aspeed,g5-pinctrl"
> +Required properties for g4:
> +- compatible : 			Should be any one of the following:
> +				"aspeed,ast2400-pinctrl"
> +				"aspeed,g4-pinctrl"
> +
> +Required properties for g5:
> +- compatible : 			Should be any one of the following:
> +				"aspeed,ast2500-pinctrl"
> +				"aspeed,g5-pinctrl"
> +
> +- aspeed,external-nodes:	A cell of phandles to external controller nodes:
> +				0: compatible with "aspeed,ast2500-gfx", "syscon"
> +				1: compatible with "aspeed,ast2500-lpchc", "syscon"
>  
>  The pin controller node should be a child of a syscon node with the required
>  property:
> @@ -47,7 +54,7 @@ RGMII1 RGMII2 RMII1 RMII2 SD1 SPI1 SPI1DEBUG SPI1PASSTHRU TIMER4 TIMER5 TIMER6
>  TIMER7 TIMER8 VGABIOSROM
>  
>  
> -Examples:
> +g4 Example:
>  
>  syscon: scu at 1e6e2000 {
>  	compatible = "syscon", "simple-mfd";
> @@ -63,5 +70,34 @@ syscon: scu at 1e6e2000 {
>  	};
>  };
>  
> +g5 Example:
> +
> +apb {
> +	gfx: display at 1e6e6000 {
> +		compatible = "aspeed,ast2500-gfx", "syscon";
> +		reg = <0x1e6e6000 0x1000>;
> +	};
> +
> +	lpchc: lpchc at 1e7890a0 {
> +		compatible = "aspeed,ast2500-lpchc", "syscon";
> +		reg = <0x1e7890a0 0xc4>;
> +	};
> +
> +	syscon: scu at 1e6e2000 {
> +		compatible = "syscon", "simple-mfd";
> +		reg = <0x1e6e2000 0x1a8>;
> +
> +		pinctrl: pinctrl {

Why the single child node here? Doesn't look like any reason for it in 
the example. 

> +			compatible = "aspeed,g5-pinctrl";
> +			aspeed,external-nodes = <&gfx, &lpchc>;
> +
> +			pinctrl_i2c3_default: i2c3_default {
> +				function = "I2C3";
> +				groups = "I2C3";
> +			};
> +		};
> +	};
> +};
> +
>  Please refer to pinctrl-bindings.txt in this directory for details of the
>  common pinctrl bindings used by client devices.

^ permalink raw reply

* [PATCH v2 2/6] mfd: dt: Add bindings for the Aspeed SoC Display Controller (GFX)
From: Rob Herring @ 2016-11-09 18:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478097481-14895-3-git-send-email-andrew@aj.id.au>

On Thu, Nov 03, 2016 at 01:07:57AM +1030, Andrew Jeffery wrote:
> The Aspeed SoC Display Controller is presented as a syscon device to
> arbitrate access by display and pinmux drivers. Video pinmux
> configuration on fifth generation SoCs depends on bits in both the
> System Control Unit and the Display Controller.
> 
> Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
> ---
>  Documentation/devicetree/bindings/mfd/aspeed-gfx.txt | 17 +++++++++++++++++

The register space can't be split to 2 nodes? 

>  1 file changed, 17 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mfd/aspeed-gfx.txt
> 
> diff --git a/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt b/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt
> new file mode 100644
> index 000000000000..aea5370efd97
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mfd/aspeed-gfx.txt
> @@ -0,0 +1,17 @@
> +* Device tree bindings for Aspeed SoC Display Controller (GFX)
> +
> +The Aspeed SoC Display Controller primarily does as its name suggests, but also
> +participates in pinmux requests on the g5 SoCs. It is therefore considered a
> +syscon device.
> +
> +Required properties:
> +- compatible:		"aspeed,ast2500-gfx", "syscon"

I think perhaps we should drop the syscon here and the driver should 
just register as a syscon.

Rob

^ permalink raw reply

* [PATCHv3 1/4] dt-bindings: mfd: Add Altera Arria10 SR Monitor
From: Rob Herring @ 2016-11-09 18:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478097178-24341-2-git-send-email-tthayer@opensource.altera.com>

On Wed, Nov 02, 2016 at 09:32:55AM -0500, tthayer at opensource.altera.com wrote:
> From: Thor Thayer <tthayer@opensource.altera.com>
> 
> Add the Arria10 DevKit System Resource Chip register and state
> monitoring module to the MFD.
> 
> Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
> ---
> Note: This needs to be applied to the bindings document that
> was Acked & Applied but didn't reach the for-next branch.
> See https://patchwork.ozlabs.org/patch/629397/
> ---
> v2  Change compatible string -mon to -monitor for clarity
> v3  Replace node name a10sr_monitor with just monitor.
>     Replace node name a10sr_gpio with just gpio.
> ---
>  Documentation/devicetree/bindings/mfd/altera-a10sr.txt | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply

* [PATCH v3 2/2] dt-bindings: net: Add OXNAS DWMAC Bindings
From: Rob Herring @ 2016-11-09 18:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161102140237.6955-3-narmstrong@baylibre.com>

On Wed, Nov 02, 2016 at 03:02:37PM +0100, Neil Armstrong wrote:
> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
> ---
>  .../devicetree/bindings/net/oxnas-dwmac.txt        | 39 ++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/oxnas-dwmac.txt

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox