Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3] ARM: shmobile: Rework the PMIC IRQ line quirk
From: Marek Vasut @ 2018-06-11 13:35 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAMuHMdXw146fwm0tHS7CBAofDAgRgDD2tNqgEJwzx0Jm1xKCyw@mail.gmail.com>

On 06/11/2018 03:03 PM, Geert Uytterhoeven wrote:
> Hi Marek,

Hi,

> On Mon, Jun 11, 2018 at 2:15 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>> On 06/11/2018 11:56 AM, Geert Uytterhoeven wrote:
>>> On Mon, Jun 4, 2018 at 7:59 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>>>> Rather than hard-coding the quirk topology, which stopped scaling,
>>>> parse the information from DT. The code looks for all compatible
>>>> PMICs -- da9036 and da9210 -- and checks if their IRQ line is tied
>>>
>>> da9063
>>>
>>>> to the same pin. If so, the code sends a matching sequence to the
>>>> PMIC to deassert the IRQ.
> 
>>>> @@ -122,7 +143,13 @@ static struct notifier_block regulator_quirk_nb = {
>>>>
>>>>  static int __init rcar_gen2_regulator_quirk(void)
>>>>  {
>>>> -       u32 mon;
>>>> +       struct device_node *np;
>>>> +       const struct of_device_id *id;
>>>> +       struct regulator_quirk *quirk;
>>>> +       struct regulator_quirk *pos;
>>>> +       struct of_phandle_args *argsa, *argsb;
>>>> +       u32 mon, addr;
>>>> +       int ret;
>>>
>>> Some people prefer "Reverse Christmas Tree Ordering", i.e. longest line first.
>>>
>>>>
>>>>         if (!of_machine_is_compatible("renesas,koelsch") &&
>>>>             !of_machine_is_compatible("renesas,lager") &&
>>>> @@ -130,6 +157,45 @@ static int __init rcar_gen2_regulator_quirk(void)
>>>>             !of_machine_is_compatible("renesas,gose"))
>>>>                 return -ENODEV;
>>>
>>> I think the board checks above can be removed. That will auto-enable the
>>> fix on e.g. Porter (once its regulators have ended up in DTS, of course).
>>
>> Removing the check would also enable it on boards where we don't want
>> this enabled, so I'd prefer to keep the check to avoid strange surprises.
> 
> Like, Porter? ;-)

I'm adding Porter in a separate patch.

>>>> +               ret = of_property_read_u32(np, "reg", &addr);
>>>> +               if (ret)
>>>> +                       return ret;
>>>
>>> I think it's safer to skip this entry and continue, after calling
>>> kfree(quirk), of course.
>>>
>>>> +
>>>> +               quirk->id = id;
>>>> +               quirk->i2c_msg.addr = addr;
>>>> +               quirk->shared = false;
>>>> +
>>>> +               ret = of_irq_parse_one(np, 0, &quirk->irq_args);
>>>> +               if (ret)
>>>> +                       return ret;
>>>
>>> kfree(quirk) and continue...
>>
>> I wonder if it shouldn't rather free the entire list and abort ?
> 
> "Be strict when sending, be liberal when receiving."

Meaning ? I think "the language barrier is protecting me" (TM)

-- 
Best regards,
Marek Vasut

^ permalink raw reply

* [PATCH RESEND v4 2/2] arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS
From: James Morse @ 2018-06-11 13:36 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1528487320-2873-3-git-send-email-gengdongjiu@huawei.com>

Hi Dongjiu Geng,

Please only put 'RESEND' in the subject if the patch content is identical.
This patch is not the same as v4.

On 08/06/18 20:48, Dongjiu Geng wrote:
> For the migrating VMs, user space may need to know the exception
> state. For example, in the machine A, KVM make an SError pending,
> when migrate to B, KVM also needs to pend an SError.
> 
> This new IOCTL exports user-invisible states related to SError.
> Together with appropriate user space changes, user space can get/set
> the SError exception state to do migrate/snapshot/suspend.

> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index fdac969..8896737 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -835,11 +835,13 @@ struct kvm_clock_data {
>  
>  Capability: KVM_CAP_VCPU_EVENTS
>  Extended by: KVM_CAP_INTR_SHADOW
> -Architectures: x86
> +Architectures: x86, arm, arm64
>  Type: vm ioctl

Isn't this actually a per-vcpu ioctl? Can we fix the documentation?


>  Parameters: struct kvm_vcpu_event (out)
>  Returns: 0 on success, -1 on error
>  
> +X86:
> +
>  Gets currently pending exceptions, interrupts, and NMIs as well as related
>  states of the vcpu.
>  
> @@ -881,15 +883,32 @@ Only two fields are defined in the flags field:
>  - KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that
>    smi contains a valid state.
>  
> +ARM, ARM64:
> +
> +Gets currently pending SError exceptions as well as related states of the vcpu.
> +
> +struct kvm_vcpu_events {
> +	struct {
> +		__u8 serror_pending;
> +		__u8 serror_has_esr;
> +		/* Align it to 8 bytes */
> +		__u8 pad[6];
> +		__u64 serror_esr;
> +	} exception;
> +	__u32 reserved[12];
> +};
> +
>  4.32 KVM_SET_VCPU_EVENTS
>  
> -Capability: KVM_CAP_VCPU_EVENTS
> +Capebility: KVM_CAP_VCPU_EVENTS

(please fix this)


>  Extended by: KVM_CAP_INTR_SHADOW
> -Architectures: x86
> +Architectures: x86, arm, arm64
>  Type: vm ioctl

(this is also a vcpu ioctl)


>  Parameters: struct kvm_vcpu_event (in)
>  Returns: 0 on success, -1 on error
>  
> +X86:
> +
>  Set pending exceptions, interrupts, and NMIs as well as related states of the
>  vcpu.
>  
> @@ -910,6 +929,12 @@ shall be written into the VCPU.
>  
>  KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available.
>  
> +ARM, ARM64:
> +
> +Set pending SError exceptions as well as related states of the vcpu.

There are some deliberate choices here I think we should document:
| This API can't be used to clear a pending SError.

If there already was an SError pending, this API just overwrites it with the new
one. The architecture has some rules about merging multiple SError. (details in
2.5.3 Multiple SError interrupts of [0])

I don't think KVM needs to enforce these, as they are implementation-defined if
one of the ESR is implementation-defined... the part that matters is reporting
the 'most severe' RAS ESR if there are multiple pending. As only user-space ever
sets these, let's make it user-spaces problem to do.

I think we should recommend user-space always reads the pending values and
applies its merging-multiple-SError logic. (I assume your Qemu patches do this).

Something like:
| User-space should first use KVM_GET_VCPU_EVENTS in case KVM has made an SError
| pending as part of its device emulation. When both values are architected RAS
| SError ESR values, the new ESR should reflect the combined effect of both
| errors.


> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index caae484..c3e6975 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -124,6 +124,18 @@ struct kvm_sync_regs {
>  struct kvm_arch_memory_slot {
>  };
>  
> +/* for KVM_GET/SET_VCPU_EVENTS */
> +struct kvm_vcpu_events {
> +	struct {
> +		__u8 serror_pending;
> +		__u8 serror_has_esr;
> +		/* Align it to 8 bytes */
> +		__u8 pad[6];
> +		__u64 serror_esr;
> +	} exception;
> +	__u32 reserved[12];
> +};
> +

You haven't defined __KVM_HAVE_VCPU_EVENTS for 32bit, so presumably this struct
will never be used. Why is it here?

(I agree if we ever provide it on 32bit, the struct layout should be the same.
Is this only here to force that to happen?)

[...]


> +int kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
> +			struct kvm_vcpu_events *events)
> +{
> +	bool serror_pending = events->exception.serror_pending;
> +	bool has_esr = events->exception.serror_has_esr;
> +
> +	if (serror_pending && has_esr) {
> +		if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
> +			return -EINVAL;
> +
> +		kvm_set_sei_esr(vcpu, events->exception.serror_esr);

kvm_set_sei_esr() will silently discard the top 40 bits of serror_esr, (which is
correct, we shouldn't copy them into hardware without know what they do).

Could we please force user-space to zero these bits, we can advertise extra CAPs
if new features turn up in that space, instead of user-space passing <something>
and relying on the kernel to remove it.

(Background: VSESR is a 64bit register that holds the value to go in a 32bit
register. I suspect the top-half could get re-used for control values or
something we don't want to give user-space)


> diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
> index d8e7165..a55e91d 100644
> --- a/arch/arm64/kvm/inject_fault.c
> +++ b/arch/arm64/kvm/inject_fault.c
> @@ -164,9 +164,9 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
>  		inject_undef64(vcpu);
>  }
>  
> -static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
> +void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 esr)
>  {
> -	vcpu_set_vsesr(vcpu, esr);
> +	vcpu_set_vsesr(vcpu, esr & ESR_ELx_ISS_MASK);
>  	*vcpu_hcr(vcpu) |= HCR_VSE;
>  }
>  

> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index a4c1b76..79ecba9 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -1107,6 +1107,25 @@ long kvm_arch_vcpu_ioctl(struct file *filp,

> +	case KVM_SET_VCPU_EVENTS: {
> +		struct kvm_vcpu_events events;
> +
> +		if (copy_from_user(&events, argp, sizeof(events)))
> +			return -EFAULT;
> +
> +		return kvm_arm_vcpu_set_events(vcpu, &events);
> +	}

Please check the padding[] and reserved[] are zero, otherwise we can't re-use these.


Thanks,

James

[0]
https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

^ permalink raw reply

* [PATCH RESEND v4 2/2] arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS
From: James Morse @ 2018-06-11 13:36 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <86zi04875t.wl-marc.zyngier@arm.com>

Hi Dongjiu Geng,

On 09/06/18 13:40, Marc Zyngier wrote:
> On Fri, 08 Jun 2018 20:48:40 +0100, Dongjiu Geng wrote:
>> For the migrating VMs, user space may need to know the exception
>> state. For example, in the machine A, KVM make an SError pending,
>> when migrate to B, KVM also needs to pend an SError.
>>
>> This new IOCTL exports user-invisible states related to SError.
>> Together with appropriate user space changes, user space can get/set
>> the SError exception state to do migrate/snapshot/suspend.

>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index 04b3256..df4faee 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -153,6 +154,18 @@ struct kvm_sync_regs {
>>  struct kvm_arch_memory_slot {
>>  };
>>  
>> +/* for KVM_GET/SET_VCPU_EVENTS */
>> +struct kvm_vcpu_events {
>> +	struct {
>> +		__u8 serror_pending;
>> +		__u8 serror_has_esr;
>> +		/* Align it to 8 bytes */
>> +		__u8 pad[6];
>> +		__u64 serror_esr;
>> +	} exception;
>> +	__u32 reserved[12];
>> +};

>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 56a0260..4426915 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c

>> +int kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
>> +			struct kvm_vcpu_events *events)
>> +{
>> +	bool serror_pending = events->exception.serror_pending;
>> +	bool has_esr = events->exception.serror_has_esr;
>> +
>> +	if (serror_pending && has_esr) {
>> +		if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
>> +			return -EINVAL;
>> +
>> +		kvm_set_sei_esr(vcpu, events->exception.serror_esr);
>> +	} else if (serror_pending) {
>> +		kvm_inject_vabt(vcpu);
>> +	}
>> +
>> +	return 0;
> 
> There was an earlier request to check that all the padding is set to
> zero. I still think this makes sense.

I agree, not just the exception.padding[], but reserved[] too.


Thanks,

James

^ permalink raw reply

* [PATCH 1/2] arm64: avoid alloc memory on offline node
From: Bjorn Helgaas @ 2018-06-11 13:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <16c4db2f-bc70-d0f2-fb38-341d9117ff66@huawei.com>

On Mon, Jun 11, 2018 at 08:32:10PM +0800, Xie XiuQi wrote:
> Hi Michal,
> 
> On 2018/6/11 16:52, Michal Hocko wrote:
> > On Mon 11-06-18 11:23:18, Xie XiuQi wrote:
> >> Hi Michal,
> >>
> >> On 2018/6/7 20:21, Michal Hocko wrote:
> >>> On Thu 07-06-18 19:55:53, Hanjun Guo wrote:
> >>>> On 2018/6/7 18:55, Michal Hocko wrote:
> >>> [...]
> >>>>> I am not sure I have the full context but pci_acpi_scan_root calls
> >>>>> kzalloc_node(sizeof(*info), GFP_KERNEL, node)
> >>>>> and that should fall back to whatever node that is online. Offline node
> >>>>> shouldn't keep any pages behind. So there must be something else going
> >>>>> on here and the patch is not the right way to handle it. What does
> >>>>> faddr2line __alloc_pages_nodemask+0xf0 tells on this kernel?
> >>>>
> >>>> The whole context is:
> >>>>
> >>>> The system is booted with a NUMA node has no memory attaching to it
> >>>> (memory-less NUMA node), also with NR_CPUS less than CPUs presented
> >>>> in MADT, so CPUs on this memory-less node are not brought up, and
> >>>> this NUMA node will not be online (but SRAT presents this NUMA node);
> >>>>
> >>>> Devices attaching to this NUMA node such as PCI host bridge still
> >>>> return the valid NUMA node via _PXM, but actually that valid NUMA node
> >>>> is not online which lead to this issue.
> >>>
> >>> But we should have other numa nodes on the zonelists so the allocator
> >>> should fall back to other node. If the zonelist is not intiailized
> >>> properly, though, then this can indeed show up as a problem. Knowing
> >>> which exact place has blown up would help get a better picture...
> >>>
> >>
> >> I specific a non-exist node to allocate memory using kzalloc_node,
> >> and got this following error message.
> >>
> >> And I found out there is just a VM_WARN, but it does not prevent the memory
> >> allocation continue.
> >>
> >> This nid would be use to access NODE_DADA(nid), so if nid is invalid,
> >> it would cause oops here.
> >>
> >> 459 /*
> >> 460  * Allocate pages, preferring the node given as nid. The node must be valid and
> >> 461  * online. For more general interface, see alloc_pages_node().
> >> 462  */
> >> 463 static inline struct page *
> >> 464 __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
> >> 465 {
> >> 466         VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
> >> 467         VM_WARN_ON(!node_online(nid));
> >> 468
> >> 469         return __alloc_pages(gfp_mask, order, nid);
> >> 470 }
> >> 471
> >>
> >> (I wrote a ko, to allocate memory on a non-exist node using kzalloc_node().)
> > 
> > OK, so this is an artificialy broken code, right. You shouldn't get a
> > non-existent node via standard APIs AFAICS. The original report was
> > about an existing node which is offline AFAIU. That would be a different
> > case. If I am missing something and there are legitimate users that try
> > to allocate from non-existing nodes then we should handle that in
> > node_zonelist.
> 
> I think hanjun's comments may help to understood this question:
>  - NUMA node will be built if CPUs and (or) memory are valid on this NUMA
>  node;
> 
>  - But if we boot the system with memory-less node and also with
>  CONFIG_NR_CPUS less than CPUs in SRAT, for example, 64 CPUs total with 4
>  NUMA nodes, 16 CPUs on each NUMA node, if we boot with
>  CONFIG_NR_CPUS=48, then we will not built numa node for node 3, but with
>  devices on that numa node, alloc memory will be panic because NUMA node
>  3 is not a valid node.
> 
> I triggered this BUG on arm64 platform, and I found a similar bug has
> been fixed on x86 platform. So I sent a similar patch for this bug.
> 
> Or, could we consider to fix it in the mm subsystem?

The patch below (b755de8dfdfe) seems like totally the wrong direction.
I don't think we want every caller of kzalloc_node() to have check for
node_online().

Why would memory on an off-line node even be in the allocation pool?
I wouldn't expect that memory to be put in the pool until the node
comes online and the memory is accessible, so this sounds like some
kind of setup issue.

But I'm definitely not an mm person.

> From b755de8dfdfef97effaa91379ffafcb81f4d62a1 Mon Sep 17 00:00:00 2001
> From: Yinghai Lu <Yinghai.Lu@Sun.COM>
> Date: Wed, 20 Feb 2008 12:41:52 -0800
> Subject: [PATCH] x86: make dev_to_node return online node
> 
> a numa system (with multi HT chains) may return node without ram. Aka it
> is not online. Try to get an online node, otherwise return -1.
> 
> Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/pci/acpi.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> index d95de2f..ea8685f 100644
> --- a/arch/x86/pci/acpi.c
> +++ b/arch/x86/pci/acpi.c
> @@ -172,6 +172,9 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_device *device, int do
>  		set_mp_bus_to_node(busnum, node);
>  	else
>  		node = get_mp_bus_to_node(busnum);
> +
> +	if (node != -1 && !node_online(node))
> +		node = -1;
>  #endif
> 
>  	/* Allocate per-root-bus (not per bus) arch-specific data.

^ permalink raw reply

* [PATCH V3] ARM: shmobile: Rework the PMIC IRQ line quirk
From: Geert Uytterhoeven @ 2018-06-11 13:49 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <967b800e-c935-aa35-da5a-ca3672fd18c2@gmail.com>

Hi Marek,

On Mon, Jun 11, 2018 at 3:39 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> On 06/11/2018 03:03 PM, Geert Uytterhoeven wrote:
> > On Mon, Jun 11, 2018 at 2:15 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >> On 06/11/2018 11:56 AM, Geert Uytterhoeven wrote:
> >>> On Mon, Jun 4, 2018 at 7:59 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >>>> Rather than hard-coding the quirk topology, which stopped scaling,
> >>>> parse the information from DT. The code looks for all compatible
> >>>> PMICs -- da9036 and da9210 -- and checks if their IRQ line is tied
> >>>> to the same pin. If so, the code sends a matching sequence to the
> >>>> PMIC to deassert the IRQ.

> >>>> +               ret = of_property_read_u32(np, "reg", &addr);
> >>>> +               if (ret)
> >>>> +                       return ret;
> >>>
> >>> I think it's safer to skip this entry and continue, after calling
> >>> kfree(quirk), of course.
> >>>
> >>>> +
> >>>> +               quirk->id = id;
> >>>> +               quirk->i2c_msg.addr = addr;
> >>>> +               quirk->shared = false;
> >>>> +
> >>>> +               ret = of_irq_parse_one(np, 0, &quirk->irq_args);
> >>>> +               if (ret)
> >>>> +                       return ret;
> >>>
> >>> kfree(quirk) and continue...
> >>
> >> I wonder if it shouldn't rather free the entire list and abort ?
> >
> > "Be strict when sending, be liberal when receiving."
>
> Meaning ? I think "the language barrier is protecting me" (TM)

Do the best you can, given the buggy DT you received.
I.e. don't fail completely, just ignore the bad device node, and continue.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH v2 5/5] arm64: perf: Add support for chaining event counters
From: Suzuki K Poulose @ 2018-06-11 13:54 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <4a5b5e7f-fc0b-84e3-fc65-b9f860029207@arm.com>

On 08/06/18 15:46, Suzuki K Poulose wrote:
> Hi Mark,
> 
> On 06/06/2018 07:01 PM, Mark Rutland wrote:
>> On Tue, May 29, 2018 at 11:55:56AM +0100, Suzuki K Poulose wrote:

> 
>>> -??????? value |= 0xffffffff00000000ULL;
>>> +??????? if (!armv8pmu_event_is_64bit(event))
>>> +??????????? value |= 0xffffffff00000000ULL;
>>> ????????? write_sysreg(value, pmccntr_el0);
>>> -??? } else if (armv8pmu_select_counter(idx) == idx)
>>> -??????? write_sysreg(value, pmxevcntr_el0);
>>> +??? } else
>>> +??????? armv8pmu_write_hw_counter(event, value);
>>> ? }
>>
>>> +static inline void armv8pmu_write_event_type(struct perf_event *event)
>>> +{
>>> +??? struct hw_perf_event *hwc = &event->hw;
>>> +??? int idx = hwc->idx;
>>> +
>>> +??? /*
>>> +???? * For chained events, write the the low counter event type
>>> +???? * followed by the high counter. The high counter is programmed
>>> +???? * with CHAIN event code with filters set to count at all ELs.
>>> +???? */
>>> +??? if (armv8pmu_event_is_chained(event)) {
>>> +??????? u32 chain_evt = ARMV8_PMUV3_PERFCTR_CHAIN |
>>> +??????????????? ARMV8_PMU_INCLUDE_EL2;
>>> +
>>> +??????? armv8pmu_write_evtype(idx - 1, hwc->config_base);
>>> +??????? isb();
>>> +??????? armv8pmu_write_evtype(idx, chain_evt);
>>
>> The ISB isn't necessary here, AFAICT. We only do this while the PMU is
>> disabled; no?
> 
> You're right. I was just following the ARM ARM.

Taking another look, it is not clear about the semantics of "pmu->enable()"
and pmu->disable() callbacks. I don't see any reference to them in the perf core
driver anymore. The perf core uses add() / del () instead, with the PMU
turned off. Do you have any idea about the enable()/disable() callbacks ?
Am I missing something ?

Suzuki

^ permalink raw reply

* [PATCH V3] ARM: shmobile: Rework the PMIC IRQ line quirk
From: Marek Vasut @ 2018-06-11 14:04 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAMuHMdVJ+ad-kr0iYuoaH3chJ49_g6yjEb8PHmTpwjFdYcG4=Q@mail.gmail.com>

On 06/11/2018 03:49 PM, Geert Uytterhoeven wrote:
> Hi Marek,

Hi,

> On Mon, Jun 11, 2018 at 3:39 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>> On 06/11/2018 03:03 PM, Geert Uytterhoeven wrote:
>>> On Mon, Jun 11, 2018 at 2:15 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>>>> On 06/11/2018 11:56 AM, Geert Uytterhoeven wrote:
>>>>> On Mon, Jun 4, 2018 at 7:59 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>>>>>> Rather than hard-coding the quirk topology, which stopped scaling,
>>>>>> parse the information from DT. The code looks for all compatible
>>>>>> PMICs -- da9036 and da9210 -- and checks if their IRQ line is tied
>>>>>> to the same pin. If so, the code sends a matching sequence to the
>>>>>> PMIC to deassert the IRQ.
> 
>>>>>> +               ret = of_property_read_u32(np, "reg", &addr);
>>>>>> +               if (ret)
>>>>>> +                       return ret;
>>>>>
>>>>> I think it's safer to skip this entry and continue, after calling
>>>>> kfree(quirk), of course.
>>>>>
>>>>>> +
>>>>>> +               quirk->id = id;
>>>>>> +               quirk->i2c_msg.addr = addr;
>>>>>> +               quirk->shared = false;
>>>>>> +
>>>>>> +               ret = of_irq_parse_one(np, 0, &quirk->irq_args);
>>>>>> +               if (ret)
>>>>>> +                       return ret;
>>>>>
>>>>> kfree(quirk) and continue...
>>>>
>>>> I wonder if it shouldn't rather free the entire list and abort ?
>>>
>>> "Be strict when sending, be liberal when receiving."
>>
>> Meaning ? I think "the language barrier is protecting me" (TM)
> 
> Do the best you can, given the buggy DT you received.
> I.e. don't fail completely, just ignore the bad device node, and continue.

But if you ignore node, you might as well ignore one which is shared and
then the system crashes due to IRQ storm anyway. So hum, what can we do ?

-- 
Best regards,
Marek Vasut

^ permalink raw reply

* [PATCH V3] ARM: shmobile: Rework the PMIC IRQ line quirk
From: Geert Uytterhoeven @ 2018-06-11 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <47e87deb-a7a8-4780-53b5-4e4ed6e1bac3@gmail.com>

Hi Marek,

On Mon, Jun 11, 2018 at 4:04 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> On 06/11/2018 03:49 PM, Geert Uytterhoeven wrote:
> > On Mon, Jun 11, 2018 at 3:39 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >> On 06/11/2018 03:03 PM, Geert Uytterhoeven wrote:
> >>> On Mon, Jun 11, 2018 at 2:15 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >>>> On 06/11/2018 11:56 AM, Geert Uytterhoeven wrote:
> >>>>> On Mon, Jun 4, 2018 at 7:59 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >>>>>> Rather than hard-coding the quirk topology, which stopped scaling,
> >>>>>> parse the information from DT. The code looks for all compatible
> >>>>>> PMICs -- da9036 and da9210 -- and checks if their IRQ line is tied
> >>>>>> to the same pin. If so, the code sends a matching sequence to the
> >>>>>> PMIC to deassert the IRQ.
> >
> >>>>>> +               ret = of_property_read_u32(np, "reg", &addr);
> >>>>>> +               if (ret)
> >>>>>> +                       return ret;
> >>>>>
> >>>>> I think it's safer to skip this entry and continue, after calling
> >>>>> kfree(quirk), of course.
> >>>>>
> >>>>>> +
> >>>>>> +               quirk->id = id;
> >>>>>> +               quirk->i2c_msg.addr = addr;
> >>>>>> +               quirk->shared = false;
> >>>>>> +
> >>>>>> +               ret = of_irq_parse_one(np, 0, &quirk->irq_args);
> >>>>>> +               if (ret)
> >>>>>> +                       return ret;
> >>>>>
> >>>>> kfree(quirk) and continue...
> >>>>
> >>>> I wonder if it shouldn't rather free the entire list and abort ?
> >>>
> >>> "Be strict when sending, be liberal when receiving."
> >>
> >> Meaning ? I think "the language barrier is protecting me" (TM)
> >
> > Do the best you can, given the buggy DT you received.
> > I.e. don't fail completely, just ignore the bad device node, and continue.
>
> But if you ignore node, you might as well ignore one which is shared and
> then the system crashes due to IRQ storm anyway. So hum, what can we do ?

Correct. If it's a critical node, it will crash regardless.
If it's a non-critical node, you have the choice between aborting and crashing,
or ignoring and keeping the system alive. Your call.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH] clk: imx6: fix video_27m parent for IMX6QDL_CLK_CKO1_SEL
From: Philipp Puschmann @ 2018-06-11 14:12 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAOMZO5C8CA02V+oiag5tGpKBWN+UP28KwQ98jQ9PpQ2_aCq_bA@mail.gmail.com>

> Hi Philipp,
> 
> On Fri, Jun 8, 2018 at 7:19 AM, Philipp Puschmann <pp@emlix.com> wrote:
>> q/dl datasheets list the 5th selection value for ck01_sel as
>> video_27M_clk_root.
>>
>> By replacing the dummy value we then can set IMX6QDL_CLK_VIDEO_27M
>> as parent for IMX6QDL_CLK_CKO1_SEL.
>>
>> Signed-off-by: Philipp Puschmann <pp@emlix.com>
> 
> You could still have added my Reviewed-by tag that I sent previously :-)
> 
> Here it goes:
> 
> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
> 
> Also in the future, when posting a new version of a patch, please mark
> it as such:
> 
> [PATCH v2] clk: imx6: fix video_27m parent for IMX6QDL_CLK_CKO1_SEL
> 
> and write below the --- line what are the changes in this new version.
> 
> Thanks
> 

Hi Fabio,

thank you for your hints, next time i will do it that way

^ permalink raw reply

* [PATCH V3] ARM: shmobile: Rework the PMIC IRQ line quirk
From: Marek Vasut @ 2018-06-11 14:19 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAMuHMdXJEC8HbuB5+YHMLjZy9GfCYyXmGTk0TXkoAxLhxTOJyA@mail.gmail.com>

On 06/11/2018 04:10 PM, Geert Uytterhoeven wrote:
> Hi Marek,

Hi,

> On Mon, Jun 11, 2018 at 4:04 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>> On 06/11/2018 03:49 PM, Geert Uytterhoeven wrote:
>>> On Mon, Jun 11, 2018 at 3:39 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>>>> On 06/11/2018 03:03 PM, Geert Uytterhoeven wrote:
>>>>> On Mon, Jun 11, 2018 at 2:15 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>>>>>> On 06/11/2018 11:56 AM, Geert Uytterhoeven wrote:
>>>>>>> On Mon, Jun 4, 2018 at 7:59 PM Marek Vasut <marek.vasut@gmail.com> wrote:
>>>>>>>> Rather than hard-coding the quirk topology, which stopped scaling,
>>>>>>>> parse the information from DT. The code looks for all compatible
>>>>>>>> PMICs -- da9036 and da9210 -- and checks if their IRQ line is tied
>>>>>>>> to the same pin. If so, the code sends a matching sequence to the
>>>>>>>> PMIC to deassert the IRQ.
>>>
>>>>>>>> +               ret = of_property_read_u32(np, "reg", &addr);
>>>>>>>> +               if (ret)
>>>>>>>> +                       return ret;
>>>>>>>
>>>>>>> I think it's safer to skip this entry and continue, after calling
>>>>>>> kfree(quirk), of course.
>>>>>>>
>>>>>>>> +
>>>>>>>> +               quirk->id = id;
>>>>>>>> +               quirk->i2c_msg.addr = addr;
>>>>>>>> +               quirk->shared = false;
>>>>>>>> +
>>>>>>>> +               ret = of_irq_parse_one(np, 0, &quirk->irq_args);
>>>>>>>> +               if (ret)
>>>>>>>> +                       return ret;
>>>>>>>
>>>>>>> kfree(quirk) and continue...
>>>>>>
>>>>>> I wonder if it shouldn't rather free the entire list and abort ?
>>>>>
>>>>> "Be strict when sending, be liberal when receiving."
>>>>
>>>> Meaning ? I think "the language barrier is protecting me" (TM)
>>>
>>> Do the best you can, given the buggy DT you received.
>>> I.e. don't fail completely, just ignore the bad device node, and continue.
>>
>> But if you ignore node, you might as well ignore one which is shared and
>> then the system crashes due to IRQ storm anyway. So hum, what can we do ?
> 
> Correct. If it's a critical node, it will crash regardless.
> If it's a non-critical node, you have the choice between aborting and crashing,
> or ignoring and keeping the system alive. Your call.

But wait, since we control which machines this code runs on , can't we
assure they have valid DTs ? This situation with invalid DT starts to
look a bit hypothetical to me.

-- 
Best regards,
Marek Vasut

^ permalink raw reply

* [PATCH v2 5/5] arm64: perf: Add support for chaining event counters
From: Mark Rutland @ 2018-06-11 14:24 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <847dfe5c-665c-6398-f87f-3ca56e73f5aa@arm.com>

On Mon, Jun 11, 2018 at 02:54:16PM +0100, Suzuki K Poulose wrote:
> On 08/06/18 15:46, Suzuki K Poulose wrote:
> > On 06/06/2018 07:01 PM, Mark Rutland wrote:
> > > On Tue, May 29, 2018 at 11:55:56AM +0100, Suzuki K Poulose wrote:
> 
> > > > -??????? value |= 0xffffffff00000000ULL;
> > > > +??????? if (!armv8pmu_event_is_64bit(event))
> > > > +??????????? value |= 0xffffffff00000000ULL;
> > > > ????????? write_sysreg(value, pmccntr_el0);
> > > > -??? } else if (armv8pmu_select_counter(idx) == idx)
> > > > -??????? write_sysreg(value, pmxevcntr_el0);
> > > > +??? } else
> > > > +??????? armv8pmu_write_hw_counter(event, value);
> > > > ? }
> > > 
> > > > +static inline void armv8pmu_write_event_type(struct perf_event *event)
> > > > +{
> > > > +??? struct hw_perf_event *hwc = &event->hw;
> > > > +??? int idx = hwc->idx;
> > > > +
> > > > +??? /*
> > > > +???? * For chained events, write the the low counter event type
> > > > +???? * followed by the high counter. The high counter is programmed
> > > > +???? * with CHAIN event code with filters set to count at all ELs.
> > > > +???? */
> > > > +??? if (armv8pmu_event_is_chained(event)) {
> > > > +??????? u32 chain_evt = ARMV8_PMUV3_PERFCTR_CHAIN |
> > > > +??????????????? ARMV8_PMU_INCLUDE_EL2;
> > > > +
> > > > +??????? armv8pmu_write_evtype(idx - 1, hwc->config_base);
> > > > +??????? isb();
> > > > +??????? armv8pmu_write_evtype(idx, chain_evt);
> > > 
> > > The ISB isn't necessary here, AFAICT. We only do this while the PMU is
> > > disabled; no?
> > 
> > You're right. I was just following the ARM ARM.
> 
> Taking another look, it is not clear about the semantics of "pmu->enable()"
> and pmu->disable() callbacks. 

I was talking about pmu::{pmu_disable,pmu_enable}(), so I'm not sure I
follow how arm_pmu::{enable,disable}() are relevant here.

The arm_pmu::{enable,disable}() callbacks enable or disable individual
counters. For example, leaving unused counters disabled may save power,
even if the PMU as a whole is enabled.

> I don't see any reference to them in the perf core
> driver anymore. The perf core uses add() / del () instead, with the PMU
> turned off. Do you have any idea about the enable()/disable() callbacks ?

I'm not sure I understand what you're asking here.

Thanks,
Mark.

^ permalink raw reply

* [PATCH V3] ARM: shmobile: Rework the PMIC IRQ line quirk
From: Geert Uytterhoeven @ 2018-06-11 14:30 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <6864ed2c-39ac-615d-f38e-b28c3647e451@gmail.com>

Hi Marek,

On Mon, Jun 11, 2018 at 4:19 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> On 06/11/2018 04:10 PM, Geert Uytterhoeven wrote:
> > On Mon, Jun 11, 2018 at 4:04 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >> On 06/11/2018 03:49 PM, Geert Uytterhoeven wrote:
> >>> On Mon, Jun 11, 2018 at 3:39 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >>>> On 06/11/2018 03:03 PM, Geert Uytterhoeven wrote:
> >>>>> On Mon, Jun 11, 2018 at 2:15 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >>>>>> On 06/11/2018 11:56 AM, Geert Uytterhoeven wrote:
> >>>>>>> On Mon, Jun 4, 2018 at 7:59 PM Marek Vasut <marek.vasut@gmail.com> wrote:
> >>>>>>>> Rather than hard-coding the quirk topology, which stopped scaling,
> >>>>>>>> parse the information from DT. The code looks for all compatible
> >>>>>>>> PMICs -- da9036 and da9210 -- and checks if their IRQ line is tied
> >>>>>>>> to the same pin. If so, the code sends a matching sequence to the
> >>>>>>>> PMIC to deassert the IRQ.
> >>>
> >>>>>>>> +               ret = of_property_read_u32(np, "reg", &addr);
> >>>>>>>> +               if (ret)
> >>>>>>>> +                       return ret;
> >>>>>>>
> >>>>>>> I think it's safer to skip this entry and continue, after calling
> >>>>>>> kfree(quirk), of course.
> >>>>>>>
> >>>>>>>> +
> >>>>>>>> +               quirk->id = id;
> >>>>>>>> +               quirk->i2c_msg.addr = addr;
> >>>>>>>> +               quirk->shared = false;
> >>>>>>>> +
> >>>>>>>> +               ret = of_irq_parse_one(np, 0, &quirk->irq_args);
> >>>>>>>> +               if (ret)
> >>>>>>>> +                       return ret;
> >>>>>>>
> >>>>>>> kfree(quirk) and continue...
> >>>>>>
> >>>>>> I wonder if it shouldn't rather free the entire list and abort ?
> >>>>>
> >>>>> "Be strict when sending, be liberal when receiving."
> >>>>
> >>>> Meaning ? I think "the language barrier is protecting me" (TM)
> >>>
> >>> Do the best you can, given the buggy DT you received.
> >>> I.e. don't fail completely, just ignore the bad device node, and continue.
> >>
> >> But if you ignore node, you might as well ignore one which is shared and
> >> then the system crashes due to IRQ storm anyway. So hum, what can we do ?
> >
> > Correct. If it's a critical node, it will crash regardless.
> > If it's a non-critical node, you have the choice between aborting and crashing,
> > or ignoring and keeping the system alive. Your call.
>
> But wait, since we control which machines this code runs on , can't we
> assure they have valid DTs ? This situation with invalid DT starts to
> look a bit hypothetical to me.

That assumes you keep the list of machines to check, and don't want to fix the
issue automatically when detected (on any R-Car Gen2 or RZ/G1 platform, so
you still need to check for r8a779[0-4] and r8a774[23457]).

Anyway, as we care about booting old DTBs on new kernels (for a while), we
have a few more release cycles to bikeshed ;-)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH v6 0/5] Reintroduce i.MX EPIT Timer
From: Vladimir Zapolskiy @ 2018-06-11 14:39 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180607140544.22268-1-peron.clem@gmail.com>

Hi Cl?ment,

On 06/07/2018 05:05 PM, Cl?ment P?ron wrote:
> From: Cl?ment Peron <clement.peron@devialet.com>
> 
> As suggested in the commit message we have added the device tree support,
> proper bindings and we moved the driver into the correct folder.
> 
> Moreover we made some changes like use of relaxed IO accesor,
> implement sched_clock, delay_timer and reduce the clockevents min_delta.
> 

I reviewed and tested the driver on i.MX31, as expected it works fine,
and I'll give my tags per a commit, please add them to v7 changes.

--
Best wishes,
Vladimir

^ permalink raw reply

* [PATCH v6 2/5] ARM: imx: remove inexistant EPIT timer init
From: Vladimir Zapolskiy @ 2018-06-11 14:39 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180607140544.22268-3-peron.clem@gmail.com>

On 06/07/2018 05:05 PM, Cl?ment P?ron wrote:
> From: Cl?ment Peron <clement.peron@devialet.com>
> 
> i.MX EPIT timer has been removed but not the init function declaration.
> 
> Signed-off-by: Cl?ment Peron <clement.peron@devialet.com>
> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>

Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Tested-by: Vladimir Zapolskiy <vz@mleia.com>

--
With best wishes,
Vladimir

^ permalink raw reply

* [PATCH v6 3/5] dt-bindings: timer: add i.MX EPIT timer binding
From: Vladimir Zapolskiy @ 2018-06-11 14:39 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180607140544.22268-4-peron.clem@gmail.com>

On 06/07/2018 05:05 PM, Cl?ment P?ron wrote:
> From: Cl?ment Peron <clement.peron@devialet.com>
> 
> Add devicetree binding document for NXP's i.MX SoC specific
> EPIT timer driver.
> 
> Signed-off-by: Cl?ment Peron <clement.peron@devialet.com>

Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>

--
With best wishes,
Vladimir

^ permalink raw reply

* [PATCH v6 4/5] clocksource: add driver for i.MX EPIT timer
From: Vladimir Zapolskiy @ 2018-06-11 14:40 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180607140544.22268-5-peron.clem@gmail.com>

On 06/07/2018 05:05 PM, Cl?ment P?ron wrote:
> From: Colin Didier <colin.didier@devialet.com>
> 
> Add driver for NXP's EPIT timer used in i.MX SoC.
> 
> Signed-off-by: Colin Didier <colin.didier@devialet.com>
> Signed-off-by: Cl?ment Peron <clement.peron@devialet.com>

Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Tested-by: Vladimir Zapolskiy <vz@mleia.com>

I tested the driver on i.MX31 only, and I didn't find any problems.

Please fix the indentation issue found by Stefan in v7.

Regarding utilization of timer-of.c it can be postponed IMHO,
but it's up to clocksource maintainers and you to decide, and if
you do such an update for v7, then please don't add my tags,
I'll review and test it again.

--
With best wishes,
Vladimir

^ permalink raw reply

* [PATCH v6 5/5] ARM: dts: imx: add missing compatible and clock properties for EPIT
From: Vladimir Zapolskiy @ 2018-06-11 14:40 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180607140544.22268-6-peron.clem@gmail.com>

On 06/07/2018 05:05 PM, Cl?ment P?ron wrote:
> From: Colin Didier <colin.didier@devialet.com>
> 
> Add missing compatible and clock properties for EPIT node.
> 
> Signed-off-by: Colin Didier <colin.didier@devialet.com>
> Signed-off-by: Cl?ment Peron <clement.peron@devialet.com>

Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>

--
With best wishes,
Vladimir

^ permalink raw reply

* [PATCH v6 0/5] Reintroduce i.MX EPIT Timer
From: Clément Péron @ 2018-06-11 14:46 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <74959a6a-60bb-f854-6ef2-21c3c60d3451@mleia.com>

Hi Vladimir,
On Mon, 11 Jun 2018 at 16:39, Vladimir Zapolskiy <vz@mleia.com> wrote:
>
> Hi Cl?ment,
>
> On 06/07/2018 05:05 PM, Cl?ment P?ron wrote:
> > From: Cl?ment Peron <clement.peron@devialet.com>
> >
> > As suggested in the commit message we have added the device tree support,
> > proper bindings and we moved the driver into the correct folder.
> >
> > Moreover we made some changes like use of relaxed IO accesor,
> > implement sched_clock, delay_timer and reduce the clockevents min_delta.
> >
>
> I reviewed and tested the driver on i.MX31, as expected it works fine,
> and I'll give my tags per a commit, please add them to v7 changes.

Thanks, i will

>
> --
> Best wishes,
> Vladimir

^ permalink raw reply

* [PATCH 1/2] arm64: avoid alloc memory on offline node
From: Michal Hocko @ 2018-06-11 14:53 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180611134303.GC75679@bhelgaas-glaptop.roam.corp.google.com>

On Mon 11-06-18 08:43:03, Bjorn Helgaas wrote:
> On Mon, Jun 11, 2018 at 08:32:10PM +0800, Xie XiuQi wrote:
> > Hi Michal,
> > 
> > On 2018/6/11 16:52, Michal Hocko wrote:
> > > On Mon 11-06-18 11:23:18, Xie XiuQi wrote:
> > >> Hi Michal,
> > >>
> > >> On 2018/6/7 20:21, Michal Hocko wrote:
> > >>> On Thu 07-06-18 19:55:53, Hanjun Guo wrote:
> > >>>> On 2018/6/7 18:55, Michal Hocko wrote:
> > >>> [...]
> > >>>>> I am not sure I have the full context but pci_acpi_scan_root calls
> > >>>>> kzalloc_node(sizeof(*info), GFP_KERNEL, node)
> > >>>>> and that should fall back to whatever node that is online. Offline node
> > >>>>> shouldn't keep any pages behind. So there must be something else going
> > >>>>> on here and the patch is not the right way to handle it. What does
> > >>>>> faddr2line __alloc_pages_nodemask+0xf0 tells on this kernel?
> > >>>>
> > >>>> The whole context is:
> > >>>>
> > >>>> The system is booted with a NUMA node has no memory attaching to it
> > >>>> (memory-less NUMA node), also with NR_CPUS less than CPUs presented
> > >>>> in MADT, so CPUs on this memory-less node are not brought up, and
> > >>>> this NUMA node will not be online (but SRAT presents this NUMA node);
> > >>>>
> > >>>> Devices attaching to this NUMA node such as PCI host bridge still
> > >>>> return the valid NUMA node via _PXM, but actually that valid NUMA node
> > >>>> is not online which lead to this issue.
> > >>>
> > >>> But we should have other numa nodes on the zonelists so the allocator
> > >>> should fall back to other node. If the zonelist is not intiailized
> > >>> properly, though, then this can indeed show up as a problem. Knowing
> > >>> which exact place has blown up would help get a better picture...
> > >>>
> > >>
> > >> I specific a non-exist node to allocate memory using kzalloc_node,
> > >> and got this following error message.
> > >>
> > >> And I found out there is just a VM_WARN, but it does not prevent the memory
> > >> allocation continue.
> > >>
> > >> This nid would be use to access NODE_DADA(nid), so if nid is invalid,
> > >> it would cause oops here.
> > >>
> > >> 459 /*
> > >> 460  * Allocate pages, preferring the node given as nid. The node must be valid and
> > >> 461  * online. For more general interface, see alloc_pages_node().
> > >> 462  */
> > >> 463 static inline struct page *
> > >> 464 __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
> > >> 465 {
> > >> 466         VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
> > >> 467         VM_WARN_ON(!node_online(nid));
> > >> 468
> > >> 469         return __alloc_pages(gfp_mask, order, nid);
> > >> 470 }
> > >> 471
> > >>
> > >> (I wrote a ko, to allocate memory on a non-exist node using kzalloc_node().)
> > > 
> > > OK, so this is an artificialy broken code, right. You shouldn't get a
> > > non-existent node via standard APIs AFAICS. The original report was
> > > about an existing node which is offline AFAIU. That would be a different
> > > case. If I am missing something and there are legitimate users that try
> > > to allocate from non-existing nodes then we should handle that in
> > > node_zonelist.
> > 
> > I think hanjun's comments may help to understood this question:
> >  - NUMA node will be built if CPUs and (or) memory are valid on this NUMA
> >  node;
> > 
> >  - But if we boot the system with memory-less node and also with
> >  CONFIG_NR_CPUS less than CPUs in SRAT, for example, 64 CPUs total with 4
> >  NUMA nodes, 16 CPUs on each NUMA node, if we boot with
> >  CONFIG_NR_CPUS=48, then we will not built numa node for node 3, but with
> >  devices on that numa node, alloc memory will be panic because NUMA node
> >  3 is not a valid node.

Hmm, but this is not a memory-less node. It sounds like a misconfigured
kernel to me or the broken initialization. Each CPU should have a
fallback numa node to be used.

> > I triggered this BUG on arm64 platform, and I found a similar bug has
> > been fixed on x86 platform. So I sent a similar patch for this bug.
> > 
> > Or, could we consider to fix it in the mm subsystem?
> 
> The patch below (b755de8dfdfe) seems like totally the wrong direction.
> I don't think we want every caller of kzalloc_node() to have check for
> node_online().

absolutely.

> Why would memory on an off-line node even be in the allocation pool?
> I wouldn't expect that memory to be put in the pool until the node
> comes online and the memory is accessible, so this sounds like some
> kind of setup issue.
> 
> But I'm definitely not an mm person.

Well, the standard way to handle memory less NUMA nodes is to simply
fallback to the closest NUMA node. We even have an API for that
(numa_mem_id).
 
> > From b755de8dfdfef97effaa91379ffafcb81f4d62a1 Mon Sep 17 00:00:00 2001
> > From: Yinghai Lu <Yinghai.Lu@Sun.COM>
> > Date: Wed, 20 Feb 2008 12:41:52 -0800
> > Subject: [PATCH] x86: make dev_to_node return online node
> > 
> > a numa system (with multi HT chains) may return node without ram. Aka it
> > is not online. Try to get an online node, otherwise return -1.
> > 
> > Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
> > Signed-off-by: Ingo Molnar <mingo@elte.hu>
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > ---
> >  arch/x86/pci/acpi.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
> > index d95de2f..ea8685f 100644
> > --- a/arch/x86/pci/acpi.c
> > +++ b/arch/x86/pci/acpi.c
> > @@ -172,6 +172,9 @@ struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_device *device, int do
> >  		set_mp_bus_to_node(busnum, node);
> >  	else
> >  		node = get_mp_bus_to_node(busnum);
> > +
> > +	if (node != -1 && !node_online(node))
> > +		node = -1;
> >  #endif
> > 
> >  	/* Allocate per-root-bus (not per bus) arch-specific data.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* [PATCH v5 3/6] mfd: at91-usart: added mfd driver for usart
From: Enric Balletbo Serra @ 2018-06-11 14:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180604165943.31381-4-radu.pirea@microchip.com>

Hi Radu,

Some few feedback in a hope to help you.

Missatge de Radu Pirea <radu.pirea@microchip.com> del dia dl., 4 de
juny 2018 a les 19:03:
>
> This mfd driver is just a wrapper over atmel_serial driver and
> spi-at91-usart driver. Selection of one of the drivers is based on a
> property from device tree. If the property is not specified, the default
> driver is atmel_serial.
>
> Signed-off-by: Radu Pirea <radu.pirea@microchip.com>
> Acked-by: Rob Herring <robh@kernel.org>
> ---
>  drivers/mfd/Kconfig      |  9 ++++++
>  drivers/mfd/Makefile     |  1 +
>  drivers/mfd/at91-usart.c | 67 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 77 insertions(+)
>  create mode 100644 drivers/mfd/at91-usart.c
>
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index b860eb5aa194..a886672b960d 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -99,6 +99,15 @@ config MFD_AAT2870_CORE
>           additional drivers must be enabled in order to use the
>           functionality of the device.
>
> +config MFD_AT91_USART
> +       tristate "AT91 USART Driver"
> +       select MFD_CORE
> +       help
> +         Select this to get support for AT91 USART IP. This is a wrapper
> +         over at91-usart-serial driver and usart-spi-driver. Only one function
> +         can be used at a time. The choice is done at boot time by the probe
> +         function of this MFD driver according to a device tree property.
> +
>  config MFD_ATMEL_FLEXCOM
>         tristate "Atmel Flexcom (Flexible Serial Communication Unit)"
>         select MFD_CORE
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index d9d2cf0d32ef..db1332aa96db 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -185,6 +185,7 @@ obj-$(CONFIG_MFD_SPMI_PMIC) += qcom-spmi-pmic.o
>  obj-$(CONFIG_TPS65911_COMPARATOR)      += tps65911-comparator.o
>  obj-$(CONFIG_MFD_TPS65090)     += tps65090.o
>  obj-$(CONFIG_MFD_AAT2870_CORE) += aat2870-core.o
> +obj-$(CONFIG_MFD_AT91_USART)   += at91-usart.o
>  obj-$(CONFIG_MFD_ATMEL_FLEXCOM)        += atmel-flexcom.o
>  obj-$(CONFIG_MFD_ATMEL_HLCDC)  += atmel-hlcdc.o
>  obj-$(CONFIG_MFD_ATMEL_SMC)    += atmel-smc.o
> diff --git a/drivers/mfd/at91-usart.c b/drivers/mfd/at91-usart.c
> new file mode 100644
> index 000000000000..262e7affba22
> --- /dev/null
> +++ b/drivers/mfd/at91-usart.c
> @@ -0,0 +1,67 @@
> +// SPDX-License-Identifier: GPL-2.0+

This means GPL-2.0 or later ...

> +/*
> + * Driver for AT91 USART
> + *
> + * Copyright (C) 2018 Microchip Technology
> + *
> + * Author: Radu Pirea <radu.pirea@microchip.com>
> + *
> + */
> +
> +#include <dt-bindings/mfd/at91-usart.h>
> +
> +#include <linux/module.h>
> +#include <linux/mfd/core.h>
> +#include <linux/property.h>
> +
> +static struct mfd_cell at91_usart_spi_subdev = {
> +               .name = "at91_usart_spi",
> +               .of_compatible = "microchip,at91sam9g45-usart-spi",
> +       };
> +
> +static struct mfd_cell at91_usart_serial_subdev = {
> +               .name = "atmel_usart_serial",
> +               .of_compatible = "atmel,at91rm9200-usart-serial",
> +       };
> +
> +static int at91_usart_mode_probe(struct platform_device *pdev)
> +{
> +       struct mfd_cell cell;
> +       u32 opmode;
> +       int err;
> +
> +       err = device_property_read_u32(&pdev->dev, "atmel,usart-mode", &opmode);
> +

The returned value is not checked (and probably has no sense to
check), so I think you can just remove the err variable.

> +       switch (opmode) {
> +       case AT91_USART_MODE_SPI:
> +               cell = at91_usart_spi_subdev;
> +               break;
> +       case AT91_USART_MODE_SERIAL:
> +       default:
> +               cell = at91_usart_serial_subdev;
> +       }
> +
> +       return devm_mfd_add_devices(&pdev->dev, PLATFORM_DEVID_AUTO, &cell, 1,
> +                             NULL, 0, NULL);
> +}
> +
> +static const struct of_device_id at91_usart_mode_of_match[] = {
> +       { .compatible = "atmel,at91rm9200-usart" },
> +       { .compatible = "atmel,at91sam9260-usart" },
> +       { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, at91_flexcom_of_match);
> +
> +static struct platform_driver at91_usart_mfd = {
> +       .probe  = at91_usart_mode_probe,
> +       .driver = {
> +               .name           = "at91_usart_mode",
> +               .of_match_table = at91_usart_mode_of_match,
> +       },
> +};
> +
> +module_platform_driver(at91_usart_mfd);
> +
> +MODULE_AUTHOR("Radu Pirea <radu.pirea@microchip.com>");
> +MODULE_DESCRIPTION("AT91 USART MFD driver");
> +MODULE_LICENSE("GPL v2");

And this means GPL-2.0-only, so there is a mismatch. The SPDX
identifier should be GPL-2.0 or the module license should be GPL.

> --
> 2.17.1
>
Cheers,
  Enric

^ permalink raw reply

* [PATCH v3 0/6] add virt-dma support for imx-sdma
From: Robin Gong @ 2018-06-11 14:59 UTC (permalink / raw)
  To: linux-arm-kernel

The legacy sdma driver has below limitations or drawbacks:
  1. Hardcode the max BDs number as "PAGE_SIZE / sizeof(*)", and alloc
     one page size for one channel regardless of only few BDs needed
     most time. But in few cases, the max PAGE_SIZE maybe not enough.
  2. One SDMA channel can't stop immediatley once channel disabled which
     means SDMA interrupt may come in after this channel terminated.There
     are some patches for this corner case such as commit "2746e2c389f9",
     but not cover non-cyclic.

The common virt-dma overcomes the above limitations. It can alloc bd
dynamically and free bd once this tx transfer done. No memory wasted or
maximum limititation here, only depends on how many memory can be requested
from kernel. For No.2, such issue can be workaround by checking if there
is available descript("sdmac->desc") now once the unwanted interrupt
coming. At last the common virt-dma is easier for sdma driver maintain.

Change from v2:
  1. include Sascha's patch to make the main patch easier to review.
     Thanks Sacha.
  2. remove useless 'desc'/'chan' in struct sdma_channe.

Change from v1:
  1. split v1 patch into 5 patches.
  2. remove some unnecessary condition check.
  3. remove unnecessary 'pending' list.

Robin Gong (5):
  dmaengine: imx-sdma: add virt-dma support
  Revert "dmaengine: imx-sdma: fix pagefault when channel is disabled
    during interrupt"
  dmaengine: imx-sdma: remove usless lock
  dmaengine: imx-sdma: remove the maximum limation for bd numbers
  dmaengine: imx-sdma: add sdma_transfer_init to decrease code overlap

 drivers/dma/Kconfig    |   1 +
 drivers/dma/imx-sdma.c | 392 ++++++++++++++++++++++++++++---------------------
 2 files changed, 227 insertions(+), 166 deletions(-)

-- 
2.7.4

Robin Gong (5):
  dmaengine: imx-sdma: add virt-dma support
  Revert "dmaengine: imx-sdma: fix pagefault when channel is disabled
    during interrupt"
  dmaengine: imx-sdma: remove usless lock
  dmaengine: imx-sdma: remove the maximum limation for bd numbers
  dmaengine: imx-sdma: add sdma_transfer_init to decrease code overlap

Sascha Hauer (1):
  dmaengine: imx-sdma: factor out a struct sdma_desc from struct
    sdma_channel

 drivers/dma/Kconfig    |   1 +
 drivers/dma/imx-sdma.c | 391 ++++++++++++++++++++++++++++---------------------
 2 files changed, 226 insertions(+), 166 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCH v3 1/6] dmaengine: imx-sdma: factor out a struct sdma_desc from struct sdma_channel
From: Robin Gong @ 2018-06-11 14:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1528729173-28684-1-git-send-email-yibin.gong@nxp.com>

From: Sascha Hauer <s.hauer@pengutronix.de>

This is a preparation step to make the adding of virt-dma easier.
We create a struct sdma_desc, move some fields from struct sdma_channel
there and add a pointer from the former to the latter. For now we
allocate the data statically in struct sdma_channel, but with
virt-dma support it will be dynamically allocated.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
---
 drivers/dma/imx-sdma.c | 137 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 83 insertions(+), 54 deletions(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index ccd03c3..556d087 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -296,6 +296,30 @@ struct sdma_context_data {
 struct sdma_engine;
 
 /**
+ * struct sdma_desc - descriptor structor for one transfer
+ * @vd			descriptor for virt dma
+ * @num_bd		max NUM_BD. number of descriptors currently handling
+ * @buf_tail		ID of the buffer that was processed
+ * @buf_ptail		ID of the previous buffer that was processed
+ * @period_len		period length, used in cyclic.
+ * @chn_real_count	the real count updated from bd->mode.count
+ * @chn_count		the transfer count setuped
+ * @sdmac		sdma_channel pointer
+ * @bd			pointer of alloced bd
+ */
+struct sdma_desc {
+	unsigned int		num_bd;
+	dma_addr_t		bd_phys;
+	unsigned int		buf_tail;
+	unsigned int		buf_ptail;
+	unsigned int		period_len;
+	unsigned int		chn_real_count;
+	unsigned int		chn_count;
+	struct sdma_channel	*sdmac;
+	struct sdma_buffer_descriptor *bd;
+};
+
+/**
  * struct sdma_channel - housekeeping for a SDMA channel
  *
  * @sdma		pointer to the SDMA engine for this channel
@@ -305,11 +329,10 @@ struct sdma_engine;
  * @event_id0		aka dma request line
  * @event_id1		for channels that use 2 events
  * @word_size		peripheral access size
- * @buf_tail		ID of the buffer that was processed
- * @buf_ptail		ID of the previous buffer that was processed
- * @num_bd		max NUM_BD. number of descriptors currently handling
  */
 struct sdma_channel {
+	struct sdma_desc		*desc;
+	struct sdma_desc		_desc;
 	struct sdma_engine		*sdma;
 	unsigned int			channel;
 	enum dma_transfer_direction		direction;
@@ -317,12 +340,6 @@ struct sdma_channel {
 	unsigned int			event_id0;
 	unsigned int			event_id1;
 	enum dma_slave_buswidth		word_size;
-	unsigned int			buf_tail;
-	unsigned int			buf_ptail;
-	unsigned int			num_bd;
-	unsigned int			period_len;
-	struct sdma_buffer_descriptor	*bd;
-	dma_addr_t			bd_phys;
 	unsigned int			pc_from_device, pc_to_device;
 	unsigned int			device_to_device;
 	unsigned long			flags;
@@ -332,10 +349,8 @@ struct sdma_channel {
 	u32				shp_addr, per_addr;
 	struct dma_chan			chan;
 	spinlock_t			lock;
-	struct dma_async_tx_descriptor	desc;
+	struct dma_async_tx_descriptor	txdesc;
 	enum dma_status			status;
-	unsigned int			chn_count;
-	unsigned int			chn_real_count;
 	struct tasklet_struct		tasklet;
 	struct imx_dma_data		data;
 	bool				enabled;
@@ -398,6 +413,8 @@ struct sdma_engine {
 	u32				spba_start_addr;
 	u32				spba_end_addr;
 	unsigned int			irq;
+	dma_addr_t			bd0_phys;
+	struct sdma_buffer_descriptor	*bd0;
 };
 
 static struct sdma_driver_data sdma_imx31 = {
@@ -632,7 +649,7 @@ static int sdma_run_channel0(struct sdma_engine *sdma)
 static int sdma_load_script(struct sdma_engine *sdma, void *buf, int size,
 		u32 address)
 {
-	struct sdma_buffer_descriptor *bd0 = sdma->channel[0].bd;
+	struct sdma_buffer_descriptor *bd0 = sdma->bd0;
 	void *buf_virt;
 	dma_addr_t buf_phys;
 	int ret;
@@ -707,7 +724,9 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 	 * call callback function.
 	 */
 	while (1) {
-		bd = &sdmac->bd[sdmac->buf_tail];
+		struct sdma_desc *desc = sdmac->desc;
+
+		bd = &desc->bd[desc->buf_tail];
 
 		if (bd->mode.status & BD_DONE)
 			break;
@@ -723,11 +742,11 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 		* the number of bytes present in the current buffer descriptor.
 		*/
 
-		sdmac->chn_real_count = bd->mode.count;
+		desc->chn_real_count = bd->mode.count;
 		bd->mode.status |= BD_DONE;
-		bd->mode.count = sdmac->period_len;
-		sdmac->buf_ptail = sdmac->buf_tail;
-		sdmac->buf_tail = (sdmac->buf_tail + 1) % sdmac->num_bd;
+		bd->mode.count = desc->period_len;
+		desc->buf_ptail = desc->buf_tail;
+		desc->buf_tail = (desc->buf_tail + 1) % desc->num_bd;
 
 		/*
 		 * The callback is called from the interrupt context in order
@@ -736,7 +755,7 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 		 * executed.
 		 */
 
-		dmaengine_desc_get_callback_invoke(&sdmac->desc, NULL);
+		dmaengine_desc_get_callback_invoke(&sdmac->txdesc, NULL);
 
 		if (error)
 			sdmac->status = old_status;
@@ -749,17 +768,17 @@ static void mxc_sdma_handle_channel_normal(unsigned long data)
 	struct sdma_buffer_descriptor *bd;
 	int i, error = 0;
 
-	sdmac->chn_real_count = 0;
+	sdmac->desc->chn_real_count = 0;
 	/*
 	 * non loop mode. Iterate over all descriptors, collect
 	 * errors and call callback function
 	 */
-	for (i = 0; i < sdmac->num_bd; i++) {
-		bd = &sdmac->bd[i];
+	for (i = 0; i < sdmac->desc->num_bd; i++) {
+		bd = &sdmac->desc->bd[i];
 
 		 if (bd->mode.status & (BD_DONE | BD_RROR))
 			error = -EIO;
-		 sdmac->chn_real_count += bd->mode.count;
+		 sdmac->desc->chn_real_count += bd->mode.count;
 	}
 
 	if (error)
@@ -767,9 +786,9 @@ static void mxc_sdma_handle_channel_normal(unsigned long data)
 	else
 		sdmac->status = DMA_COMPLETE;
 
-	dma_cookie_complete(&sdmac->desc);
+	dma_cookie_complete(&sdmac->txdesc);
 
-	dmaengine_desc_get_callback_invoke(&sdmac->desc, NULL);
+	dmaengine_desc_get_callback_invoke(&sdmac->txdesc, NULL);
 }
 
 static irqreturn_t sdma_int_handler(int irq, void *dev_id)
@@ -897,7 +916,7 @@ static int sdma_load_context(struct sdma_channel *sdmac)
 	int channel = sdmac->channel;
 	int load_address;
 	struct sdma_context_data *context = sdma->context;
-	struct sdma_buffer_descriptor *bd0 = sdma->channel[0].bd;
+	struct sdma_buffer_descriptor *bd0 = sdma->bd0;
 	int ret;
 	unsigned long flags;
 
@@ -1100,18 +1119,22 @@ static int sdma_set_channel_priority(struct sdma_channel *sdmac,
 static int sdma_request_channel(struct sdma_channel *sdmac)
 {
 	struct sdma_engine *sdma = sdmac->sdma;
+	struct sdma_desc *desc;
 	int channel = sdmac->channel;
 	int ret = -EBUSY;
 
-	sdmac->bd = dma_zalloc_coherent(NULL, PAGE_SIZE, &sdmac->bd_phys,
+	sdmac->desc = &sdmac->_desc;
+	desc = sdmac->desc;
+
+	desc->bd = dma_zalloc_coherent(NULL, PAGE_SIZE, &desc->bd_phys,
 					GFP_KERNEL);
-	if (!sdmac->bd) {
+	if (!desc->bd) {
 		ret = -ENOMEM;
 		goto out;
 	}
 
-	sdma->channel_control[channel].base_bd_ptr = sdmac->bd_phys;
-	sdma->channel_control[channel].current_bd_ptr = sdmac->bd_phys;
+	sdma->channel_control[channel].base_bd_ptr = desc->bd_phys;
+	sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
 
 	sdma_set_channel_priority(sdmac, MXC_SDMA_DEFAULT_PRIORITY);
 	return 0;
@@ -1176,10 +1199,10 @@ static int sdma_alloc_chan_resources(struct dma_chan *chan)
 	if (ret)
 		goto disable_clk_ahb;
 
-	dma_async_tx_descriptor_init(&sdmac->desc, chan);
-	sdmac->desc.tx_submit = sdma_tx_submit;
+	dma_async_tx_descriptor_init(&sdmac->txdesc, chan);
+	sdmac->txdesc.tx_submit = sdma_tx_submit;
 	/* txd.flags will be overwritten in prep funcs */
-	sdmac->desc.flags = DMA_CTRL_ACK;
+	sdmac->txdesc.flags = DMA_CTRL_ACK;
 
 	return 0;
 
@@ -1194,6 +1217,7 @@ static void sdma_free_chan_resources(struct dma_chan *chan)
 {
 	struct sdma_channel *sdmac = to_sdma_chan(chan);
 	struct sdma_engine *sdma = sdmac->sdma;
+	struct sdma_desc *desc = sdmac->desc;
 
 	sdma_disable_channel(chan);
 
@@ -1207,7 +1231,7 @@ static void sdma_free_chan_resources(struct dma_chan *chan)
 
 	sdma_set_channel_priority(sdmac, 0);
 
-	dma_free_coherent(NULL, PAGE_SIZE, sdmac->bd, sdmac->bd_phys);
+	dma_free_coherent(NULL, PAGE_SIZE, desc->bd, desc->bd_phys);
 
 	clk_disable(sdma->clk_ipg);
 	clk_disable(sdma->clk_ahb);
@@ -1223,6 +1247,7 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 	int ret, i, count;
 	int channel = sdmac->channel;
 	struct scatterlist *sg;
+	struct sdma_desc *desc = sdmac->desc;
 
 	if (sdmac->status == DMA_IN_PROGRESS)
 		return NULL;
@@ -1230,9 +1255,9 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 
 	sdmac->flags = 0;
 
-	sdmac->buf_tail = 0;
-	sdmac->buf_ptail = 0;
-	sdmac->chn_real_count = 0;
+	desc->buf_tail = 0;
+	desc->buf_ptail = 0;
+	desc->chn_real_count = 0;
 
 	dev_dbg(sdma->dev, "setting up %d entries for channel %d.\n",
 			sg_len, channel);
@@ -1249,9 +1274,9 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 		goto err_out;
 	}
 
-	sdmac->chn_count = 0;
+	desc->chn_count = 0;
 	for_each_sg(sgl, sg, sg_len, i) {
-		struct sdma_buffer_descriptor *bd = &sdmac->bd[i];
+		struct sdma_buffer_descriptor *bd = &desc->bd[i];
 		int param;
 
 		bd->buffer_addr = sg->dma_address;
@@ -1266,7 +1291,7 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 		}
 
 		bd->mode.count = count;
-		sdmac->chn_count += count;
+		desc->chn_count += count;
 
 		if (sdmac->word_size > DMA_SLAVE_BUSWIDTH_4_BYTES) {
 			ret =  -EINVAL;
@@ -1307,10 +1332,10 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 		bd->mode.status = param;
 	}
 
-	sdmac->num_bd = sg_len;
-	sdma->channel_control[channel].current_bd_ptr = sdmac->bd_phys;
+	desc->num_bd = sg_len;
+	sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
 
-	return &sdmac->desc;
+	return &sdmac->txdesc;
 err_out:
 	sdmac->status = DMA_ERROR;
 	return NULL;
@@ -1326,6 +1351,7 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 	int num_periods = buf_len / period_len;
 	int channel = sdmac->channel;
 	int ret, i = 0, buf = 0;
+	struct sdma_desc *desc = sdmac->desc;
 
 	dev_dbg(sdma->dev, "%s channel: %d\n", __func__, channel);
 
@@ -1334,10 +1360,10 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 
 	sdmac->status = DMA_IN_PROGRESS;
 
-	sdmac->buf_tail = 0;
-	sdmac->buf_ptail = 0;
-	sdmac->chn_real_count = 0;
-	sdmac->period_len = period_len;
+	desc->buf_tail = 0;
+	desc->buf_ptail = 0;
+	desc->chn_real_count = 0;
+	desc->period_len = period_len;
 
 	sdmac->flags |= IMX_DMA_SG_LOOP;
 	sdmac->direction = direction;
@@ -1358,7 +1384,7 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 	}
 
 	while (buf < buf_len) {
-		struct sdma_buffer_descriptor *bd = &sdmac->bd[i];
+		struct sdma_buffer_descriptor *bd = &desc->bd[i];
 		int param;
 
 		bd->buffer_addr = dma_addr;
@@ -1389,10 +1415,10 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 		i++;
 	}
 
-	sdmac->num_bd = num_periods;
-	sdma->channel_control[channel].current_bd_ptr = sdmac->bd_phys;
+	desc->num_bd = num_periods;
+	sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
 
-	return &sdmac->desc;
+	return &sdmac->txdesc;
 err_out:
 	sdmac->status = DMA_ERROR;
 	return NULL;
@@ -1431,13 +1457,14 @@ static enum dma_status sdma_tx_status(struct dma_chan *chan,
 				      struct dma_tx_state *txstate)
 {
 	struct sdma_channel *sdmac = to_sdma_chan(chan);
+	struct sdma_desc *desc = sdmac->desc;
 	u32 residue;
 
 	if (sdmac->flags & IMX_DMA_SG_LOOP)
-		residue = (sdmac->num_bd - sdmac->buf_ptail) *
-			   sdmac->period_len - sdmac->chn_real_count;
+		residue = (desc->num_bd - desc->buf_ptail) *
+			   desc->period_len - desc->chn_real_count;
 	else
-		residue = sdmac->chn_count - sdmac->chn_real_count;
+		residue = desc->chn_count - desc->chn_real_count;
 
 	dma_set_tx_state(txstate, chan->completed_cookie, chan->cookie,
 			 residue);
@@ -1661,6 +1688,8 @@ static int sdma_init(struct sdma_engine *sdma)
 	if (ret)
 		goto err_dma_alloc;
 
+	sdma->bd0 = sdma->channel[0].desc->bd;
+
 	sdma_config_ownership(&sdma->channel[0], false, true, false);
 
 	/* Set Command Channel (Channel Zero) */
-- 
2.7.4

^ permalink raw reply related

* [PATCH v3 2/6] dmaengine: imx-sdma: add virt-dma support
From: Robin Gong @ 2018-06-11 14:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1528729173-28684-1-git-send-email-yibin.gong@nxp.com>

The legacy sdma driver has below limitations or drawbacks:
  1. Hardcode the max BDs number as "PAGE_SIZE / sizeof(*)", and alloc
     one page size for one channel regardless of only few BDs needed
     most time. But in few cases, the max PAGE_SIZE maybe not enough.
  2. One SDMA channel can't stop immediatley once channel disabled which
     means SDMA interrupt may come in after this channel terminated.There
     are some patches for this corner case such as commit "2746e2c389f9",
     but not cover non-cyclic.

The common virt-dma overcomes the above limitations. It can alloc bd
dynamically and free bd once this tx transfer done. No memory wasted or
maximum limititation here, only depends on how many memory can be requested
from kernel. For No.2, such issue can be workaround by checking if there
is available descript("sdmac->desc") now once the unwanted interrupt
coming. At last the common virt-dma is easier for sdma driver maintain.

Signed-off-by: Robin Gong <yibin.gong@nxp.com>
---
 drivers/dma/Kconfig    |   1 +
 drivers/dma/imx-sdma.c | 258 ++++++++++++++++++++++++++++++++-----------------
 2 files changed, 168 insertions(+), 91 deletions(-)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 6d61cd0..78715a2 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -257,6 +257,7 @@ config IMX_SDMA
 	tristate "i.MX SDMA support"
 	depends on ARCH_MXC
 	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
 	help
 	  Support the i.MX SDMA engine. This engine is integrated into
 	  Freescale i.MX25/31/35/51/53/6 chips.
diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index 556d087..474c105 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -48,6 +48,7 @@
 #include <linux/mfd/syscon/imx6q-iomuxc-gpr.h>
 
 #include "dmaengine.h"
+#include "virt-dma.h"
 
 /* SDMA registers */
 #define SDMA_H_C0PTR		0x000
@@ -308,6 +309,7 @@ struct sdma_engine;
  * @bd			pointer of alloced bd
  */
 struct sdma_desc {
+	struct virt_dma_desc	vd;
 	unsigned int		num_bd;
 	dma_addr_t		bd_phys;
 	unsigned int		buf_tail;
@@ -331,8 +333,8 @@ struct sdma_desc {
  * @word_size		peripheral access size
  */
 struct sdma_channel {
+	struct virt_dma_chan		vc;
 	struct sdma_desc		*desc;
-	struct sdma_desc		_desc;
 	struct sdma_engine		*sdma;
 	unsigned int			channel;
 	enum dma_transfer_direction		direction;
@@ -347,11 +349,8 @@ struct sdma_channel {
 	unsigned long			event_mask[2];
 	unsigned long			watermark_level;
 	u32				shp_addr, per_addr;
-	struct dma_chan			chan;
 	spinlock_t			lock;
-	struct dma_async_tx_descriptor	txdesc;
 	enum dma_status			status;
-	struct tasklet_struct		tasklet;
 	struct imx_dma_data		data;
 	bool				enabled;
 };
@@ -705,6 +704,35 @@ static void sdma_event_disable(struct sdma_channel *sdmac, unsigned int event)
 	writel_relaxed(val, sdma->regs + chnenbl);
 }
 
+static struct sdma_desc *to_sdma_desc(struct dma_async_tx_descriptor *t)
+{
+	return container_of(t, struct sdma_desc, vd.tx);
+}
+
+static void sdma_start_desc(struct sdma_channel *sdmac)
+{
+	struct virt_dma_desc *vd = vchan_next_desc(&sdmac->vc);
+	struct sdma_desc *desc;
+	struct sdma_engine *sdma = sdmac->sdma;
+	int channel = sdmac->channel;
+
+	if (!vd) {
+		sdmac->desc = NULL;
+		return;
+	}
+	sdmac->desc = desc = to_sdma_desc(&vd->tx);
+	/*
+	 * Do not delete the node in desc_issued list in cyclic mode, otherwise
+	 * the desc alloced will never be freed in vchan_dma_desc_free_list
+	 */
+	if (!(sdmac->flags & IMX_DMA_SG_LOOP))
+		list_del(&vd->node);
+
+	sdma->channel_control[channel].base_bd_ptr = desc->bd_phys;
+	sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
+	sdma_enable_channel(sdma, sdmac->channel);
+}
+
 static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 {
 	struct sdma_buffer_descriptor *bd;
@@ -723,7 +751,7 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 	 * loop mode. Iterate over descriptors, re-setup them and
 	 * call callback function.
 	 */
-	while (1) {
+	while (sdmac->desc) {
 		struct sdma_desc *desc = sdmac->desc;
 
 		bd = &desc->bd[desc->buf_tail];
@@ -755,14 +783,14 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 		 * executed.
 		 */
 
-		dmaengine_desc_get_callback_invoke(&sdmac->txdesc, NULL);
+		dmaengine_desc_get_callback_invoke(&desc->vd.tx, NULL);
 
 		if (error)
 			sdmac->status = old_status;
 	}
 }
 
-static void mxc_sdma_handle_channel_normal(unsigned long data)
+static void mxc_sdma_handle_channel_normal(struct sdma_channel *data)
 {
 	struct sdma_channel *sdmac = (struct sdma_channel *) data;
 	struct sdma_buffer_descriptor *bd;
@@ -785,10 +813,6 @@ static void mxc_sdma_handle_channel_normal(unsigned long data)
 		sdmac->status = DMA_ERROR;
 	else
 		sdmac->status = DMA_COMPLETE;
-
-	dma_cookie_complete(&sdmac->txdesc);
-
-	dmaengine_desc_get_callback_invoke(&sdmac->txdesc, NULL);
 }
 
 static irqreturn_t sdma_int_handler(int irq, void *dev_id)
@@ -804,12 +828,21 @@ static irqreturn_t sdma_int_handler(int irq, void *dev_id)
 	while (stat) {
 		int channel = fls(stat) - 1;
 		struct sdma_channel *sdmac = &sdma->channel[channel];
+		struct sdma_desc *desc;
+
+		spin_lock(&sdmac->vc.lock);
+		desc = sdmac->desc;
+		if (desc) {
+			if (sdmac->flags & IMX_DMA_SG_LOOP) {
+				sdma_update_channel_loop(sdmac);
+			} else {
+				mxc_sdma_handle_channel_normal(sdmac);
+				vchan_cookie_complete(&desc->vd);
+				sdma_start_desc(sdmac);
+			}
+		}
 
-		if (sdmac->flags & IMX_DMA_SG_LOOP)
-			sdma_update_channel_loop(sdmac);
-		else
-			tasklet_schedule(&sdmac->tasklet);
-
+		spin_unlock(&sdmac->vc.lock);
 		__clear_bit(channel, &stat);
 	}
 
@@ -965,7 +998,7 @@ static int sdma_load_context(struct sdma_channel *sdmac)
 
 static struct sdma_channel *to_sdma_chan(struct dma_chan *chan)
 {
-	return container_of(chan, struct sdma_channel, chan);
+	return container_of(chan, struct sdma_channel, vc.chan);
 }
 
 static int sdma_disable_channel(struct dma_chan *chan)
@@ -987,7 +1020,16 @@ static int sdma_disable_channel(struct dma_chan *chan)
 
 static int sdma_disable_channel_with_delay(struct dma_chan *chan)
 {
+	struct sdma_channel *sdmac = to_sdma_chan(chan);
+	unsigned long flags;
+	LIST_HEAD(head);
+
 	sdma_disable_channel(chan);
+	spin_lock_irqsave(&sdmac->vc.lock, flags);
+	vchan_get_all_descriptors(&sdmac->vc, &head);
+	sdmac->desc = NULL;
+	spin_unlock_irqrestore(&sdmac->vc.lock, flags);
+	vchan_dma_desc_free_list(&sdmac->vc, &head);
 
 	/*
 	 * According to NXP R&D team a delay of one BD SDMA cost time
@@ -1116,46 +1158,56 @@ static int sdma_set_channel_priority(struct sdma_channel *sdmac,
 	return 0;
 }
 
-static int sdma_request_channel(struct sdma_channel *sdmac)
+static int sdma_request_channel0(struct sdma_engine *sdma)
 {
-	struct sdma_engine *sdma = sdmac->sdma;
-	struct sdma_desc *desc;
-	int channel = sdmac->channel;
 	int ret = -EBUSY;
 
-	sdmac->desc = &sdmac->_desc;
-	desc = sdmac->desc;
-
-	desc->bd = dma_zalloc_coherent(NULL, PAGE_SIZE, &desc->bd_phys,
+	sdma->bd0 = dma_zalloc_coherent(NULL, PAGE_SIZE, &sdma->bd0_phys,
 					GFP_KERNEL);
-	if (!desc->bd) {
+	if (!sdma->bd0) {
 		ret = -ENOMEM;
 		goto out;
 	}
 
-	sdma->channel_control[channel].base_bd_ptr = desc->bd_phys;
-	sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
+	sdma->channel_control[0].base_bd_ptr = sdma->bd0_phys;
+	sdma->channel_control[0].current_bd_ptr = sdma->bd0_phys;
 
-	sdma_set_channel_priority(sdmac, MXC_SDMA_DEFAULT_PRIORITY);
+	sdma_set_channel_priority(&sdma->channel[0], MXC_SDMA_DEFAULT_PRIORITY);
 	return 0;
 out:
 
 	return ret;
 }
 
-static dma_cookie_t sdma_tx_submit(struct dma_async_tx_descriptor *tx)
+
+static int sdma_alloc_bd(struct sdma_desc *desc)
 {
-	unsigned long flags;
-	struct sdma_channel *sdmac = to_sdma_chan(tx->chan);
-	dma_cookie_t cookie;
+	u32 bd_size = desc->num_bd * sizeof(struct sdma_buffer_descriptor);
+	int ret = 0;
 
-	spin_lock_irqsave(&sdmac->lock, flags);
+	desc->bd = dma_zalloc_coherent(NULL, bd_size, &desc->bd_phys,
+					GFP_ATOMIC);
+	if (!desc->bd) {
+		ret = -ENOMEM;
+		goto out;
+	}
+out:
+	return ret;
+}
 
-	cookie = dma_cookie_assign(tx);
+static void sdma_free_bd(struct sdma_desc *desc)
+{
+	u32 bd_size = desc->num_bd * sizeof(struct sdma_buffer_descriptor);
 
-	spin_unlock_irqrestore(&sdmac->lock, flags);
+	dma_free_coherent(NULL, bd_size, desc->bd, desc->bd_phys);
+}
 
-	return cookie;
+static void sdma_desc_free(struct virt_dma_desc *vd)
+{
+	struct sdma_desc *desc = container_of(vd, struct sdma_desc, vd);
+
+	sdma_free_bd(desc);
+	kfree(desc);
 }
 
 static int sdma_alloc_chan_resources(struct dma_chan *chan)
@@ -1191,19 +1243,10 @@ static int sdma_alloc_chan_resources(struct dma_chan *chan)
 	if (ret)
 		goto disable_clk_ipg;
 
-	ret = sdma_request_channel(sdmac);
-	if (ret)
-		goto disable_clk_ahb;
-
 	ret = sdma_set_channel_priority(sdmac, prio);
 	if (ret)
 		goto disable_clk_ahb;
 
-	dma_async_tx_descriptor_init(&sdmac->txdesc, chan);
-	sdmac->txdesc.tx_submit = sdma_tx_submit;
-	/* txd.flags will be overwritten in prep funcs */
-	sdmac->txdesc.flags = DMA_CTRL_ACK;
-
 	return 0;
 
 disable_clk_ahb:
@@ -1217,9 +1260,8 @@ static void sdma_free_chan_resources(struct dma_chan *chan)
 {
 	struct sdma_channel *sdmac = to_sdma_chan(chan);
 	struct sdma_engine *sdma = sdmac->sdma;
-	struct sdma_desc *desc = sdmac->desc;
 
-	sdma_disable_channel(chan);
+	sdma_disable_channel_with_delay(chan);
 
 	if (sdmac->event_id0)
 		sdma_event_disable(sdmac, sdmac->event_id0);
@@ -1231,8 +1273,6 @@ static void sdma_free_chan_resources(struct dma_chan *chan)
 
 	sdma_set_channel_priority(sdmac, 0);
 
-	dma_free_coherent(NULL, PAGE_SIZE, desc->bd, desc->bd_phys);
-
 	clk_disable(sdma->clk_ipg);
 	clk_disable(sdma->clk_ahb);
 }
@@ -1247,7 +1287,7 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 	int ret, i, count;
 	int channel = sdmac->channel;
 	struct scatterlist *sg;
-	struct sdma_desc *desc = sdmac->desc;
+	struct sdma_desc *desc;
 
 	if (sdmac->status == DMA_IN_PROGRESS)
 		return NULL;
@@ -1255,23 +1295,34 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 
 	sdmac->flags = 0;
 
+	desc = kzalloc((sizeof(*desc)), GFP_KERNEL);
+	if (!desc)
+		goto err_out;
+
 	desc->buf_tail = 0;
 	desc->buf_ptail = 0;
+	desc->sdmac = sdmac;
+	desc->num_bd = sg_len;
 	desc->chn_real_count = 0;
 
+	if (sdma_alloc_bd(desc)) {
+		kfree(desc);
+		goto err_out;
+	}
+
 	dev_dbg(sdma->dev, "setting up %d entries for channel %d.\n",
 			sg_len, channel);
 
 	sdmac->direction = direction;
 	ret = sdma_load_context(sdmac);
 	if (ret)
-		goto err_out;
+		goto err_bd_out;
 
 	if (sg_len > NUM_BD) {
 		dev_err(sdma->dev, "SDMA channel %d: maximum number of sg exceeded: %d > %d\n",
 				channel, sg_len, NUM_BD);
 		ret = -EINVAL;
-		goto err_out;
+		goto err_bd_out;
 	}
 
 	desc->chn_count = 0;
@@ -1287,7 +1338,7 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 			dev_err(sdma->dev, "SDMA channel %d: maximum bytes for sg entry exceeded: %d > %d\n",
 					channel, count, 0xffff);
 			ret = -EINVAL;
-			goto err_out;
+			goto err_bd_out;
 		}
 
 		bd->mode.count = count;
@@ -1295,25 +1346,25 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 
 		if (sdmac->word_size > DMA_SLAVE_BUSWIDTH_4_BYTES) {
 			ret =  -EINVAL;
-			goto err_out;
+			goto err_bd_out;
 		}
 
 		switch (sdmac->word_size) {
 		case DMA_SLAVE_BUSWIDTH_4_BYTES:
 			bd->mode.command = 0;
 			if (count & 3 || sg->dma_address & 3)
-				return NULL;
+				goto err_bd_out;
 			break;
 		case DMA_SLAVE_BUSWIDTH_2_BYTES:
 			bd->mode.command = 2;
 			if (count & 1 || sg->dma_address & 1)
-				return NULL;
+				goto err_bd_out;
 			break;
 		case DMA_SLAVE_BUSWIDTH_1_BYTE:
 			bd->mode.command = 1;
 			break;
 		default:
-			return NULL;
+			goto err_bd_out;
 		}
 
 		param = BD_DONE | BD_EXTD | BD_CONT;
@@ -1332,10 +1383,10 @@ static struct dma_async_tx_descriptor *sdma_prep_slave_sg(
 		bd->mode.status = param;
 	}
 
-	desc->num_bd = sg_len;
-	sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
-
-	return &sdmac->txdesc;
+	return vchan_tx_prep(&sdmac->vc, &desc->vd, flags);
+err_bd_out:
+	sdma_free_bd(desc);
+	kfree(desc);
 err_out:
 	sdmac->status = DMA_ERROR;
 	return NULL;
@@ -1351,7 +1402,7 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 	int num_periods = buf_len / period_len;
 	int channel = sdmac->channel;
 	int ret, i = 0, buf = 0;
-	struct sdma_desc *desc = sdmac->desc;
+	struct sdma_desc *desc;
 
 	dev_dbg(sdma->dev, "%s channel: %d\n", __func__, channel);
 
@@ -1360,27 +1411,39 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 
 	sdmac->status = DMA_IN_PROGRESS;
 
+	desc = kzalloc((sizeof(*desc)), GFP_KERNEL);
+	if (!desc)
+		goto err_out;
+
 	desc->buf_tail = 0;
 	desc->buf_ptail = 0;
+	desc->sdmac = sdmac;
+	desc->num_bd = num_periods;
 	desc->chn_real_count = 0;
 	desc->period_len = period_len;
 
 	sdmac->flags |= IMX_DMA_SG_LOOP;
 	sdmac->direction = direction;
+
+	if (sdma_alloc_bd(desc)) {
+		kfree(desc);
+		goto err_bd_out;
+	}
+
 	ret = sdma_load_context(sdmac);
 	if (ret)
-		goto err_out;
+		goto err_bd_out;
 
 	if (num_periods > NUM_BD) {
 		dev_err(sdma->dev, "SDMA channel %d: maximum number of sg exceeded: %d > %d\n",
 				channel, num_periods, NUM_BD);
-		goto err_out;
+		goto err_bd_out;
 	}
 
 	if (period_len > 0xffff) {
 		dev_err(sdma->dev, "SDMA channel %d: maximum period size exceeded: %zu > %d\n",
 				channel, period_len, 0xffff);
-		goto err_out;
+		goto err_bd_out;
 	}
 
 	while (buf < buf_len) {
@@ -1392,7 +1455,7 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 		bd->mode.count = period_len;
 
 		if (sdmac->word_size > DMA_SLAVE_BUSWIDTH_4_BYTES)
-			goto err_out;
+			goto err_bd_out;
 		if (sdmac->word_size == DMA_SLAVE_BUSWIDTH_4_BYTES)
 			bd->mode.command = 0;
 		else
@@ -1415,10 +1478,10 @@ static struct dma_async_tx_descriptor *sdma_prep_dma_cyclic(
 		i++;
 	}
 
-	desc->num_bd = num_periods;
-	sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
-
-	return &sdmac->txdesc;
+	return vchan_tx_prep(&sdmac->vc, &desc->vd, flags);
+err_bd_out:
+	sdma_free_bd(desc);
+	kfree(desc);
 err_out:
 	sdmac->status = DMA_ERROR;
 	return NULL;
@@ -1457,14 +1520,31 @@ static enum dma_status sdma_tx_status(struct dma_chan *chan,
 				      struct dma_tx_state *txstate)
 {
 	struct sdma_channel *sdmac = to_sdma_chan(chan);
-	struct sdma_desc *desc = sdmac->desc;
+	struct sdma_desc *desc;
 	u32 residue;
+	struct virt_dma_desc *vd;
+	enum dma_status ret;
+	unsigned long flags;
 
-	if (sdmac->flags & IMX_DMA_SG_LOOP)
-		residue = (desc->num_bd - desc->buf_ptail) *
-			   desc->period_len - desc->chn_real_count;
-	else
-		residue = desc->chn_count - desc->chn_real_count;
+	ret = dma_cookie_status(chan, cookie, txstate);
+	if (ret == DMA_COMPLETE || !txstate)
+		return ret;
+
+	spin_lock_irqsave(&sdmac->vc.lock, flags);
+	vd = vchan_find_desc(&sdmac->vc, cookie);
+	if (vd) {
+		desc = to_sdma_desc(&vd->tx);
+		if (sdmac->flags & IMX_DMA_SG_LOOP)
+			residue = (desc->num_bd - desc->buf_ptail) *
+				desc->period_len - desc->chn_real_count;
+		else
+			residue = desc->chn_count - desc->chn_real_count;
+	} else if (sdmac->desc && sdmac->desc->vd.tx.cookie == cookie) {
+		residue = sdmac->desc->chn_count - sdmac->desc->chn_real_count;
+	} else {
+		residue = 0;
+	}
+	spin_unlock_irqrestore(&sdmac->vc.lock, flags);
 
 	dma_set_tx_state(txstate, chan->completed_cookie, chan->cookie,
 			 residue);
@@ -1475,10 +1555,12 @@ static enum dma_status sdma_tx_status(struct dma_chan *chan,
 static void sdma_issue_pending(struct dma_chan *chan)
 {
 	struct sdma_channel *sdmac = to_sdma_chan(chan);
-	struct sdma_engine *sdma = sdmac->sdma;
+	unsigned long flags;
 
-	if (sdmac->status == DMA_IN_PROGRESS)
-		sdma_enable_channel(sdma, sdmac->channel);
+	spin_lock_irqsave(&sdmac->vc.lock, flags);
+	if (vchan_issue_pending(&sdmac->vc) && !sdmac->desc)
+		sdma_start_desc(sdmac);
+	spin_unlock_irqrestore(&sdmac->vc.lock, flags);
 }
 
 #define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1	34
@@ -1684,12 +1766,10 @@ static int sdma_init(struct sdma_engine *sdma)
 	for (i = 0; i < MAX_DMA_CHANNELS; i++)
 		writel_relaxed(0, sdma->regs + SDMA_CHNPRI_0 + i * 4);
 
-	ret = sdma_request_channel(&sdma->channel[0]);
+	ret = sdma_request_channel0(sdma);
 	if (ret)
 		goto err_dma_alloc;
 
-	sdma->bd0 = sdma->channel[0].desc->bd;
-
 	sdma_config_ownership(&sdma->channel[0], false, true, false);
 
 	/* Set Command Channel (Channel Zero) */
@@ -1850,20 +1930,15 @@ static int sdma_probe(struct platform_device *pdev)
 		sdmac->sdma = sdma;
 		spin_lock_init(&sdmac->lock);
 
-		sdmac->chan.device = &sdma->dma_device;
-		dma_cookie_init(&sdmac->chan);
 		sdmac->channel = i;
-
-		tasklet_init(&sdmac->tasklet, mxc_sdma_handle_channel_normal,
-			     (unsigned long) sdmac);
+		sdmac->vc.desc_free = sdma_desc_free;
 		/*
 		 * Add the channel to the DMAC list. Do not add channel 0 though
 		 * because we need it internally in the SDMA driver. This also means
 		 * that channel 0 in dmaengine counting matches sdma channel 1.
 		 */
 		if (i)
-			list_add_tail(&sdmac->chan.device_node,
-					&sdma->dma_device.channels);
+			vchan_init(&sdmac->vc, &sdma->dma_device);
 	}
 
 	ret = sdma_init(sdma);
@@ -1968,7 +2043,8 @@ static int sdma_remove(struct platform_device *pdev)
 	for (i = 0; i < MAX_DMA_CHANNELS; i++) {
 		struct sdma_channel *sdmac = &sdma->channel[i];
 
-		tasklet_kill(&sdmac->tasklet);
+		tasklet_kill(&sdmac->vc.task);
+		sdma_free_chan_resources(&sdmac->vc.chan);
 	}
 
 	platform_set_drvdata(pdev, NULL);
-- 
2.7.4

^ permalink raw reply related

* [PATCH v3 3/6] Revert "dmaengine: imx-sdma: fix pagefault when channel is disabled during interrupt"
From: Robin Gong @ 2018-06-11 14:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1528729173-28684-1-git-send-email-yibin.gong@nxp.com>

This reverts commit 2746e2c389f9d50043d21e2204270403efb9d62f.
Don't need this patch anymore,since we can easily check 'sdmac->desc' to avoid
handling dma interrupt after channel disabled if virt-dma used.

Signed-off-by: Robin Gong <yibin.gong@nxp.com>
---
 drivers/dma/imx-sdma.c | 21 ---------------------
 1 file changed, 21 deletions(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index 474c105..e0af8ee 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -352,7 +352,6 @@ struct sdma_channel {
 	spinlock_t			lock;
 	enum dma_status			status;
 	struct imx_dma_data		data;
-	bool				enabled;
 };
 
 #define IMX_DMA_SG_LOOP		BIT(0)
@@ -613,14 +612,7 @@ static int sdma_config_ownership(struct sdma_channel *sdmac,
 
 static void sdma_enable_channel(struct sdma_engine *sdma, int channel)
 {
-	unsigned long flags;
-	struct sdma_channel *sdmac = &sdma->channel[channel];
-
 	writel(BIT(channel), sdma->regs + SDMA_H_START);
-
-	spin_lock_irqsave(&sdmac->lock, flags);
-	sdmac->enabled = true;
-	spin_unlock_irqrestore(&sdmac->lock, flags);
 }
 
 /*
@@ -738,14 +730,6 @@ static void sdma_update_channel_loop(struct sdma_channel *sdmac)
 	struct sdma_buffer_descriptor *bd;
 	int error = 0;
 	enum dma_status	old_status = sdmac->status;
-	unsigned long flags;
-
-	spin_lock_irqsave(&sdmac->lock, flags);
-	if (!sdmac->enabled) {
-		spin_unlock_irqrestore(&sdmac->lock, flags);
-		return;
-	}
-	spin_unlock_irqrestore(&sdmac->lock, flags);
 
 	/*
 	 * loop mode. Iterate over descriptors, re-setup them and
@@ -1006,15 +990,10 @@ static int sdma_disable_channel(struct dma_chan *chan)
 	struct sdma_channel *sdmac = to_sdma_chan(chan);
 	struct sdma_engine *sdma = sdmac->sdma;
 	int channel = sdmac->channel;
-	unsigned long flags;
 
 	writel_relaxed(BIT(channel), sdma->regs + SDMA_H_STATSTOP);
 	sdmac->status = DMA_ERROR;
 
-	spin_lock_irqsave(&sdmac->lock, flags);
-	sdmac->enabled = false;
-	spin_unlock_irqrestore(&sdmac->lock, flags);
-
 	return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH v3 4/6] dmaengine: imx-sdma: remove usless lock
From: Robin Gong @ 2018-06-11 14:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1528729173-28684-1-git-send-email-yibin.gong@nxp.com>

No need anymore for 'lock' now since virtual dma will provide
the common lock instead.

Signed-off-by: Robin Gong <yibin.gong@nxp.com>
---
 drivers/dma/imx-sdma.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index e0af8ee..f150b38 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -349,7 +349,6 @@ struct sdma_channel {
 	unsigned long			event_mask[2];
 	unsigned long			watermark_level;
 	u32				shp_addr, per_addr;
-	spinlock_t			lock;
 	enum dma_status			status;
 	struct imx_dma_data		data;
 };
@@ -1907,7 +1906,6 @@ static int sdma_probe(struct platform_device *pdev)
 		struct sdma_channel *sdmac = &sdma->channel[i];
 
 		sdmac->sdma = sdma;
-		spin_lock_init(&sdmac->lock);
 
 		sdmac->channel = i;
 		sdmac->vc.desc_free = sdma_desc_free;
-- 
2.7.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox