Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5] KVM: arm64: Allow creating the PMU without the in-kernel GIC
From: Christoffer Dall @ 2017-05-04  8:38 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <db5a1fb7-e77e-b9f8-9be4-161e94279d10@arm.com>

On Thu, May 04, 2017 at 09:28:50AM +0100, Marc Zyngier wrote:
> On 04/05/17 09:13, Christoffer Dall wrote:
> > On Thu, May 04, 2017 at 09:09:47AM +0100, Marc Zyngier wrote:
> >> On 03/05/17 19:32, Christoffer Dall wrote:
> >>> Since we got support for devices in userspace which allows reporting the
> >>> PMU overflow output status to userspace, we should actually allow
> >>> creating the PMU on systems without an in-kernel irqchip, which in turn
> >>> requires us to slightly clarify error codes for the ABI and move things
> >>> around for the initialization phase.
> >>>
> >>> Signed-off-by: Christoffer Dall <cdall@linaro.org>
> >>> ---
> >>>  Documentation/virtual/kvm/devices/vcpu.txt | 16 +++++++++-------
> >>>  virt/kvm/arm/pmu.c                         | 27 +++++++++++++++++----------
> >>>  2 files changed, 26 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt
> >>> index 02f5068..352af6e 100644
> >>> --- a/Documentation/virtual/kvm/devices/vcpu.txt
> >>> +++ b/Documentation/virtual/kvm/devices/vcpu.txt
> >>> @@ -16,7 +16,9 @@ Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
> >>>  Returns: -EBUSY: The PMU overflow interrupt is already set
> >>>           -ENXIO: The overflow interrupt not set when attempting to get it
> >>>           -ENODEV: PMUv3 not supported
> >>> -         -EINVAL: Invalid PMU overflow interrupt number supplied
> >>> +         -EINVAL: Invalid PMU overflow interrupt number supplied or
> >>> +                  trying to set the IRQ number without using an in-kernel
> >>> +                  irqchip.
> >>>  
> >>>  A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
> >>>  number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
> >>> @@ -25,11 +27,11 @@ all vcpus, while as an SPI it must be a separate number per vcpu.
> >>>  
> >>>  1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
> >>>  Parameters: no additional parameter in kvm_device_attr.addr
> >>> -Returns: -ENODEV: PMUv3 not supported
> >>> -         -ENXIO: PMUv3 not properly configured as required prior to calling this
> >>> -                 attribute
> >>> +Returns: -ENODEV: PMUv3 not supported or GIC not initialized
> >>> +         -ENXIO: PMUv3 not properly configured or in-kernel irqchip not
> >>> +                 conigured as required prior to calling this attribute
> >>>           -EBUSY: PMUv3 already initialized
> >>>  
> >>> -Request the initialization of the PMUv3.  This must be done after creating the
> >>> -in-kernel irqchip.  Creating a PMU with a userspace irqchip is currently not
> >>> -supported.
> >>> +Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
> >>> +virtual GIC implementation, this must be done after initializing the in-kernel
> >>> +irqchip.
> >>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> >>> index 4b43e7f..f046b08 100644
> >>> --- a/virt/kvm/arm/pmu.c
> >>> +++ b/virt/kvm/arm/pmu.c
> >>> @@ -456,21 +456,25 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> >>>  	if (!kvm_arm_support_pmu_v3())
> >>>  		return -ENODEV;
> >>>  
> >>> -	/*
> >>> -	 * We currently require an in-kernel VGIC to use the PMU emulation,
> >>> -	 * because we do not support forwarding PMU overflow interrupts to
> >>> -	 * userspace yet.
> >>> -	 */
> >>> -	if (!irqchip_in_kernel(vcpu->kvm) || !vgic_initialized(vcpu->kvm))
> >>> -		return -ENODEV;
> >>> -
> >>> -	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features) ||
> >>> -	    !kvm_arm_pmu_irq_initialized(vcpu))
> >>> +	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features))
> >>>  		return -ENXIO;
> >>>  
> >>>  	if (kvm_arm_pmu_v3_ready(vcpu))
> >>>  		return -EBUSY;
> >>>  
> >>> +	if (irqchip_in_kernel(vcpu->kvm)) {
> >>> +		/*
> >>> +		 * If using the PMU with an in-kernel virtual GIC
> >>> +		 * implementation, we require the GIC to be already
> >>> +		 * initialized when initializing the PMU.
> >>> +		 */
> >>> +		if (!vgic_initialized(vcpu->kvm))
> >>> +			return -ENODEV;
> >>> +
> >>> +		if (!kvm_arm_pmu_irq_initialized(vcpu))
> >>> +			return -ENXIO;
> >>> +	}
> >>> +
> >>>  	kvm_pmu_vcpu_reset(vcpu);
> >>>  	vcpu->arch.pmu.ready = true;
> >>>  
> >>> @@ -512,6 +516,9 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >>>  		int __user *uaddr = (int __user *)(long)attr->addr;
> >>>  		int irq;
> >>>  
> >>> +		if (!irqchip_in_kernel(vcpu->kvm))
> >>> +			return -EINVAL;
> >>> +
> >>
> >> Shouldn't we fail the same way for {get,has}_attr? get_attr is going to
> >> generate a -ENXIO, and has_attr is going to lie about the availability
> >> of KVM_ARM_VCPU_PMU_V3_IRQ...
> >>
> > 
> > Here's the text from api.txt:
> > 
> >   Tests whether a device supports a particular attribute.  A successful
> >   return indicates the attribute is implemented.  It does not necessarily
> >   indicate that the attribute can be read or written in the device's
> >   current state.  "addr" is ignored.
> > 
> > My interpretation therefore is that QEMU can use this ioctl to figure
> > out if the feature is supported (sort of like a capability), but that
> > doesn't mean that the configuration of the VM is such that the attribute
> > can be get or set at that moment.
> > 
> > For example, there will also alway be situations where you can get an
> > attr, but not set an attr, what should the has_attr return then?
> 
> My issue here is that whether we can get/set the interrupt or not is not
> a function of the device itself, but of the way it is "wired". No matter
> what "the device's current state" is, we'll never be able to get/set the
> interrupt.
> 
> I'd tend to err on the side of caution and return something that is
> unambiguous, be maybe I have too strict an interpretation of the API.
> 

Hmm, I see the has_attr as a method for userspace to discover "does this
kernel have this feature".  If we make has_attr return a value depending
on the VM having an in-kernel gic or not, we implicitly require
userspace to create a VM with an in-kernel GIC to discover if this
kernel has that feature, and therefore also impose an ordering
requirement of figuring out the capabilities of the kernel (i.e. create
the GIC before checking this API).

I think QEMU should be able to do:

  if (has_attr()) {
     kvm_supports_set_timer_irq = true;
     vtimer_irq = foo;
  } else {
     kvm_supports_set_timer_irq = false;
     vtimer_irq = 27; /* default, we're stuck with it */
  }

  create_board_definition();
  create_dt();
  create_acpi();

  /* do whatever */

  if (kvm_supports_set_timer_irq && kvm_irqchip_in_kernel()) {
  	kvm_arm_timer_set_irq(...);
  }

And all this should not be coupled to when we create the irqchip device.

But I may be failing to see the case where the current implementation
creates a problem for userspace, in which case we should document the
ordering requirement.

Note: that I don't think it's expected that has_attr implies get_attr
and set_attr succeed.  For example, has_attr is currently typically
called with a null pointer to the addr field.

Thanks,
-Christoffer

^ permalink raw reply

* [PATCH v5 19/22] KVM: arm64: vgic-its: ITT save and restore
From: Auger Eric @ 2017-05-04  8:44 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170504082339.GG5923@cbox>

Hi,
On 04/05/2017 10:23, Christoffer Dall wrote:
> On Thu, May 04, 2017 at 09:40:35AM +0200, Auger Eric wrote:
>> Hi Christoffer,
>>
>> On 04/05/2017 09:31, Christoffer Dall wrote:
>>> On Wed, May 03, 2017 at 11:55:34PM +0200, Auger Eric wrote:
>>>> Hi Christoffer,
>>>>
>>>> On 03/05/2017 18:37, Christoffer Dall wrote:
>>>>> On Wed, May 03, 2017 at 06:08:58PM +0200, Auger Eric wrote:
>>>>>> Hi Christoffer,
>>>>>>
>>>>>> On 30/04/2017 22:14, Christoffer Dall wrote:
>>>>>>> On Fri, Apr 14, 2017 at 12:15:31PM +0200, Eric Auger wrote:
>>>>>>>> Introduce routines to save and restore device ITT and their
>>>>>>>> interrupt table entries (ITE).
>>>>>>>>
>>>>>>>> The routines will be called on device table save and
>>>>>>>> restore. They will become static in subsequent patches.
>>>>>>>
>>>>>>> Why this bottom-up approach?  Couldn't you start by having the patch
>>>>>>> that restores the device table and define the static functions that
>>>>>>> return an error there
>>>>>> done
>>>>>> , and then fill them in with subsequent patches
>>>>>>> (liek this one)?
>>>>>>>
>>>>>>> That would have the added benefit of being able to tell how things are
>>>>>>> designed to be called.
>>>>>>>
>>>>>>>>
>>>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>>>>
>>>>>>>> ---
>>>>>>>> v4 -> v5:
>>>>>>>> - ITE are now sorted by eventid on the flush
>>>>>>>> - rename *flush* into *save*
>>>>>>>> - use macros for shits and masks
>>>>>>>> - pass ite_esz to vgic_its_save_ite
>>>>>>>>
>>>>>>>> v3 -> v4:
>>>>>>>> - lookup_table and compute_next_eventid_offset become static in this
>>>>>>>>   patch
>>>>>>>> - remove static along with vgic_its_flush/restore_itt to avoid
>>>>>>>>   compilation warnings
>>>>>>>> - next field only computed with a shift (mask removed)
>>>>>>>> - handle the case where the last element has not been found
>>>>>>>>
>>>>>>>> v2 -> v3:
>>>>>>>> - add return 0 in vgic_its_restore_ite (was in subsequent patch)
>>>>>>>>
>>>>>>>> v2: creation
>>>>>>>> ---
>>>>>>>>  virt/kvm/arm/vgic/vgic-its.c | 128 ++++++++++++++++++++++++++++++++++++++++++-
>>>>>>>>  virt/kvm/arm/vgic/vgic.h     |   4 ++
>>>>>>>>  2 files changed, 129 insertions(+), 3 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>>>>>>> index 35b2ca1..b02fc3f 100644
>>>>>>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>>>>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>>>>>>> @@ -23,6 +23,7 @@
>>>>>>>>  #include <linux/interrupt.h>
>>>>>>>>  #include <linux/list.h>
>>>>>>>>  #include <linux/uaccess.h>
>>>>>>>> +#include <linux/list_sort.h>
>>>>>>>>  
>>>>>>>>  #include <linux/irqchip/arm-gic-v3.h>
>>>>>>>>  
>>>>>>>> @@ -1695,7 +1696,7 @@ u32 compute_next_devid_offset(struct list_head *h, struct its_device *dev)
>>>>>>>>  	return min_t(u32, next_offset, VITS_DTE_MAX_DEVID_OFFSET);
>>>>>>>>  }
>>>>>>>>  
>>>>>>>> -u32 compute_next_eventid_offset(struct list_head *h, struct its_ite *ite)
>>>>>>>> +static u32 compute_next_eventid_offset(struct list_head *h, struct its_ite *ite)
>>>>>>>>  {
>>>>>>>>  	struct list_head *e = &ite->ite_list;
>>>>>>>>  	struct its_ite *next;
>>>>>>>> @@ -1737,8 +1738,8 @@ typedef int (*entry_fn_t)(struct vgic_its *its, u32 id, void *entry,
>>>>>>>>   *
>>>>>>>>   * Return: < 0 on error, 1 if last element identified, 0 otherwise
>>>>>>>>   */
>>>>>>>> -int lookup_table(struct vgic_its *its, gpa_t base, int size, int esz,
>>>>>>>> -		 int start_id, entry_fn_t fn, void *opaque)
>>>>>>>> +static int lookup_table(struct vgic_its *its, gpa_t base, int size, int esz,
>>>>>>>> +			int start_id, entry_fn_t fn, void *opaque)
>>>>>>>>  {
>>>>>>>>  	void *entry = kzalloc(esz, GFP_KERNEL);
>>>>>>>>  	struct kvm *kvm = its->dev->kvm;
>>>>>>>> @@ -1773,6 +1774,127 @@ int lookup_table(struct vgic_its *its, gpa_t base, int size, int esz,
>>>>>>>>  }
>>>>>>>>  
>>>>>>>>  /**
>>>>>>>> + * vgic_its_save_ite - Save an interrupt translation entry at @gpa
>>>>>>>> + */
>>>>>>>> +static int vgic_its_save_ite(struct vgic_its *its, struct its_device *dev,
>>>>>>>> +			      struct its_ite *ite, gpa_t gpa, int ite_esz)
>>>>>>>> +{
>>>>>>>> +	struct kvm *kvm = its->dev->kvm;
>>>>>>>> +	u32 next_offset;
>>>>>>>> +	u64 val;
>>>>>>>> +
>>>>>>>> +	next_offset = compute_next_eventid_offset(&dev->itt_head, ite);
>>>>>>>> +	val = ((u64)next_offset << KVM_ITS_ITE_NEXT_SHIFT) |
>>>>>>>> +	       ((u64)ite->lpi << KVM_ITS_ITE_PINTID_SHIFT) |
>>>>>>>> +		ite->collection->collection_id;
>>>>>>>> +	val = cpu_to_le64(val);
>>>>>>>> +	return kvm_write_guest(kvm, gpa, &val, ite_esz);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * vgic_its_restore_ite - restore an interrupt translation entry
>>>>>>>> + * @event_id: id used for indexing
>>>>>>>> + * @ptr: pointer to the ITE entry
>>>>>>>> + * @opaque: pointer to the its_device
>>>>>>>> + * @next: id offset to the next entry
>>>>>>>> + */
>>>>>>>> +static int vgic_its_restore_ite(struct vgic_its *its, u32 event_id,
>>>>>>>> +				void *ptr, void *opaque, u32 *next)
>>>>>>>> +{
>>>>>>>> +	struct its_device *dev = (struct its_device *)opaque;
>>>>>>>> +	struct its_collection *collection;
>>>>>>>> +	struct kvm *kvm = its->dev->kvm;
>>>>>>>> +	u64 val, *p = (u64 *)ptr;
>>>>>>>
>>>>>>> nit: initializations on separate line (and possible do that just above
>>>>>>> assigning val).
>>>>>> done
>>>>>>>
>>>>>>>> +	struct vgic_irq *irq;
>>>>>>>> +	u32 coll_id, lpi_id;
>>>>>>>> +	struct its_ite *ite;
>>>>>>>> +	int ret;
>>>>>>>> +
>>>>>>>> +	val = *p;
>>>>>>>> +	*next = 1;
>>>>>>>> +
>>>>>>>> +	val = le64_to_cpu(val);
>>>>>>>> +
>>>>>>>> +	coll_id = val & KVM_ITS_ITE_ICID_MASK;
>>>>>>>> +	lpi_id = (val & KVM_ITS_ITE_PINTID_MASK) >> KVM_ITS_ITE_PINTID_SHIFT;
>>>>>>>> +
>>>>>>>> +	if (!lpi_id)
>>>>>>>> +		return 0;
>>>>>>>
>>>>>>> are all non-zero LPI IDs valid?  Don't we have a wrapper that tests if
>>>>>>> the ID is valid?
>>>>>> no, lpi_id must be >= GIC_MIN_LPI=8192; added that check.
>>>>>> ABI Doc says lpi_id==0 is interpreted as invalid. Other values <
>>>>>> GIC_MIN_LPI cause an -EINVAL error
>>>>>>>
>>>>>>> (looks like it's possible to add LPIs with the INTID range of SPIs, SGIs
>>>>>>> and PPIs here)
>>>>>>
>>>>>>>
>>>>>>>> +
>>>>>>>> +	*next = val >> KVM_ITS_ITE_NEXT_SHIFT;
>>>>>>>
>>>>>>> Don't we need to validate this somehow since it will presumably be used
>>>>>>> to forward a pointer somehow by the caller?
>>>>>> checked against max number of eventids supported by the device
>>>>>>>
>>>>>>>> +
>>>>>>>> +	collection = find_collection(its, coll_id);
>>>>>>>> +	if (!collection)
>>>>>>>> +		return -EINVAL;
>>>>>>>> +
>>>>>>>> +	ret = vgic_its_alloc_ite(dev, &ite, collection,
>>>>>>>> +				  lpi_id, event_id);
>>>>>>>> +	if (ret)
>>>>>>>> +		return ret;
>>>>>>>> +
>>>>>>>> +	irq = vgic_add_lpi(kvm, lpi_id);
>>>>>>>> +	if (IS_ERR(irq))
>>>>>>>> +		return PTR_ERR(irq);
>>>>>>>> +	ite->irq = irq;
>>>>>>>> +
>>>>>>>> +	/* restore the configuration of the LPI */
>>>>>>>> +	ret = update_lpi_config(kvm, irq, NULL);
>>>>>>>> +	if (ret)
>>>>>>>> +		return ret;
>>>>>>>> +
>>>>>>>> +	update_affinity_ite(kvm, ite);
>>>>>>>> +	return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static int vgic_its_ite_cmp(void *priv, struct list_head *a,
>>>>>>>> +			    struct list_head *b)
>>>>>>>> +{
>>>>>>>> +	struct its_ite *itea = container_of(a, struct its_ite, ite_list);
>>>>>>>> +	struct its_ite *iteb = container_of(b, struct its_ite, ite_list);
>>>>>>>> +
>>>>>>>> +	if (itea->event_id < iteb->event_id)
>>>>>>>> +		return -1;
>>>>>>>> +	else
>>>>>>>> +		return 1;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +int vgic_its_save_itt(struct vgic_its *its, struct its_device *device)
>>>>>>>> +{
>>>>>>>> +	const struct vgic_its_abi *abi = vgic_its_get_abi(its);
>>>>>>>> +	gpa_t base = device->itt_addr;
>>>>>>>> +	struct its_ite *ite;
>>>>>>>> +	int ret, ite_esz = abi->ite_esz;
>>>>>>>
>>>>>>> nit: initializations on separate line
>>>>>> OK
>>>>>>>
>>>>>>>> +
>>>>>>>> +	list_sort(NULL, &device->itt_head, vgic_its_ite_cmp);
>>>>>>>> +
>>>>>>>> +	list_for_each_entry(ite, &device->itt_head, ite_list) {
>>>>>>>> +		gpa_t gpa = base + ite->event_id * ite_esz;
>>>>>>>> +
>>>>>>>> +		ret = vgic_its_save_ite(its, device, ite, gpa, ite_esz);
>>>>>>>> +		if (ret)
>>>>>>>> +			return ret;
>>>>>>>> +	}
>>>>>>>> +	return 0;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +int vgic_its_restore_itt(struct vgic_its *its, struct its_device *dev)
>>>>>>>> +{
>>>>>>>> +	const struct vgic_its_abi *abi = vgic_its_get_abi(its);
>>>>>>>> +	gpa_t base = dev->itt_addr;
>>>>>>>> +	int ret, ite_esz = abi->ite_esz;
>>>>>>>> +	size_t max_size = BIT_ULL(dev->nb_eventid_bits) * ite_esz;
>>>>>>>
>>>>>>> nit: initializations on separate line
>>>>>> OK
>>>>>>>
>>>>>>>> +
>>>>>>>> +	ret =  lookup_table(its, base, max_size, ite_esz, 0,
>>>>>>>> +			    vgic_its_restore_ite, dev);
>>>>>>>
>>>>>>> nit: extra white space
>>>>>>>
>>>>>>>> +
>>>>>>>> +	if (ret < 0)
>>>>>>>> +		return ret;
>>>>>>>> +
>>>>>>>> +	/* if the last element has not been found we are in trouble */
>>>>>>>> +	return ret ? 0 : -EINVAL;
>>>>>>>
>>>>>>> hmm, these are values potentially created by the guest in guest RAM,
>>>>>>> right?  So do we really abort migration and return an error to userspace
>>>>>>> in this case?
>>>>>> So we discussed with Peter/dave we shouldn't abort() in qemu in case of
>>>>>> such error. The restore table IOCTL will return an error. Up to qemu to
>>>>>> print the error. Destination guest will not be functional though.
>>>>>>
>>>>>
>>>>> ok, I'm just wondering if userspace can make a qualified decision based
>>>>> on this error code.  EINVAL typically means that userspace provided
>>>>> something incorrect, which I suppose in a sense is true, but this should
>>>>> be the only case where we return EINVAL here.
>>>>   Userspace must be able to
>>>>> tell the cases apart where the guest programmed bogus into memory before
>>>>> migration started, in which case we should ignore-and-resume, and where
>>>>> QEMU errornously provide some bogus value where the machine state
>>>>> becomes unreliable and must be powered down.
>>>> guest does not feed much besides few registers the ITS table restore
>>>> depends on. In case we want a more subtle error management at userspace
>>>> level all the error codes need to be revisited I am afraid. My plan was
>>>> to be more rough at the beginning and ignore & resume if ITS table
>>>> restore fails.
>>>>
>>>
>>> Do we require that the VM is quiesced the entire time between saving the
>>> ITS state to memory and copying all memory over the wire and capturing
>>> all register state?  If so, then an error to restore would be because of
>>> userspace doing something wrong and handling that accordingly is fine.
>>
>> yes the ITS table save into RAM starts when we have a guarantee that all
>> the VCPUS are stopped (we take all locks). 
> 
> The important bit is whether or not userspace is allowed to start any
> VCPUs again before copying over all RAM etc.  I suppose not.
no it is not meant to happen.
> 
>> The restore happens before
>> the VM gets resumed. At least this is the QEMU integration as of today.
>>
> 
> Does our ABI mandate this behavior (document it somewhere) ?
I will add this

Thanks

Eric
> 
> Thanks,
> -Christoffer
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply

* [PATCH 1/5] KVM: arm64: Allow creating the PMU without the in-kernel GIC
From: Marc Zyngier @ 2017-05-04  9:05 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170504083845.GH5923@cbox>

On 04/05/17 09:38, Christoffer Dall wrote:
> On Thu, May 04, 2017 at 09:28:50AM +0100, Marc Zyngier wrote:
>> On 04/05/17 09:13, Christoffer Dall wrote:
>>> On Thu, May 04, 2017 at 09:09:47AM +0100, Marc Zyngier wrote:
>>>> On 03/05/17 19:32, Christoffer Dall wrote:
>>>>> Since we got support for devices in userspace which allows reporting the
>>>>> PMU overflow output status to userspace, we should actually allow
>>>>> creating the PMU on systems without an in-kernel irqchip, which in turn
>>>>> requires us to slightly clarify error codes for the ABI and move things
>>>>> around for the initialization phase.
>>>>>
>>>>> Signed-off-by: Christoffer Dall <cdall@linaro.org>
>>>>> ---
>>>>>  Documentation/virtual/kvm/devices/vcpu.txt | 16 +++++++++-------
>>>>>  virt/kvm/arm/pmu.c                         | 27 +++++++++++++++++----------
>>>>>  2 files changed, 26 insertions(+), 17 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt
>>>>> index 02f5068..352af6e 100644
>>>>> --- a/Documentation/virtual/kvm/devices/vcpu.txt
>>>>> +++ b/Documentation/virtual/kvm/devices/vcpu.txt
>>>>> @@ -16,7 +16,9 @@ Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
>>>>>  Returns: -EBUSY: The PMU overflow interrupt is already set
>>>>>           -ENXIO: The overflow interrupt not set when attempting to get it
>>>>>           -ENODEV: PMUv3 not supported
>>>>> -         -EINVAL: Invalid PMU overflow interrupt number supplied
>>>>> +         -EINVAL: Invalid PMU overflow interrupt number supplied or
>>>>> +                  trying to set the IRQ number without using an in-kernel
>>>>> +                  irqchip.
>>>>>  
>>>>>  A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
>>>>>  number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
>>>>> @@ -25,11 +27,11 @@ all vcpus, while as an SPI it must be a separate number per vcpu.
>>>>>  
>>>>>  1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
>>>>>  Parameters: no additional parameter in kvm_device_attr.addr
>>>>> -Returns: -ENODEV: PMUv3 not supported
>>>>> -         -ENXIO: PMUv3 not properly configured as required prior to calling this
>>>>> -                 attribute
>>>>> +Returns: -ENODEV: PMUv3 not supported or GIC not initialized
>>>>> +         -ENXIO: PMUv3 not properly configured or in-kernel irqchip not
>>>>> +                 conigured as required prior to calling this attribute
>>>>>           -EBUSY: PMUv3 already initialized
>>>>>  
>>>>> -Request the initialization of the PMUv3.  This must be done after creating the
>>>>> -in-kernel irqchip.  Creating a PMU with a userspace irqchip is currently not
>>>>> -supported.
>>>>> +Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
>>>>> +virtual GIC implementation, this must be done after initializing the in-kernel
>>>>> +irqchip.
>>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>>> index 4b43e7f..f046b08 100644
>>>>> --- a/virt/kvm/arm/pmu.c
>>>>> +++ b/virt/kvm/arm/pmu.c
>>>>> @@ -456,21 +456,25 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
>>>>>  	if (!kvm_arm_support_pmu_v3())
>>>>>  		return -ENODEV;
>>>>>  
>>>>> -	/*
>>>>> -	 * We currently require an in-kernel VGIC to use the PMU emulation,
>>>>> -	 * because we do not support forwarding PMU overflow interrupts to
>>>>> -	 * userspace yet.
>>>>> -	 */
>>>>> -	if (!irqchip_in_kernel(vcpu->kvm) || !vgic_initialized(vcpu->kvm))
>>>>> -		return -ENODEV;
>>>>> -
>>>>> -	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features) ||
>>>>> -	    !kvm_arm_pmu_irq_initialized(vcpu))
>>>>> +	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features))
>>>>>  		return -ENXIO;
>>>>>  
>>>>>  	if (kvm_arm_pmu_v3_ready(vcpu))
>>>>>  		return -EBUSY;
>>>>>  
>>>>> +	if (irqchip_in_kernel(vcpu->kvm)) {
>>>>> +		/*
>>>>> +		 * If using the PMU with an in-kernel virtual GIC
>>>>> +		 * implementation, we require the GIC to be already
>>>>> +		 * initialized when initializing the PMU.
>>>>> +		 */
>>>>> +		if (!vgic_initialized(vcpu->kvm))
>>>>> +			return -ENODEV;
>>>>> +
>>>>> +		if (!kvm_arm_pmu_irq_initialized(vcpu))
>>>>> +			return -ENXIO;
>>>>> +	}
>>>>> +
>>>>>  	kvm_pmu_vcpu_reset(vcpu);
>>>>>  	vcpu->arch.pmu.ready = true;
>>>>>  
>>>>> @@ -512,6 +516,9 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>>>>>  		int __user *uaddr = (int __user *)(long)attr->addr;
>>>>>  		int irq;
>>>>>  
>>>>> +		if (!irqchip_in_kernel(vcpu->kvm))
>>>>> +			return -EINVAL;
>>>>> +
>>>>
>>>> Shouldn't we fail the same way for {get,has}_attr? get_attr is going to
>>>> generate a -ENXIO, and has_attr is going to lie about the availability
>>>> of KVM_ARM_VCPU_PMU_V3_IRQ...
>>>>
>>>
>>> Here's the text from api.txt:
>>>
>>>   Tests whether a device supports a particular attribute.  A successful
>>>   return indicates the attribute is implemented.  It does not necessarily
>>>   indicate that the attribute can be read or written in the device's
>>>   current state.  "addr" is ignored.
>>>
>>> My interpretation therefore is that QEMU can use this ioctl to figure
>>> out if the feature is supported (sort of like a capability), but that
>>> doesn't mean that the configuration of the VM is such that the attribute
>>> can be get or set at that moment.
>>>
>>> For example, there will also alway be situations where you can get an
>>> attr, but not set an attr, what should the has_attr return then?
>>
>> My issue here is that whether we can get/set the interrupt or not is not
>> a function of the device itself, but of the way it is "wired". No matter
>> what "the device's current state" is, we'll never be able to get/set the
>> interrupt.
>>
>> I'd tend to err on the side of caution and return something that is
>> unambiguous, be maybe I have too strict an interpretation of the API.
>>
> 
> Hmm, I see the has_attr as a method for userspace to discover "does this
> kernel have this feature".  If we make has_attr return a value depending
> on the VM having an in-kernel gic or not, we implicitly require
> userspace to create a VM with an in-kernel GIC to discover if this
> kernel has that feature, and therefore also impose an ordering
> requirement of figuring out the capabilities of the kernel (i.e. create
> the GIC before checking this API).
> 
> I think QEMU should be able to do:
> 
>   if (has_attr()) {
>      kvm_supports_set_timer_irq = true;
>      vtimer_irq = foo;
>   } else {
>      kvm_supports_set_timer_irq = false;
>      vtimer_irq = 27; /* default, we're stuck with it */
>   }
> 
>   create_board_definition();
>   create_dt();
>   create_acpi();
> 
>   /* do whatever */
> 
>   if (kvm_supports_set_timer_irq && kvm_irqchip_in_kernel()) {
>   	kvm_arm_timer_set_irq(...);
>   }
> 
> And all this should not be coupled to when we create the irqchip device.
> 
> But I may be failing to see the case where the current implementation
> creates a problem for userspace, in which case we should document the
> ordering requirement.

I'm not sure it would create any problem, at least not for the PMU
(there is no working code that would have created a PMU without an irqchip).

It is just that we have a slightly diverging interpretation of what
has_attr means. You see it as "attribute that the device supports", and
I see it as "attribute that the device supports in this configuration".
I'm happy to use your semantics from now on.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH] arch: arm64: Define big endian of IFC for LS1043a/LS1046a
From: Prabhakar Kushwaha @ 2017-05-04  9:10 UTC (permalink / raw)
  To: linux-arm-kernel

Integrated flash controller present in LS1043A and LS1046A is big endian.

So add big endian property in the devive tree.

Signed-off-by: Prabhakar Kushwaha <prabhakar.kushwaha@nxp.com>
---
 arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 1 +
 arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
index 45cface..09f1775 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
@@ -223,6 +223,7 @@
 		ifc: ifc at 1530000 {
 			compatible = "fsl,ifc", "simple-bus";
 			reg = <0x0 0x1530000 0x0 0x10000>;
+			big-endian;
 			interrupts = <0 43 0x4>;
 		};
 
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
index f4b8b7e..509e6c2 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
@@ -190,6 +190,7 @@
 		ifc: ifc at 1530000 {
 			compatible = "fsl,ifc", "simple-bus";
 			reg = <0x0 0x1530000 0x0 0x10000>;
+			big-endian;
 			interrupts = <GIC_SPI 43 IRQ_TYPE_LEVEL_HIGH>;
 		};
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH] arm: arm64: Add flash node for ls1088a qds and rdb
From: Prabhakar Kushwaha @ 2017-05-04  9:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493889032-20857-1-git-send-email-prabhakar.kushwaha@nxp.com>

LS1088AQDS consist of NOR, NAND and FPGA connected over IFC
LS1088ARDB consist of NAND and FPGA connected over IFC.

So add flash information in ifc node of device tree.

Signed-off-by: Prabhakar Kushwaha <prabhakar.kushwaha@nxp.com>
---
 arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts | 27 +++++++++++++++++++++++
 arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts | 20 +++++++++++++++++
 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi    |  4 ----
 3 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts b/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts
index 8c3cae5..ca7eb41 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a-qds.dts
@@ -53,6 +53,33 @@
 	compatible = "fsl,ls1088a-qds", "fsl,ls1088a";
 };
 
+&ifc {
+	#address-cells = <2>;
+	#size-cells = <1>;
+	status = "okay";
+
+	ranges = <0 0 0x5 0x80000000 0x08000000
+		  2 0 0x5 0x30000000 0x00010000
+		  3 0 0x5 0x20000000 0x00010000>;
+
+	nor at 0,0 {
+		compatible = "cfi-flash";
+		reg = <0x0 0x0 0x8000000>;
+		bank-width = <2>;
+		device-width = <1>;
+	};
+
+	nand at 2,0 {
+		compatible = "fsl,ifc-nand";
+		reg = <0x2 0x0 0x10000>;
+	};
+
+	fpga: board-control at 3,0 {
+		compatible = "fsl,ls1088aqds-fpga", "fsl,fpga-qixis";
+		reg = <0x3 0x0 0x0000100>;
+	};
+};
+
 &i2c0 {
 	status = "okay";
 
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts b/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts
index 8a04fbb..726e167 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts
@@ -53,6 +53,26 @@
 	compatible = "fsl,ls1088a-rdb", "fsl,ls1088a";
 };
 
+&ifc {
+	#address-cells = <2>;
+	#size-cells = <1>;
+	status = "okay";
+
+	ranges = <0 0 0x5 0x30000000 0x00010000
+		  2 0 0x5 0x20000000 0x00010000>;
+
+	nand at 0,0 {
+		compatible = "fsl,ifc-nand";
+		reg = <0x0 0x0 0x10000>;
+	};
+
+	fpga: board-control at 2,0 {
+		compatible = "fsl,ls1088ardb-fpga", "fsl,fpga-qixis";
+		reg = <0x2 0x0 0x0000100>;
+	};
+};
+
+
 &i2c0 {
 	status = "okay";
 
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index 2946fd7..9f6bcb4 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -217,10 +217,6 @@
 			#address-cells = <2>;
 			#size-cells = <1>;
 
-			ranges = <0 0 0x5 0x80000000 0x08000000
-				  2 0 0x5 0x30000000 0x00010000
-				  3 0 0x5 0x20000000 0x00010000>;
-			status = "disabled";
 		};
 
 		i2c0: i2c at 2000000 {
-- 
2.7.4

^ permalink raw reply related

* [PATCH v1 0/2] Add PCIe phy driver for some Mediatek SoCs
From: Ryder Lee @ 2017-05-04  9:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This patch series add PCIe phy driver and related dt-binding file for
Mediatek mt7623 SoCs families.

Changes since v1:
- revise binding document:
  drop 'status' properties.
  add a description to 'phy-switch' property and add vendor prefix.

Ryder Lee (2):
  phy: add PCIe PHY driver for mt7623 SoCs families
  dt-bindings: phy: Add documentation for Mediatek PCIe PHY

 .../devicetree/bindings/phy/phy-mt7623-pcie.txt    |  63 +++++
 drivers/phy/Kconfig                                |   9 +
 drivers/phy/Makefile                               |   1 +
 drivers/phy/phy-mt7623-pcie.c                      | 290 +++++++++++++++++++++
 4 files changed, 363 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
 create mode 100644 drivers/phy/phy-mt7623-pcie.c

-- 
1.9.1

^ permalink raw reply

* [PATCH v1 1/2] phy: add PCIe PHY driver for mt7623 SoCs families
From: Ryder Lee @ 2017-05-04  9:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493890308-6159-1-git-send-email-ryder.lee@mediatek.com>

support PCIe PHY of MT7623 SoCs families

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
---
 drivers/phy/Kconfig           |   9 ++
 drivers/phy/Makefile          |   1 +
 drivers/phy/phy-mt7623-pcie.c | 290 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 300 insertions(+)
 create mode 100644 drivers/phy/phy-mt7623-pcie.c

diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
index dc5277a..00ab313 100644
--- a/drivers/phy/Kconfig
+++ b/drivers/phy/Kconfig
@@ -518,4 +518,13 @@ config PHY_NSP_USB3
 	help
 	  Enable this to support the Broadcom Northstar plus USB3 PHY.
 	  If unsure, say N.
+
+config PHY_MT7623_PCIE
+	tristate "Mediatek PCIe PHY driver for MT7623 SoC families"
+	depends on ARCH_MEDIATEK && OF
+	select GENERIC_PHY
+	select MFD_SYSCON
+	help
+	  Say 'Y' here to add support for Mediatek PCIe PHY driver which
+	  can be found on the MT7623 SoC families.
 endmenu
diff --git a/drivers/phy/Makefile b/drivers/phy/Makefile
index e7b0feb..95032d6 100644
--- a/drivers/phy/Makefile
+++ b/drivers/phy/Makefile
@@ -63,3 +63,4 @@ obj-$(CONFIG_ARCH_TEGRA) += tegra/
 obj-$(CONFIG_PHY_NS2_PCIE)		+= phy-bcm-ns2-pcie.o
 obj-$(CONFIG_PHY_MESON8B_USB2)		+= phy-meson8b-usb2.o
 obj-$(CONFIG_PHY_NSP_USB3)		+= phy-bcm-nsp-usb3.o
+obj-$(CONFIG_PHY_MT7623_PCIE)		+= phy-mt7623-pcie.o
diff --git a/drivers/phy/phy-mt7623-pcie.c b/drivers/phy/phy-mt7623-pcie.c
new file mode 100644
index 0000000..ee575ce
--- /dev/null
+++ b/drivers/phy/phy-mt7623-pcie.c
@@ -0,0 +1,290 @@
+/*
+ * Copyright (c) 2017 MediaTek Inc.
+ * Author: Ryder Lee <ryder.lee@mediatek.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/mfd/syscon.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <linux/phy/phy.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+
+/* Offsets of sub-segment in each port registers */
+#define PCIE_SIFSLV_PHYD_BANK2_BASE	0xa00
+#define SSUSB_SIFSLV_PHYA_BASE		0xb00
+#define SSUSB_SIFSLV_PHYA_DA_BASE	0xc00
+
+/*
+ * RX detection stable - 1 scale represent 8 reference cycles
+ * cover reference clock from 1M~100MHz, 7us~40us
+ */
+#define B2_PHYD_RXDET1			(PCIE_SIFSLV_PHYD_BANK2_BASE + 0x28)
+#define RG_SSUSB_RXDET_STB2		GENMASK(17, 9)
+#define RG_SSUSB_RXDET_STB2_VAL(x)	((0x1ff & (x)) << 9)
+
+#define B2_PHYD_RXDET2			(PCIE_SIFSLV_PHYD_BANK2_BASE + 0x2c)
+#define RG_SSUSB_RXDET_STB2_P3		GENMASK(8, 0)
+#define RG_SSUSB_RXDET_STB2_P3_VAL(x)	(0x1ff & (x))
+
+#define U3_PHYA_REG0			(SSUSB_SIFSLV_PHYA_BASE + 0x00)
+#define RG_PCIE_CLKDRV_OFFSET		GENMASK(3, 1)
+#define RG_PCIE_CLKDRV_OFFSET_VAL(x)	((0x3 & (x)) << 2)
+
+#define U3_PHYA_REG1			(SSUSB_SIFSLV_PHYA_BASE + 0x04)
+#define RG_PCIE_CLKDRV_AMP		GENMASK(31, 29)
+#define RG_PCIE_CLKDRV_AMP_VAL(x)	((0x7 & (x)) << 29)
+
+#define DA_SSUSB_CDR_REFCK_SEL		(SSUSB_SIFSLV_PHYA_DA_BASE + 0x00)
+#define RG_SSUSB_XTAL_EXT_PE1H		GENMASK(13, 12)
+#define RG_SSUSB_XTAL_EXT_PE1H_VAL(x)	((0x3 & (x)) << 12)
+#define RG_SSUSB_XTAL_EXT_PE2H		GENMASK(17, 16)
+#define RG_SSUSB_XTAL_EXT_PE2H_VAL(x)		((0x3 & (x)) << 16)
+
+#define DA_SSUSB_PLL_IC			(SSUSB_SIFSLV_PHYA_DA_BASE + 0x0c)
+#define RG_SSUSB_PLL_IC_PE2H		GENMASK(15, 12)
+#define RG_SSUSB_PLL_IC_PE2H_VAL(x)	((0xf & (x)) << 12)
+#define RG_SSUSB_PLL_BR_PE2H		GENMASK(29, 28)
+#define RG_SSUSB_PLL_BR_PE2H_VAL(x)	((0x3 & (x)) << 28)
+
+#define DA_SSUSB_PLL_BC			(SSUSB_SIFSLV_PHYA_DA_BASE + 0x08)
+#define RG_SSUSB_PLL_DIVEN_PE2H		GENMASK(21, 19)
+#define RG_SSUSB_PLL_BC_PE2H		GENMASK(7, 6)
+#define RG_SSUSB_PLL_BC_PE2H_VAL(x)	((0x3 & (x)) << 6)
+
+#define DA_SSUSB_PLL_IR			(SSUSB_SIFSLV_PHYA_DA_BASE + 0x10)
+#define RG_SSUSB_PLL_IR_PE2H		GENMASK(19, 16)
+#define RG_SSUSB_PLL_IR_PE2H_VAL(x)	((0xf & (x)) << 16)
+
+#define DA_SSUSB_PLL_BP			(SSUSB_SIFSLV_PHYA_DA_BASE + 0x14)
+#define RG_SSUSB_PLL_BP_PE2H		GENMASK(19, 16)
+#define RG_SSUSB_PLL_BP_PE2H_VAL(x)	((0xf & (x)) << 16)
+
+#define DA_SSUSB_PLL_SSC_DELTA1_REG20	(SSUSB_SIFSLV_PHYA_DA_BASE + 0x3c)
+#define RG_SSUSB_PLL_SSC_DELTA1_PE2H		GENMASK(31, 16)
+#define RG_SSUSB_PLL_SSC_DELTA1_PE2H_VAL(x)	((0xffff & (x)) << 16)
+
+#define DA_SSUSB_PLL_SSC_DELTA_REG25	(SSUSB_SIFSLV_PHYA_DA_BASE + 0x48)
+#define RG_SSUSB_PLL_SSC_DELTA_PE2H		GENMASK(15, 0)
+#define RG_SSUSB_PLL_SSC_DELTA_PE2H_VAL(x)	(0xffff & (x))
+
+#define HIF_SYSCFG1			0x14
+#define HIF_SYSCFG1_PHY2_MASK		(0x3 << 20)
+
+struct mtk_pcie_phy {
+	struct device *dev;
+	void __iomem *base;
+	struct regmap *hif;
+	struct clk *phya_ref;
+	struct phy *phy;
+};
+
+static inline u32 phy_read(struct mtk_pcie_phy *phy, u32 reg)
+{
+	return readl(phy->base + reg);
+}
+
+static inline void phy_write(struct mtk_pcie_phy *phy, u32 val, u32 reg)
+{
+	writel(val, phy->base + reg);
+}
+
+static int mtk_pcie_phy_power_on(struct phy *phy)
+{
+	struct mtk_pcie_phy *mtk_phy = phy_get_drvdata(phy);
+	int err;
+	u32 val;
+
+	/* PCIe port2 PHY is shared with USB u3phy2 */
+	if (mtk_phy->hif)
+		regmap_update_bits(mtk_phy->hif, HIF_SYSCFG1,
+				   HIF_SYSCFG1_PHY2_MASK, 0);
+
+	err = clk_prepare_enable(mtk_phy->phya_ref);
+	if (err) {
+		dev_err(mtk_phy->dev, "failed to enable PCIe phy clock\n");
+		return err;
+	}
+
+	val = phy_read(mtk_phy, DA_SSUSB_CDR_REFCK_SEL);
+	val &= ~(RG_SSUSB_XTAL_EXT_PE1H | RG_SSUSB_XTAL_EXT_PE2H);
+	val |= RG_SSUSB_XTAL_EXT_PE1H_VAL(0x2) |
+	       RG_SSUSB_XTAL_EXT_PE2H_VAL(0x2);
+	phy_write(mtk_phy, val, DA_SSUSB_CDR_REFCK_SEL);
+
+	/* ref clk drive */
+	val = phy_read(mtk_phy, U3_PHYA_REG1);
+	val &= ~RG_PCIE_CLKDRV_AMP;
+	val |= RG_PCIE_CLKDRV_AMP_VAL(0x4);
+	phy_write(mtk_phy, val, U3_PHYA_REG1);
+
+	val = phy_read(mtk_phy, U3_PHYA_REG0);
+	val &= ~RG_PCIE_CLKDRV_OFFSET;
+	val |= RG_PCIE_CLKDRV_OFFSET_VAL(0x1);
+	phy_write(mtk_phy, val, U3_PHYA_REG0);
+
+	/* SSC delta -5000ppm */
+	val = phy_read(mtk_phy, DA_SSUSB_PLL_SSC_DELTA1_REG20);
+	val &= ~RG_SSUSB_PLL_SSC_DELTA1_PE2H;
+	val |= RG_SSUSB_PLL_SSC_DELTA1_PE2H_VAL(0x3c);
+	phy_write(mtk_phy, val, DA_SSUSB_PLL_SSC_DELTA1_REG20);
+
+	val = phy_read(mtk_phy, DA_SSUSB_PLL_SSC_DELTA_REG25);
+	val &= ~RG_SSUSB_PLL_SSC_DELTA_PE2H;
+	val |= RG_SSUSB_PLL_SSC_DELTA_PE2H_VAL(0x36);
+	phy_write(mtk_phy, val, DA_SSUSB_PLL_SSC_DELTA_REG25);
+
+	/* change pll BW 0.6M */
+	val = phy_read(mtk_phy, DA_SSUSB_PLL_IC);
+	val &= ~RG_SSUSB_PLL_BR_PE2H;
+	val |= RG_SSUSB_PLL_BR_PE2H_VAL(0x1);
+	phy_write(mtk_phy, val, DA_SSUSB_PLL_IC);
+
+	val = phy_read(mtk_phy, DA_SSUSB_PLL_BC);
+	val &= ~(RG_SSUSB_PLL_DIVEN_PE2H | RG_SSUSB_PLL_BC_PE2H);
+	val |= RG_SSUSB_PLL_BC_PE2H_VAL(0x3);
+	phy_write(mtk_phy, val, DA_SSUSB_PLL_BC);
+
+	val = phy_read(mtk_phy, DA_SSUSB_PLL_IR);
+	val &= ~RG_SSUSB_PLL_IR_PE2H;
+	val |= RG_SSUSB_PLL_IR_PE2H_VAL(0x2);
+	phy_write(mtk_phy, val, DA_SSUSB_PLL_IR);
+
+	val = phy_read(mtk_phy, DA_SSUSB_PLL_IC);
+	val &= ~RG_SSUSB_PLL_IC_PE2H;
+	val |= RG_SSUSB_PLL_IC_PE2H_VAL(0x1);
+	phy_write(mtk_phy, val, DA_SSUSB_PLL_IC);
+
+	val = phy_read(mtk_phy, DA_SSUSB_PLL_BP);
+	val &= ~RG_SSUSB_PLL_BP_PE2H;
+	val |= RG_SSUSB_PLL_BP_PE2H_VAL(0xa);
+	phy_write(mtk_phy, val, DA_SSUSB_PLL_BP);
+
+	/* Tx Detect Rx Timing: 10us -> 5us */
+	val = phy_read(mtk_phy, B2_PHYD_RXDET1);
+	val &= ~RG_SSUSB_RXDET_STB2;
+	val |= RG_SSUSB_RXDET_STB2_VAL(0x10);
+	phy_write(mtk_phy, val, B2_PHYD_RXDET1);
+
+	val = phy_read(mtk_phy, B2_PHYD_RXDET2);
+	val &= ~RG_SSUSB_RXDET_STB2_P3;
+	val |= RG_SSUSB_RXDET_STB2_P3_VAL(0x10);
+	phy_write(mtk_phy, val, B2_PHYD_RXDET2);
+
+	/* wait for PCIe subsys(MAC layer) register to active */
+	usleep_range(2500, 3000);
+
+	return 0;
+}
+
+static int mtk_pcie_phy_power_off(struct phy *phy)
+{
+	struct mtk_pcie_phy *mtk_phy = phy_get_drvdata(phy);
+
+	clk_disable_unprepare(mtk_phy->phya_ref);
+
+	return 0;
+}
+
+static struct phy_ops mtk_pcie_phy_ops = {
+	.power_on	= mtk_pcie_phy_power_on,
+	.power_off	= mtk_pcie_phy_power_off,
+	.owner		= THIS_MODULE,
+};
+
+static const struct of_device_id mtk_pcie_phy_of_match[];
+
+static int mtk_pcie_phy_probe(struct platform_device *pdev)
+{
+	struct device_node *np = pdev->dev.of_node;
+	const struct of_device_id *match;
+	struct phy_provider *phy_provider;
+	struct mtk_pcie_phy *mtk_phy;
+	struct resource *res;
+	struct phy *phy;
+
+	match = of_match_device(mtk_pcie_phy_of_match, &pdev->dev);
+	if (!match)
+		return -ENODEV;
+
+	mtk_phy = devm_kzalloc(&pdev->dev, sizeof(*mtk_phy), GFP_KERNEL);
+	if (!mtk_phy)
+		return -ENOMEM;
+
+	mtk_phy->dev = &pdev->dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	mtk_phy->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(mtk_phy->base)) {
+		dev_err(&pdev->dev, "failed to get phy base\n");
+		return PTR_ERR(mtk_phy->base);
+	}
+
+	mtk_phy->phya_ref = devm_clk_get(&pdev->dev, "pciephya_ref");
+	if (IS_ERR(mtk_phy->phya_ref)) {
+		dev_err(&pdev->dev, "error to get pciephya_ref\n");
+		return PTR_ERR(mtk_phy->phya_ref);
+	}
+
+	if (of_find_property(np, "mediatek,phy-switch", NULL)) {
+		mtk_phy->hif = syscon_regmap_lookup_by_phandle(
+						np, "mediatek,phy-switch");
+		if (IS_ERR(mtk_phy->hif)) {
+			dev_err(&pdev->dev, "missing \"mediatek,phy-switch\" phandle\n");
+			return PTR_ERR(mtk_phy->hif);
+		}
+	}
+
+	platform_set_drvdata(pdev, mtk_phy);
+	phy = devm_phy_create(&pdev->dev, NULL, &mtk_pcie_phy_ops);
+	if (IS_ERR(phy)) {
+		dev_err(&pdev->dev, "failed to create phy device\n");
+		return PTR_ERR(phy);
+	}
+
+	mtk_phy->phy = phy;
+	phy_set_drvdata(phy, mtk_phy);
+
+	phy_provider = devm_of_phy_provider_register(&pdev->dev,
+						     of_phy_simple_xlate);
+	if (IS_ERR(phy_provider)) {
+		dev_err(&pdev->dev, "failed to register phy provider\n");
+		return PTR_ERR(phy_provider);
+	}
+
+	return 0;
+}
+
+static const struct of_device_id mtk_pcie_phy_of_match[] = {
+	{ .compatible = "mediatek,mt7623-pcie-phy"},
+	{ .compatible = "mediatek,mt2701-pcie-phy"},
+	{},
+};
+MODULE_DEVICE_TABLE(of, mtk_pcie_phy_of_match);
+
+static struct platform_driver mtk_pcie_phy_driver = {
+	.probe	= mtk_pcie_phy_probe,
+	.driver = {
+		.name	= "mtk-pcie-phy",
+		.of_match_table	= mtk_pcie_phy_of_match,
+	}
+};
+module_platform_driver(mtk_pcie_phy_driver);
+
+MODULE_AUTHOR("Ryder Lee <ryder.lee@mediatek.com>");
+MODULE_DESCRIPTION("Mediatek PCIe PHY driver");
+MODULE_LICENSE("GPL v2");
-- 
1.9.1

^ permalink raw reply related

* [PATCH v1 2/2] dt-bindings: phy: Add documentation for Mediatek PCIe PHY
From: Ryder Lee @ 2017-05-04  9:31 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493890308-6159-1-git-send-email-ryder.lee@mediatek.com>

Add documentation for PCIe PHY available in MT7623 series SoCs.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
---
 .../devicetree/bindings/phy/phy-mt7623-pcie.txt    | 63 ++++++++++++++++++++++
 1 file changed, 63 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt

diff --git a/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt b/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
new file mode 100644
index 0000000..1309500
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/phy-mt7623-pcie.txt
@@ -0,0 +1,63 @@
+Mediatek MT7623 PCIe PHY
+-----------------------
+
+Required properties:
+ - compatible: Should contain "mediatek,mt7623-pcie-phy".
+ - reg: Base address and length of the registers.
+ - clocks: Must contain an entry in clock-names.
+   See ../clocks/clock-bindings.txt for details.
+ - clock-names: Must be "pciephya_ref"
+ - #phy-cells: Must be 0.
+
+Optional properties:
+ - mediatek,phy-switch: A phandle to the system controller, used to
+   switch the PHY on PCIe port2 which is shared with USB u3phy2.
+
+Example:
+
+	pcie0_phy: pcie-phy at 1a149000 {
+		compatible = "mediatek,mt7623-pcie-phy";
+		reg = <0 0x1a149000 0 0x1000>;
+		clocks = <&clk26m>;
+		clock-names = "pciephya_ref";
+		#phy-cells = <0>;
+	};
+
+	pcie1_phy: pcie-phy at 1a14a000 {
+		compatible = "mediatek,mt7623-pcie-phy";
+		reg = <0 0x1a14a000 0 0x1000>;
+		clocks = <&clk26m>;
+		clock-names = "pciephya_ref";
+		#phy-cells = <0>;
+	};
+
+	pcie2_phy: pcie-phy at 1a244000 {
+		compatible = "mediatek,mt7623-pcie-phy";
+		reg = <0 0x1a244000 0 0x1000>;
+		clocks = <&clk26m>;
+		clock-names = "pciephya_ref";
+		#phy-cells = <0>;
+
+		mediatek,phy-switch = <&hifsys>;
+	};
+
+Specifying phy control of devices
+---------------------------------
+
+Device nodes should specify the configuration required in their "phys"
+property, containing a phandle to the phy node and phy-names.
+
+Example:
+
+#include <dt-bindings/phy/phy.h>
+
+pcie: pcie at 1a140000 {
+	...
+	pcie at 1,0 {
+		...
+		phys = <&pcie0_phy>;
+		phy-names = "pcie-phy0";
+	}
+	...
+};
+
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2 0/2] Add PCIe host driver support for Mediatek SoCs
From: Ryder Lee @ 2017-05-04  9:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This patch series add Mediatek Gen2 PCIe host controller driver and
dt-binding document. It can be found on MT7623 series SoCs.

This driver was validated using Broadcom Tigon3 and Intel(R) 82575/82576
gigabit ethernet card.

Changes since v2:
- modify Kconfig to avoid kbuild test error on some architecture.
- change compatible string.
- revise binding document:
  add missing interrupt-names.
  remove the board dts example and drop 'status' properties.
  remove unnecessary descriptions bout standard PCI bus binding.

Changes since v1:
- add .suppress_bind_attrs.
- remove unnecessary *_valid_device() pattern.
- remove PCI_PROBE_ONLY.
- use the regular readl() instead of readl_relaxed().
- add .map_bus() and change to use pci_generic_config_read/pci_generic_config_write.
- revise dt-binding document and move nonstandard properties to root node.
- change compatible string.
- use interrupt-map property and replace mtk_pcie_map_irq() with of_irq_parse_and_map_pci().
- use the new pci_register_host_bridge() method instead of pci_scan_root_bus().

Ryder Lee (2):
  PCI: mediatek: Add Mediatek PCIe host controller support
  dt-bindings: pcie: Add documentation for Mediatek PCIe

 .../bindings/pci/mediatek,mt7623-pcie.txt          | 149 ++++++
 drivers/pci/host/Kconfig                           |  11 +
 drivers/pci/host/Makefile                          |   1 +
 drivers/pci/host/pcie-mediatek.c                   | 563 +++++++++++++++++++++
 4 files changed, 724 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
 create mode 100644 drivers/pci/host/pcie-mediatek.c

-- 
1.9.1

^ permalink raw reply

* [PATCH v2 1/2] PCI: mediatek: Add Mediatek PCIe host controller support
From: Ryder Lee @ 2017-05-04  9:32 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493890376-6713-1-git-send-email-ryder.lee@mediatek.com>

Add support for the Mediatek PCIe Gen2 controller which can
be found on MT7623 series SoCs.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
---
 drivers/pci/host/Kconfig         |  11 +
 drivers/pci/host/Makefile        |   1 +
 drivers/pci/host/pcie-mediatek.c | 563 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 575 insertions(+)
 create mode 100644 drivers/pci/host/pcie-mediatek.c

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index f7c1d4d..aef0de9 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -174,6 +174,17 @@ config PCIE_ROCKCHIP
 	  There is 1 internal PCIe port available to support GEN2 with
 	  4 slots.
 
+config PCIE_MEDIATEK
+	bool "Mediatek PCIe controller"
+	depends on ARM && (ARCH_MEDIATEK || COMPILE_TEST)
+	depends on OF
+	depends on PCI
+	select PCIEPORTBUS
+	help
+	  Say Y here if you want to enable PCIe controller support on MT7623 series
+	  SoCs. There is one single root complex with 3 root ports available.
+	  Each port supports Gen2 lane x1.
+
 config VMD
 	depends on PCI_MSI && X86_64 && SRCU
 	tristate "Intel Volume Management Device Driver"
diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile
index 4d36866..265adff 100644
--- a/drivers/pci/host/Makefile
+++ b/drivers/pci/host/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_PCIE_IPROC_BCMA) += pcie-iproc-bcma.o
 obj-$(CONFIG_PCIE_ALTERA) += pcie-altera.o
 obj-$(CONFIG_PCIE_ALTERA_MSI) += pcie-altera-msi.o
 obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
+obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
 obj-$(CONFIG_VMD) += vmd.o
 
 # The following drivers are for devices that use the generic ACPI
diff --git a/drivers/pci/host/pcie-mediatek.c b/drivers/pci/host/pcie-mediatek.c
new file mode 100644
index 0000000..5e8c1bf
--- /dev/null
+++ b/drivers/pci/host/pcie-mediatek.c
@@ -0,0 +1,563 @@
+/*
+ * Mediatek PCIe host controller driver.
+ *
+ * Copyright (c) 2017 MediaTek Inc.
+ * Author: Ryder Lee <ryder.lee@mediatek.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_pci.h>
+#include <linux/of_platform.h>
+#include <linux/pci.h>
+#include <linux/phy/phy.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/reset.h>
+
+/* PCIe shared registers */
+#define PCIE_SYS_CFG		0x00
+#define PCIE_INT_ENABLE		0x0c
+#define PCIE_CFG_ADDR		0x20
+#define PCIE_CFG_DATA		0x24
+
+/* PCIe per port registers */
+#define PCIE_BAR0_SETUP		0x10
+#define PCIE_BAR1_SETUP		0x14
+#define PCIE_BAR0_MEM_BASE	0x18
+#define PCIE_CLASS		0x34
+#define PCIE_LINK_STATUS	0x50
+
+#define PCIE_PORT_INT_EN(x)	BIT(20 + (x))
+#define PCIE_PORT_PERST(x)	BIT(1 + (x))
+#define PCIE_PORT_LINKUP	BIT(0)
+#define PCIE_BAR_MAP_MAX	GENMASK(31, 16)
+
+#define PCIE_BAR_ENABLE		BIT(0)
+#define PCIE_REVISION_ID	BIT(0)
+#define PCIE_CLASS_CODE		(0x60400 << 8)
+#define PCIE_CONF_REG(regn)	(((regn) & GENMASK(7, 2)) | \
+				((((regn) >> 8) & GENMASK(3, 0)) << 24))
+#define PCIE_CONF_FUN(fun)	(((fun) << 8) & GENMASK(10, 8))
+#define PCIE_CONF_DEV(dev)	(((dev) << 11) & GENMASK(15, 11))
+#define PCIE_CONF_BUS(bus)	(((bus) << 16) & GENMASK(23, 16))
+#define PCIE_CONF_ADDR(regn, fun, dev, bus) \
+	(PCIE_CONF_REG(regn) | PCIE_CONF_FUN(fun) | \
+	 PCIE_CONF_DEV(dev) | PCIE_CONF_BUS(bus))
+
+/* Mediatek specific configuration registers */
+#define PCIE_FTS_NUM		0x70c
+#define PCIE_FTS_NUM_MASK	GENMASK(15, 8)
+#define PCIE_FTS_NUM_L0(x)	((x) & 0xff << 8)
+
+#define PCIE_FC_CREDIT		0x73c
+#define PCIE_FC_CREDIT_MASK	(GENMASK(31, 31) | GENMASK(28, 16))
+#define PCIE_FC_CREDIT_VAL(x)	((x) << 16)
+
+/**
+ * struct mtk_pcie_port - PCIe port information
+ * @base: IO mapped register base
+ * @list: port list
+ * @pcie: pointer to PCIe host info
+ * @reset: pointer to port reset control
+ * @regs: port memory region
+ * @sys_ck: pointer to bus clock
+ * @phy: pointer to phy control block
+ * @lane: lane count
+ * @index: port index
+ */
+struct mtk_pcie_port {
+	void __iomem *base;
+	struct list_head list;
+	struct mtk_pcie *pcie;
+	struct reset_control *reset;
+	struct resource regs;
+	struct clk *sys_ck;
+	struct phy *phy;
+	u32 lane;
+	u32 index;
+};
+
+/**
+ * struct mtk_pcie - PCIe host information
+ * @dev: pointer to PCIe device
+ * @base: IO mapped register Base
+ * @free_ck: free-run reference clock
+ * @io: IO resource
+ * @pio: PIO resource
+ * @mem: non-prefetchable memory resource
+ * @busn: bus range
+ * @offset: IO / Memory offset
+ * @ports: pointer to PCIe port information
+ */
+struct mtk_pcie {
+	struct device *dev;
+	void __iomem *base;
+	struct clk *free_ck;
+
+	struct resource io;
+	struct resource pio;
+	struct resource mem;
+	struct resource busn;
+	struct {
+		resource_size_t mem;
+		resource_size_t io;
+	} offset;
+	struct list_head ports;
+};
+
+static inline bool mtk_pcie_link_is_up(struct mtk_pcie_port *port)
+{
+	return !!(readl(port->base + PCIE_LINK_STATUS) &
+		  PCIE_PORT_LINKUP);
+}
+
+static void mtk_pcie_port_free(struct mtk_pcie_port *port)
+{
+	struct mtk_pcie *pcie = port->pcie;
+	struct device *dev = pcie->dev;
+
+	devm_iounmap(dev, port->base);
+	devm_release_mem_region(dev, port->regs.start,
+				resource_size(&port->regs));
+	list_del(&port->list);
+	devm_kfree(dev, port);
+}
+
+static void mtk_pcie_put_resources(struct mtk_pcie *pcie)
+{
+	struct device *dev = pcie->dev;
+	struct mtk_pcie_port *port, *tmp;
+
+	list_for_each_entry_safe(port, tmp, &pcie->ports, list) {
+		phy_power_off(port->phy);
+		clk_disable_unprepare(port->sys_ck);
+		mtk_pcie_port_free(port);
+	}
+
+	clk_disable_unprepare(pcie->free_ck);
+	pm_runtime_put_sync(dev);
+	pm_runtime_disable(dev);
+}
+
+static void __iomem *mtk_pcie_map_bus(struct pci_bus *bus,
+				      unsigned int devfn, int where)
+{
+	struct pci_host_bridge *host = pci_find_host_bridge(bus);
+	struct mtk_pcie *pcie = pci_host_bridge_priv(host);
+
+	writel(PCIE_CONF_ADDR(where, PCI_FUNC(devfn), PCI_SLOT(devfn),
+			      bus->number), pcie->base + PCIE_CFG_ADDR);
+
+	return pcie->base + PCIE_CFG_DATA + (where & 3);
+}
+
+static struct pci_ops mtk_pcie_ops = {
+	.map_bus = mtk_pcie_map_bus,
+	.read  = pci_generic_config_read,
+	.write = pci_generic_config_write,
+};
+
+static void mtk_pcie_configure_rc(struct mtk_pcie_port *port)
+{
+	struct mtk_pcie *pcie = port->pcie;
+	u32 func = PCI_FUNC(port->index << 3);
+	u32 slot = PCI_SLOT(port->index << 3);
+	u32 val;
+
+	/* enable interrupt */
+	val = readl(pcie->base + PCIE_INT_ENABLE);
+	val |= PCIE_PORT_INT_EN(port->index);
+	writel(val, pcie->base + PCIE_INT_ENABLE);
+
+	/* map to all DDR region. We need to set it before cfg operation. */
+	writel(PCIE_BAR_MAP_MAX | PCIE_BAR_ENABLE,
+	       port->base + PCIE_BAR0_SETUP);
+
+	/* configure class Code and revision ID */
+	writel(PCIE_CLASS_CODE | PCIE_REVISION_ID,
+	       port->base + PCIE_CLASS);
+
+	/* configure FC credit */
+	writel(PCIE_CONF_ADDR(PCIE_FC_CREDIT, func, slot, 0),
+	       pcie->base + PCIE_CFG_ADDR);
+	val = readl(pcie->base + PCIE_CFG_DATA);
+	val &= ~PCIE_FC_CREDIT_MASK;
+	val |= PCIE_FC_CREDIT_VAL(0x806c);
+	writel(PCIE_CONF_ADDR(PCIE_FC_CREDIT, func, slot, 0),
+	       pcie->base + PCIE_CFG_ADDR);
+	writel(val, pcie->base + PCIE_CFG_DATA);
+
+	/* configure RC FTS number to 250 when it leaves L0s */
+	writel(PCIE_CONF_ADDR(PCIE_FTS_NUM, func, slot, 0),
+	       pcie->base + PCIE_CFG_ADDR);
+	val = readl(pcie->base + PCIE_CFG_DATA);
+	val &= ~PCIE_FTS_NUM_MASK;
+	val |= PCIE_FTS_NUM_L0(0x50);
+	writel(PCIE_CONF_ADDR(PCIE_FTS_NUM, func, slot, 0),
+	       pcie->base + PCIE_CFG_ADDR);
+	writel(val, pcie->base + PCIE_CFG_DATA);
+}
+
+static void mtk_pcie_assert_ports(struct mtk_pcie_port *port)
+{
+	struct mtk_pcie *pcie = port->pcie;
+	u32 val;
+
+	/* assert port PERST_N */
+	val = readl(pcie->base + PCIE_SYS_CFG);
+	val |= PCIE_PORT_PERST(port->index);
+	writel(val, pcie->base + PCIE_SYS_CFG);
+
+	/* de-assert port PERST_N */
+	val = readl(pcie->base + PCIE_SYS_CFG);
+	val &= ~PCIE_PORT_PERST(port->index);
+	writel(val, pcie->base + PCIE_SYS_CFG);
+
+	/* PCIe v2.0 need at least 100ms delay to train from Gen1 to Gen2 */
+	msleep(100);
+}
+
+static int mtk_pcie_enable_ports(struct mtk_pcie_port *port)
+{
+	struct device *dev = port->pcie->dev;
+	int err;
+
+	err = clk_prepare_enable(port->sys_ck);
+	if (err) {
+		dev_err(dev, "failed to enable port%d clock\n", port->index);
+		goto err_sys_clk;
+	}
+
+	reset_control_assert(port->reset);
+	reset_control_deassert(port->reset);
+
+	err = phy_power_on(port->phy);
+	if (err) {
+		dev_err(dev, "failed to power on port%d phy\n", port->index);
+		goto err_phy_on;
+	}
+
+	mtk_pcie_assert_ports(port);
+
+	/* if link up, then setup root port configuration space */
+	if (mtk_pcie_link_is_up(port)) {
+		mtk_pcie_configure_rc(port);
+		return 0;
+	}
+
+	dev_info(dev, "Port%d link down\n", port->index);
+
+	phy_power_off(port->phy);
+err_phy_on:
+	clk_disable_unprepare(port->sys_ck);
+	mtk_pcie_port_free(port);
+err_sys_clk:
+	return err;
+}
+
+static int mtk_pcie_parse_ports(struct mtk_pcie *pcie,
+				struct mtk_pcie_port **p,
+				struct device_node *node,
+				int index)
+{
+	struct mtk_pcie_port *port;
+	struct device *dev = pcie->dev;
+	char name[10];
+	int err;
+
+	*p = devm_kzalloc(dev, sizeof(*port), GFP_KERNEL);
+	if (!*p)
+		return -ENOMEM;
+
+	port = *p;
+
+	err = of_property_read_u32(node, "num-lanes", &port->lane);
+	if (err) {
+		dev_err(dev, "missing num-lanes property\n");
+		return err;
+	}
+
+	err = of_address_to_resource(node, 0, &port->regs);
+	if (err) {
+		dev_err(dev, "failed to parse address: %d\n", err);
+		return err;
+	}
+
+	port->base = devm_ioremap_resource(dev, &port->regs);
+	if (IS_ERR(port->base)) {
+		dev_err(dev, "failed to map port%d base\n", index);
+		return PTR_ERR(port->base);
+	}
+
+	snprintf(name, sizeof(name), "sys_ck%d", index);
+	port->sys_ck = devm_clk_get(dev, name);
+	if (IS_ERR(port->sys_ck)) {
+		dev_err(dev, "failed to get port%d clock\n", index);
+		return PTR_ERR(port->sys_ck);
+	}
+
+	snprintf(name, sizeof(name), "pcie-rst%d", index);
+	port->reset = devm_reset_control_get(dev, name);
+	if (IS_ERR(port->reset)) {
+		dev_err(dev, "failed to get port%d reset\n", index);
+		return PTR_ERR(port->reset);
+	}
+
+	snprintf(name, sizeof(name), "pcie-phy%d", index);
+	port->phy = devm_phy_get(dev, name);
+	if (IS_ERR(port->phy)) {
+		dev_err(dev, "failed to get port%d phy\n", index);
+		return PTR_ERR(port->phy);
+	}
+
+	port->index = index;
+	port->pcie = pcie;
+
+	INIT_LIST_HEAD(&port->list);
+	list_add_tail(&port->list, &pcie->ports);
+	return 0;
+}
+
+static int mtk_pcie_handle_shared_resource(struct mtk_pcie *pcie)
+{
+	struct device *dev = pcie->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+	struct resource *regs;
+	int err;
+
+	/* get shared registers */
+	regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	pcie->base = devm_ioremap_resource(dev, regs);
+	if (IS_ERR(pcie->base)) {
+		dev_err(dev, "failed to map shared register\n");
+		return PTR_ERR(pcie->base);
+	}
+
+	pcie->free_ck = devm_clk_get(dev, "free_ck");
+	if (IS_ERR(pcie->free_ck))
+		return PTR_ERR(pcie->free_ck);
+
+	pm_runtime_enable(dev);
+	err = pm_runtime_get_sync(dev);
+	if (err)
+		goto err_pm;
+
+	/* enable top level clock */
+	err = clk_prepare_enable(pcie->free_ck);
+	if (err) {
+		dev_err(dev, "failed to enable free_ck\n");
+		goto err_free_ck;
+	}
+
+	return 0;
+
+err_free_ck:
+	pm_runtime_put_sync(dev);
+err_pm:
+	pm_runtime_disable(dev);
+
+	return err;
+}
+
+static int mtk_pcie_parse_and_add_res(struct mtk_pcie *pcie)
+{
+	struct device *dev = pcie->dev;
+	struct device_node *node = dev->of_node, *child;
+	struct of_pci_range_parser parser;
+	struct of_pci_range range;
+	struct resource res;
+	int err, linkup = 0;
+
+	/* parse shared resources */
+	err = mtk_pcie_handle_shared_resource(pcie);
+	if (err)
+		return err;
+
+	if (of_pci_range_parser_init(&parser, node)) {
+		dev_err(dev, "missing \"ranges\" property\n");
+		return -EINVAL;
+	}
+
+	for_each_of_pci_range(&parser, &range) {
+		err = of_pci_range_to_resource(&range, node, &res);
+		if (err < 0)
+			return err;
+
+		switch (res.flags & IORESOURCE_TYPE_BITS) {
+		case IORESOURCE_IO:
+			pcie->offset.io = res.start - range.pci_addr;
+
+			memcpy(&pcie->pio, &res, sizeof(res));
+			pcie->pio.name = node->full_name;
+
+			pcie->io.start = range.cpu_addr;
+			pcie->io.end = range.cpu_addr + range.size - 1;
+			pcie->io.flags = IORESOURCE_MEM;
+			pcie->io.name = "I/O";
+
+			memcpy(&res, &pcie->io, sizeof(res));
+			break;
+
+		case IORESOURCE_MEM:
+			pcie->offset.mem = res.start - range.pci_addr;
+
+			memcpy(&pcie->mem, &res, sizeof(res));
+			pcie->mem.name = "non-prefetchable";
+			break;
+		}
+	}
+
+	err = of_pci_parse_bus_range(node, &pcie->busn);
+	if (err < 0) {
+		dev_err(dev, "failed to parse ranges property: %d\n", err);
+		pcie->busn.name = node->name;
+		pcie->busn.start = 0;
+		pcie->busn.end = 0xff;
+		pcie->busn.flags = IORESOURCE_BUS;
+	}
+
+	for_each_child_of_node(node, child) {
+		struct mtk_pcie_port *port;
+		int index;
+
+		err = of_pci_get_devfn(child);
+		if (err < 0) {
+			dev_err(dev, "failed to parse devfn: %d\n", err);
+			return err;
+		}
+
+		index = PCI_SLOT(err);
+
+		if (!of_device_is_available(child))
+			continue;
+
+		err = mtk_pcie_parse_ports(pcie, &port, child, index);
+		if (err)
+			return err;
+
+		/* enable each port, and then check link status */
+		err = mtk_pcie_enable_ports(port);
+		if (!err)
+			linkup++;
+	}
+
+	return !linkup;
+}
+
+static int mtk_pcie_request_resources(struct mtk_pcie *pcie)
+{
+	struct pci_host_bridge *host = pci_host_bridge_from_priv(pcie);
+	struct list_head *windows = &host->windows;
+	struct device *dev = pcie->dev;
+	int err;
+
+	pci_add_resource_offset(windows, &pcie->pio, pcie->offset.io);
+	pci_add_resource_offset(windows, &pcie->mem, pcie->offset.mem);
+	pci_add_resource(windows, &pcie->busn);
+
+	err = devm_request_pci_bus_resources(dev, windows);
+	if (err < 0)
+		return err;
+
+	pci_remap_iospace(&pcie->pio, pcie->io.start);
+
+	return 0;
+}
+
+static int mtk_pcie_register_host(struct pci_host_bridge *host)
+{
+	struct mtk_pcie *pcie = pci_host_bridge_priv(host);
+	struct pci_bus *child;
+	int err;
+
+	pci_add_flags(PCI_REASSIGN_ALL_RSRC | PCI_REASSIGN_ALL_BUS);
+	host->busnr = pcie->busn.start;
+	host->dev.parent = pcie->dev;
+	host->ops = &mtk_pcie_ops;
+
+	err = pci_register_host_bridge(host);
+	if (err < 0)
+		return err;
+
+	pci_scan_child_bus(host->bus);
+
+	pci_fixup_irqs(pci_common_swizzle, of_irq_parse_and_map_pci);
+	pci_bus_size_bridges(host->bus);
+	pci_bus_assign_resources(host->bus);
+
+	list_for_each_entry(child, &host->bus->children, node)
+		pcie_bus_configure_settings(child);
+
+	pci_bus_add_devices(host->bus);
+
+	return 0;
+}
+
+static int mtk_pcie_probe(struct platform_device *pdev)
+{
+	struct mtk_pcie *pcie;
+	struct pci_host_bridge *host;
+	int err;
+
+	host = pci_alloc_host_bridge(sizeof(*pcie));
+	if (!host)
+		return -ENOMEM;
+
+	pcie = pci_host_bridge_priv(host);
+
+	pcie->dev = &pdev->dev;
+	platform_set_drvdata(pdev, pcie);
+	INIT_LIST_HEAD(&pcie->ports);
+
+	err = mtk_pcie_parse_and_add_res(pcie);
+	if (err)
+		return err;
+
+	err = mtk_pcie_request_resources(pcie);
+	if (err)
+		goto put_resources;
+
+	err = mtk_pcie_register_host(host);
+	if (err)
+		goto put_resources;
+
+	return 0;
+
+put_resources:
+	mtk_pcie_put_resources(pcie);
+	return err;
+}
+
+static const struct of_device_id mtk_pcie_ids[] = {
+	{ .compatible = "mediatek,mt7623-pcie"},
+	{ .compatible = "mediatek,mt2701-pcie"},
+	{},
+};
+MODULE_DEVICE_TABLE(of, mtk_pcie_ids);
+
+static struct platform_driver mtk_pcie_driver = {
+	.probe = mtk_pcie_probe,
+	.driver = {
+		.name = "mtk-pcie",
+		.of_match_table = mtk_pcie_ids,
+		.suppress_bind_attrs = true,
+	},
+};
+
+builtin_platform_driver(mtk_pcie_driver);
+
+MODULE_DESCRIPTION("Mediatek PCIe host controller driver.");
+MODULE_LICENSE("GPL v2");
-- 
1.9.1

^ permalink raw reply related

* [PATCH v2 2/2] dt-bindings: pcie: Add documentation for Mediatek PCIe
From: Ryder Lee @ 2017-05-04  9:32 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493890376-6713-1-git-send-email-ryder.lee@mediatek.com>

Add documentation for PCIe host driver available in MT7623
series SoCs.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
---
 .../bindings/pci/mediatek,mt7623-pcie.txt          | 149 +++++++++++++++++++++
 1 file changed, 149 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt

diff --git a/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt b/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
new file mode 100644
index 0000000..9c5d5a7
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/mediatek,mt7623-pcie.txt
@@ -0,0 +1,149 @@
+Mediatek Gen2 PCIe controller which is available on MT7623 series SoCs
+
+PCIe subsys supports single root complex (RC) with 3 Root Ports. Each root
+ports supports a Gen2 1-lane Link and has PIPE interface to PHY.
+
+Required properties:
+- compatible: Should contain "mediatek,mt7623-pcie".
+- device_type: Must be "pci"
+- reg: Base addresses and lengths of the PCIe controller.
+- #address-cells: Address representation for root ports (must be 3)
+- #size-cells: Size representation for root ports (must be 2)
+- #interrupt-cells: Size representation for interrupts (must be 1)
+- interrupts: Three interrupt outputs of the controller. Must contain an
+  entry for each entry in the interrupt-names property.
+- interrupt-names: Must include the following names
+  - "pcie-int0"
+  - "pcie-int1"
+  - "pcie-int2"
+- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping properties
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- clocks: Must contain an entry for each entry in clock-names.
+  See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+  - free_ck :for reference clock of PCIe subsys
+  - sys_ck0 :for clock of Port0
+  - sys_ck1 :for clock of Port1
+  - sys_ck2 :for clock of Port2
+- resets: Must contain an entry for each entry in reset-names.
+  See ../reset/reset.txt for details.
+- reset-names: Must include the following entries:
+  - pcie-rst0 :port0 reset
+  - pcie-rst1 :port1 reset
+  - pcie-rst2 :port2 reset
+- phys: list of PHY specifiers (used by generic PHY framework).
+- phy-names : must be "pcie-phy0", "pcie-phy1", "pcie-phyN".. based on the
+  number of PHYs as specified in *phys* property.
+- power-domains: A phandle and power domain specifier pair to the power domain
+  which is responsible for collapsing and restoring power to the peripheral.
+- bus-range: Range of bus numbers associated with this controller.
+- ranges:
+  - The first three entries are expected to translate the addresses for the root
+    port registers, which are referenced by the assigned-addresses property of
+    the root port nodes (see below).
+  - The remaining entries setup the mapping for the standard I/O and memory
+	regions.
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+
+In addition, the device tree node must have sub-nodes describing each
+PCIe port interface, having the following mandatory properties:
+
+Required properties:
+- device_type: Must be "pci"
+- assigned-addresses: Address and size of the port configuration registers
+- reg: Only the first four bytes are used to refer to the correct bus number
+  and device number.
+- #address-cells: Must be 3
+- #size-cells: Must be 2
+- #interrupt-cells: Must be 1
+- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping properties
+  Please refer to the standard PCI bus binding document for a more detailed
+  explanation.
+- ranges: Sub-ranges distributed from the PCIe controller node. An empty
+  property is sufficient.
+- num-lanes: Number of lanes to use for this port.
+
+Examples:
+
+	hifsys: syscon at 1a000000 {
+		compatible = "mediatek,mt7623-hifsys", "syscon";
+		reg = <0 0x1a000000 0 0x1000>;
+		#clock-cells = <1>;
+		#reset-cells = <1>;
+	};
+
+	pcie: pcie-controller at 1a140000 {
+		compatible = "mediatek,mt7623-pcie";
+		device_type = "pci";
+		reg = <0 0x1a140000 0 0x1000>; /* PCIe shared registers */
+		#address-cells = <3>;
+		#size-cells = <2>;
+		#interrupt-cells = <1>;
+		interrupts = <GIC_SPI 193 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_SPI 194 IRQ_TYPE_LEVEL_LOW>,
+			     <GIC_SPI 195 IRQ_TYPE_LEVEL_LOW>;
+		interrupt-names = "pcie-int0", "pcie-int1", "pcie-int2";
+		interrupt-map-mask = <0xf800 0 0 0>;
+		interrupt-map = <0x0000 0 0 0 &gic GIC_SPI 193 IRQ_TYPE_NONE>,
+				 <0x0800 0 0 0 &gic GIC_SPI 194 IRQ_TYPE_NONE>,
+			     <0x1000 0 0 0 &gic GIC_SPI 195 IRQ_TYPE_NONE>;
+		clocks = <&topckgen CLK_TOP_ETHIF_SEL>,
+			     <&hifsys CLK_HIFSYS_PCIE0>,
+				 <&hifsys CLK_HIFSYS_PCIE1>,
+				 <&hifsys CLK_HIFSYS_PCIE2>;
+		clock-names = "free_ck", "sys_ck0", "sys_ck1", "sys_ck2";
+		resets = <&hifsys MT2701_HIFSYS_PCIE0_RST>,
+			     <&hifsys MT2701_HIFSYS_PCIE1_RST>,
+			     <&hifsys MT2701_HIFSYS_PCIE2_RST>;
+		reset-names = "pcie-rst0", "pcie-rst1", "pcie-rst2";
+		phys = <&pcie0_phy>, <&pcie1_phy>, <&pcie2_phy>;
+		phy-names = "pcie-phy0", "pcie-phy1", "pcie-phy2";
+		power-domains = <&scpsys MT2701_POWER_DOMAIN_HIF>;
+		bus-range = <0x00 0xff>;
+		ranges = <0x82000000 0 0x1a142000 0 0x1a142000 0 0x1000 /* Port0 registers */
+			  0x82000000 0 0x1a143000 0 0x1a143000 0 0x1000 /* Port1 registers */
+			  0x82000000 0 0x1a144000 0 0x1a144000 0 0x1000 /* Port2 registers */
+			  0x81000000 0 0x1a160000 0 0x1a160000 0 0x00010000 /* I/O space */
+			  0x83000000 0 0x60000000 0 0x60000000 0 0x10000000>;	/* memory space */
+
+		pcie at 1,0 {
+			device_type = "pci";
+			assigned-addresses = <0x82000000 0 0x1a142000 0 0x1000>;
+			reg = <0x0000 0 0 0 0>;
+			#address-cells = <3>;
+			#size-cells = <2>;
+			#interrupt-cells = <1>;
+			interrupt-map-mask = <0 0 0 0>;
+			interrupt-map = <0 0 0 0 &gic GIC_SPI 193 IRQ_TYPE_NONE>;
+			ranges;
+			num-lanes = <1>;
+		};
+
+		pcie at 2,0 {
+			device_type = "pci";
+			assigned-addresses = <0x82000800 0 0x1a143000 0 0x1000>;
+			reg = <0x0800 0 0 0 0>;
+			#address-cells = <3>;
+			#size-cells = <2>;
+			#interrupt-cells = <1>;
+			interrupt-map-mask = <0 0 0 0>;
+			interrupt-map = <0 0 0 0 &gic GIC_SPI 194 IRQ_TYPE_NONE>;
+			ranges;
+			num-lanes = <1>;
+		};
+
+		pcie at 3,0 {
+			device_type = "pci";
+			assigned-addresses = <0x82001000 0 0x1a144000 0 0x1000>;
+			reg = <0x1000 0 0 0 0>;
+			#address-cells = <3>;
+			#size-cells = <2>;
+			#interrupt-cells = <1>;
+			interrupt-map-mask = <0 0 0 0>;
+			interrupt-map = <0 0 0 0 &gic GIC_SPI 195 IRQ_TYPE_NONE>;
+			ranges;
+			num-lanes = <1>;
+		};
+	};
-- 
1.9.1

^ permalink raw reply related

* [PATCH 5/5] KVM: arm/arm64: Allow setting the timer IRQ numbers from userspace
From: Marc Zyngier @ 2017-05-04  9:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170503183300.4618-6-cdall@linaro.org>

On 03/05/17 19:33, Christoffer Dall wrote:
> First we define an ABI using the vcpu devices that lets userspace set
> the interrupt numbers for the various timers on both the 32-bit and
> 64-bit KVM/ARM implementations.
> 
> Second, we add the definitions for the groups and attributes introduced
> by the above ABI.  (We add the PMU define on the 32-bit side as well for
> symmetry and it may get used some day.)
> 
> Third, we set up the arch-specific vcpu device operation handlers to
> call into the timer code for anything related to the
> KVM_ARM_VCPU_TIMER_CTRL group.
> 
> Fourth, we implement support for getting and setting the timer interrupt
> numbers using the above defined ABI in the arch timer code.
> 
> Fifth, we introduce error checking upon enabling the arch timer (which
> is called when first running a VCPU) to check that all VCPUs are
> configured to use the same PPI for the timer (as mandated by the
> architecture) and that the virtual and physical timers are not
> configured to use the same IRQ number.
> 
> Signed-off-by: Christoffer Dall <cdall@linaro.org>
> ---
>  Documentation/virtual/kvm/devices/vcpu.txt |  25 +++++++
>  arch/arm/include/uapi/asm/kvm.h            |   8 +++
>  arch/arm/kvm/guest.c                       |   9 +++
>  arch/arm64/include/uapi/asm/kvm.h          |   3 +
>  arch/arm64/kvm/guest.c                     |   9 +++
>  include/kvm/arm_arch_timer.h               |   4 ++
>  virt/kvm/arm/arch_timer.c                  | 104 +++++++++++++++++++++++++++++
>  7 files changed, 162 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt
> index 352af6e..013e3f1 100644
> --- a/Documentation/virtual/kvm/devices/vcpu.txt
> +++ b/Documentation/virtual/kvm/devices/vcpu.txt
> @@ -35,3 +35,28 @@ Returns: -ENODEV: PMUv3 not supported or GIC not initialized
>  Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
>  virtual GIC implementation, this must be done after initializing the in-kernel
>  irqchip.
> +
> +
> +2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> +Architectures: ARM,ARM64
> +
> +2.1. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_VTIMER
> +2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_PTIMER
> +Parameters: in kvm_device_attr.addr the address for the timer interrupt is a
> +            pointer to an int
> +Returns: -EINVAL: Invalid timer interrupt number
> +         -EBUSY:  One or more VCPUs has already run
> +
> +A value describing the architected timer interrupt number when connected to an
> +in-kernel virtual GIC.  These must be a PPI (16 <= intid < 32).  If an
> +attribute is not set, a default value is applied (see below).
> +
> +KVM_ARM_VCPU_TIMER_IRQ_VTIMER: The EL1 virtual timer intid (default: 27)
> +KVM_ARM_VCPU_TIMER_IRQ_PTIMER: The EL1 physical timer intid (default: 30)

uber nit: my reading of the code is that the default is set at VCPU
creation time. So setting the attribute overrides the default, not that
the default is applied if no attribute is set (i.e. there is always a
valid value).

> +
> +Setting the same PPI for different timers will prevent the VCPUs from running.
> +Setting the interrupt number on a VCPU configures all VCPUs created at that
> +time to use the number provided for a given timer, overwriting any previously
> +configured values on other VCPUs.  Userspace should configure the interrupt
> +numbers on at least one VCPU after creating all VCPUs and before running any
> +VCPUs.
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index a5838d6..6a75ec4 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -200,6 +200,14 @@ struct kvm_arch_memory_slot {
>  #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff
>  #define VGIC_LEVEL_INFO_LINE_LEVEL	0
>  
> +/* Device Control API on vcpu fd */
> +#define KVM_ARM_VCPU_PMU_V3_CTRL	0
> +#define   KVM_ARM_VCPU_PMU_V3_IRQ	0
> +#define   KVM_ARM_VCPU_PMU_V3_INIT	1
> +#define KVM_ARM_VCPU_TIMER_CTRL		1
> +#define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
> +#define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
> +
>  #define   KVM_DEV_ARM_VGIC_CTRL_INIT    0
>  
>  /* KVM_IRQ_LINE irq field index values */
> diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
> index acea05e..1e0784e 100644
> --- a/arch/arm/kvm/guest.c
> +++ b/arch/arm/kvm/guest.c
> @@ -308,6 +308,9 @@ int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>  	int ret;
>  
>  	switch (attr->group) {
> +	case KVM_ARM_VCPU_TIMER_CTRL:
> +		ret = kvm_arm_timer_set_attr(vcpu, attr);
> +		break;
>  	default:
>  		ret = -ENXIO;
>  		break;
> @@ -322,6 +325,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  	int ret;
>  
>  	switch (attr->group) {
> +	case KVM_ARM_VCPU_TIMER_CTRL:
> +		ret = kvm_arm_timer_get_attr(vcpu, attr);
> +		break;
>  	default:
>  		ret = -ENXIO;
>  		break;
> @@ -336,6 +342,9 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  	int ret;
>  
>  	switch (attr->group) {
> +	case KVM_ARM_VCPU_TIMER_CTRL:
> +		ret = kvm_arm_timer_has_attr(vcpu, attr);
> +		break;
>  	default:
>  		ret = -ENXIO;
>  		break;
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index cd6bea4..fc64518 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -226,6 +226,9 @@ struct kvm_arch_memory_slot {
>  #define KVM_ARM_VCPU_PMU_V3_CTRL	0
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
>  #define   KVM_ARM_VCPU_PMU_V3_INIT	1
> +#define KVM_ARM_VCPU_TIMER_CTRL		1
> +#define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
> +#define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
>  
>  /* KVM_IRQ_LINE irq field index values */
>  #define KVM_ARM_IRQ_TYPE_SHIFT		24
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index b37446a..5c7f657 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -390,6 +390,9 @@ int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>  	case KVM_ARM_VCPU_PMU_V3_CTRL:
>  		ret = kvm_arm_pmu_v3_set_attr(vcpu, attr);
>  		break;
> +	case KVM_ARM_VCPU_TIMER_CTRL:
> +		ret = kvm_arm_timer_set_attr(vcpu, attr);
> +		break;
>  	default:
>  		ret = -ENXIO;
>  		break;
> @@ -407,6 +410,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>  	case KVM_ARM_VCPU_PMU_V3_CTRL:
>  		ret = kvm_arm_pmu_v3_get_attr(vcpu, attr);
>  		break;
> +	case KVM_ARM_VCPU_TIMER_CTRL:
> +		ret = kvm_arm_timer_get_attr(vcpu, attr);
> +		break;
>  	default:
>  		ret = -ENXIO;
>  		break;
> @@ -424,6 +430,9 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  	case KVM_ARM_VCPU_PMU_V3_CTRL:
>  		ret = kvm_arm_pmu_v3_has_attr(vcpu, attr);
>  		break;
> +	case KVM_ARM_VCPU_TIMER_CTRL:
> +		ret = kvm_arm_timer_has_attr(vcpu, attr);
> +		break;
>  	default:
>  		ret = -ENXIO;
>  		break;
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index f1c967a..f0053f8 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -68,6 +68,10 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu);
>  u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
>  int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
>  
> +int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
> +int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
> +int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
> +
>  bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
>  void kvm_timer_schedule(struct kvm_vcpu *vcpu);
>  void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 6c5a064..d3d1dce 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -21,6 +21,7 @@
>  #include <linux/kvm_host.h>
>  #include <linux/interrupt.h>
>  #include <linux/irq.h>
> +#include <linux/uaccess.h>
>  
>  #include <clocksource/arm_arch_timer.h>
>  #include <asm/arch_timer.h>
> @@ -617,6 +618,28 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>  	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
>  }
>  
> +static bool timer_irqs_are_valid(struct kvm *kvm)
> +{
> +	struct kvm_vcpu *vcpu;
> +	int vtimer_irq, ptimer_irq;
> +	int i;
> +
> +	vcpu = kvm_get_vcpu(kvm, 0);
> +	vtimer_irq = vcpu_vtimer(vcpu)->irq.irq;
> +	ptimer_irq = vcpu_ptimer(vcpu)->irq.irq;
> +
> +	if (vtimer_irq == ptimer_irq)
> +		return false;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		if (vcpu_vtimer(vcpu)->irq.irq != vtimer_irq ||
> +		    vcpu_ptimer(vcpu)->irq.irq != ptimer_irq)
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
>  int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  {
>  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
> @@ -636,6 +659,11 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>  	if (!vgic_initialized(vcpu->kvm))
>  		return -ENODEV;
>  
> +	if (!timer_irqs_are_valid(vcpu->kvm)) {
> +		kvm_debug("incorrectly configured timer irqs\n");
> +		return -EINVAL;
> +	}
> +
>  	/*
>  	 * Find the physical IRQ number corresponding to the host_vtimer_irq
>  	 */
> @@ -685,3 +713,79 @@ void kvm_timer_init_vhe(void)
>  	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
>  	write_sysreg(val, cnthctl_el2);
>  }
> +
> +static void set_timer_irqs(struct kvm *kvm, int vtimer_irq, int ptimer_irq)
> +{
> +	struct kvm_vcpu *vcpu;
> +	int i;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		vcpu_vtimer(vcpu)->irq.irq = vtimer_irq;
> +		vcpu_ptimer(vcpu)->irq.irq = ptimer_irq;
> +	}
> +}
> +
> +int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> +{
> +	int __user *uaddr = (int __user *)(long)attr->addr;
> +	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> +	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> +	int irq;
> +
> +	if (!irqchip_in_kernel(vcpu->kvm))
> +		return -EINVAL;
> +
> +	if (get_user(irq, uaddr))
> +		return -EFAULT;
> +
> +	if (!(irq_is_ppi(irq)))
> +		return -EINVAL;
> +
> +	if (vcpu->arch.timer_cpu.enabled)
> +		return -EBUSY;
> +
> +	switch (attr->attr) {
> +	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> +		set_timer_irqs(vcpu->kvm, irq, ptimer->irq.irq);

Could we move the (vtimer_irq == ptimer_irq) check here? It is probably
better to fail early than to wait until the first run to discover that
something was wrong... Or does that clash horribly with the default values?

Another issue that just popped in my head: how do we ensure that we
don't clash between the PMU and the timers? Yes, that's a foolish thing
to do, but that's no different from avoiding the same interrupt on the
timers...

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Leonard Crestez @ 2017-05-04  9:42 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <51fb1bd0-028d-c431-acc8-3c4f1d25fdba@denx.de>

On Wed, 2017-05-03 at 21:33 +0200, Marek Vasut wrote:
> On 05/03/2017 07:58 PM, Leonard Crestez wrote:
> > On Wed, 2017-05-03 at 17:59 +0200, Marek Vasut wrote:
> > > On 05/03/2017 04:58 PM, Leonard Crestez wrote:
> > > > On Wed, 2017-05-03 at 16:26 +0200, Marek Vasut wrote:

> > > > > 2) It actually fixes a problem with the voltage rails such that the DVFS
> > > > > ?  works without leaving the system in unstable or dead state. You do
> > > > > ?  need the second part of my patch if you drop the OPP hackery, without
> > > > > ?  it the power framework cannot correctly configure the core voltages,
> > > > > ?  so the patch from Leonard makes things worse.

> > > > No, I think there is a misunderstanding here. The second part of your
> > > > patch will cause cpufreq poking at LDOs to indirectly adjust the input
> > > > from the PMIC to the minimum required (this is LDO target +
> > > > min_dropout_uv). Without it by default VDD_ARM_SOC_IN will remain fixed
> > > > as 1375mV from boot.

> > > Who sets / guarantees that default value for ARM and SOC rails ?

> > I think it's from the PMIC hardware itself (but maybe uboot plays with
> > it). VDD_ARM_SOC_IN on this board is tied to SW1AB from MMPF0200:
> > 
> > http://www.nxp.com/assets/documents/data/en/data-sheets/MMPF0200.pdf
> > 
> > It seems reasonable to rely on such voltages set externally.

> Isn't it an established rule that Linux should not depend on bootloader
> settings ? Or did that change ?

I don't actually know. Is there a hard and fast rule about this, even when it comes to voltages?

In theory it is possible for a bootloader to set a low cpu frequency and low voltage and then have the chip fail when the cpufreq driver attempts to go higher. Setting vin-supply on reg_arm/reg_soc would fix that.

> Well the regulator(s) cannot be correctly configured if the kernel
> doesn't have the correct power distribution described in the DT .

It depends on your definition of "correctness". It it certainly
possible to get a functional system while only partially describing
regulator relationships.

I think there is a further misunderstanding here. I have a problem
where imx6sx-sdb rev C boards crash on boot with upstream (but are
reported to work fine with rev B). Removing the OPP overrides fixes
this specific issue.

I don't object to the second part of your patch, setting correct supply links is a good thing for various reasons. It is just not necessary for fixing the concrete crash mentioned above (and I tested this). It should probably go in a separate patch.

It might seem a pedantic difference but it's good to accurately describe the effect of patches in commit messages. For example it might help somebody looking to backport various fixes.

--
Regards,
Leonard

^ permalink raw reply

* [PATCH 1/5] KVM: arm64: Allow creating the PMU without the in-kernel GIC
From: Christoffer Dall @ 2017-05-04  9:44 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <17cb39a1-f0a9-a1f1-fef5-78379d99de2f@arm.com>

On Thu, May 04, 2017 at 10:05:43AM +0100, Marc Zyngier wrote:
> On 04/05/17 09:38, Christoffer Dall wrote:
> > On Thu, May 04, 2017 at 09:28:50AM +0100, Marc Zyngier wrote:
> >> On 04/05/17 09:13, Christoffer Dall wrote:
> >>> On Thu, May 04, 2017 at 09:09:47AM +0100, Marc Zyngier wrote:
> >>>> On 03/05/17 19:32, Christoffer Dall wrote:
> >>>>> Since we got support for devices in userspace which allows reporting the
> >>>>> PMU overflow output status to userspace, we should actually allow
> >>>>> creating the PMU on systems without an in-kernel irqchip, which in turn
> >>>>> requires us to slightly clarify error codes for the ABI and move things
> >>>>> around for the initialization phase.
> >>>>>
> >>>>> Signed-off-by: Christoffer Dall <cdall@linaro.org>
> >>>>> ---
> >>>>>  Documentation/virtual/kvm/devices/vcpu.txt | 16 +++++++++-------
> >>>>>  virt/kvm/arm/pmu.c                         | 27 +++++++++++++++++----------
> >>>>>  2 files changed, 26 insertions(+), 17 deletions(-)
> >>>>>
> >>>>> diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt
> >>>>> index 02f5068..352af6e 100644
> >>>>> --- a/Documentation/virtual/kvm/devices/vcpu.txt
> >>>>> +++ b/Documentation/virtual/kvm/devices/vcpu.txt
> >>>>> @@ -16,7 +16,9 @@ Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
> >>>>>  Returns: -EBUSY: The PMU overflow interrupt is already set
> >>>>>           -ENXIO: The overflow interrupt not set when attempting to get it
> >>>>>           -ENODEV: PMUv3 not supported
> >>>>> -         -EINVAL: Invalid PMU overflow interrupt number supplied
> >>>>> +         -EINVAL: Invalid PMU overflow interrupt number supplied or
> >>>>> +                  trying to set the IRQ number without using an in-kernel
> >>>>> +                  irqchip.
> >>>>>  
> >>>>>  A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
> >>>>>  number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
> >>>>> @@ -25,11 +27,11 @@ all vcpus, while as an SPI it must be a separate number per vcpu.
> >>>>>  
> >>>>>  1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
> >>>>>  Parameters: no additional parameter in kvm_device_attr.addr
> >>>>> -Returns: -ENODEV: PMUv3 not supported
> >>>>> -         -ENXIO: PMUv3 not properly configured as required prior to calling this
> >>>>> -                 attribute
> >>>>> +Returns: -ENODEV: PMUv3 not supported or GIC not initialized
> >>>>> +         -ENXIO: PMUv3 not properly configured or in-kernel irqchip not
> >>>>> +                 conigured as required prior to calling this attribute
> >>>>>           -EBUSY: PMUv3 already initialized
> >>>>>  
> >>>>> -Request the initialization of the PMUv3.  This must be done after creating the
> >>>>> -in-kernel irqchip.  Creating a PMU with a userspace irqchip is currently not
> >>>>> -supported.
> >>>>> +Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
> >>>>> +virtual GIC implementation, this must be done after initializing the in-kernel
> >>>>> +irqchip.
> >>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> >>>>> index 4b43e7f..f046b08 100644
> >>>>> --- a/virt/kvm/arm/pmu.c
> >>>>> +++ b/virt/kvm/arm/pmu.c
> >>>>> @@ -456,21 +456,25 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
> >>>>>  	if (!kvm_arm_support_pmu_v3())
> >>>>>  		return -ENODEV;
> >>>>>  
> >>>>> -	/*
> >>>>> -	 * We currently require an in-kernel VGIC to use the PMU emulation,
> >>>>> -	 * because we do not support forwarding PMU overflow interrupts to
> >>>>> -	 * userspace yet.
> >>>>> -	 */
> >>>>> -	if (!irqchip_in_kernel(vcpu->kvm) || !vgic_initialized(vcpu->kvm))
> >>>>> -		return -ENODEV;
> >>>>> -
> >>>>> -	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features) ||
> >>>>> -	    !kvm_arm_pmu_irq_initialized(vcpu))
> >>>>> +	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features))
> >>>>>  		return -ENXIO;
> >>>>>  
> >>>>>  	if (kvm_arm_pmu_v3_ready(vcpu))
> >>>>>  		return -EBUSY;
> >>>>>  
> >>>>> +	if (irqchip_in_kernel(vcpu->kvm)) {
> >>>>> +		/*
> >>>>> +		 * If using the PMU with an in-kernel virtual GIC
> >>>>> +		 * implementation, we require the GIC to be already
> >>>>> +		 * initialized when initializing the PMU.
> >>>>> +		 */
> >>>>> +		if (!vgic_initialized(vcpu->kvm))
> >>>>> +			return -ENODEV;
> >>>>> +
> >>>>> +		if (!kvm_arm_pmu_irq_initialized(vcpu))
> >>>>> +			return -ENXIO;
> >>>>> +	}
> >>>>> +
> >>>>>  	kvm_pmu_vcpu_reset(vcpu);
> >>>>>  	vcpu->arch.pmu.ready = true;
> >>>>>  
> >>>>> @@ -512,6 +516,9 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> >>>>>  		int __user *uaddr = (int __user *)(long)attr->addr;
> >>>>>  		int irq;
> >>>>>  
> >>>>> +		if (!irqchip_in_kernel(vcpu->kvm))
> >>>>> +			return -EINVAL;
> >>>>> +
> >>>>
> >>>> Shouldn't we fail the same way for {get,has}_attr? get_attr is going to
> >>>> generate a -ENXIO, and has_attr is going to lie about the availability
> >>>> of KVM_ARM_VCPU_PMU_V3_IRQ...
> >>>>
> >>>
> >>> Here's the text from api.txt:
> >>>
> >>>   Tests whether a device supports a particular attribute.  A successful
> >>>   return indicates the attribute is implemented.  It does not necessarily
> >>>   indicate that the attribute can be read or written in the device's
> >>>   current state.  "addr" is ignored.
> >>>
> >>> My interpretation therefore is that QEMU can use this ioctl to figure
> >>> out if the feature is supported (sort of like a capability), but that
> >>> doesn't mean that the configuration of the VM is such that the attribute
> >>> can be get or set at that moment.
> >>>
> >>> For example, there will also alway be situations where you can get an
> >>> attr, but not set an attr, what should the has_attr return then?
> >>
> >> My issue here is that whether we can get/set the interrupt or not is not
> >> a function of the device itself, but of the way it is "wired". No matter
> >> what "the device's current state" is, we'll never be able to get/set the
> >> interrupt.
> >>
> >> I'd tend to err on the side of caution and return something that is
> >> unambiguous, be maybe I have too strict an interpretation of the API.
> >>
> > 
> > Hmm, I see the has_attr as a method for userspace to discover "does this
> > kernel have this feature".  If we make has_attr return a value depending
> > on the VM having an in-kernel gic or not, we implicitly require
> > userspace to create a VM with an in-kernel GIC to discover if this
> > kernel has that feature, and therefore also impose an ordering
> > requirement of figuring out the capabilities of the kernel (i.e. create
> > the GIC before checking this API).
> > 
> > I think QEMU should be able to do:
> > 
> >   if (has_attr()) {
> >      kvm_supports_set_timer_irq = true;
> >      vtimer_irq = foo;
> >   } else {
> >      kvm_supports_set_timer_irq = false;
> >      vtimer_irq = 27; /* default, we're stuck with it */
> >   }
> > 
> >   create_board_definition();
> >   create_dt();
> >   create_acpi();
> > 
> >   /* do whatever */
> > 
> >   if (kvm_supports_set_timer_irq && kvm_irqchip_in_kernel()) {
> >   	kvm_arm_timer_set_irq(...);
> >   }
> > 
> > And all this should not be coupled to when we create the irqchip device.
> > 
> > But I may be failing to see the case where the current implementation
> > creates a problem for userspace, in which case we should document the
> > ordering requirement.
> 
> I'm not sure it would create any problem, at least not for the PMU
> (there is no working code that would have created a PMU without an irqchip).
> 
> It is just that we have a slightly diverging interpretation of what
> has_attr means. You see it as "attribute that the device supports", and
> I see it as "attribute that the device supports in this configuration".
> I'm happy to use your semantics from now on.

In either case, we should make sure this is clear in the ABI.  I thought
that the "It does not necessarily indicate that the attribute can be
read or written in the device's current state." implied my
interpretation, but maybe I'm missing some subtlety there?

Do you think we should clarify the API?

By the way, I now realize that we are not maintining the same
understanding between get/set_attr, which I really think we should, so
I'll add the following:


diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
index 9b3e3ea..0cf62b7 100644
--- a/virt/kvm/arm/pmu.c
+++ b/virt/kvm/arm/pmu.c
@@ -551,6 +551,9 @@ int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
 		int __user *uaddr = (int __user *)(long)attr->addr;
 		int irq;
 
+		if (!irqchip_in_kernel(vcpu->kvm))
+			return -EINVAL;
+
 		if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features))
 			return -ENODEV;
 

I was only talking about has_attr above.

Let me know.

Thanks,
-Christoffer

^ permalink raw reply related

* [PATCH] staging: vc04_services: Fix bulk cache maintenance
From: Phil Elwell @ 2017-05-04  9:58 UTC (permalink / raw)
  To: linux-arm-kernel

vchiq_arm supports transfers less than one page and at arbitrary
alignment, using the dma-mapping API to perform its cache maintenance
(even though the VPU drives the DMA hardware). Read (DMA_FROM_DEVICE)
operations use cache invalidation for speed, falling back to
clean+invalidate on partial cache lines, with writes (DMA_TO_DEVICE)
using flushes.

If a read transfer has ends which aren't page-aligned, performing cache
maintenance as if they were whole pages can lead to memory corruption
since the partial cache lines at the ends (and any cache lines before or
after the transfer area) will be invalidated. This bug was masked until
the disabling of the cache flush in flush_dcache_page().

Honouring the requested transfer start- and end-points prevents the
corruption.

Fixes: cf9caf192988 ("staging: vc04_services: Replace dmac_map_area with dmac_map_sg")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
---
 .../interface/vchiq_arm/vchiq_2835_arm.c           | 31 +++++++++++++---------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
index 3aeffcb..02e9736 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
@@ -501,8 +501,15 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
 	 */
 	sg_init_table(scatterlist, num_pages);
 	/* Now set the pages for each scatterlist */
-	for (i = 0; i < num_pages; i++)
-		sg_set_page(scatterlist + i, pages[i], PAGE_SIZE, 0);
+	for (i = 0; i < num_pages; i++)	{
+		unsigned int len = PAGE_SIZE - offset;
+
+		if (len > count)
+			len = count;
+		sg_set_page(scatterlist + i, pages[i], len, offset);
+		offset = 0;
+		count -= len;
+	}
 
 	dma_buffers = dma_map_sg(g_dev,
 				 scatterlist,
@@ -523,20 +530,20 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
 		u32 addr = sg_dma_address(sg);
 
 		/* Note: addrs is the address + page_count - 1
-		 * The firmware expects the block to be page
+		 * The firmware expects blocks after the first to be page-
 		 * aligned and a multiple of the page size
 		 */
 		WARN_ON(len == 0);
-		WARN_ON(len & ~PAGE_MASK);
-		WARN_ON(addr & ~PAGE_MASK);
+		WARN_ON(i && (i != (dma_buffers - 1)) && (len & ~PAGE_MASK));
+		WARN_ON(i && (addr & ~PAGE_MASK));
 		if (k > 0 &&
-		    ((addrs[k - 1] & PAGE_MASK) |
-			((addrs[k - 1] & ~PAGE_MASK) + 1) << PAGE_SHIFT)
-		    == addr) {
-			addrs[k - 1] += (len >> PAGE_SHIFT);
-		} else {
-			addrs[k++] = addr | ((len >> PAGE_SHIFT) - 1);
-		}
+		    ((addrs[k - 1] & PAGE_MASK) +
+		     (((addrs[k - 1] & ~PAGE_MASK) + 1) << PAGE_SHIFT))
+		    == (addr & PAGE_MASK))
+			addrs[k - 1] += ((len + PAGE_SIZE - 1) >> PAGE_SHIFT);
+		else
+			addrs[k++] = (addr & PAGE_MASK) |
+				(((len + PAGE_SIZE - 1) >> PAGE_SHIFT) - 1);
 	}
 
 	/* Partial cache lines (fragments) require special measures */
-- 
1.9.1

^ permalink raw reply related

* [PATCH 5/5] KVM: arm/arm64: Allow setting the timer IRQ numbers from userspace
From: Christoffer Dall @ 2017-05-04  9:59 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <b08c50b4-6aba-1a59-80f4-343cd906f1fa@arm.com>

On Thu, May 04, 2017 at 10:34:32AM +0100, Marc Zyngier wrote:
> On 03/05/17 19:33, Christoffer Dall wrote:
> > First we define an ABI using the vcpu devices that lets userspace set
> > the interrupt numbers for the various timers on both the 32-bit and
> > 64-bit KVM/ARM implementations.
> > 
> > Second, we add the definitions for the groups and attributes introduced
> > by the above ABI.  (We add the PMU define on the 32-bit side as well for
> > symmetry and it may get used some day.)
> > 
> > Third, we set up the arch-specific vcpu device operation handlers to
> > call into the timer code for anything related to the
> > KVM_ARM_VCPU_TIMER_CTRL group.
> > 
> > Fourth, we implement support for getting and setting the timer interrupt
> > numbers using the above defined ABI in the arch timer code.
> > 
> > Fifth, we introduce error checking upon enabling the arch timer (which
> > is called when first running a VCPU) to check that all VCPUs are
> > configured to use the same PPI for the timer (as mandated by the
> > architecture) and that the virtual and physical timers are not
> > configured to use the same IRQ number.
> > 
> > Signed-off-by: Christoffer Dall <cdall@linaro.org>
> > ---
> >  Documentation/virtual/kvm/devices/vcpu.txt |  25 +++++++
> >  arch/arm/include/uapi/asm/kvm.h            |   8 +++
> >  arch/arm/kvm/guest.c                       |   9 +++
> >  arch/arm64/include/uapi/asm/kvm.h          |   3 +
> >  arch/arm64/kvm/guest.c                     |   9 +++
> >  include/kvm/arm_arch_timer.h               |   4 ++
> >  virt/kvm/arm/arch_timer.c                  | 104 +++++++++++++++++++++++++++++
> >  7 files changed, 162 insertions(+)
> > 
> > diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt
> > index 352af6e..013e3f1 100644
> > --- a/Documentation/virtual/kvm/devices/vcpu.txt
> > +++ b/Documentation/virtual/kvm/devices/vcpu.txt
> > @@ -35,3 +35,28 @@ Returns: -ENODEV: PMUv3 not supported or GIC not initialized
> >  Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
> >  virtual GIC implementation, this must be done after initializing the in-kernel
> >  irqchip.
> > +
> > +
> > +2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
> > +Architectures: ARM,ARM64
> > +
> > +2.1. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_VTIMER
> > +2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_PTIMER
> > +Parameters: in kvm_device_attr.addr the address for the timer interrupt is a
> > +            pointer to an int
> > +Returns: -EINVAL: Invalid timer interrupt number
> > +         -EBUSY:  One or more VCPUs has already run
> > +
> > +A value describing the architected timer interrupt number when connected to an
> > +in-kernel virtual GIC.  These must be a PPI (16 <= intid < 32).  If an
> > +attribute is not set, a default value is applied (see below).
> > +
> > +KVM_ARM_VCPU_TIMER_IRQ_VTIMER: The EL1 virtual timer intid (default: 27)
> > +KVM_ARM_VCPU_TIMER_IRQ_PTIMER: The EL1 physical timer intid (default: 30)
> 
> uber nit: my reading of the code is that the default is set at VCPU
> creation time. So setting the attribute overrides the default, not that
> the default is applied if no attribute is set (i.e. there is always a
> valid value).
> 

uh, right, I don't see the distinction though, so not sure how to
correct the text.

Would "Setting the attribute overrides the default values (see below)."
work instead?

> > +
> > +Setting the same PPI for different timers will prevent the VCPUs from running.
> > +Setting the interrupt number on a VCPU configures all VCPUs created at that
> > +time to use the number provided for a given timer, overwriting any previously
> > +configured values on other VCPUs.  Userspace should configure the interrupt
> > +numbers on at least one VCPU after creating all VCPUs and before running any
> > +VCPUs.
> > diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> > index a5838d6..6a75ec4 100644
> > --- a/arch/arm/include/uapi/asm/kvm.h
> > +++ b/arch/arm/include/uapi/asm/kvm.h
> > @@ -200,6 +200,14 @@ struct kvm_arch_memory_slot {
> >  #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff
> >  #define VGIC_LEVEL_INFO_LINE_LEVEL	0
> >  
> > +/* Device Control API on vcpu fd */
> > +#define KVM_ARM_VCPU_PMU_V3_CTRL	0
> > +#define   KVM_ARM_VCPU_PMU_V3_IRQ	0
> > +#define   KVM_ARM_VCPU_PMU_V3_INIT	1
> > +#define KVM_ARM_VCPU_TIMER_CTRL		1
> > +#define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
> > +#define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
> > +
> >  #define   KVM_DEV_ARM_VGIC_CTRL_INIT    0
> >  
> >  /* KVM_IRQ_LINE irq field index values */
> > diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c
> > index acea05e..1e0784e 100644
> > --- a/arch/arm/kvm/guest.c
> > +++ b/arch/arm/kvm/guest.c
> > @@ -308,6 +308,9 @@ int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
> >  	int ret;
> >  
> >  	switch (attr->group) {
> > +	case KVM_ARM_VCPU_TIMER_CTRL:
> > +		ret = kvm_arm_timer_set_attr(vcpu, attr);
> > +		break;
> >  	default:
> >  		ret = -ENXIO;
> >  		break;
> > @@ -322,6 +325,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
> >  	int ret;
> >  
> >  	switch (attr->group) {
> > +	case KVM_ARM_VCPU_TIMER_CTRL:
> > +		ret = kvm_arm_timer_get_attr(vcpu, attr);
> > +		break;
> >  	default:
> >  		ret = -ENXIO;
> >  		break;
> > @@ -336,6 +342,9 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >  	int ret;
> >  
> >  	switch (attr->group) {
> > +	case KVM_ARM_VCPU_TIMER_CTRL:
> > +		ret = kvm_arm_timer_has_attr(vcpu, attr);
> > +		break;
> >  	default:
> >  		ret = -ENXIO;
> >  		break;
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index cd6bea4..fc64518 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -226,6 +226,9 @@ struct kvm_arch_memory_slot {
> >  #define KVM_ARM_VCPU_PMU_V3_CTRL	0
> >  #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
> >  #define   KVM_ARM_VCPU_PMU_V3_INIT	1
> > +#define KVM_ARM_VCPU_TIMER_CTRL		1
> > +#define   KVM_ARM_VCPU_TIMER_IRQ_VTIMER		0
> > +#define   KVM_ARM_VCPU_TIMER_IRQ_PTIMER		1
> >  
> >  /* KVM_IRQ_LINE irq field index values */
> >  #define KVM_ARM_IRQ_TYPE_SHIFT		24
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index b37446a..5c7f657 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -390,6 +390,9 @@ int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
> >  	case KVM_ARM_VCPU_PMU_V3_CTRL:
> >  		ret = kvm_arm_pmu_v3_set_attr(vcpu, attr);
> >  		break;
> > +	case KVM_ARM_VCPU_TIMER_CTRL:
> > +		ret = kvm_arm_timer_set_attr(vcpu, attr);
> > +		break;
> >  	default:
> >  		ret = -ENXIO;
> >  		break;
> > @@ -407,6 +410,9 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
> >  	case KVM_ARM_VCPU_PMU_V3_CTRL:
> >  		ret = kvm_arm_pmu_v3_get_attr(vcpu, attr);
> >  		break;
> > +	case KVM_ARM_VCPU_TIMER_CTRL:
> > +		ret = kvm_arm_timer_get_attr(vcpu, attr);
> > +		break;
> >  	default:
> >  		ret = -ENXIO;
> >  		break;
> > @@ -424,6 +430,9 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >  	case KVM_ARM_VCPU_PMU_V3_CTRL:
> >  		ret = kvm_arm_pmu_v3_has_attr(vcpu, attr);
> >  		break;
> > +	case KVM_ARM_VCPU_TIMER_CTRL:
> > +		ret = kvm_arm_timer_has_attr(vcpu, attr);
> > +		break;
> >  	default:
> >  		ret = -ENXIO;
> >  		break;
> > diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> > index f1c967a..f0053f8 100644
> > --- a/include/kvm/arm_arch_timer.h
> > +++ b/include/kvm/arm_arch_timer.h
> > @@ -68,6 +68,10 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu);
> >  u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
> >  int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
> >  
> > +int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
> > +int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
> > +int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
> > +
> >  bool kvm_timer_should_fire(struct arch_timer_context *timer_ctx);
> >  void kvm_timer_schedule(struct kvm_vcpu *vcpu);
> >  void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> > index 6c5a064..d3d1dce 100644
> > --- a/virt/kvm/arm/arch_timer.c
> > +++ b/virt/kvm/arm/arch_timer.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/kvm_host.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/irq.h>
> > +#include <linux/uaccess.h>
> >  
> >  #include <clocksource/arm_arch_timer.h>
> >  #include <asm/arch_timer.h>
> > @@ -617,6 +618,28 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
> >  	kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
> >  }
> >  
> > +static bool timer_irqs_are_valid(struct kvm *kvm)
> > +{
> > +	struct kvm_vcpu *vcpu;
> > +	int vtimer_irq, ptimer_irq;
> > +	int i;
> > +
> > +	vcpu = kvm_get_vcpu(kvm, 0);
> > +	vtimer_irq = vcpu_vtimer(vcpu)->irq.irq;
> > +	ptimer_irq = vcpu_ptimer(vcpu)->irq.irq;
> > +
> > +	if (vtimer_irq == ptimer_irq)
> > +		return false;
> > +
> > +	kvm_for_each_vcpu(i, vcpu, kvm) {
> > +		if (vcpu_vtimer(vcpu)->irq.irq != vtimer_irq ||
> > +		    vcpu_ptimer(vcpu)->irq.irq != ptimer_irq)
> > +			return false;
> > +	}
> > +
> > +	return true;
> > +}
> > +
> >  int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >  {
> >  	struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
> > @@ -636,6 +659,11 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
> >  	if (!vgic_initialized(vcpu->kvm))
> >  		return -ENODEV;
> >  
> > +	if (!timer_irqs_are_valid(vcpu->kvm)) {
> > +		kvm_debug("incorrectly configured timer irqs\n");
> > +		return -EINVAL;
> > +	}
> > +
> >  	/*
> >  	 * Find the physical IRQ number corresponding to the host_vtimer_irq
> >  	 */
> > @@ -685,3 +713,79 @@ void kvm_timer_init_vhe(void)
> >  	val |= (CNTHCTL_EL1PCTEN << cnthctl_shift);
> >  	write_sysreg(val, cnthctl_el2);
> >  }
> > +
> > +static void set_timer_irqs(struct kvm *kvm, int vtimer_irq, int ptimer_irq)
> > +{
> > +	struct kvm_vcpu *vcpu;
> > +	int i;
> > +
> > +	kvm_for_each_vcpu(i, vcpu, kvm) {
> > +		vcpu_vtimer(vcpu)->irq.irq = vtimer_irq;
> > +		vcpu_ptimer(vcpu)->irq.irq = ptimer_irq;
> > +	}
> > +}
> > +
> > +int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
> > +{
> > +	int __user *uaddr = (int __user *)(long)attr->addr;
> > +	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> > +	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> > +	int irq;
> > +
> > +	if (!irqchip_in_kernel(vcpu->kvm))
> > +		return -EINVAL;
> > +
> > +	if (get_user(irq, uaddr))
> > +		return -EFAULT;
> > +
> > +	if (!(irq_is_ppi(irq)))
> > +		return -EINVAL;
> > +
> > +	if (vcpu->arch.timer_cpu.enabled)
> > +		return -EBUSY;
> > +
> > +	switch (attr->attr) {
> > +	case KVM_ARM_VCPU_TIMER_IRQ_VTIMER:
> > +		set_timer_irqs(vcpu->kvm, irq, ptimer->irq.irq);
> 
> Could we move the (vtimer_irq == ptimer_irq) check here? It is probably
> better to fail early than to wait until the first run to discover that
> something was wrong... Or does that clash horribly with the default values?

It clashes horribly with the default value.  I tried it and gave up.  I
don't think it's reasonable to require userspace to set stuff to a
non-default value first etc., and I experimented with initializing the
irq numbers to 0 and then check stuff later, but it became a huge mess
basically.

> 
> Another issue that just popped in my head: how do we ensure that we
> don't clash between the PMU and the timers? Yes, that's a foolish thing
> to do, but that's no different from avoiding the same interrupt on the
> timers...
> 
Argh, good point.

So actually, nothing *really bad* happens from using the same IRQ number
except that the VM gets confused.  However, it's living life dangerously
because we use the HW bit for the vtimer, so we if ever start using it
for other timers or the PMU, then you could end up in a situation where
you unmap the phys mapping on behalf of another device, and the physical
interrupt could be left in an active state.

On the other hand, the vtimer currently belongs only to VMs and we set
the required physical active state before entering the VM, so maybe it
doesn't matter.

Should we just forego these checks and let the user shoot itself in the
foot if he/she wants to?

Thanks,
-Christoffer

^ permalink raw reply

* [PATCH] ARM: dts: imx6sx-sdb: Remove cpufreq OPP override
From: Marek Vasut @ 2017-05-04 10:06 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1493890943.11226.42.camel@nxp.com>

On 05/04/2017 11:42 AM, Leonard Crestez wrote:
> On Wed, 2017-05-03 at 21:33 +0200, Marek Vasut wrote:
>> On 05/03/2017 07:58 PM, Leonard Crestez wrote:
>>> On Wed, 2017-05-03 at 17:59 +0200, Marek Vasut wrote:
>>>> On 05/03/2017 04:58 PM, Leonard Crestez wrote:
>>>>> On Wed, 2017-05-03 at 16:26 +0200, Marek Vasut wrote:
> 
>>>>>> 2) It actually fixes a problem with the voltage rails such that the DVFS
>>>>>>    works without leaving the system in unstable or dead state. You do
>>>>>>    need the second part of my patch if you drop the OPP hackery, without
>>>>>>    it the power framework cannot correctly configure the core voltages,
>>>>>>    so the patch from Leonard makes things worse.
> 
>>>>> No, I think there is a misunderstanding here. The second part of your
>>>>> patch will cause cpufreq poking at LDOs to indirectly adjust the input
>>>>> from the PMIC to the minimum required (this is LDO target +
>>>>> min_dropout_uv). Without it by default VDD_ARM_SOC_IN will remain fixed
>>>>> as 1375mV from boot.
> 
>>>> Who sets / guarantees that default value for ARM and SOC rails ?
> 
>>> I think it's from the PMIC hardware itself (but maybe uboot plays with
>>> it). VDD_ARM_SOC_IN on this board is tied to SW1AB from MMPF0200:
>>>
>>> http://www.nxp.com/assets/documents/data/en/data-sheets/MMPF0200.pdf
>>>
>>> It seems reasonable to rely on such voltages set externally.
> 
>> Isn't it an established rule that Linux should not depend on bootloader
>> settings ? Or did that change ?
> 
> I don't actually know. Is there a hard and fast rule about this, even when it comes to voltages?

Unless something changed recently, you are not supposed to depend on
bootloader behavior.

> In theory it is possible for a bootloader to set a low cpu frequency and low voltage and then have the chip fail when the cpufreq driver attempts to go higher. Setting vin-supply on reg_arm/reg_soc would fix that.
> 
>> Well the regulator(s) cannot be correctly configured if the kernel
>> doesn't have the correct power distribution described in the DT .
> 
> It depends on your definition of "correctness". It it certainly
> possible to get a functional system while only partially describing
> regulator relationships.

Then your description of the hardware in DT is not correct.

> I think there is a further misunderstanding here. I have a problem
> where imx6sx-sdb rev C boards crash on boot with upstream (but are
> reported to work fine with rev B). Removing the OPP overrides fixes
> this specific issue.
> 
> I don't object to the second part of your patch, setting correct supply links is a good thing for various reasons. It is just not necessary for fixing the concrete crash mentioned above (and I tested this). It should probably go in a separate patch.

Mind you, my patch is not fixing any crash, it's correcting the
regulator binding and removing the OPP override which is at that
point useless.

> It might seem a pedantic difference but it's good to accurately describe the effect of patches in commit messages. For example it might help somebody looking to backport various fixes.

Which part of the following commit message is unclear?

"
The imx6sx-sdb has one power supply that drives both VDDARM_IN
and VDDSOC_IN, which is the sw1a regulator on the PFUZE PMIC.
Connect both inputs to the sw1a regulator on the PMIC and drop
the OPP hackery which is no longer needed as the power framework
will take care of the regulator configuration as needed.
"

btw if sending obvious bugfix upstream is met with this sort of
resistance and pointless discussion, it is extremely demotivating.
Waiting for a maintainer reply for 2-4 weeks, only to get a kurt
reply like "I don't like the commit message" doesn't help either.

-- 
Best regards,
Marek Vasut

^ permalink raw reply

* [PATCH 1/5] KVM: arm64: Allow creating the PMU without the in-kernel GIC
From: Marc Zyngier @ 2017-05-04 10:16 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170504094404.GI5923@cbox>

On 04/05/17 10:44, Christoffer Dall wrote:
> On Thu, May 04, 2017 at 10:05:43AM +0100, Marc Zyngier wrote:
>> On 04/05/17 09:38, Christoffer Dall wrote:
>>> On Thu, May 04, 2017 at 09:28:50AM +0100, Marc Zyngier wrote:
>>>> On 04/05/17 09:13, Christoffer Dall wrote:
>>>>> On Thu, May 04, 2017 at 09:09:47AM +0100, Marc Zyngier wrote:
>>>>>> On 03/05/17 19:32, Christoffer Dall wrote:
>>>>>>> Since we got support for devices in userspace which allows reporting the
>>>>>>> PMU overflow output status to userspace, we should actually allow
>>>>>>> creating the PMU on systems without an in-kernel irqchip, which in turn
>>>>>>> requires us to slightly clarify error codes for the ABI and move things
>>>>>>> around for the initialization phase.
>>>>>>>
>>>>>>> Signed-off-by: Christoffer Dall <cdall@linaro.org>
>>>>>>> ---
>>>>>>>  Documentation/virtual/kvm/devices/vcpu.txt | 16 +++++++++-------
>>>>>>>  virt/kvm/arm/pmu.c                         | 27 +++++++++++++++++----------
>>>>>>>  2 files changed, 26 insertions(+), 17 deletions(-)
>>>>>>>
>>>>>>> diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt
>>>>>>> index 02f5068..352af6e 100644
>>>>>>> --- a/Documentation/virtual/kvm/devices/vcpu.txt
>>>>>>> +++ b/Documentation/virtual/kvm/devices/vcpu.txt
>>>>>>> @@ -16,7 +16,9 @@ Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
>>>>>>>  Returns: -EBUSY: The PMU overflow interrupt is already set
>>>>>>>           -ENXIO: The overflow interrupt not set when attempting to get it
>>>>>>>           -ENODEV: PMUv3 not supported
>>>>>>> -         -EINVAL: Invalid PMU overflow interrupt number supplied
>>>>>>> +         -EINVAL: Invalid PMU overflow interrupt number supplied or
>>>>>>> +                  trying to set the IRQ number without using an in-kernel
>>>>>>> +                  irqchip.
>>>>>>>  
>>>>>>>  A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
>>>>>>>  number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
>>>>>>> @@ -25,11 +27,11 @@ all vcpus, while as an SPI it must be a separate number per vcpu.
>>>>>>>  
>>>>>>>  1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
>>>>>>>  Parameters: no additional parameter in kvm_device_attr.addr
>>>>>>> -Returns: -ENODEV: PMUv3 not supported
>>>>>>> -         -ENXIO: PMUv3 not properly configured as required prior to calling this
>>>>>>> -                 attribute
>>>>>>> +Returns: -ENODEV: PMUv3 not supported or GIC not initialized
>>>>>>> +         -ENXIO: PMUv3 not properly configured or in-kernel irqchip not
>>>>>>> +                 conigured as required prior to calling this attribute
>>>>>>>           -EBUSY: PMUv3 already initialized
>>>>>>>  
>>>>>>> -Request the initialization of the PMUv3.  This must be done after creating the
>>>>>>> -in-kernel irqchip.  Creating a PMU with a userspace irqchip is currently not
>>>>>>> -supported.
>>>>>>> +Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
>>>>>>> +virtual GIC implementation, this must be done after initializing the in-kernel
>>>>>>> +irqchip.
>>>>>>> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
>>>>>>> index 4b43e7f..f046b08 100644
>>>>>>> --- a/virt/kvm/arm/pmu.c
>>>>>>> +++ b/virt/kvm/arm/pmu.c
>>>>>>> @@ -456,21 +456,25 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
>>>>>>>  	if (!kvm_arm_support_pmu_v3())
>>>>>>>  		return -ENODEV;
>>>>>>>  
>>>>>>> -	/*
>>>>>>> -	 * We currently require an in-kernel VGIC to use the PMU emulation,
>>>>>>> -	 * because we do not support forwarding PMU overflow interrupts to
>>>>>>> -	 * userspace yet.
>>>>>>> -	 */
>>>>>>> -	if (!irqchip_in_kernel(vcpu->kvm) || !vgic_initialized(vcpu->kvm))
>>>>>>> -		return -ENODEV;
>>>>>>> -
>>>>>>> -	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features) ||
>>>>>>> -	    !kvm_arm_pmu_irq_initialized(vcpu))
>>>>>>> +	if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features))
>>>>>>>  		return -ENXIO;
>>>>>>>  
>>>>>>>  	if (kvm_arm_pmu_v3_ready(vcpu))
>>>>>>>  		return -EBUSY;
>>>>>>>  
>>>>>>> +	if (irqchip_in_kernel(vcpu->kvm)) {
>>>>>>> +		/*
>>>>>>> +		 * If using the PMU with an in-kernel virtual GIC
>>>>>>> +		 * implementation, we require the GIC to be already
>>>>>>> +		 * initialized when initializing the PMU.
>>>>>>> +		 */
>>>>>>> +		if (!vgic_initialized(vcpu->kvm))
>>>>>>> +			return -ENODEV;
>>>>>>> +
>>>>>>> +		if (!kvm_arm_pmu_irq_initialized(vcpu))
>>>>>>> +			return -ENXIO;
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	kvm_pmu_vcpu_reset(vcpu);
>>>>>>>  	vcpu->arch.pmu.ready = true;
>>>>>>>  
>>>>>>> @@ -512,6 +516,9 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>>>>>>>  		int __user *uaddr = (int __user *)(long)attr->addr;
>>>>>>>  		int irq;
>>>>>>>  
>>>>>>> +		if (!irqchip_in_kernel(vcpu->kvm))
>>>>>>> +			return -EINVAL;
>>>>>>> +
>>>>>>
>>>>>> Shouldn't we fail the same way for {get,has}_attr? get_attr is going to
>>>>>> generate a -ENXIO, and has_attr is going to lie about the availability
>>>>>> of KVM_ARM_VCPU_PMU_V3_IRQ...
>>>>>>
>>>>>
>>>>> Here's the text from api.txt:
>>>>>
>>>>>   Tests whether a device supports a particular attribute.  A successful
>>>>>   return indicates the attribute is implemented.  It does not necessarily
>>>>>   indicate that the attribute can be read or written in the device's
>>>>>   current state.  "addr" is ignored.
>>>>>
>>>>> My interpretation therefore is that QEMU can use this ioctl to figure
>>>>> out if the feature is supported (sort of like a capability), but that
>>>>> doesn't mean that the configuration of the VM is such that the attribute
>>>>> can be get or set at that moment.
>>>>>
>>>>> For example, there will also alway be situations where you can get an
>>>>> attr, but not set an attr, what should the has_attr return then?
>>>>
>>>> My issue here is that whether we can get/set the interrupt or not is not
>>>> a function of the device itself, but of the way it is "wired". No matter
>>>> what "the device's current state" is, we'll never be able to get/set the
>>>> interrupt.
>>>>
>>>> I'd tend to err on the side of caution and return something that is
>>>> unambiguous, be maybe I have too strict an interpretation of the API.
>>>>
>>>
>>> Hmm, I see the has_attr as a method for userspace to discover "does this
>>> kernel have this feature".  If we make has_attr return a value depending
>>> on the VM having an in-kernel gic or not, we implicitly require
>>> userspace to create a VM with an in-kernel GIC to discover if this
>>> kernel has that feature, and therefore also impose an ordering
>>> requirement of figuring out the capabilities of the kernel (i.e. create
>>> the GIC before checking this API).
>>>
>>> I think QEMU should be able to do:
>>>
>>>   if (has_attr()) {
>>>      kvm_supports_set_timer_irq = true;
>>>      vtimer_irq = foo;
>>>   } else {
>>>      kvm_supports_set_timer_irq = false;
>>>      vtimer_irq = 27; /* default, we're stuck with it */
>>>   }
>>>
>>>   create_board_definition();
>>>   create_dt();
>>>   create_acpi();
>>>
>>>   /* do whatever */
>>>
>>>   if (kvm_supports_set_timer_irq && kvm_irqchip_in_kernel()) {
>>>   	kvm_arm_timer_set_irq(...);
>>>   }
>>>
>>> And all this should not be coupled to when we create the irqchip device.
>>>
>>> But I may be failing to see the case where the current implementation
>>> creates a problem for userspace, in which case we should document the
>>> ordering requirement.
>>
>> I'm not sure it would create any problem, at least not for the PMU
>> (there is no working code that would have created a PMU without an irqchip).
>>
>> It is just that we have a slightly diverging interpretation of what
>> has_attr means. You see it as "attribute that the device supports", and
>> I see it as "attribute that the device supports in this configuration".
>> I'm happy to use your semantics from now on.
> 
> In either case, we should make sure this is clear in the ABI.  I thought
> that the "It does not necessarily indicate that the attribute can be
> read or written in the device's current state." implied my
> interpretation, but maybe I'm missing some subtlety there?

Well, that's what I said above. The interrupt number is not a function
of the device state, but one of the integration of that device in the
system. The PMU itself is not concerned with an interrupt number, only
the GIC is.

> Do you think we should clarify the API?

Not sure. I admit my interpretation is a bit borderline. If I was brave,
I'd say that all the interrupt numbers should be a property of the GIC,
and not of the device. But that ship has sailed a while ago, and I'm
weak ;-).

More seriously, I don't think this is likely to cause any issue.

> 
> By the way, I now realize that we are not maintining the same
> understanding between get/set_attr, which I really think we should, so
> I'll add the following:
> 
> 
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index 9b3e3ea..0cf62b7 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -551,6 +551,9 @@ int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>  		int __user *uaddr = (int __user *)(long)attr->addr;
>  		int irq;
>  
> +		if (!irqchip_in_kernel(vcpu->kvm))
> +			return -EINVAL;
> +
>  		if (!test_bit(KVM_ARM_VCPU_PMU_V3, vcpu->arch.features))
>  			return -ENODEV;
>  

Looks good to me.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH] MAINTAINERS: add arm@kernel.org as the list for arm64 defconfig changes
From: Will Deacon @ 2017-05-04 10:21 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <0ad7d305-d323-339f-9534-d559f1cf656a@codeaurora.org>

On Wed, May 03, 2017 at 12:47:16PM -0500, Timur Tabi wrote:
> On 05/03/2017 12:39 PM, Olof Johansson wrote:
> > No, I'm not saying that, and there is a better way: Send it through
> > Andy if in doubt -- it should be his job as your platform maintainer
> > to help guide you through the system if needed.
> 
> Will told me that defconfig patches should go to arm at kernel.org, but you're
> telling me they should go to Andy instead.

defconfig patches go via arm at kernel.org, because they pick them up from the
platform maintainers. However, given that the merge window is open (and my
understanding is that you want these in for 4.12!), I suggested mailing them
directly to arm at kernel.org to get their opinion, which they promptly provided ;)

Apologies for the confusion,

Will

^ permalink raw reply

* [PATCH] ARM: socfpga: Increase max number of GPIOs
From: Marek Vasut @ 2017-05-04 10:26 UTC (permalink / raw)
  To: linux-arm-kernel

Increase the maximum number of GPIOs on SoCFPGA as this platform
can have many GPIO controllers in the FPGA part.

Signed-off-by: Marek Vasut <marex@denx.de>
---
 arch/arm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 4c1a35f15838..108c203a8b29 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1460,6 +1460,7 @@ config ARM_PSCI
 # selected platforms.
 config ARCH_NR_GPIO
 	int
+	default 2048 if ARCH_SOCFPGA
 	default 1024 if ARCH_BRCMSTB || ARCH_SHMOBILE || ARCH_TEGRA || \
 		ARCH_ZYNQ
 	default 512 if ARCH_EXYNOS || ARCH_KEYSTONE || SOC_OMAP5 || \
-- 
2.11.0

^ permalink raw reply related

* [PATCH] ARM: dts: socfpga: Fix the ethernet clock phandle
From: Marek Vasut @ 2017-05-04 10:27 UTC (permalink / raw)
  To: linux-arm-kernel

The ethernet block clock phandle must point to the clock node which
represents the clock which directly supply the ethernet block. This
is emac_x_clk , not emacx_clk , so fix this.

From: Pavel Machek <pavel@denx.de>
Signed-off-by: Marek Vasut <marex@denx.de>
---
 arch/arm/boot/dts/socfpga.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/socfpga.dtsi b/arch/arm/boot/dts/socfpga.dtsi
index b2674bdb8e6a..7e24dc8e82d4 100644
--- a/arch/arm/boot/dts/socfpga.dtsi
+++ b/arch/arm/boot/dts/socfpga.dtsi
@@ -557,7 +557,7 @@
 			interrupts = <0 115 4>;
 			interrupt-names = "macirq";
 			mac-address = [00 00 00 00 00 00];/* Filled in by U-Boot */
-			clocks = <&emac0_clk>;
+			clocks = <&emac_0_clk>;
 			clock-names = "stmmaceth";
 			resets = <&rst EMAC0_RESET>;
 			reset-names = "stmmaceth";
@@ -575,7 +575,7 @@
 			interrupts = <0 120 4>;
 			interrupt-names = "macirq";
 			mac-address = [00 00 00 00 00 00];/* Filled in by U-Boot */
-			clocks = <&emac1_clk>;
+			clocks = <&emac_1_clk>;
 			clock-names = "stmmaceth";
 			resets = <&rst EMAC1_RESET>;
 			reset-names = "stmmaceth";
-- 
2.11.0

^ permalink raw reply related

* [PATCH 1/4] ARM: dts: socfpga: Enable QSPI support on VINING FPGA
From: Marek Vasut @ 2017-05-04 10:29 UTC (permalink / raw)
  To: linux-arm-kernel

Enable the QSPI node and add the flash chips.

Signed-off-by: Marek Vasut <marex@denx.de>
---
 arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts | 38 ++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
index 893198049397..cb4bdbcf54ee 100644
--- a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
+++ b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
@@ -300,6 +300,44 @@
 	};
 };
 
+&qspi {
+	status = "okay";
+
+	n25q128 at 0 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "n25q128";
+		reg = <0>;		/* chip select */
+		spi-max-frequency = <100000000>;
+		m25p,fast-read;
+
+		cdns,page-size = <256>;
+		cdns,block-size = <16>;
+		cdns,read-delay = <4>;
+		cdns,tshsl-ns = <50>;
+		cdns,tsd2d-ns = <50>;
+		cdns,tchsh-ns = <4>;
+		cdns,tslch-ns = <4>;
+	};
+
+	n25q00 at 1 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		compatible = "n25q00";
+		reg = <1>;		/* chip select */
+		spi-max-frequency = <100000000>;
+		m25p,fast-read;
+
+		cdns,page-size = <256>;
+		cdns,block-size = <16>;
+		cdns,read-delay = <4>;
+		cdns,tshsl-ns = <50>;
+		cdns,tsd2d-ns = <50>;
+		cdns,tchsh-ns = <4>;
+		cdns,tslch-ns = <4>;
+	};
+};
+
 &usb0 {
 	dr_mode = "host";
 	status = "okay";
-- 
2.11.0

^ permalink raw reply related

* [PATCH 2/4] ARM: dts: socfpga: Remove I2C EEPROMs from VINING FPGA
From: Marek Vasut @ 2017-05-04 10:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170504102950.10667-1-marex@denx.de>

Remove the EEPROMs attached to the I2C expander ports which
lead to the backplane slots from the main VIN|ING DTS file.
These EEPROMs are bound using separate DTO files, which lets
us handle both two-slot and six-slot configuration of the
backplane.

Signed-off-by: Marek Vasut <marex@denx.de>
---
 arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts | 34 ++--------------------
 1 file changed, 2 insertions(+), 32 deletions(-)

diff --git a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
index cb4bdbcf54ee..268785ecb82a 100644
--- a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
+++ b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
@@ -203,69 +203,39 @@
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <0>;
-			eeprom at 51 {
-				compatible = "at,24c01";
-				pagesize = <8>;
-				reg = <0x51>;
-			};
 		};
 
 		i2c at 1 {
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <1>;
-			eeprom at 51 {
-				compatible = "at,24c01";
-				pagesize = <8>;
-				reg = <0x51>;
-			};
 		};
 
 		i2c at 2 {
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <2>;
-			eeprom at 51 {
-				compatible = "at,24c01";
-				pagesize = <8>;
-				reg = <0x51>;
-			};
 		};
 
 		i2c at 3 {
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <3>;
-			eeprom at 51 {
-				compatible = "at,24c01";
-				pagesize = <8>;
-				reg = <0x51>;
-			};
 		};
 
 		i2c at 4 {
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <4>;
-			eeprom at 51 {
-				compatible = "at,24c01";
-				pagesize = <8>;
-				reg = <0x51>;
-			};
 		};
 
 		i2c at 5 {
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <5>;
-			eeprom at 51 {
-				compatible = "at,24c01";
-				pagesize = <8>;
-				reg = <0x51>;
-			};
 		};
 
-		i2c at 6 {
+		i2c at 6 {	/* Backplane EEPROM */
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <6>;
@@ -276,7 +246,7 @@
 			};
 		};
 
-		i2c at 7 {
+		i2c at 7 {	/* Power board EEPROM */
 			#address-cells = <1>;
 			#size-cells = <0>;
 			reg = <7>;
-- 
2.11.0

^ permalink raw reply related

* [PATCH 3/4] ARM: dts: socfpga: Drop LED node from VINING FPGA
From: Marek Vasut @ 2017-05-04 10:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170504102950.10667-1-marex@denx.de>

Drop the LED node from VINing FPGA DT because the LED wiring is
different on each mainboard revision. This wiring is therefore
handled in mainboard DT Overlays.

Signed-off-by: Marek Vasut <marex@denx.de>
---
 arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts | 28 ----------------------
 1 file changed, 28 deletions(-)

diff --git a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
index 268785ecb82a..9195f11dacdc 100644
--- a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
+++ b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
@@ -71,34 +71,6 @@
 		ethernet0 = &gmac1;
 	};
 
-	leds {
-		compatible = "gpio-leds";
-
-		hps_led0 {
-			label = "hps:green:led0";	/* ALIVE_LED_GR */
-			gpios = <&portb 19 0>;	/* HPS_GPIO48 */
-			linux,default-trigger = "heartbeat";
-		};
-
-		hps_led1 {
-			label = "hps:red:led0";		/* ALIVE_LED_RD */
-			gpios = <&portb 24 0>;	/* HPS_GPIO53 */
-			linux,default-trigger = "none";
-		};
-
-		hps_led2 {
-			label = "hps:green:led1";	/* LINK2HOST_LED_GR */
-			gpios = <&portb 25 0>;	/* HPS_GPIO54 */
-			linux,default-trigger = "heartbeat";
-		};
-
-		hps_led3 {
-			label = "hps:red:led1";		/* LINK2HOST_LED_RD */
-			gpios = <&portc 7 0>;	/* HPS_GPIO65 */
-			linux,default-trigger = "none";
-		};
-	};
-
 	gpio-keys {
 		compatible = "gpio-keys";
 
-- 
2.11.0

^ permalink raw reply related

* [PATCH 4/4] ARM: dts: socfpga: Add second ethernet alias to VINING FPGA
From: Marek Vasut @ 2017-05-04 10:29 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20170504102950.10667-1-marex@denx.de>

Add DT alias for the second ethernet present on mainboard rev 1.10.

Signed-off-by: Marek Vasut <marex@denx.de>
---
 arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
index 9195f11dacdc..655fe87e272d 100644
--- a/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
+++ b/arch/arm/boot/dts/socfpga_cyclone5_vining_fpga.dts
@@ -69,6 +69,7 @@
 		 * to be added to the gmac1 device tree blob.
 		 */
 		ethernet0 = &gmac1;
+		ethernet1 = &gmac0;
 	};
 
 	gpio-keys {
-- 
2.11.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox