Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Linaro-acpi] ACPI namespace details for ARM64
From: Hanjun Guo @ 2016-11-11 14:44 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161111093226.GA13333@red-moon>

On 11/11/2016 05:32 PM, Lorenzo Pieralisi wrote:
> On Thu, Nov 10, 2016 at 04:18:54PM -0700, Al Stone wrote:
>> On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
>>> Hi all,
>>>
>>> We've been working through the details of getting ACPI to work on
>>> arm64, and there have been lots of questions about what this means for
>>> PCI.  I've outlined this for several people individually, but I'm
>>> going to send this separately, apart from a specific patch series, to
>>> make sure we're all on the same page.  Please correct my errors and
>>> misunderstandings.
>>>
>>> Bjorn
>>>
>>> [snip....]
>>
>> A big +1 to all of this.  This also looks like something that should
>> be added to either PCI, ACPI or arm64 documentation (or even all three).
>
> And to arm64 platforms FW :)
>
>> What do you think?
>
> I do not think there is anything ARM64 specific in Bjorn's description,
> but I do think it is very useful to have it in documentation, these
> bits of information are scattered around ACPI specs and PCI FW specs,
> having a single source would help and would have prevented asking
> Bjorn the same questions 100 times.
>
>> Thank you for putting this together, Bjorn.
>
> +1, Thank you very much for this nice summary Bjorn.

+1, thanks a lot :)

Hanjun

^ permalink raw reply

* [RFC v2 8/8] iommu/arm-smmu: implement add_reserved_regions callback
From: Robin Murphy @ 2016-11-11 14:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161110161619.GK2078@8bytes.org>

On 10/11/16 16:16, Joerg Roedel wrote:
> On Thu, Nov 10, 2016 at 04:07:08PM +0000, Robin Murphy wrote:
>> On 10/11/16 15:46, Joerg Roedel wrote:
>>> On Fri, Nov 04, 2016 at 11:24:06AM +0000, Eric Auger wrote:
>>>> +	resource_list_for_each_entry(window, &bridge->windows) {
>>>> +		if (resource_type(window->res) != IORESOURCE_MEM &&
>>>> +		    resource_type(window->res) != IORESOURCE_IO)
>>>> +			continue;
>>>
>>> Why do you care about IO resources?
>>
>> [since this is essentially code I wrote]
>>
>> Because they occupy some area of the PCI address space, therefore I
>> assumed that, like memory windows, they would be treated as P2P. Is that
>> not the case?
> 
> No, not at all. The IO-space is completly seperate from the MEM-space.
> They are two different address-spaces, addressing different things. And
> the IO-space is also not translated by any IOMMU I am aware of.

OK. On the particular root complex I have to hand, though, any DMA to
IOVAs between 0x5f800000 and 0x5fffffff sends an error back to the
endpoint, and that just so happens to be where the I/O window is placed
(both on the PCI side and the AXI (i.e. CPU MMIO) side. Whether it's
that the external MMIO view of the RC's I/O window is explicitly
duplicated in its PCI memory space as some side-effect of the PCI/AXI
bridge, or that the thing just doesn't actually respect the access type
on the PCI side I don't know, but that's how it is (and I spent this
morning recreating it to make sure I wasn't mistaken).

This thing's ECAM space is similarly visible from the PCI side and
aborts DMA the same way - I've not tried decoding the "PCI hardware
error (0x1010)" that the sky2 network driver reports, but I do note it's
a slightly different number from the one it gives when trying to access
an address matching another device's BAR (actual peer-to-peer is
explicitly unsupported). Admittedly that's not something we can detect
from the host bridge windows at all.

Anyway, you are of course right that in the normal case this should only
matter if devices were doing I/O accesses in the first place, which
makes especially no sense in the context of the DMA API, so perhaps we
could drop the unintuitive IORESOURCE_IO check from here and consider
weird PCI controllers a separate problem to solve.

Robin.

^ permalink raw reply

* [Linaro-acpi] ACPI namespace details for ARM64
From: Sinan Kaya @ 2016-11-11 14:24 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <a0829be4-6324-758b-2fe6-a274f715a3a9@linaro.org>

On 11/10/2016 6:18 PM, Al Stone wrote:
> On 11/09/2016 03:05 PM, Bjorn Helgaas wrote:
>> Hi all,
>>
>> We've been working through the details of getting ACPI to work on
>> arm64, and there have been lots of questions about what this means for
>> PCI.  I've outlined this for several people individually, but I'm
>> going to send this separately, apart from a specific patch series, to
>> make sure we're all on the same page.  Please correct my errors and
>> misunderstandings.
>>
>> Bjorn
>>
>> [snip....]
> 
> A big +1 to all of this.  This also looks like something that should
> be added to either PCI, ACPI or arm64 documentation (or even all three).
> What do you think?


I agree. In order to have compliant systems, we have to make PNP0C02 required
in the PCIe appendix of the SBSA specification.

> 
> Thank you for putting this together, Bjorn.
> 
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* [PATCH v5 6/8] Documentation: bindings: add compatible specific to legacy SCPI protocol
From: Sudeep Holla @ 2016-11-11 14:19 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAL_JsqJ5PxzM_VMP55m+6snkqyWSMSsdCOiUzYWkMTb3LkD5Mw@mail.gmail.com>



On 11/11/16 13:34, Rob Herring wrote:
> On Fri, Nov 11, 2016 at 1:48 AM, Sudeep Holla <sudeep.holla@arm.com> wrote:
>> On 10/11/16 19:03, Olof Johansson wrote:
>>> On Thu, Nov 10, 2016 at 6:34 AM, Sudeep Holla <sudeep.holla@arm.com>
>>> wrote:
>
> [...]
>
>>>> E.g. Amlogic follows most of the legacy protocol though it deviates in
>>>> couple of things which we can handle with platform specific compatible
>>>> (in the following patch in the series). When another user(Rockchip ?)
>>>> make use of this legacy protocol, we can start using those platform
>>>> specific compatible for deviations only.
>>>>
>>>> Is that not acceptable ?
>>>
>>>
>>> If there's no shared legacy feature set, then it's probably less
>>> useful to have a shared less precise compatible value.
>>>
>>
>> There is and will be some shared feature set for sure. At the least the
>> standard command set will be shared.
>>
>>> What the main point I was trying to get across was that we shouldn't
>>> expand the generic binding with per-vendor compatible fields, instead
>>> we should have those as extensions on the side.
>>>
>>
>> Yes I get the point. We will have per-vendor compatibles for handle the
>> deviations but generic one to handle the shared set.
>>
>>> I'm also a little apprehensive of using "legacy", it goes in the same
>>> bucket as "misc". At some point 1.0 will be legacy too, etc.
>>>
>>
>> True and I agree, how about "arm,scpi-pre-1.0" instead ?
>
> That's still meaningless. Convince me that multiple implementations
> are identical, then we can have a common property. For example, how
> many releases did ARM make before 1.0.
>

None officially, so I tend to agree with you on this.

But so far we have seen some commonality between Rockchip and Amlogic
implementations, which in fact shares some commonality with early
release of SCPI from ARM (there are based on the same SCP code base,
which is closed source and released to partners only). ARM improved the
specification and the code base before the official release but by then
it was adopted(as usual we were late ;))

IMO, it's might be useful to have more generic say "arm,scpi-pre-1.0"
and platform specific "amlogic,meson-gxbb-scpi"

-- 
Regards,
Sudeep

^ permalink raw reply

* [PATCH v14 4/9] acpi/arm64: Add GTDT table parse driver
From: Hanjun Guo @ 2016-11-11 13:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <5825CBB5.8090104@linaro.org>

On 11/11/2016 09:46 PM, Hanjun Guo wrote:
> Hi Mark,
>
> Sorry for the late reply.
>
> On 10/21/2016 12:37 AM, Mark Rutland wrote:
>> Hi,
>>
>> As a heads-up, on v4.9-rc1 I see conflicts at least against
>> arch/arm64/Kconfig. Luckily git am -3 seems to be able to fix that up
>> automatically, but this will need to be rebased before the next posting
>> and/or merging.
>>
>> On Thu, Sep 29, 2016 at 02:17:12AM +0800, fu.wei at linaro.org wrote:
>>> +static int __init map_gt_gsi(u32 interrupt, u32 flags)
>>> +{
>>> +    int trigger, polarity;
>>> +
>>> +    if (!interrupt)
>>> +        return 0;
>>
>> Urgh.
>>
>> Only the secure interrupt (which we do not need) is optional in this
>> manner, and (hilariously), zero appears to also be a valid GSIV, per
>> figure 5-24 in the ACPI 6.1 spec.
>>
>> So, I think that:
>>
>> (a) we should not bother parsing the secure interrupt
>> (b) we should drop the check above
>> (c) we should report the spec issue to the ASWG
>
> Sorry, I willing to do that, but I need to figure out the issue here.
> What kind of issue in detail? do you mean that zero should not be valid
> for arch timer interrupts?

OK, I think you are referring to "we don't need the secure interrupt",
correct me if I'm wrong (still in jet lag...).

Thanks
Hanjun

^ permalink raw reply

* [PATCH v6 2/9] drm/hisilicon/hibmc: Add video memory management
From: Rongrong Zou @ 2016-11-11 13:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAOw6vbJFitYby+cNezQvdtwJqNkSjhX2=S0QwUrL40gmv-Kozg@mail.gmail.com>

? 2016/11/11 21:25, Sean Paul ??:
> On Fri, Nov 11, 2016 at 6:16 AM, Rongrong Zou <zourongrong@huawei.com> wrote:
>> ? 2016/11/11 1:35, Sean Paul ??:
>>>
>>> On Fri, Oct 28, 2016 at 3:27 AM, Rongrong Zou <zourongrong@gmail.com>
>>> wrote:
>>>>
>>>> Hibmc have 32m video memory which can be accessed through PCIe by host,
>>>> we use ttm to manage these memory.
>>>>
>>>> Signed-off-by: Rongrong Zou <zourongrong@gmail.com>
>>>> ---
>>>>    drivers/gpu/drm/hisilicon/hibmc/Kconfig         |   1 +
>>>>    drivers/gpu/drm/hisilicon/hibmc/Makefile        |   2 +-
>>>>    drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c |  12 +
>>>>    drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h |  46 +++
>>>>    drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c     | 490
>>>> ++++++++++++++++++++++++
>>>>    5 files changed, 550 insertions(+), 1 deletion(-)
>>>>    create mode 100644 drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>>>
>>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>>> b/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>>> index a9af90d..bcb8c18 100644
>>>> --- a/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>>> @@ -1,6 +1,7 @@
>>>>    config DRM_HISI_HIBMC
>>>>           tristate "DRM Support for Hisilicon Hibmc"
>>>>           depends on DRM && PCI
>>>> +       select DRM_TTM
>>>>
>>>>           help
>>>>             Choose this option if you have a Hisilicon Hibmc soc chipset.
>>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>>> b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>>> index 97cf4a0..d5c40b8 100644
>>>> --- a/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>>> @@ -1,5 +1,5 @@
>>>>    ccflags-y := -Iinclude/drm
>>>> -hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_power.o
>>>> +hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_power.o hibmc_ttm.o
>>>>
>>>>    obj-$(CONFIG_DRM_HISI_HIBMC)   +=hibmc-drm.o
>>>>    #obj-y += hibmc-drm.o
>>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>>> index 4669d42..81f4301 100644
>>>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>>> @@ -31,6 +31,7 @@
>>>>    #ifdef CONFIG_COMPAT
>>>>           .compat_ioctl   = drm_compat_ioctl,
>>>>    #endif
>>>> +       .mmap           = hibmc_mmap,
>>>>           .poll           = drm_poll,
>>>>           .read           = drm_read,
>>>>           .llseek         = no_llseek,
>>>> @@ -46,6 +47,8 @@ static void hibmc_disable_vblank(struct drm_device
>>>> *dev, unsigned int pipe)
>>>>    }
>>>>
>>>>    static struct drm_driver hibmc_driver = {
>>>> +       .driver_features        = DRIVER_GEM,
>>>> +
>>>
>>>
>>> nit: extra space
>>>
>>>>           .fops                   = &hibmc_fops,
>>>>           .name                   = "hibmc",
>>>>           .date                   = "20160828",
>>>> @@ -55,6 +58,10 @@ static void hibmc_disable_vblank(struct drm_device
>>>> *dev, unsigned int pipe)
>>>>           .get_vblank_counter     = drm_vblank_no_hw_counter,
>>>>           .enable_vblank          = hibmc_enable_vblank,
>>>>           .disable_vblank         = hibmc_disable_vblank,
>>>> +       .gem_free_object_unlocked = hibmc_gem_free_object,
>>>> +       .dumb_create            = hibmc_dumb_create,
>>>> +       .dumb_map_offset        = hibmc_dumb_mmap_offset,
>>>> +       .dumb_destroy           = drm_gem_dumb_destroy,
>>>>    };
>>>>
>>>>    static int hibmc_pm_suspend(struct device *dev)
>>>> @@ -163,6 +170,7 @@ static int hibmc_unload(struct drm_device *dev)
>>>>    {
>>>>           struct hibmc_drm_device *hidev = dev->dev_private;
>>>>
>>>> +       hibmc_mm_fini(hidev);
>>>>           hibmc_hw_fini(hidev);
>>>>           dev->dev_private = NULL;
>>>>           return 0;
>>>> @@ -183,6 +191,10 @@ static int hibmc_load(struct drm_device *dev,
>>>> unsigned long flags)
>>>>           if (ret)
>>>>                   goto err;
>>>>
>>>> +       ret = hibmc_mm_init(hidev);
>>>> +       if (ret)
>>>> +               goto err;
>>>> +
>>>>           return 0;
>>>>
>>>>    err:
>>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>>> index 0037341..db8d80e 100644
>>>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>>> @@ -20,6 +20,8 @@
>>>>    #define HIBMC_DRM_DRV_H
>>>>
>>>>    #include <drm/drmP.h>
>>>> +#include <drm/ttm/ttm_bo_driver.h>
>>>> +#include <drm/drm_gem.h>
>>>
>>>
>>> nit: alphabetize
>>
>>
>> will fix it, thanks.
>>
>>>
>>>>
>>>>    struct hibmc_drm_device {
>>>>           /* hw */
>>>> @@ -30,6 +32,50 @@ struct hibmc_drm_device {
>>>>
>>>>           /* drm */
>>>>           struct drm_device  *dev;
>>>> +
>>>> +       /* ttm */
>>>> +       struct {
>>>> +               struct drm_global_reference mem_global_ref;
>>>> +               struct ttm_bo_global_ref bo_global_ref;
>>>> +               struct ttm_bo_device bdev;
>>>> +               bool initialized;
>>>> +       } ttm;
>>>
>>>
>>> I don't think you gain anything other than keystrokes from the substruct
>>
>>
>> I'm sorry i didn't catch you, i looked at the all drivers used ttm such
>> as ast/bochs/cirrus/mgag200/qxl/virtio_gpu, they all embedded the ttm
>> substruct
>> into the driver-private struct.
>>
>> so do you mean
>> struct hibmc_drm_device {
>>          /* hw */
>>          void __iomem   *mmio;
>>          void __iomem   *fb_map;
>>          unsigned long  fb_base;
>>          unsigned long  fb_size;
>>
>>          /* drm */
>>          struct drm_device  *dev;
>>          struct drm_plane plane;
>>          struct drm_crtc crtc;
>>          struct drm_encoder encoder;
>>          struct drm_connector connector;
>>          bool mode_config_initialized;
>>
>>          /* ttm */
>>          struct drm_global_reference mem_global_ref;
>>          struct ttm_bo_global_ref bo_global_ref;
>>          struct ttm_bo_device bdev;
>>          bool initialized;
>>          ...
>>          };
>> ?
>
> Yeah, that's what I was thinking
>
>>
>>>
>>>> +
>>>> +       bool mm_inited;
>>>>    };
>>>>
>>>> +struct hibmc_bo {
>>>> +       struct ttm_buffer_object bo;
>>>> +       struct ttm_placement placement;
>>>> +       struct ttm_bo_kmap_obj kmap;
>>>> +       struct drm_gem_object gem;
>>>> +       struct ttm_place placements[3];
>>>> +       int pin_count;
>>>> +};
>>>> +
>>>> +static inline struct hibmc_bo *hibmc_bo(struct ttm_buffer_object *bo)
>>>> +{
>>>> +       return container_of(bo, struct hibmc_bo, bo);
>>>> +}
>>>> +
>>>> +static inline struct hibmc_bo *gem_to_hibmc_bo(struct drm_gem_object
>>>> *gem)
>>>> +{
>>>> +       return container_of(gem, struct hibmc_bo, gem);
>>>> +}
>>>> +
>>>> +#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
>>>
>>>
>>> Hide this in ttm.c
>>
>>
>> ok, will do that.
>> thanks for pointing it out.
>>
>>
>>>
>>>> +
>>>> +int hibmc_gem_create(struct drm_device *dev, u32 size, bool iskernel,
>>>> +                    struct drm_gem_object **obj);
>>>> +
>>>> +int hibmc_mm_init(struct hibmc_drm_device *hibmc);
>>>> +void hibmc_mm_fini(struct hibmc_drm_device *hibmc);
>>>> +int hibmc_bo_pin(struct hibmc_bo *bo, u32 pl_flag, u64 *gpu_addr);
>>>> +void hibmc_gem_free_object(struct drm_gem_object *obj);
>>>> +int hibmc_dumb_create(struct drm_file *file, struct drm_device *dev,
>>>> +                     struct drm_mode_create_dumb *args);
>>>> +int hibmc_dumb_mmap_offset(struct drm_file *file, struct drm_device
>>>> *dev,
>>>> +                          u32 handle, u64 *offset);
>>>> +int hibmc_mmap(struct file *filp, struct vm_area_struct *vma);
>>>> +
>>>>    #endif
>>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>>> new file mode 100644
>>>> index 0000000..0802ebd
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>>> @@ -0,0 +1,490 @@
>>>> +/* Hisilicon Hibmc SoC drm driver
>>>> + *
>>>> + * Based on the bochs drm driver.
>>>> + *
>>>> + * Copyright (c) 2016 Huawei Limited.
>>>> + *
>>>> + * Author:
>>>> + *     Rongrong Zou <zourongrong@huawei.com>
>>>> + *     Rongrong Zou <zourongrong@gmail.com>
>>>> + *     Jianhua Li <lijianhua@huawei.com>
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License as published by
>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>> + * (at your option) any later version.
>>>> + *
>>>> + */
>>>> +
>>>> +#include "hibmc_drm_drv.h"
>>>> +#include <ttm/ttm_page_alloc.h>
>>>> +#include <drm/drm_crtc_helper.h>
>>>> +#include <drm/drm_atomic_helper.h>
>>>> +
>>>> +static inline struct hibmc_drm_device *
>>>> +hibmc_bdev(struct ttm_bo_device *bd)
>>>> +{
>>>> +       return container_of(bd, struct hibmc_drm_device, ttm.bdev);
>>>> +}
>>>> +
>>>> +static int
>>>> +hibmc_ttm_mem_global_init(struct drm_global_reference *ref)
>>>> +{
>>>> +       return ttm_mem_global_init(ref->object);
>>>> +}
>>>> +
>>>> +static void
>>>> +hibmc_ttm_mem_global_release(struct drm_global_reference *ref)
>>>> +{
>>>> +       ttm_mem_global_release(ref->object);
>>>> +}
>>>> +
>>>> +static int hibmc_ttm_global_init(struct hibmc_drm_device *hibmc)
>>>> +{
>>>> +       struct drm_global_reference *global_ref;
>>>> +       int r;
>>>
>>>
>>> nit: try not to use one character variable names unless it's for the
>>> purpose of a loop (ie: i,j). You also use ret elsewhere in the driver,
>>> so it'd be nice to remain consistent
>>
>>
>> the whole file is delivered from bochs ttm, i didn't modify anything except
>> some checkpatch warnings and the 'hibmc_' prefix. Unfortunately, some
>> problems were delivered too.
>
> Yeah, seems like it. Perhaps you can post patches to fix these issues
> in the other drivers too :)

i will do after the this one get merged :)

>
>>
>>>
>>>> +
>>>> +       global_ref = &hibmc->ttm.mem_global_ref;
>>>
>>>
>>> I think using the global_ref local obfuscates what you're doing here.
>>> It saves you 6 characters while typing, but adds a layer of
>>> indirection for all future readers.
>>>
>>>> +       global_ref->global_type = DRM_GLOBAL_TTM_MEM;
>>>> +       global_ref->size = sizeof(struct ttm_mem_global);
>>>> +       global_ref->init = &hibmc_ttm_mem_global_init;
>>>> +       global_ref->release = &hibmc_ttm_mem_global_release;
>>>> +       r = drm_global_item_ref(global_ref);
>>>> +       if (r != 0) {
>>>
>>>
>>> nit: if (r)
>>
>>
>> will fix it,
>> thanks.
>> BTW, i wonder why checkpatch.pl didn't report it.
>>
>>
>>>
>>>> +               DRM_ERROR("Failed setting up TTM memory accounting
>>>> subsystem.\n"
>>>> +                        );
>>>
>>>
>>> Breaking up the line for one character is probably not worthwhile, and
>>> you should really print the error. How about:
>>>
>>> DRM_ERROR("Could not get ref on ttm global ret=%d.\n", ret);
>>
>>
>> i like your solution, thanks.
>>
>>
>>>
>>>
>>>> +               return r;
>>>> +       }
>>>> +
>>>> +       hibmc->ttm.bo_global_ref.mem_glob =
>>>> +               hibmc->ttm.mem_global_ref.object;
>>>> +       global_ref = &hibmc->ttm.bo_global_ref.ref;
>>>> +       global_ref->global_type = DRM_GLOBAL_TTM_BO;
>>>> +       global_ref->size = sizeof(struct ttm_bo_global);
>>>> +       global_ref->init = &ttm_bo_global_init;
>>>> +       global_ref->release = &ttm_bo_global_release;
>>>> +       r = drm_global_item_ref(global_ref);
>>>> +       if (r != 0) {
>>>> +               DRM_ERROR("Failed setting up TTM BO subsystem.\n");
>>>> +               drm_global_item_unref(&hibmc->ttm.mem_global_ref);
>>>> +               return r;
>>>> +       }
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +hibmc_ttm_global_release(struct hibmc_drm_device *hibmc)
>>>> +{
>>>> +       if (!hibmc->ttm.mem_global_ref.release)
>>>
>>>
>>> Are you actually hitting this condition? This seems like it's papering
>>> over something else.
>>
>>
>> it was also delivered from others, i looked at the xxx_ttm_global_init
>> function, 'mem_global_ref.release' is assigned unconditionally, so i
>> think this condition never be hit, it may be hit when release twice,
>> but this won't take place in my driver.
>>
>
> Yeah, that's what I was hoping for. So perhaps we can remove this?

yes, we can.

Regards,
Rongrong.

>
>>>
>>>> +               return;
>>>> +
>>>> +       drm_global_item_unref(&hibmc->ttm.bo_global_ref.ref);
>>>> +       drm_global_item_unref(&hibmc->ttm.mem_global_ref);
>>>> +       hibmc->ttm.mem_global_ref.release = NULL;
>>>> +}
>>>> +
>>>> +static void hibmc_bo_ttm_destroy(struct ttm_buffer_object *tbo)
>>>> +{
>>>> +       struct hibmc_bo *bo;
>>>> +
>>>> +       bo = container_of(tbo, struct hibmc_bo, bo);
>>>
>>>
>>> nit: No need to split this into a separate line.
>>
>>
>> agreed, thanks.
>>
>>>
>>>> +
>>>> +       drm_gem_object_release(&bo->gem);
>>>> +       kfree(bo);
>>>> +}
>>>> +
>>>> +static bool hibmc_ttm_bo_is_hibmc_bo(struct ttm_buffer_object *bo)
>>>> +{
>>>> +       if (bo->destroy == &hibmc_bo_ttm_destroy)
>>>> +               return true;
>>>> +       return false;
>>>
>>>
>>> return bo->destroy == &hibmc_bo_ttm_destroy;
>>
>>
>> looks better to me.
>>
>>
>>>
>>>> +}
>>>> +
>>>> +static int
>>>> +hibmc_bo_init_mem_type(struct ttm_bo_device *bdev, u32 type,
>>>> +                      struct ttm_mem_type_manager *man)
>>>> +{
>>>> +       switch (type) {
>>>> +       case TTM_PL_SYSTEM:
>>>> +               man->flags = TTM_MEMTYPE_FLAG_MAPPABLE;
>>>> +               man->available_caching = TTM_PL_MASK_CACHING;
>>>> +               man->default_caching = TTM_PL_FLAG_CACHED;
>>>> +               break;
>>>> +       case TTM_PL_VRAM:
>>>> +               man->func = &ttm_bo_manager_func;
>>>> +               man->flags = TTM_MEMTYPE_FLAG_FIXED |
>>>> +                       TTM_MEMTYPE_FLAG_MAPPABLE;
>>>> +               man->available_caching = TTM_PL_FLAG_UNCACHED |
>>>> +                       TTM_PL_FLAG_WC;
>>>> +               man->default_caching = TTM_PL_FLAG_WC;
>>>> +               break;
>>>> +       default:
>>>> +               DRM_ERROR("Unsupported memory type %u\n", type);
>>>> +               return -EINVAL;
>>>> +       }
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +void hibmc_ttm_placement(struct hibmc_bo *bo, int domain)
>>>> +{
>>>> +       u32 c = 0;
>>>
>>>
>>> Can you please use a more descriptive name than 'c'?
>>
>>
>> ok, will do that.
>>
>>>
>>>> +       u32 i;
>>>> +
>>>> +       bo->placement.placement = bo->placements;
>>>> +       bo->placement.busy_placement = bo->placements;
>>>> +       if (domain & TTM_PL_FLAG_VRAM)
>>>> +               bo->placements[c++].flags = TTM_PL_FLAG_WC |
>>>> +               TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM;
>>>
>>>
>>> nit: you're alignment is off here and below
>>
>>
>> is it correct?
>>
>>          if (domain & TTM_PL_FLAG_VRAM)
>>                  bo->placements[c++].flags = TTM_PL_FLAG_WC |
>>                          TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM;
>>          if (domain & TTM_PL_FLAG_SYSTEM)
>>                  bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>>                          TTM_PL_FLAG_SYSTEM;
>>          if (!c)
>>                  bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>>                          TTM_PL_FLAG_SYSTEM;
>>
>
> Pretty much anything other than lining them up one under the other is better
>
>>>
>>>> +       if (domain & TTM_PL_FLAG_SYSTEM)
>>>> +               bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>>>> +               TTM_PL_FLAG_SYSTEM;
>>>> +       if (!c)
>>>> +               bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>>>> +               TTM_PL_FLAG_SYSTEM;
>>>> +
>>>> +       bo->placement.num_placement = c;
>>>> +       bo->placement.num_busy_placement = c;
>>>> +       for (i = 0; i < c; ++i) {
>>>
>>>
>>> nit: we tend towards post-increment in kernel
>>
>>
>> agreed, thanks.
>>
>>
>>>
>>>> +               bo->placements[i].fpfn = 0;
>>>> +               bo->placements[i].lpfn = 0;
>>>> +       }
>>>> +}
>>>> +
>>>> +static void
>>>> +hibmc_bo_evict_flags(struct ttm_buffer_object *bo, struct ttm_placement
>>>> *pl)
>>>> +{
>>>> +       struct hibmc_bo *hibmcbo = hibmc_bo(bo);
>>>> +
>>>> +       if (!hibmc_ttm_bo_is_hibmc_bo(bo))
>>>> +               return;
>>>> +
>>>> +       hibmc_ttm_placement(hibmcbo, TTM_PL_FLAG_SYSTEM);
>>>> +       *pl = hibmcbo->placement;
>>>> +}
>>>> +
>>>> +static int hibmc_bo_verify_access(struct ttm_buffer_object *bo,
>>>> +                                 struct file *filp)
>>>> +{
>>>> +       struct hibmc_bo *hibmcbo = hibmc_bo(bo);
>>>> +
>>>> +       return drm_vma_node_verify_access(&hibmcbo->gem.vma_node,
>>>> +                                         filp->private_data);
>>>> +}
>>>> +
>>>> +static int hibmc_ttm_io_mem_reserve(struct ttm_bo_device *bdev,
>>>> +                                   struct ttm_mem_reg *mem)
>>>> +{
>>>> +       struct ttm_mem_type_manager *man = &bdev->man[mem->mem_type];
>>>> +       struct hibmc_drm_device *hibmc = hibmc_bdev(bdev);
>>>> +
>>>> +       mem->bus.addr = NULL;
>>>> +       mem->bus.offset = 0;
>>>> +       mem->bus.size = mem->num_pages << PAGE_SHIFT;
>>>> +       mem->bus.base = 0;
>>>> +       mem->bus.is_iomem = false;
>>>> +       if (!(man->flags & TTM_MEMTYPE_FLAG_MAPPABLE))
>>>> +               return -EINVAL;
>>>> +       switch (mem->mem_type) {
>>>> +       case TTM_PL_SYSTEM:
>>>> +               /* system memory */
>>>> +               return 0;
>>>> +       case TTM_PL_VRAM:
>>>> +               mem->bus.offset = mem->start << PAGE_SHIFT;
>>>> +               mem->bus.base = pci_resource_start(hibmc->dev->pdev, 0);
>>>> +               mem->bus.is_iomem = true;
>>>> +               break;
>>>> +       default:
>>>> +               return -EINVAL;
>>>> +       }
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static void hibmc_ttm_io_mem_free(struct ttm_bo_device *bdev,
>>>> +                                 struct ttm_mem_reg *mem)
>>>> +{
>>>> +}
>>>
>>>
>>> No need to stub this, the caller does a NULL-check before invoking
>>
>>
>> will delete it, thanks.
>>
>>>
>>>> +
>>>> +static void hibmc_ttm_backend_destroy(struct ttm_tt *tt)
>>>> +{
>>>> +       ttm_tt_fini(tt);
>>>> +       kfree(tt);
>>>> +}
>>>> +
>>>> +static struct ttm_backend_func hibmc_tt_backend_func = {
>>>> +       .destroy = &hibmc_ttm_backend_destroy,
>>>> +};
>>>> +
>>>> +static struct ttm_tt *hibmc_ttm_tt_create(struct ttm_bo_device *bdev,
>>>> +                                         unsigned long size,
>>>> +                                         u32 page_flags,
>>>> +                                         struct page *dummy_read_page)
>>>> +{
>>>> +       struct ttm_tt *tt;
>>>> +
>>>> +       tt = kzalloc(sizeof(*tt), GFP_KERNEL);
>>>> +       if (!tt)
>>>
>>>
>>> Print error
>>
>>
>> ok, will do that, thanks.
>>
>>>
>>>> +               return NULL;
>>>> +       tt->func = &hibmc_tt_backend_func;
>>>> +       if (ttm_tt_init(tt, bdev, size, page_flags, dummy_read_page)) {
>>>
>>>
>>> Here too?
>>
>>
>> ditto
>>
>>
>>>
>>>> +               kfree(tt);
>>>> +               return NULL;
>>>> +       }
>>>> +       return tt;
>>>> +}
>>>> +
>>>> +static int hibmc_ttm_tt_populate(struct ttm_tt *ttm)
>>>> +{
>>>> +       return ttm_pool_populate(ttm);
>>>> +}
>>>> +
>>>> +static void hibmc_ttm_tt_unpopulate(struct ttm_tt *ttm)
>>>> +{
>>>> +       ttm_pool_unpopulate(ttm);
>>>> +}
>>>> +
>>>> +struct ttm_bo_driver hibmc_bo_driver = {
>>>> +       .ttm_tt_create          = hibmc_ttm_tt_create,
>>>> +       .ttm_tt_populate        = hibmc_ttm_tt_populate,
>>>> +       .ttm_tt_unpopulate      = hibmc_ttm_tt_unpopulate,
>>>> +       .init_mem_type          = hibmc_bo_init_mem_type,
>>>> +       .evict_flags            = hibmc_bo_evict_flags,
>>>> +       .move                   = NULL,
>>>> +       .verify_access          = hibmc_bo_verify_access,
>>>> +       .io_mem_reserve         = &hibmc_ttm_io_mem_reserve,
>>>> +       .io_mem_free            = &hibmc_ttm_io_mem_free,
>>>> +       .lru_tail               = &ttm_bo_default_lru_tail,
>>>> +       .swap_lru_tail          = &ttm_bo_default_swap_lru_tail,
>>>> +};
>>>> +
>>>> +int hibmc_mm_init(struct hibmc_drm_device *hibmc)
>>>> +{
>>>> +       int ret;
>>>> +       struct drm_device *dev = hibmc->dev;
>>>> +       struct ttm_bo_device *bdev = &hibmc->ttm.bdev;
>>>> +
>>>> +       ret = hibmc_ttm_global_init(hibmc);
>>>> +       if (ret)
>>>> +               return ret;
>>>> +
>>>> +       ret = ttm_bo_device_init(&hibmc->ttm.bdev,
>>>> +                                hibmc->ttm.bo_global_ref.ref.object,
>>>> +                                &hibmc_bo_driver,
>>>> +                                dev->anon_inode->i_mapping,
>>>> +                                DRM_FILE_PAGE_OFFSET,
>>>> +                                true);
>>>> +       if (ret) {
>>>
>>>
>>> Call hibmc_ttm_global_release here?
>>
>>
>> agreed, thanks for pointing it out.
>>
>>>
>>>> +               DRM_ERROR("Error initialising bo driver; %d\n", ret);
>>>> +               return ret;
>>>> +       }
>>>> +
>>>> +       ret = ttm_bo_init_mm(bdev, TTM_PL_VRAM,
>>>> +                            hibmc->fb_size >> PAGE_SHIFT);
>>>> +       if (ret) {
>>>
>>>
>>> Clean up here as well?
>>
>>
>> ditto
>>
>>
>>>
>>>> +               DRM_ERROR("Failed ttm VRAM init: %d\n", ret);
>>>> +               return ret;
>>>> +       }
>>>> +
>>>> +       hibmc->mm_inited = true;
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +void hibmc_mm_fini(struct hibmc_drm_device *hibmc)
>>>> +{
>>>> +       if (!hibmc->mm_inited)
>>>> +               return;
>>>> +
>>>> +       ttm_bo_device_release(&hibmc->ttm.bdev);
>>>> +       hibmc_ttm_global_release(hibmc);
>>>> +       hibmc->mm_inited = false;
>>>> +}
>>>> +
>>>> +int hibmc_bo_create(struct drm_device *dev, int size, int align,
>>>> +                   u32 flags, struct hibmc_bo **phibmcbo)
>>>> +{
>>>> +       struct hibmc_drm_device *hibmc = dev->dev_private;
>>>> +       struct hibmc_bo *hibmcbo;
>>>> +       size_t acc_size;
>>>> +       int ret;
>>>> +
>>>> +       hibmcbo = kzalloc(sizeof(*hibmcbo), GFP_KERNEL);
>>>> +       if (!hibmcbo)
>>>> +               return -ENOMEM;
>>>> +
>>>> +       ret = drm_gem_object_init(dev, &hibmcbo->gem, size);
>>>> +       if (ret) {
>>>> +               kfree(hibmcbo);
>>>> +               return ret;
>>>> +       }
>>>> +
>>>> +       hibmcbo->bo.bdev = &hibmc->ttm.bdev;
>>>> +
>>>> +       hibmc_ttm_placement(hibmcbo, TTM_PL_FLAG_VRAM |
>>>> TTM_PL_FLAG_SYSTEM);
>>>> +
>>>> +       acc_size = ttm_bo_dma_acc_size(&hibmc->ttm.bdev, size,
>>>> +                                      sizeof(struct hibmc_bo));
>>>> +
>>>> +       ret = ttm_bo_init(&hibmc->ttm.bdev, &hibmcbo->bo, size,
>>>> +                         ttm_bo_type_device, &hibmcbo->placement,
>>>> +                         align >> PAGE_SHIFT, false, NULL, acc_size,
>>>> +                         NULL, NULL, hibmc_bo_ttm_destroy);
>>>> +       if (ret)
>>>
>>>
>>> Missing hibmcbo clean up here
>>
>>
>> i looked at all other ttm drivers and all of them return directly when
>> ttm_bo_init
>> failed, however, i think it is better to clean up here, should i call
>> hibmc_bo_unref(&hibmc_bo) here ?
>>
>
> Yeah, that should work (might want to test it, though ;)
>
>
>>>
>>>> +               return ret;
>>>> +
>>>> +       *phibmcbo = hibmcbo;
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static inline u64 hibmc_bo_gpu_offset(struct hibmc_bo *bo)
>>>> +{
>>>> +       return bo->bo.offset;
>>>> +}
>>>
>>>
>>> I don't think this function provides any value
>>
>>
>> do you nean i use bo->bo.offset instead of calling hibmc_bo_gpu_offset()?
>>
>
> yes
>
>>>
>>>> +
>>>> +int hibmc_bo_pin(struct hibmc_bo *bo, u32 pl_flag, u64 *gpu_addr)
>>>> +{
>>>> +       int i, ret;
>>>> +
>>>> +       if (bo->pin_count) {
>>>> +               bo->pin_count++;
>>>> +               if (gpu_addr)
>>>> +                       *gpu_addr = hibmc_bo_gpu_offset(bo);
>>>
>>>
>>> Are you missing a return here?
>>
>>
>> Thanks for pointing it out!
>>
>>
>>>
>>>> +       }
>>>> +
>>>> +       hibmc_ttm_placement(bo, pl_flag);
>>>> +       for (i = 0; i < bo->placement.num_placement; i++)
>>>> +               bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
>>>> +       ret = ttm_bo_validate(&bo->bo, &bo->placement, false, false);
>>>> +       if (ret)
>>>> +               return ret;
>>>> +
>>>> +       bo->pin_count = 1;
>>>> +       if (gpu_addr)
>>>> +               *gpu_addr = hibmc_bo_gpu_offset(bo);
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +int hibmc_bo_push_sysram(struct hibmc_bo *bo)
>>>> +{
>>>> +       int i, ret;
>>>> +
>>>> +       if (!bo->pin_count) {
>>>> +               DRM_ERROR("unpin bad %p\n", bo);
>>>> +               return 0;
>>>> +       }
>>>> +       bo->pin_count--;
>>>> +       if (bo->pin_count)
>>>> +               return 0;
>>>> +
>>>> +       if (bo->kmap.virtual)
>>>
>>>
>>> ttm_bo_kunmap already does this check so you don't have to
>>
>>
>> agreed. will remove this condition.
>>
>>>
>>>> +               ttm_bo_kunmap(&bo->kmap);
>>>> +
>>>> +       hibmc_ttm_placement(bo, TTM_PL_FLAG_SYSTEM);
>>>> +       for (i = 0; i < bo->placement.num_placement ; i++)
>>>> +               bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
>>>> +
>>>> +       ret = ttm_bo_validate(&bo->bo, &bo->placement, false, false);
>>>> +       if (ret) {
>>>> +               DRM_ERROR("pushing to VRAM failed\n");
>>>
>>>
>>> Print ret
>>
>>
>> ok, thanks.
>>
>>
>>>
>>>> +               return ret;
>>>> +       }
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +int hibmc_mmap(struct file *filp, struct vm_area_struct *vma)
>>>> +{
>>>> +       struct drm_file *file_priv;
>>>> +       struct hibmc_drm_device *hibmc;
>>>> +
>>>> +       if (unlikely(vma->vm_pgoff < DRM_FILE_PAGE_OFFSET))
>>>> +               return -EINVAL;
>>>> +
>>>> +       file_priv = filp->private_data;
>>>> +       hibmc = file_priv->minor->dev->dev_private;
>>>> +       return ttm_bo_mmap(filp, vma, &hibmc->ttm.bdev);
>>>> +}
>>>> +
>>>> +int hibmc_gem_create(struct drm_device *dev, u32 size, bool iskernel,
>>>> +                    struct drm_gem_object **obj)
>>>> +{
>>>> +       struct hibmc_bo *hibmcbo;
>>>> +       int ret;
>>>> +
>>>> +       *obj = NULL;
>>>> +
>>>> +       size = PAGE_ALIGN(size);
>>>> +       if (size == 0)
>>>
>>>
>>> Print error
>>
>>
>> ditto
>>
>>>
>>>> +               return -EINVAL;
>>>> +
>>>> +       ret = hibmc_bo_create(dev, size, 0, 0, &hibmcbo);
>>>> +       if (ret) {
>>>> +               if (ret != -ERESTARTSYS)
>>>> +                       DRM_ERROR("failed to allocate GEM object\n");
>>>
>>>
>>> Print ret
>>
>>
>> ditto
>>
>>>
>>>> +               return ret;
>>>> +       }
>>>> +       *obj = &hibmcbo->gem;
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +int hibmc_dumb_create(struct drm_file *file, struct drm_device *dev,
>>>> +                     struct drm_mode_create_dumb *args)
>>>> +{
>>>> +       struct drm_gem_object *gobj;
>>>> +       u32 handle;
>>>> +       int ret;
>>>> +
>>>> +       args->pitch = ALIGN(args->width * ((args->bpp + 7) / 8), 16);
>>>
>>>
>>> What's up with the bpp + 7 here? Perhaps you're looking for DIV_ROUND_UP?
>>
>>
>> Yes, that sounds sane.
>>
>
> sane is what i usually aim for :)
>
> Sean
>
>
>>>
>>>
>>>> +       args->size = args->pitch * args->height;
>>>> +
>>>> +       ret = hibmc_gem_create(dev, args->size, false,
>>>> +                              &gobj);
>>>> +       if (ret)
>>>> +               return ret;
>>>> +
>>>> +       ret = drm_gem_handle_create(file, gobj, &handle);
>>>> +       drm_gem_object_unreference_unlocked(gobj);
>>>> +       if (ret)
>>>
>>>
>>> Print error here
>>
>>
>> agreed.
>>
>>
>>>
>>>> +               return ret;
>>>> +
>>>> +       args->handle = handle;
>>>> +       return 0;
>>>> +}
>>>> +
>>>> +static void hibmc_bo_unref(struct hibmc_bo **bo)
>>>> +{
>>>> +       struct ttm_buffer_object *tbo;
>>>> +
>>>> +       if ((*bo) == NULL)
>>>> +               return;
>>>> +
>>>> +       tbo = &((*bo)->bo);
>>>> +       ttm_bo_unref(&tbo);
>>>> +       *bo = NULL;
>>>> +}
>>>> +
>>>> +void hibmc_gem_free_object(struct drm_gem_object *obj)
>>>> +{
>>>> +       struct hibmc_bo *hibmcbo = gem_to_hibmc_bo(obj);
>>>> +
>>>> +       hibmc_bo_unref(&hibmcbo);
>>>> +}
>>>> +
>>>> +static u64 hibmc_bo_mmap_offset(struct hibmc_bo *bo)
>>>> +{
>>>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>>> +}
>>>> +
>>>> +int hibmc_dumb_mmap_offset(struct drm_file *file, struct drm_device
>>>> *dev,
>>>> +                          u32 handle, u64 *offset)
>>>> +{
>>>> +       struct drm_gem_object *obj;
>>>> +       struct hibmc_bo *bo;
>>>> +
>>>> +       obj = drm_gem_object_lookup(file, handle);
>>>> +       if (!obj)
>>>> +               return -ENOENT;
>>>> +
>>>> +       bo = gem_to_hibmc_bo(obj);
>>>> +       *offset = hibmc_bo_mmap_offset(bo);
>>>> +
>>>> +       drm_gem_object_unreference_unlocked(obj);
>>>> +       return 0;
>>>> +}
>>
>>
>> Regards,
>> Rongrong.
>>
>>>> --
>>>> 1.9.1
>>>>
>>>>
>>>> _______________________________________________
>>>> linux-arm-kernel mailing list
>>>> linux-arm-kernel at lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>>
>>> _______________________________________________
>>> linuxarm mailing list
>>> linuxarm at huawei.com
>>> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
>>>
>>> .
>>>
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>


-- 
Regards, Rongrong

^ permalink raw reply

* [PATCH v3 2/3] pinctrl: sunxi: Add support for fetching pinconf settings from hardware
From: Chen-Yu Tsai @ 2016-11-11 13:57 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161111135159.zqjqjjn7azg4gkzx@lukather>

On Fri, Nov 11, 2016 at 9:51 PM, Maxime Ripard
<maxime.ripard@free-electrons.com> wrote:
> On Fri, Nov 11, 2016 at 05:50:35PM +0800, Chen-Yu Tsai wrote:
>> The sunxi pinctrl driver only caches whatever pinconf setting was last
>> set on a given pingroup. This is not particularly helpful, nor is it
>> correct.
>>
>> Fix this by actually reading the hardware registers and returning
>> the correct results or error codes. Also filter out unsupported
>> pinconf settings. Since this driver has a peculiar setup of 1 pin
>> per group, we can support both pin and pingroup pinconf setting
>> read back with the same code. The sunxi_pconf_reg helper and code
>> structure is inspired by pinctrl-msm.
>>
>> With this done we can also claim to support generic pinconf, by
>> setting .is_generic = true in pinconf_ops.
>>
>> Also remove the cached config value. The behavior of this was never
>> correct, as it only cached 1 setting instead of all of them. Since
>> we can now read back settings directly from the hardware, it is no
>> longer required.
>>
>> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
>> ---
>>  drivers/pinctrl/sunxi/pinctrl-sunxi.c | 86 +++++++++++++++++++++++++++++++++--
>>  drivers/pinctrl/sunxi/pinctrl-sunxi.h |  1 -
>>  2 files changed, 81 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.c b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
>> index e04edda8629d..ed71bff39869 100644
>> --- a/drivers/pinctrl/sunxi/pinctrl-sunxi.c
>> +++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
>> @@ -438,15 +438,91 @@ static const struct pinctrl_ops sunxi_pctrl_ops = {
>>       .get_group_pins         = sunxi_pctrl_get_group_pins,
>>  };
>>
>> +static int sunxi_pconf_reg(unsigned pin, enum pin_config_param param,
>
> Sorry, this went unnoticed in your previous version, but checkpatch
> reports:
>
> WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
>
> For this one and the next function.

I know, but the function pointer definition in pinconf.h says
'unsigned', not 'unsigned int'.

Having a mismatch between the two is even more confusing...

ChenYu

>
> Once fixed,
> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
>
> Thanks!
> Maxime
>
> --
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com

^ permalink raw reply

* [PATCH v14 5/9] clocksource/drivers/arm_arch_timer: Simplify ACPI support code.
From: Hanjun Guo @ 2016-11-11 13:55 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161021112120.GC16630@leverpostej>

On 10/21/2016 07:21 PM, Mark Rutland wrote:
> On Fri, Oct 21, 2016 at 12:14:01PM +0100, Mark Rutland wrote:
>> On Thu, Oct 20, 2016 at 05:58:17PM +0100, Mark Rutland wrote:
>>> On Thu, Sep 29, 2016 at 02:17:13AM +0800, fu.wei at linaro.org wrote:
>>>> +	arch_timer_ppi[PHYS_NONSECURE_PPI] = acpi_gtdt_map_ppi(PHYS_NONSECURE_PPI);
>>>> +	arch_timer_ppi[VIRT_PPI] = acpi_gtdt_map_ppi(VIRT_PPI);
>>>> +	arch_timer_ppi[HYP_PPI] = acpi_gtdt_map_ppi(HYP_PPI);
>>>> +	/* Always-on capability */
>>>> +	arch_timer_c3stop = acpi_gtdt_c3stop();
>>>
>>> ... I think we should check the flag on the relevant interrupt, though
>>> that's worth clarifying.
>>
>> I see I misread the spec; this is part of the common flags.
>>
>> Please ignore this point; sorry for the noise.
>
> Actually, I misread the spec this time around; the flag *can* differ per
> interrupt for the sysreg/cp15 timer, but not for the MMIO timers where
> the flag is in a common field.
>
> So please *do* consider the above.

Sorry, misread the email as well...

Check the spec again and it's per interrupt flag.

Thanks
Hanjun

^ permalink raw reply

* [PATCH v3 2/3] pinctrl: sunxi: Add support for fetching pinconf settings from hardware
From: Maxime Ripard @ 2016-11-11 13:51 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161111095036.11803-3-wens@csie.org>

On Fri, Nov 11, 2016 at 05:50:35PM +0800, Chen-Yu Tsai wrote:
> The sunxi pinctrl driver only caches whatever pinconf setting was last
> set on a given pingroup. This is not particularly helpful, nor is it
> correct.
> 
> Fix this by actually reading the hardware registers and returning
> the correct results or error codes. Also filter out unsupported
> pinconf settings. Since this driver has a peculiar setup of 1 pin
> per group, we can support both pin and pingroup pinconf setting
> read back with the same code. The sunxi_pconf_reg helper and code
> structure is inspired by pinctrl-msm.
> 
> With this done we can also claim to support generic pinconf, by
> setting .is_generic = true in pinconf_ops.
> 
> Also remove the cached config value. The behavior of this was never
> correct, as it only cached 1 setting instead of all of them. Since
> we can now read back settings directly from the hardware, it is no
> longer required.
> 
> Signed-off-by: Chen-Yu Tsai <wens@csie.org>
> ---
>  drivers/pinctrl/sunxi/pinctrl-sunxi.c | 86 +++++++++++++++++++++++++++++++++--
>  drivers/pinctrl/sunxi/pinctrl-sunxi.h |  1 -
>  2 files changed, 81 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pinctrl/sunxi/pinctrl-sunxi.c b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
> index e04edda8629d..ed71bff39869 100644
> --- a/drivers/pinctrl/sunxi/pinctrl-sunxi.c
> +++ b/drivers/pinctrl/sunxi/pinctrl-sunxi.c
> @@ -438,15 +438,91 @@ static const struct pinctrl_ops sunxi_pctrl_ops = {
>  	.get_group_pins		= sunxi_pctrl_get_group_pins,
>  };
>  
> +static int sunxi_pconf_reg(unsigned pin, enum pin_config_param param,

Sorry, this went unnoticed in your previous version, but checkpatch
reports:

WARNING: Prefer 'unsigned int' to bare use of 'unsigned'

For this one and the next function.

Once fixed,
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161111/78072c0f/attachment.sig>

^ permalink raw reply

* [PATCH] crypto: arm64/sha2: integrate OpenSSL implementations of SHA256/SHA512
From: Ard Biesheuvel @ 2016-11-11 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

This integrates both the accelerated scalar and the NEON implementations
of SHA-224/256 as well as SHA-384/512 from the OpenSSL project.

Relative performance compared to the respective generic C versions:

                 |  SHA256-scalar  | SHA256-NEON* |  SHA512  |
     ------------+-----------------+--------------+----------+
     Cortex-A53  |      1.63x      |     1.63x    |   2.34x  |
     Cortex-A57  |      1.43x      |     1.59x    |   1.95x  |
     Cortex-A73  |      1.26x      |     1.56x    |     ?    |

The core crypto code was authored by Andy Polyakov of the OpenSSL
project, in collaboration with whom the upstream code was adapted so
that this module can be built from the same version of sha512-armv8.pl.

The version in this patch was taken from OpenSSL commit

   866e505e0d66 sha/asm/sha512-armv8.pl: add NEON version of SHA256.

* The core SHA algorithm is fundamentally sequential, but there is a
  secondary transformation involved, called the schedule update, which
  can be performed independently. The NEON version of SHA-224/SHA-256
  only implements this part of the algorithm using NEON instructions,
  the sequential part is always done using scalar instructions.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---

This supersedes the SHA-256-NEON-only patch I sent out about 6 weeks ago.

Will, Catalin: note that this pulls in a .pl script, and adds a build rule
locally in arch/arm64/crypto to generate .S files on the fly from Perl
scripts. I will leave it to you to decide whether you are ok with this as
is, or whether you prefer .S_shipped files, in which case the Perl script
is only included as a reference (this is how we did it for arch/arm in the
past, but given that it adds about 3000 lines of generated code to the patch,
I think we may want to simply keep it as below)
 
 arch/arm64/crypto/Kconfig         |   8 +
 arch/arm64/crypto/Makefile        |  15 +
 arch/arm64/crypto/sha256-glue.c   | 185 +++++
 arch/arm64/crypto/sha512-armv8.pl | 778 ++++++++++++++++++++
 arch/arm64/crypto/sha512-glue.c   |  94 +++
 5 files changed, 1080 insertions(+)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 2cf32e9887e1..5f4a617e2957 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -8,6 +8,14 @@ menuconfig ARM64_CRYPTO
 
 if ARM64_CRYPTO
 
+config CRYPTO_SHA256_ARM64
+	tristate "SHA-224/SHA-256 digest algorithm for arm64"
+	select CRYPTO_HASH
+
+config CRYPTO_SHA512_ARM64
+	tristate "SHA-384/SHA-512 digest algorithm for arm64"
+	select CRYPTO_HASH
+
 config CRYPTO_SHA1_ARM64_CE
 	tristate "SHA-1 digest algorithm (ARMv8 Crypto Extensions)"
 	depends on ARM64 && KERNEL_MODE_NEON
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index abb79b3cfcfe..861589faf6ef 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -29,6 +29,12 @@ aes-ce-blk-y := aes-glue-ce.o aes-ce.o
 obj-$(CONFIG_CRYPTO_AES_ARM64_NEON_BLK) += aes-neon-blk.o
 aes-neon-blk-y := aes-glue-neon.o aes-neon.o
 
+obj-$(CONFIG_CRYPTO_SHA256_ARM64) += sha256-arm64.o
+sha256-arm64-y := sha256-glue.o sha256-core.o
+
+obj-$(CONFIG_CRYPTO_SHA512_ARM64) += sha512-arm64.o
+sha512-arm64-y := sha512-glue.o sha512-core.o
+
 AFLAGS_aes-ce.o		:= -DINTERLEAVE=4
 AFLAGS_aes-neon.o	:= -DINTERLEAVE=4
 
@@ -40,3 +46,12 @@ CFLAGS_crc32-arm64.o	:= -mcpu=generic+crc
 
 $(obj)/aes-glue-%.o: $(src)/aes-glue.c FORCE
 	$(call if_changed_rule,cc_o_c)
+
+quiet_cmd_perl = PERLASM $@
+      cmd_perl = $(PERL) $(<) void $(@)
+
+$(obj)/sha256-core.S: $(src)/sha512-armv8.pl
+	$(call cmd,perl)
+
+$(obj)/sha512-core.S: $(src)/sha512-armv8.pl
+	$(call cmd,perl)
diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c
new file mode 100644
index 000000000000..a2226f841960
--- /dev/null
+++ b/arch/arm64/crypto/sha256-glue.c
@@ -0,0 +1,185 @@
+/*
+ * Linux/arm64 port of the OpenSSL SHA256 implementation for AArch64
+ *
+ * Copyright (c) 2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include <asm/hwcap.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
+#include <crypto/internal/hash.h>
+#include <crypto/sha.h>
+#include <crypto/sha256_base.h>
+#include <linux/cryptohash.h>
+#include <linux/types.h>
+#include <linux/string.h>
+
+MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash for arm64");
+MODULE_AUTHOR("Andy Polyakov <appro@openssl.org>");
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("sha224");
+MODULE_ALIAS_CRYPTO("sha256");
+
+asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
+					unsigned int num_blks);
+
+asmlinkage void sha256_block_neon(u32 *digest, const void *data,
+				  unsigned int num_blks);
+
+static int sha256_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int len)
+{
+	return sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+}
+
+static int sha256_finup(struct shash_desc *desc, const u8 *data,
+			unsigned int len, u8 *out)
+{
+	if (len)
+		sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+	sha256_base_do_finalize(desc,
+				(sha256_block_fn *)sha256_block_data_order);
+
+	return sha256_base_finish(desc, out);
+}
+
+static int sha256_final(struct shash_desc *desc, u8 *out)
+{
+	return sha256_finup(desc, NULL, 0, out);
+}
+
+static struct shash_alg algs[] = { {
+	.digestsize		= SHA256_DIGEST_SIZE,
+	.init			= sha256_base_init,
+	.update			= sha256_update,
+	.final			= sha256_final,
+	.finup			= sha256_finup,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha256",
+	.base.cra_driver_name	= "sha256-arm64",
+	.base.cra_priority	= 100,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA256_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.digestsize		= SHA224_DIGEST_SIZE,
+	.init			= sha224_base_init,
+	.update			= sha256_update,
+	.final			= sha256_final,
+	.finup			= sha256_finup,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha224",
+	.base.cra_driver_name	= "sha224-arm64",
+	.base.cra_priority	= 100,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA224_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int sha256_update_neon(struct shash_desc *desc, const u8 *data,
+			      unsigned int len)
+{
+	/*
+	 * Stacking and unstacking a substantial slice of the NEON register
+	 * file may significantly affect performance for small updates when
+	 * executing in interrupt context, so fall back to the scalar code
+	 * in that case.
+	 */
+	if (!may_use_simd())
+		return sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+
+	kernel_neon_begin();
+	sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_neon);
+	kernel_neon_end();
+
+	return 0;
+}
+
+static int sha256_finup_neon(struct shash_desc *desc, const u8 *data,
+			     unsigned int len, u8 *out)
+{
+	if (!may_use_simd()) {
+		if (len)
+			sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_data_order);
+		sha256_base_do_finalize(desc,
+				(sha256_block_fn *)sha256_block_data_order);
+	} else {
+		kernel_neon_begin();
+		if (len)
+			sha256_base_do_update(desc, data, len,
+				(sha256_block_fn *)sha256_block_neon);
+		sha256_base_do_finalize(desc,
+				(sha256_block_fn *)sha256_block_neon);
+		kernel_neon_end();
+	}
+	return sha256_base_finish(desc, out);
+}
+
+static int sha256_final_neon(struct shash_desc *desc, u8 *out)
+{
+	return sha256_finup_neon(desc, NULL, 0, out);
+}
+
+static struct shash_alg neon_algs[] = { {
+	.digestsize		= SHA256_DIGEST_SIZE,
+	.init			= sha256_base_init,
+	.update			= sha256_update_neon,
+	.final			= sha256_final_neon,
+	.finup			= sha256_finup_neon,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha256",
+	.base.cra_driver_name	= "sha256-arm64-neon",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA256_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.digestsize		= SHA224_DIGEST_SIZE,
+	.init			= sha224_base_init,
+	.update			= sha256_update_neon,
+	.final			= sha256_final_neon,
+	.finup			= sha256_finup_neon,
+	.descsize		= sizeof(struct sha256_state),
+	.base.cra_name		= "sha224",
+	.base.cra_driver_name	= "sha224-arm64-neon",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA224_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int __init sha256_mod_init(void)
+{
+	int ret = crypto_register_shashes(algs, ARRAY_SIZE(algs));
+	if (ret)
+		return ret;
+
+	if (elf_hwcap & HWCAP_ASIMD) {
+		ret = crypto_register_shashes(neon_algs, ARRAY_SIZE(neon_algs));
+		if (ret)
+			crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+	}
+	return ret;
+}
+
+static void __exit sha256_mod_fini(void)
+{
+	if (elf_hwcap & HWCAP_ASIMD)
+		crypto_unregister_shashes(neon_algs, ARRAY_SIZE(neon_algs));
+	crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+}
+
+module_init(sha256_mod_init);
+module_exit(sha256_mod_fini);
diff --git a/arch/arm64/crypto/sha512-armv8.pl b/arch/arm64/crypto/sha512-armv8.pl
new file mode 100644
index 000000000000..ffae5f23bcd8
--- /dev/null
+++ b/arch/arm64/crypto/sha512-armv8.pl
@@ -0,0 +1,778 @@
+#! /usr/bin/env perl
+# Copyright 2014-2016 The OpenSSL Project Authors. All Rights Reserved.
+#
+# Licensed under the OpenSSL license (the "License").  You may not use
+# this file except in compliance with the License.  You can obtain a copy
+# in the file LICENSE in the source distribution or at
+# https://www.openssl.org/source/license.html
+
+# ====================================================================
+# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see http://www.openssl.org/~appro/cryptogams/.
+#
+# Permission to use under GPLv2 terms is granted.
+# ====================================================================
+#
+# SHA256/512 for ARMv8.
+#
+# Performance in cycles per processed byte and improvement coefficient
+# over code generated with "default" compiler:
+#
+#		SHA256-hw	SHA256(*)	SHA512
+# Apple A7	1.97		10.5 (+33%)	6.73 (-1%(**))
+# Cortex-A53	2.38		15.5 (+115%)	10.0 (+150%(***))
+# Cortex-A57	2.31		11.6 (+86%)	7.51 (+260%(***))
+# Denver	2.01		10.5 (+26%)	6.70 (+8%)
+# X-Gene			20.0 (+100%)	12.8 (+300%(***))
+# Mongoose	2.36		13.0 (+50%)	8.36 (+33%)
+#
+# (*)	Software SHA256 results are of lesser relevance, presented
+#	mostly for informational purposes.
+# (**)	The result is a trade-off: it's possible to improve it by
+#	10% (or by 1 cycle per round), but at the cost of 20% loss
+#	on Cortex-A53 (or by 4 cycles per round).
+# (***)	Super-impressive coefficients over gcc-generated code are
+#	indication of some compiler "pathology", most notably code
+#	generated with -mgeneral-regs-only is significanty faster
+#	and the gap is only 40-90%.
+#
+# October 2016.
+#
+# Originally it was reckoned that it makes no sense to implement NEON
+# version of SHA256 for 64-bit processors. This is because performance
+# improvement on most wide-spread Cortex-A5x processors was observed
+# to be marginal, same on Cortex-A53 and ~10% on A57. But then it was
+# observed that 32-bit NEON SHA256 performs significantly better than
+# 64-bit scalar version on *some* of the more recent processors. As
+# result 64-bit NEON version of SHA256 was added to provide best
+# all-round performance. For example it executes ~30% faster on X-Gene
+# and Mongoose. [For reference, NEON version of SHA512 is bound to
+# deliver much less improvement, likely *negative* on Cortex-A5x.
+# Which is why NEON support is limited to SHA256.]
+
+$output=pop;
+$flavour=pop;
+
+if ($flavour && $flavour ne "void") {
+    $0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
+    ( $xlate="${dir}arm-xlate.pl" and -f $xlate ) or
+    ( $xlate="${dir}../../perlasm/arm-xlate.pl" and -f $xlate) or
+    die "can't locate arm-xlate.pl";
+
+    open OUT,"| \"$^X\" $xlate $flavour $output";
+    *STDOUT=*OUT;
+} else {
+    open STDOUT,">$output";
+}
+
+if ($output =~ /512/) {
+	$BITS=512;
+	$SZ=8;
+	@Sigma0=(28,34,39);
+	@Sigma1=(14,18,41);
+	@sigma0=(1,  8, 7);
+	@sigma1=(19,61, 6);
+	$rounds=80;
+	$reg_t="x";
+} else {
+	$BITS=256;
+	$SZ=4;
+	@Sigma0=( 2,13,22);
+	@Sigma1=( 6,11,25);
+	@sigma0=( 7,18, 3);
+	@sigma1=(17,19,10);
+	$rounds=64;
+	$reg_t="w";
+}
+
+$func="sha${BITS}_block_data_order";
+
+($ctx,$inp,$num,$Ktbl)=map("x$_",(0..2,30));
+
+ at X=map("$reg_t$_",(3..15,0..2));
+ at V=($A,$B,$C,$D,$E,$F,$G,$H)=map("$reg_t$_",(20..27));
+($t0,$t1,$t2,$t3)=map("$reg_t$_",(16,17,19,28));
+
+sub BODY_00_xx {
+my ($i,$a,$b,$c,$d,$e,$f,$g,$h)=@_;
+my $j=($i+1)&15;
+my ($T0,$T1,$T2)=(@X[($i-8)&15], at X[($i-9)&15], at X[($i-10)&15]);
+   $T0=@X[$i+3] if ($i<11);
+
+$code.=<<___	if ($i<16);
+#ifndef	__ARMEB__
+	rev	@X[$i], at X[$i]			// $i
+#endif
+___
+$code.=<<___	if ($i<13 && ($i&1));
+	ldp	@X[$i+1], at X[$i+2],[$inp],#2*$SZ
+___
+$code.=<<___	if ($i==13);
+	ldp	@X[14],@X[15],[$inp]
+___
+$code.=<<___	if ($i>=14);
+	ldr	@X[($i-11)&15],[sp,#`$SZ*(($i-11)%4)`]
+___
+$code.=<<___	if ($i>0 && $i<16);
+	add	$a,$a,$t1			// h+=Sigma0(a)
+___
+$code.=<<___	if ($i>=11);
+	str	@X[($i-8)&15],[sp,#`$SZ*(($i-8)%4)`]
+___
+# While ARMv8 specifies merged rotate-n-logical operation such as
+# 'eor x,y,z,ror#n', it was found to negatively affect performance
+# on Apple A7. The reason seems to be that it requires even 'y' to
+# be available earlier. This means that such merged instruction is
+# not necessarily best choice on critical path... On the other hand
+# Cortex-A5x handles merged instructions much better than disjoint
+# rotate and logical... See (**) footnote above.
+$code.=<<___	if ($i<15);
+	ror	$t0,$e,#$Sigma1[0]
+	add	$h,$h,$t2			// h+=K[i]
+	eor	$T0,$e,$e,ror#`$Sigma1[2]-$Sigma1[1]`
+	and	$t1,$f,$e
+	bic	$t2,$g,$e
+	add	$h,$h,@X[$i&15]			// h+=X[i]
+	orr	$t1,$t1,$t2			// Ch(e,f,g)
+	eor	$t2,$a,$b			// a^b, b^c in next round
+	eor	$t0,$t0,$T0,ror#$Sigma1[1]	// Sigma1(e)
+	ror	$T0,$a,#$Sigma0[0]
+	add	$h,$h,$t1			// h+=Ch(e,f,g)
+	eor	$t1,$a,$a,ror#`$Sigma0[2]-$Sigma0[1]`
+	add	$h,$h,$t0			// h+=Sigma1(e)
+	and	$t3,$t3,$t2			// (b^c)&=(a^b)
+	add	$d,$d,$h			// d+=h
+	eor	$t3,$t3,$b			// Maj(a,b,c)
+	eor	$t1,$T0,$t1,ror#$Sigma0[1]	// Sigma0(a)
+	add	$h,$h,$t3			// h+=Maj(a,b,c)
+	ldr	$t3,[$Ktbl],#$SZ		// *K++, $t2 in next round
+	//add	$h,$h,$t1			// h+=Sigma0(a)
+___
+$code.=<<___	if ($i>=15);
+	ror	$t0,$e,#$Sigma1[0]
+	add	$h,$h,$t2			// h+=K[i]
+	ror	$T1, at X[($j+1)&15],#$sigma0[0]
+	and	$t1,$f,$e
+	ror	$T2, at X[($j+14)&15],#$sigma1[0]
+	bic	$t2,$g,$e
+	ror	$T0,$a,#$Sigma0[0]
+	add	$h,$h, at X[$i&15]			// h+=X[i]
+	eor	$t0,$t0,$e,ror#$Sigma1[1]
+	eor	$T1,$T1, at X[($j+1)&15],ror#$sigma0[1]
+	orr	$t1,$t1,$t2			// Ch(e,f,g)
+	eor	$t2,$a,$b			// a^b, b^c in next round
+	eor	$t0,$t0,$e,ror#$Sigma1[2]	// Sigma1(e)
+	eor	$T0,$T0,$a,ror#$Sigma0[1]
+	add	$h,$h,$t1			// h+=Ch(e,f,g)
+	and	$t3,$t3,$t2			// (b^c)&=(a^b)
+	eor	$T2,$T2, at X[($j+14)&15],ror#$sigma1[1]
+	eor	$T1,$T1, at X[($j+1)&15],lsr#$sigma0[2]	// sigma0(X[i+1])
+	add	$h,$h,$t0			// h+=Sigma1(e)
+	eor	$t3,$t3,$b			// Maj(a,b,c)
+	eor	$t1,$T0,$a,ror#$Sigma0[2]	// Sigma0(a)
+	eor	$T2,$T2, at X[($j+14)&15],lsr#$sigma1[2]	// sigma1(X[i+14])
+	add	@X[$j], at X[$j], at X[($j+9)&15]
+	add	$d,$d,$h			// d+=h
+	add	$h,$h,$t3			// h+=Maj(a,b,c)
+	ldr	$t3,[$Ktbl],#$SZ		// *K++, $t2 in next round
+	add	@X[$j], at X[$j],$T1
+	add	$h,$h,$t1			// h+=Sigma0(a)
+	add	@X[$j], at X[$j],$T2
+___
+	($t2,$t3)=($t3,$t2);
+}
+
+$code.=<<___;
+#ifndef	__KERNEL__
+# include "arm_arch.h"
+#endif
+
+.text
+
+.extern	OPENSSL_armcap_P
+.globl	$func
+.type	$func,%function
+.align	6
+$func:
+___
+$code.=<<___	if ($SZ==4);
+#ifndef	__KERNEL__
+# ifdef	__ILP32__
+	ldrsw	x16,.LOPENSSL_armcap_P
+# else
+	ldr	x16,.LOPENSSL_armcap_P
+# endif
+	adr	x17,.LOPENSSL_armcap_P
+	add	x16,x16,x17
+	ldr	w16,[x16]
+	tst	w16,#ARMV8_SHA256
+	b.ne	.Lv8_entry
+	tst	w16,#ARMV7_NEON
+	b.ne	.Lneon_entry
+#endif
+___
+$code.=<<___;
+	stp	x29,x30,[sp,#-128]!
+	add	x29,sp,#0
+
+	stp	x19,x20,[sp,#16]
+	stp	x21,x22,[sp,#32]
+	stp	x23,x24,[sp,#48]
+	stp	x25,x26,[sp,#64]
+	stp	x27,x28,[sp,#80]
+	sub	sp,sp,#4*$SZ
+
+	ldp	$A,$B,[$ctx]				// load context
+	ldp	$C,$D,[$ctx,#2*$SZ]
+	ldp	$E,$F,[$ctx,#4*$SZ]
+	add	$num,$inp,$num,lsl#`log(16*$SZ)/log(2)`	// end of input
+	ldp	$G,$H,[$ctx,#6*$SZ]
+	adr	$Ktbl,.LK$BITS
+	stp	$ctx,$num,[x29,#96]
+
+.Loop:
+	ldp	@X[0], at X[1],[$inp],#2*$SZ
+	ldr	$t2,[$Ktbl],#$SZ			// *K++
+	eor	$t3,$B,$C				// magic seed
+	str	$inp,[x29,#112]
+___
+for ($i=0;$i<16;$i++)	{ &BODY_00_xx($i, at V); unshift(@V,pop(@V)); }
+$code.=".Loop_16_xx:\n";
+for (;$i<32;$i++)	{ &BODY_00_xx($i, at V); unshift(@V,pop(@V)); }
+$code.=<<___;
+	cbnz	$t2,.Loop_16_xx
+
+	ldp	$ctx,$num,[x29,#96]
+	ldr	$inp,[x29,#112]
+	sub	$Ktbl,$Ktbl,#`$SZ*($rounds+1)`		// rewind
+
+	ldp	@X[0], at X[1],[$ctx]
+	ldp	@X[2], at X[3],[$ctx,#2*$SZ]
+	add	$inp,$inp,#14*$SZ			// advance input pointer
+	ldp	@X[4], at X[5],[$ctx,#4*$SZ]
+	add	$A,$A, at X[0]
+	ldp	@X[6], at X[7],[$ctx,#6*$SZ]
+	add	$B,$B, at X[1]
+	add	$C,$C, at X[2]
+	add	$D,$D, at X[3]
+	stp	$A,$B,[$ctx]
+	add	$E,$E, at X[4]
+	add	$F,$F, at X[5]
+	stp	$C,$D,[$ctx,#2*$SZ]
+	add	$G,$G, at X[6]
+	add	$H,$H,@X[7]
+	cmp	$inp,$num
+	stp	$E,$F,[$ctx,#4*$SZ]
+	stp	$G,$H,[$ctx,#6*$SZ]
+	b.ne	.Loop
+
+	ldp	x19,x20,[x29,#16]
+	add	sp,sp,#4*$SZ
+	ldp	x21,x22,[x29,#32]
+	ldp	x23,x24,[x29,#48]
+	ldp	x25,x26,[x29,#64]
+	ldp	x27,x28,[x29,#80]
+	ldp	x29,x30,[sp],#128
+	ret
+.size	$func,.-$func
+
+.align	6
+.type	.LK$BITS,%object
+.LK$BITS:
+___
+$code.=<<___ if ($SZ==8);
+	.quad	0x428a2f98d728ae22,0x7137449123ef65cd
+	.quad	0xb5c0fbcfec4d3b2f,0xe9b5dba58189dbbc
+	.quad	0x3956c25bf348b538,0x59f111f1b605d019
+	.quad	0x923f82a4af194f9b,0xab1c5ed5da6d8118
+	.quad	0xd807aa98a3030242,0x12835b0145706fbe
+	.quad	0x243185be4ee4b28c,0x550c7dc3d5ffb4e2
+	.quad	0x72be5d74f27b896f,0x80deb1fe3b1696b1
+	.quad	0x9bdc06a725c71235,0xc19bf174cf692694
+	.quad	0xe49b69c19ef14ad2,0xefbe4786384f25e3
+	.quad	0x0fc19dc68b8cd5b5,0x240ca1cc77ac9c65
+	.quad	0x2de92c6f592b0275,0x4a7484aa6ea6e483
+	.quad	0x5cb0a9dcbd41fbd4,0x76f988da831153b5
+	.quad	0x983e5152ee66dfab,0xa831c66d2db43210
+	.quad	0xb00327c898fb213f,0xbf597fc7beef0ee4
+	.quad	0xc6e00bf33da88fc2,0xd5a79147930aa725
+	.quad	0x06ca6351e003826f,0x142929670a0e6e70
+	.quad	0x27b70a8546d22ffc,0x2e1b21385c26c926
+	.quad	0x4d2c6dfc5ac42aed,0x53380d139d95b3df
+	.quad	0x650a73548baf63de,0x766a0abb3c77b2a8
+	.quad	0x81c2c92e47edaee6,0x92722c851482353b
+	.quad	0xa2bfe8a14cf10364,0xa81a664bbc423001
+	.quad	0xc24b8b70d0f89791,0xc76c51a30654be30
+	.quad	0xd192e819d6ef5218,0xd69906245565a910
+	.quad	0xf40e35855771202a,0x106aa07032bbd1b8
+	.quad	0x19a4c116b8d2d0c8,0x1e376c085141ab53
+	.quad	0x2748774cdf8eeb99,0x34b0bcb5e19b48a8
+	.quad	0x391c0cb3c5c95a63,0x4ed8aa4ae3418acb
+	.quad	0x5b9cca4f7763e373,0x682e6ff3d6b2b8a3
+	.quad	0x748f82ee5defb2fc,0x78a5636f43172f60
+	.quad	0x84c87814a1f0ab72,0x8cc702081a6439ec
+	.quad	0x90befffa23631e28,0xa4506cebde82bde9
+	.quad	0xbef9a3f7b2c67915,0xc67178f2e372532b
+	.quad	0xca273eceea26619c,0xd186b8c721c0c207
+	.quad	0xeada7dd6cde0eb1e,0xf57d4f7fee6ed178
+	.quad	0x06f067aa72176fba,0x0a637dc5a2c898a6
+	.quad	0x113f9804bef90dae,0x1b710b35131c471b
+	.quad	0x28db77f523047d84,0x32caab7b40c72493
+	.quad	0x3c9ebe0a15c9bebc,0x431d67c49c100d4c
+	.quad	0x4cc5d4becb3e42b6,0x597f299cfc657e2a
+	.quad	0x5fcb6fab3ad6faec,0x6c44198c4a475817
+	.quad	0	// terminator
+___
+$code.=<<___ if ($SZ==4);
+	.long	0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5
+	.long	0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5
+	.long	0xd807aa98,0x12835b01,0x243185be,0x550c7dc3
+	.long	0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174
+	.long	0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc
+	.long	0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da
+	.long	0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7
+	.long	0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967
+	.long	0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13
+	.long	0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85
+	.long	0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3
+	.long	0xd192e819,0xd6990624,0xf40e3585,0x106aa070
+	.long	0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5
+	.long	0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3
+	.long	0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208
+	.long	0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
+	.long	0	//terminator
+___
+$code.=<<___;
+.size	.LK$BITS,.-.LK$BITS
+#ifndef	__KERNEL__
+.align	3
+.LOPENSSL_armcap_P:
+# ifdef	__ILP32__
+	.long	OPENSSL_armcap_P-.
+# else
+	.quad	OPENSSL_armcap_P-.
+# endif
+#endif
+.asciz	"SHA$BITS block transform for ARMv8, CRYPTOGAMS by <appro\@openssl.org>"
+.align	2
+___
+
+if ($SZ==4) {
+my $Ktbl="x3";
+
+my ($ABCD,$EFGH,$abcd)=map("v$_.16b",(0..2));
+my @MSG=map("v$_.16b",(4..7));
+my ($W0,$W1)=("v16.4s","v17.4s");
+my ($ABCD_SAVE,$EFGH_SAVE)=("v18.16b","v19.16b");
+
+$code.=<<___;
+#ifndef	__KERNEL__
+.type	sha256_block_armv8,%function
+.align	6
+sha256_block_armv8:
+.Lv8_entry:
+	stp		x29,x30,[sp,#-16]!
+	add		x29,sp,#0
+
+	ld1.32		{$ABCD,$EFGH},[$ctx]
+	adr		$Ktbl,.LK256
+
+.Loop_hw:
+	ld1		{@MSG[0]- at MSG[3]},[$inp],#64
+	sub		$num,$num,#1
+	ld1.32		{$W0},[$Ktbl],#16
+	rev32		@MSG[0], at MSG[0]
+	rev32		@MSG[1], at MSG[1]
+	rev32		@MSG[2], at MSG[2]
+	rev32		@MSG[3], at MSG[3]
+	orr		$ABCD_SAVE,$ABCD,$ABCD		// offload
+	orr		$EFGH_SAVE,$EFGH,$EFGH
+___
+for($i=0;$i<12;$i++) {
+$code.=<<___;
+	ld1.32		{$W1},[$Ktbl],#16
+	add.i32		$W0,$W0, at MSG[0]
+	sha256su0	@MSG[0], at MSG[1]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W0
+	sha256h2	$EFGH,$abcd,$W0
+	sha256su1	@MSG[0], at MSG[2], at MSG[3]
+___
+	($W0,$W1)=($W1,$W0);	push(@MSG,shift(@MSG));
+}
+$code.=<<___;
+	ld1.32		{$W1},[$Ktbl],#16
+	add.i32		$W0,$W0, at MSG[0]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W0
+	sha256h2	$EFGH,$abcd,$W0
+
+	ld1.32		{$W0},[$Ktbl],#16
+	add.i32		$W1,$W1, at MSG[1]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W1
+	sha256h2	$EFGH,$abcd,$W1
+
+	ld1.32		{$W1},[$Ktbl]
+	add.i32		$W0,$W0, at MSG[2]
+	sub		$Ktbl,$Ktbl,#$rounds*$SZ-16	// rewind
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W0
+	sha256h2	$EFGH,$abcd,$W0
+
+	add.i32		$W1,$W1, at MSG[3]
+	orr		$abcd,$ABCD,$ABCD
+	sha256h		$ABCD,$EFGH,$W1
+	sha256h2	$EFGH,$abcd,$W1
+
+	add.i32		$ABCD,$ABCD,$ABCD_SAVE
+	add.i32		$EFGH,$EFGH,$EFGH_SAVE
+
+	cbnz		$num,.Loop_hw
+
+	st1.32		{$ABCD,$EFGH},[$ctx]
+
+	ldr		x29,[sp],#16
+	ret
+.size	sha256_block_armv8,.-sha256_block_armv8
+#endif
+___
+}
+
+if ($SZ==4) {	######################################### NEON stuff #
+# You'll surely note a lot of similarities with sha256-armv4 module,
+# and of course it's not a coincidence. sha256-armv4 was used as
+# initial template, but was adapted for ARMv8 instruction set and
+# extensively re-tuned for all-round performance.
+
+my @V = ($A,$B,$C,$D,$E,$F,$G,$H) = map("w$_",(3..10));
+my ($t0,$t1,$t2,$t3,$t4) = map("w$_",(11..15));
+my $Ktbl="x16";
+my $Xfer="x17";
+my @X = map("q$_",(0..3));
+my ($T0,$T1,$T2,$T3,$T4,$T5,$T6,$T7) = map("q$_",(4..7,16..19));
+my $j=0;
+
+sub AUTOLOAD()          # thunk [simplified] x86-style perlasm
+{ my $opcode = $AUTOLOAD; $opcode =~ s/.*:://; $opcode =~ s/_/\./;
+  my $arg = pop;
+    $arg = "#$arg" if ($arg*1 eq $arg);
+    $code .= "\t$opcode\t".join(',', at _,$arg)."\n";
+}
+
+sub Dscalar { shift =~ m|[qv]([0-9]+)|?"d$1":""; }
+sub Dlo     { shift =~ m|[qv]([0-9]+)|?"v$1.d[0]":""; }
+sub Dhi     { shift =~ m|[qv]([0-9]+)|?"v$1.d[1]":""; }
+
+sub Xupdate()
+{ use integer;
+  my $body = shift;
+  my @insns = (&$body,&$body,&$body,&$body);
+  my ($a,$b,$c,$d,$e,$f,$g,$h);
+
+	&ext_8		($T0, at X[0], at X[1],4);	# X[1..4]
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ext_8		($T3, at X[2], at X[3],4);	# X[9..12]
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&mov		(&Dscalar($T7),&Dhi(@X[3]));	# X[14..15]
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ushr_32	($T2,$T0,$sigma0[0]);
+	 eval(shift(@insns));
+	&ushr_32	($T1,$T0,$sigma0[2]);
+	 eval(shift(@insns));
+	&add_32 	(@X[0], at X[0],$T3);	# X[0..3] += X[9..12]
+	 eval(shift(@insns));
+	&sli_32		($T2,$T0,32-$sigma0[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ushr_32	($T3,$T0,$sigma0[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&eor_8		($T1,$T1,$T2);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&sli_32		($T3,$T0,32-$sigma0[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T4,$T7,$sigma1[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&eor_8		($T1,$T1,$T3);		# sigma0(X[1..4])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_32	($T4,$T7,32-$sigma1[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T5,$T7,$sigma1[2]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T3,$T7,$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		(@X[0], at X[0],$T1);	# X[0..3] += sigma0(X[1..4])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_u32	($T3,$T7,32-$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &eor_8	($T5,$T5,$T4);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &eor_8	($T5,$T5,$T3);		# sigma1(X[14..15])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		(@X[0], at X[0],$T5);	# X[0..1] += sigma1(X[14..15])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &ushr_32	($T6, at X[0],$sigma1[0]);
+	 eval(shift(@insns));
+	  &ushr_32	($T7, at X[0],$sigma1[2]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_32	($T6, at X[0],32-$sigma1[0]);
+	 eval(shift(@insns));
+	  &ushr_32	($T5, at X[0],$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &eor_8	($T7,$T7,$T6);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	  &sli_32	($T5, at X[0],32-$sigma1[1]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ld1_32		("{$T0}","[$Ktbl], #16");
+	 eval(shift(@insns));
+	  &eor_8	($T7,$T7,$T5);		# sigma1(X[16..17])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&eor_8		($T5,$T5,$T5);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&mov		(&Dhi($T5), &Dlo($T7));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		(@X[0], at X[0],$T5);	# X[2..3] += sigma1(X[16..17])
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		($T0,$T0,@X[0]);
+	 while($#insns>=1) { eval(shift(@insns)); }
+	&st1_32		("{$T0}","[$Xfer], #16");
+	 eval(shift(@insns));
+
+	push(@X,shift(@X));		# "rotate" X[]
+}
+
+sub Xpreload()
+{ use integer;
+  my $body = shift;
+  my @insns = (&$body,&$body,&$body,&$body);
+  my ($a,$b,$c,$d,$e,$f,$g,$h);
+
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ld1_8		("{@X[0]}","[$inp],#16");
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&ld1_32		("{$T0}","[$Ktbl],#16");
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&rev32		(@X[0], at X[0]);
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	 eval(shift(@insns));
+	&add_32		($T0,$T0, at X[0]);
+	 foreach (@insns) { eval; }	# remaining instructions
+	&st1_32		("{$T0}","[$Xfer], #16");
+
+	push(@X,shift(@X));		# "rotate" X[]
+}
+
+sub body_00_15 () {
+	(
+	'($a,$b,$c,$d,$e,$f,$g,$h)=@V;'.
+	'&add	($h,$h,$t1)',			# h+=X[i]+K[i]
+	'&add	($a,$a,$t4);'.			# h+=Sigma0(a) from the past
+	'&and	($t1,$f,$e)',
+	'&bic	($t4,$g,$e)',
+	'&eor	($t0,$e,$e,"ror#".($Sigma1[1]-$Sigma1[0]))',
+	'&add	($a,$a,$t2)',			# h+=Maj(a,b,c) from the past
+	'&orr	($t1,$t1,$t4)',			# Ch(e,f,g)
+	'&eor	($t0,$t0,$e,"ror#".($Sigma1[2]-$Sigma1[0]))',	# Sigma1(e)
+	'&eor	($t4,$a,$a,"ror#".($Sigma0[1]-$Sigma0[0]))',
+	'&add	($h,$h,$t1)',			# h+=Ch(e,f,g)
+	'&ror	($t0,$t0,"#$Sigma1[0]")',
+	'&eor	($t2,$a,$b)',			# a^b, b^c in next round
+	'&eor	($t4,$t4,$a,"ror#".($Sigma0[2]-$Sigma0[0]))',	# Sigma0(a)
+	'&add	($h,$h,$t0)',			# h+=Sigma1(e)
+	'&ldr	($t1,sprintf "[sp,#%d]",4*(($j+1)&15))	if (($j&15)!=15);'.
+	'&ldr	($t1,"[$Ktbl]")				if ($j==15);'.
+	'&and	($t3,$t3,$t2)',			# (b^c)&=(a^b)
+	'&ror	($t4,$t4,"#$Sigma0[0]")',
+	'&add	($d,$d,$h)',			# d+=h
+	'&eor	($t3,$t3,$b)',			# Maj(a,b,c)
+	'$j++;	unshift(@V,pop(@V)); ($t2,$t3)=($t3,$t2);'
+	)
+}
+
+$code.=<<___;
+#ifdef	__KERNEL__
+.globl	sha256_block_neon
+#endif
+.type	sha256_block_neon,%function
+.align	4
+sha256_block_neon:
+.Lneon_entry:
+	stp	x29, x30, [sp, #-16]!
+	mov	x29, sp
+	sub	sp,sp,#16*4
+
+	adr	$Ktbl,.LK256
+	add	$num,$inp,$num,lsl#6	// len to point at the end of inp
+
+	ld1.8	{@X[0]},[$inp], #16
+	ld1.8	{@X[1]},[$inp], #16
+	ld1.8	{@X[2]},[$inp], #16
+	ld1.8	{@X[3]},[$inp], #16
+	ld1.32	{$T0},[$Ktbl], #16
+	ld1.32	{$T1},[$Ktbl], #16
+	ld1.32	{$T2},[$Ktbl], #16
+	ld1.32	{$T3},[$Ktbl], #16
+	rev32	@X[0], at X[0]		// yes, even on
+	rev32	@X[1], at X[1]		// big-endian
+	rev32	@X[2], at X[2]
+	rev32	@X[3], at X[3]
+	mov	$Xfer,sp
+	add.32	$T0,$T0, at X[0]
+	add.32	$T1,$T1, at X[1]
+	add.32	$T2,$T2, at X[2]
+	st1.32	{$T0-$T1},[$Xfer], #32
+	add.32	$T3,$T3,@X[3]
+	st1.32	{$T2-$T3},[$Xfer]
+	sub	$Xfer,$Xfer,#32
+
+	ldp	$A,$B,[$ctx]
+	ldp	$C,$D,[$ctx,#8]
+	ldp	$E,$F,[$ctx,#16]
+	ldp	$G,$H,[$ctx,#24]
+	ldr	$t1,[sp,#0]
+	mov	$t2,wzr
+	eor	$t3,$B,$C
+	mov	$t4,wzr
+	b	.L_00_48
+
+.align	4
+.L_00_48:
+___
+	&Xupdate(\&body_00_15);
+	&Xupdate(\&body_00_15);
+	&Xupdate(\&body_00_15);
+	&Xupdate(\&body_00_15);
+$code.=<<___;
+	cmp	$t1,#0				// check for K256 terminator
+	ldr	$t1,[sp,#0]
+	sub	$Xfer,$Xfer,#64
+	bne	.L_00_48
+
+	sub	$Ktbl,$Ktbl,#256		// rewind $Ktbl
+	cmp	$inp,$num
+	mov	$Xfer, #64
+	csel	$Xfer, $Xfer, xzr, eq
+	sub	$inp,$inp,$Xfer			// avoid SEGV
+	mov	$Xfer,sp
+___
+	&Xpreload(\&body_00_15);
+	&Xpreload(\&body_00_15);
+	&Xpreload(\&body_00_15);
+	&Xpreload(\&body_00_15);
+$code.=<<___;
+	add	$A,$A,$t4			// h+=Sigma0(a) from the past
+	ldp	$t0,$t1,[$ctx,#0]
+	add	$A,$A,$t2			// h+=Maj(a,b,c) from the past
+	ldp	$t2,$t3,[$ctx,#8]
+	add	$A,$A,$t0			// accumulate
+	add	$B,$B,$t1
+	ldp	$t0,$t1,[$ctx,#16]
+	add	$C,$C,$t2
+	add	$D,$D,$t3
+	ldp	$t2,$t3,[$ctx,#24]
+	add	$E,$E,$t0
+	add	$F,$F,$t1
+	 ldr	$t1,[sp,#0]
+	stp	$A,$B,[$ctx,#0]
+	add	$G,$G,$t2
+	 mov	$t2,wzr
+	stp	$C,$D,[$ctx,#8]
+	add	$H,$H,$t3
+	stp	$E,$F,[$ctx,#16]
+	 eor	$t3,$B,$C
+	stp	$G,$H,[$ctx,#24]
+	 mov	$t4,wzr
+	 mov	$Xfer,sp
+	b.ne	.L_00_48
+
+	ldr	x29,[x29]
+	add	sp,sp,#16*4+16
+	ret
+.size	sha256_block_neon,.-sha256_block_neon
+___
+}
+
+$code.=<<___;
+#ifndef	__KERNEL__
+.comm	OPENSSL_armcap_P,4,4
+#endif
+___
+
+{   my  %opcode = (
+	"sha256h"	=> 0x5e004000,	"sha256h2"	=> 0x5e005000,
+	"sha256su0"	=> 0x5e282800,	"sha256su1"	=> 0x5e006000	);
+
+    sub unsha256 {
+	my ($mnemonic,$arg)=@_;
+
+	$arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)[^,]*(?:,\s*[qv]([0-9]+))?/o
+	&&
+	sprintf ".inst\t0x%08x\t//%s %s",
+			$opcode{$mnemonic}|$1|($2<<5)|($3<<16),
+			$mnemonic,$arg;
+    }
+}
+
+open SELF,$0;
+while(<SELF>) {
+        next if (/^#!/);
+        last if (!s/^#/\/\// and !/^$/);
+        print;
+}
+close SELF;
+
+foreach(split("\n",$code)) {
+
+	s/\`([^\`]*)\`/eval($1)/ge;
+
+	s/\b(sha256\w+)\s+([qv].*)/unsha256($1,$2)/ge;
+
+	s/\bq([0-9]+)\b/v$1.16b/g;		# old->new registers
+
+	s/\.[ui]?8(\s)/$1/;
+	s/\.\w?32\b//		and s/\.16b/\.4s/g;
+	m/(ld|st)1[^\[]+\[0\]/	and s/\.4s/\.s/g;
+
+	print $_,"\n";
+}
+
+close STDOUT;
diff --git a/arch/arm64/crypto/sha512-glue.c b/arch/arm64/crypto/sha512-glue.c
new file mode 100644
index 000000000000..aff35c9992a4
--- /dev/null
+++ b/arch/arm64/crypto/sha512-glue.c
@@ -0,0 +1,94 @@
+/*
+ * Linux/arm64 port of the OpenSSL SHA512 implementation for AArch64
+ *
+ * Copyright (c) 2016 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#include <crypto/internal/hash.h>
+#include <linux/cryptohash.h>
+#include <linux/types.h>
+#include <linux/string.h>
+#include <crypto/sha.h>
+#include <crypto/sha512_base.h>
+#include <asm/neon.h>
+
+MODULE_DESCRIPTION("SHA-384/SHA-512 secure hash for arm64");
+MODULE_AUTHOR("Andy Polyakov <appro@openssl.org>");
+MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS_CRYPTO("sha384");
+MODULE_ALIAS_CRYPTO("sha512");
+
+asmlinkage void sha512_block_data_order(u32 *digest, const void *data,
+					unsigned int num_blks);
+
+static int sha512_update(struct shash_desc *desc, const u8 *data,
+			 unsigned int len)
+{
+	return sha512_base_do_update(desc, data, len,
+			(sha512_block_fn *)sha512_block_data_order);
+}
+
+static int sha512_finup(struct shash_desc *desc, const u8 *data,
+			unsigned int len, u8 *out)
+{
+	if (len)
+		sha512_base_do_update(desc, data, len,
+			(sha512_block_fn *)sha512_block_data_order);
+	sha512_base_do_finalize(desc,
+			(sha512_block_fn *)sha512_block_data_order);
+
+	return sha512_base_finish(desc, out);
+}
+
+static int sha512_final(struct shash_desc *desc, u8 *out)
+{
+	return sha512_finup(desc, NULL, 0, out);
+}
+
+static struct shash_alg algs[] = { {
+	.digestsize		= SHA512_DIGEST_SIZE,
+	.init			= sha512_base_init,
+	.update			= sha512_update,
+	.final			= sha512_final,
+	.finup			= sha512_finup,
+	.descsize		= sizeof(struct sha512_state),
+	.base.cra_name		= "sha512",
+	.base.cra_driver_name	= "sha512-arm64",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA512_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+}, {
+	.digestsize		= SHA384_DIGEST_SIZE,
+	.init			= sha384_base_init,
+	.update			= sha512_update,
+	.final			= sha512_final,
+	.finup			= sha512_finup,
+	.descsize		= sizeof(struct sha512_state),
+	.base.cra_name		= "sha384",
+	.base.cra_driver_name	= "sha384-arm64",
+	.base.cra_priority	= 150,
+	.base.cra_flags		= CRYPTO_ALG_TYPE_SHASH,
+	.base.cra_blocksize	= SHA384_BLOCK_SIZE,
+	.base.cra_module	= THIS_MODULE,
+} };
+
+static int __init sha512_mod_init(void)
+{
+	return crypto_register_shashes(algs, ARRAY_SIZE(algs));
+}
+
+static void __exit sha512_mod_fini(void)
+{
+	crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+}
+
+module_init(sha512_mod_init);
+module_exit(sha512_mod_fini);
-- 
2.7.4

^ permalink raw reply related

* [PATCH v14 4/9] acpi/arm64: Add GTDT table parse driver
From: Hanjun Guo @ 2016-11-11 13:46 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161020163719.GC27598@leverpostej>

Hi Mark,

Sorry for the late reply.

On 10/21/2016 12:37 AM, Mark Rutland wrote:
> Hi,
>
> As a heads-up, on v4.9-rc1 I see conflicts at least against
> arch/arm64/Kconfig. Luckily git am -3 seems to be able to fix that up
> automatically, but this will need to be rebased before the next posting
> and/or merging.
>
> On Thu, Sep 29, 2016 at 02:17:12AM +0800, fu.wei at linaro.org wrote:
>> +static int __init map_gt_gsi(u32 interrupt, u32 flags)
>> +{
>> +	int trigger, polarity;
>> +
>> +	if (!interrupt)
>> +		return 0;
>
> Urgh.
>
> Only the secure interrupt (which we do not need) is optional in this
> manner, and (hilariously), zero appears to also be a valid GSIV, per
> figure 5-24 in the ACPI 6.1 spec.
>
> So, I think that:
>
> (a) we should not bother parsing the secure interrupt
> (b) we should drop the check above
> (c) we should report the spec issue to the ASWG

Sorry, I willing to do that, but I need to figure out the issue here.
What kind of issue in detail? do you mean that zero should not be valid
for arch timer interrupts?

Thanks
Hanjun

^ permalink raw reply

* [PATCH v14 4/9] acpi/arm64: Add GTDT table parse driver
From: Hanjun Guo @ 2016-11-11 13:43 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CADyBb7tLJpPiMqWs4KZE=uCpo3HX3WcNVju2UKNuPOR0+M4UDw@mail.gmail.com>

On 10/26/2016 07:10 PM, Fu Wei wrote:
> Hi Mark,
>
> On 21 October 2016 at 00:37, Mark Rutland <mark.rutland@arm.com> wrote:
>> Hi,
>>
>> As a heads-up, on v4.9-rc1 I see conflicts at least against
>> arch/arm64/Kconfig. Luckily git am -3 seems to be able to fix that up
>> automatically, but this will need to be rebased before the next posting
>> and/or merging.
>>
>> On Thu, Sep 29, 2016 at 02:17:12AM +0800, fu.wei at linaro.org wrote:
>>> +static int __init map_gt_gsi(u32 interrupt, u32 flags)
>>> +{
>>> +     int trigger, polarity;
>>> +
>>> +     if (!interrupt)
>>> +             return 0;
>>
>> Urgh.
>>
>> Only the secure interrupt (which we do not need) is optional in this
>> manner, and (hilariously), zero appears to also be a valid GSIV, per
>> figure 5-24 in the ACPI 6.1 spec.
>>
>> So, I think that:
>>
>> (a) we should not bother parsing the secure interrupt
>
> If I understand correctly, from this point of view, kernel don't
> handle the secure interrupt.
> But the current arm_arch_timer driver still enable/disable/request
> PHYS_SECURE_PPI
> with PHYS_NONSECURE_PPI.
> That means we still need to parse the secure interrupt.
> Please correct me, if I misunderstand something? :-)
>
>> (b) we should drop the check above
>
> yes, if zero is a valid GSIV, this makes sense.
>
>> (c) we should report the spec issue to the ASWG
>>
>>> +/*
>>> + * acpi_gtdt_c3stop - got c3stop info from GTDT
>>> + *
>>> + * Returns 1 if the timer is powered in deep idle state, 0 otherwise.
>>> + */
>>> +bool __init acpi_gtdt_c3stop(void)
>>> +{
>>> +     struct acpi_table_gtdt *gtdt = acpi_gtdt_desc.gtdt;
>>> +
>>> +     return !(gtdt->non_secure_el1_flags & ACPI_GTDT_ALWAYS_ON);
>>> +}
>>
>> It looks like this can differ per interrupt. Shouldn't we check the
>> appropriate one?
>
> yes, I think you are right.

I think Mark already clarified this it's a global flag which defined
in the spec, and we don't need to update it.

Thanks
Hanjun

^ permalink raw reply

* [PATCH v3 0/2] arm64: Support systems without FP/ASIMD
From: Marc Zyngier @ 2016-11-11 13:41 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1478613381-5718-1-git-send-email-suzuki.poulose@arm.com>

On 08/11/16 13:56, Suzuki K Poulose wrote:
> This series adds supports to the kernel and KVM hyp to handle
> systems without FP/ASIMD properly. At the moment the kernel
> doesn't check if the FP unit is available before accessing
> the registers (e.g during context switch). Also for KVM,
> we trap the FP/ASIMD accesses and handle it by injecting an
> undefined instruction into the VM on systems without FP.
> 
> Tested on a FVP_Base-AEM-v8A model by disabling VFP on at
> least one CPU ( -C clusterX.cpuY.vfp-present=0 ).
> 
> Changes since V2:
>  - Dropped cleanup patch for arm64/crypto/aes-ce-ccm-glue.c
>  - Removed static_key check from cpus_have_cap. All users with
>    constant caps should use the new API to make use of static_keys.
>  - Removed a dedicated static_key used in irqchip-gic-v3.c for
>    Cavium errata with the new API.
> 
> Applies on v4.9-rc4 + [1] (which is pushed for rc5)
> 
> [1] http://marc.info/?l=linux-arm-kernel&m=147819889813214&w=2
> 
> 
> Suzuki K Poulose (2):
>   arm64: Add hypervisor safe helper for checking constant capabilities
>   arm64: Support systems without FP/ASIMD
> 
>  arch/arm64/include/asm/cpucaps.h    |  3 ++-
>  arch/arm64/include/asm/cpufeature.h | 24 +++++++++++++++++-------
>  arch/arm64/include/asm/neon.h       |  3 ++-
>  arch/arm64/kernel/cpufeature.c      | 17 ++++++++++++++++-
>  arch/arm64/kernel/fpsimd.c          | 14 ++++++++++++++
>  arch/arm64/kernel/process.c         |  2 +-
>  arch/arm64/kvm/handle_exit.c        | 11 +++++++++++
>  arch/arm64/kvm/hyp/hyp-entry.S      |  9 ++++++++-
>  arch/arm64/kvm/hyp/switch.c         |  5 ++++-
>  drivers/irqchip/irq-gic-v3.c        | 13 +------------
>  10 files changed, 76 insertions(+), 25 deletions(-)
> 

For the series:

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

How do we plan on merging this? Catalin, are you willing to take it all?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply

* [PATCH V5 3/3] ARM64 LPC: LPC driver implementation on Hip06
From: Gabriele Paoloni @ 2016-11-11 13:39 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <10334260.ztLXZ2Oynd@wuerfel>

Hi Arnd

> -----Original Message-----
> From: Arnd Bergmann [mailto:arnd at arndb.de]
> Sent: 10 November 2016 16:07
> To: Gabriele Paoloni
> Cc: linux-arm-kernel at lists.infradead.org; Yuanzhichang;
> mark.rutland at arm.com; devicetree at vger.kernel.org;
> lorenzo.pieralisi at arm.com; minyard at acm.org; linux-pci at vger.kernel.org;
> benh at kernel.crashing.org; John Garry; will.deacon at arm.com; linux-
> kernel at vger.kernel.org; xuwei (O); Linuxarm; zourongrong at gmail.com;
> robh+dt at kernel.org; kantyzc at 163.com; linux-serial at vger.kernel.org;
> catalin.marinas at arm.com; olof at lixom.net; liviu.dudau at arm.com;
> bhelgaas at googl e.com; zhichang.yuan02 at gmail.com
> Subject: Re: [PATCH V5 3/3] ARM64 LPC: LPC driver implementation on
> Hip06
> 
> On Thursday, November 10, 2016 3:36:49 PM CET Gabriele Paoloni wrote:
> >
> > Where should we get the range from? For LPC we know that it is going
> > Work on anything that is not used by PCI I/O space, and this is
> > why we use [0, PCIBIOS_MIN_IO]
> 
> It should be allocated the same way we allocate PCI config space
> segments. This is currently done with the io_range list in
> drivers/pci/pci.c, which isn't perfect but could be extended
> if necessary. Based on what others commented here, I'd rather
> make the differences between ISA/LPC and PCI I/O ranges smaller
> than larger.

I am not sure this would make sense...

IMHO all the mechanism around io_range_list is needed to provide the
"mapping" between I/O tokens and physical CPU addresses.

Currently the available tokens range from 0 to IO_SPACE_LIMIT.

As you know the I/O memory accessors operate on whatever
__of_address_to_resource sets into the resource (start, end).

With this special device in place we cannot know if a resource is
assigned with an I/O token or a physical address, unless we forbid
the I/O tokens to be in a specific range.

So this is why we are changing the offsets of all the functions
handling io_range_list (to make sure that a range is forbidden to
the tokens and is available to the physical addresses).

We have chosen this forbidden range to be [0, PCIBIOS_MIN_IO)
because this is the maximum physical I/O range that a non PCI device
can operate on and because we believe this does not impose much
restriction on the available I/O token range; that now is 
[PCIBIOS_MIN_IO, IO_SPACE_LIMIT].
So we believe that the chosen forbidden range can accommodate
any special ISA bus device with no much constraint on the rest
of I/O tokens...

> 
> > > Your current version has
> > >
> > >         if (arm64_extio_ops->pfout)                             \
> > >                 arm64_extio_ops->pfout(arm64_extio_ops->devpara,\
> > >                        addr, value, sizeof(type));             \
> > >
> > > Instead, just subtract the start of the range from the logical
> > > port number to transform it back into a bus-local port number:
> >
> > These accessors do not operate on IO tokens:
> >
> > If (arm64_extio_ops->start > addr || arm64_extio_ops->end < addr)
> > addr is not going to be an I/O token; in fact patch 2/3 imposes that
> > the I/O tokens will start at PCIBIOS_MIN_IO. So from 0 to
> PCIBIOS_MIN_IO
> > we have free physical addresses that the accessors can operate on.
> 
> Ah, I missed that part. I'd rather not use PCIBIOS_MIN_IO to refer to
> the logical I/O tokens, the purpose of that macro is really meant
> for allocating PCI I/O port numbers within the address space of
> one bus.

As I mentioned above, special devices operate on CPU addresses directly,
not I/O tokens. For them there is no way to distinguish....

> 
> Note that it's equally likely that whichever next platform needs
> non-mapped I/O access like this actually needs them for PCI I/O space,
> and that will use it on addresses registered to a PCI host bridge.

Ok so here you are talking about a platform that has got an I/O range
under the PCI host controller, right?
And this I/O range cannot be directly memory mapped but needs special
redirections for the I/O tokens, right?

In this scenario registering the I/O ranges with the forbidden range
implemented by the current patch would still allow to redirect I/O
tokens as long as arm64_extio_ops->start >= PCIBIOS_MIN_IO

So effectively the special PCI host controller
1) knows the physical range that needs special redirection
2) register such range
3) uses pci_pio_to_address() to retrieve the IO tokens for the
   special accessors
4) sets arm64_extio_ops->start/end to the IO tokens retrieved in 3)

So to be honest I think this patch can fit well both with
special PCI controllers that need I/O tokens redirection and with
special non-PCI controllers that need non-PCI I/O physical
address redirection...

Thanks (and sorry for the long reply but I didn't know how
to make the explanation shorter :) )

Gab

> 
> If we separate the two steps:
> 
> a) assign a range of logical I/O port numbers to a bus
> b) register a set of helpers for redirecting logical I/O
>    port to a helper function
> 
> then I think the code will get cleaner and more flexible.
> It should actually then be able to replace the powerpc
> specific implementation.
> 
> 	Arnd

^ permalink raw reply

* [PATCH v5 6/8] Documentation: bindings: add compatible specific to legacy SCPI protocol
From: Rob Herring @ 2016-11-11 13:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <ed58424e-e538-227a-37b5-b35fbe4f96ba@arm.com>

On Fri, Nov 11, 2016 at 1:48 AM, Sudeep Holla <sudeep.holla@arm.com> wrote:
> On 10/11/16 19:03, Olof Johansson wrote:
>> On Thu, Nov 10, 2016 at 6:34 AM, Sudeep Holla <sudeep.holla@arm.com>
>> wrote:

[...]

>>> E.g. Amlogic follows most of the legacy protocol though it deviates in
>>> couple of things which we can handle with platform specific compatible
>>> (in the following patch in the series). When another user(Rockchip ?)
>>> make use of this legacy protocol, we can start using those platform
>>> specific compatible for deviations only.
>>>
>>> Is that not acceptable ?
>>
>>
>> If there's no shared legacy feature set, then it's probably less
>> useful to have a shared less precise compatible value.
>>
>
> There is and will be some shared feature set for sure. At the least the
> standard command set will be shared.
>
>> What the main point I was trying to get across was that we shouldn't
>> expand the generic binding with per-vendor compatible fields, instead
>> we should have those as extensions on the side.
>>
>
> Yes I get the point. We will have per-vendor compatibles for handle the
> deviations but generic one to handle the shared set.
>
>> I'm also a little apprehensive of using "legacy", it goes in the same
>> bucket as "misc". At some point 1.0 will be legacy too, etc.
>>
>
> True and I agree, how about "arm,scpi-pre-1.0" instead ?

That's still meaningless. Convince me that multiple implementations
are identical, then we can have a common property. For example, how
many releases did ARM make before 1.0.

Rob

^ permalink raw reply

* [PATCH V6 2/3] ACPI: Add support for ResourceSource/IRQ domain mapping
From: Hanjun Guo @ 2016-11-11 13:33 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161110175814.GA12446@red-moon>

Hi Lorenzo,

On 11/11/2016 01:58 AM, Lorenzo Pieralisi wrote:
> On Thu, Nov 10, 2016 at 10:02:35AM -0500, agustinv at codeaurora.org wrote:
>> Hey Hanjun,
>>
>> On 2016-11-09 21:36, Hanjun Guo wrote:
>>> Hi Marc, Rafael, Lorenzo,
>>>
>>> Since we agreed to add a probe deferral if we failed to get irq
>>> resources which mirroring the DT does (patch 1 in this patch set),
>>> I think the last blocker to make things work both for Agustin and
>>> me [1] is this patch, which makes the interrupt producer and consumer
>>> work in ACPI, we have two different solution for one thing, we'd happy
>>> to work together for one solution, could you give some suggestions
>>> please?
>>>
>>> [1]: https://mail-archive.com/linux-kernel at vger.kernel.org/msg1257419.html
>>>
>>> Agustin, I have some comments below.
>>>
>>> On 2016/10/29 4:48, Agustin Vega-Frias wrote:
>>>> This allows irqchip drivers to associate an ACPI DSDT device to
>>>> an IRQ domain and provides support for using the ResourceSource
>>>> in Extended IRQ Resources to find the domain and map the IRQs
>>>> specified on that domain.
>>>>
>>>> Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
>>>> ---
>>>> drivers/acpi/Makefile    |   1 +
>>>> drivers/acpi/irqdomain.c | 119
>>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>> Could we just reuse the gsi.c and not introduce a new
>>> file, probably we can change the gsi.c to irqdomain.c
>>> or something similar, then reuse the code in gsi.c.
>>
>> I was thinking just that after we chatted off-list.
>
> Yes, that's a fair point.
>
>> I might revisit and see what I come up with given that we already have
>> a device argument and we could pass the IRQ source there.
>
> I agree with the approach taken by this patch, I do not like much
> passing around struct acpi_resource_source *source (in particular
> the dummy struct) I do not think it is needed, I will comment on
> the code.

thanks for your time to have a look:)

>
> Hopefully there is not any buggy FW out there that does use the
> resource source inappropriately otherwise we will notice on x86/ia64
> (ie you can't blame FW if it breaks the kernel) but I suspect the
> only way to find out is by trying, the patch has to go through Rafael's
> review anyway before getting there so it is fine.

I think we can avoid that by not touching the logic that x86/ia64
already used, but only adding interrupt producer/consumer function.

Thanks
Hanjun

^ permalink raw reply

* [PATCH v6 3/9] drm/hisilicon/hibmc: Add support for frame buffer
From: Sean Paul @ 2016-11-11 13:27 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <5825C496.5070905@huawei.com>

On Fri, Nov 11, 2016 at 8:16 AM, Rongrong Zou <zourongrong@huawei.com> wrote:
> ? 2016/11/11 2:30, Sean Paul ??:
>>
>> On Fri, Oct 28, 2016 at 3:27 AM, Rongrong Zou <zourongrong@gmail.com>
>> wrote:
>>>
>>> Add support for fbdev and kms fb management.
>>>
>>> Signed-off-by: Rongrong Zou <zourongrong@gmail.com>
>>> ---
>>>   drivers/gpu/drm/hisilicon/hibmc/Makefile          |   2 +-
>>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   |  17 ++
>>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h   |  24 ++
>>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c | 255
>>> ++++++++++++++++++++++
>>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c       |  66 ++++++
>>>   5 files changed, 363 insertions(+), 1 deletion(-)
>>>   create mode 100644 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
>>>
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> index d5c40b8..810a37e 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> @@ -1,5 +1,5 @@
>>>   ccflags-y := -Iinclude/drm
>>> -hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_power.o hibmc_ttm.o
>>> +hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_fbdev.o hibmc_drm_power.o
>>> hibmc_ttm.o
>>>
>>>   obj-$(CONFIG_DRM_HISI_HIBMC)   +=hibmc-drm.o
>>>   #obj-y += hibmc-drm.o
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> index 81f4301..5ac7a7e 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> @@ -66,11 +66,23 @@ static void hibmc_disable_vblank(struct drm_device
>>> *dev, unsigned int pipe)
>>>
>>>   static int hibmc_pm_suspend(struct device *dev)
>>>   {
>>> +       struct pci_dev *pdev = to_pci_dev(dev);
>>> +       struct drm_device *drm_dev = pci_get_drvdata(pdev);
>>> +       struct hibmc_drm_device *hidev = drm_dev->dev_private;
>>> +
>>> +       drm_fb_helper_set_suspend_unlocked(&hidev->fbdev->helper, 1);
>>> +
>>
>>
>> We have atomic helpers for suspend/resume now. Please use those.
>
>
> drm_atomic_helper_suspend/resume()?
>

Correct

>
>>
>>>          return 0;
>>>   }
>>>
>>>   static int hibmc_pm_resume(struct device *dev)
>>>   {
>>> +       struct pci_dev *pdev = to_pci_dev(dev);
>>> +       struct drm_device *drm_dev = pci_get_drvdata(pdev);
>>> +       struct hibmc_drm_device *hidev = drm_dev->dev_private;
>>> +
>>> +       drm_fb_helper_set_suspend_unlocked(&hidev->fbdev->helper, 0);
>>> +
>>>          return 0;
>>>   }
>>>
>>> @@ -170,6 +182,7 @@ static int hibmc_unload(struct drm_device *dev)
>>>   {
>>>          struct hibmc_drm_device *hidev = dev->dev_private;
>>>
>>> +       hibmc_fbdev_fini(hidev);
>>>          hibmc_mm_fini(hidev);
>>>          hibmc_hw_fini(hidev);
>>>          dev->dev_private = NULL;
>>> @@ -195,6 +208,10 @@ static int hibmc_load(struct drm_device *dev,
>>> unsigned long flags)
>>>          if (ret)
>>>                  goto err;
>>>
>>> +       ret = hibmc_fbdev_init(hidev);
>>> +       if (ret)
>>> +               goto err;
>>> +
>>>          return 0;
>>>
>>>   err:
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> index db8d80e..a40e9a7 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> @@ -20,9 +20,22 @@
>>>   #define HIBMC_DRM_DRV_H
>>>
>>>   #include <drm/drmP.h>
>>> +#include <drm/drm_fb_helper.h>
>>>   #include <drm/ttm/ttm_bo_driver.h>
>>>   #include <drm/drm_gem.h>
>>>
>>> +struct hibmc_framebuffer {
>>> +       struct drm_framebuffer fb;
>>> +       struct drm_gem_object *obj;
>>> +       bool is_fbdev_fb;
>>> +};
>>> +
>>> +struct hibmc_fbdev {
>>> +       struct drm_fb_helper helper;
>>> +       struct hibmc_framebuffer *fb;
>>> +       int size;
>>> +};
>>> +
>>>   struct hibmc_drm_device {
>>>          /* hw */
>>>          void __iomem   *mmio;
>>> @@ -41,9 +54,13 @@ struct hibmc_drm_device {
>>>                  bool initialized;
>>>          } ttm;
>>>
>>> +       /* fbdev */
>>> +       struct hibmc_fbdev *fbdev;
>>>          bool mm_inited;
>>>   };
>>>
>>> +#define to_hibmc_framebuffer(x) container_of(x, struct
>>> hibmc_framebuffer, fb)
>>> +
>>>   struct hibmc_bo {
>>>          struct ttm_buffer_object bo;
>>>          struct ttm_placement placement;
>>> @@ -65,8 +82,15 @@ static inline struct hibmc_bo *gem_to_hibmc_bo(struct
>>> drm_gem_object *gem)
>>>
>>>   #define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
>>>
>>> +int hibmc_fbdev_init(struct hibmc_drm_device *hidev);
>>> +void hibmc_fbdev_fini(struct hibmc_drm_device *hidev);
>>> +
>>>   int hibmc_gem_create(struct drm_device *dev, u32 size, bool iskernel,
>>>                       struct drm_gem_object **obj);
>>> +struct hibmc_framebuffer *
>>> +hibmc_framebuffer_init(struct drm_device *dev,
>>> +                      const struct drm_mode_fb_cmd2 *mode_cmd,
>>> +                      struct drm_gem_object *obj);
>>>
>>>   int hibmc_mm_init(struct hibmc_drm_device *hibmc);
>>>   void hibmc_mm_fini(struct hibmc_drm_device *hibmc);
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
>>> new file mode 100644
>>> index 0000000..630124b
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
>>> @@ -0,0 +1,255 @@
>>> +/* Hisilicon Hibmc SoC drm driver
>>> + *
>>> + * Based on the bochs drm driver.
>>> + *
>>> + * Copyright (c) 2016 Huawei Limited.
>>> + *
>>> + * Author:
>>> + *     Rongrong Zou <zourongrong@huawei.com>
>>> + *     Rongrong Zou <zourongrong@gmail.com>
>>> + *     Jianhua Li <lijianhua@huawei.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License as published by
>>> + * the Free Software Foundation; either version 2 of the License, or
>>> + * (at your option) any later version.
>>> + *
>>> + */
>>> +
>>> +#include <drm/drm_crtc.h>
>>> +#include <drm/drm_crtc_helper.h>
>>> +#include <drm/drm_fb_helper.h>
>>> +
>>> +#include "hibmc_drm_drv.h"
>>> +
>>> +/*
>>> ---------------------------------------------------------------------- */
>>
>>
>> Please remove
>
>
> will do that, thanks.
>
>
>>
>>> +
>>> +static int hibmcfb_create_object(
>>> +                               struct hibmc_drm_device *hidev,
>>> +                               const struct drm_mode_fb_cmd2 *mode_cmd,
>>> +                               struct drm_gem_object **gobj_p)
>>> +{
>>> +       struct drm_gem_object *gobj;
>>> +       struct drm_device *dev = hidev->dev;
>>> +       u32 size;
>>> +       int ret = 0;
>>> +
>>> +       size = mode_cmd->pitches[0] * mode_cmd->height;
>>> +       ret = hibmc_gem_create(dev, size, true, &gobj);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       *gobj_p = gobj;
>>> +       return ret;
>>> +}
>>> +
>>> +static struct fb_ops hibmc_drm_fb_ops = {
>>> +       .owner = THIS_MODULE,
>>> +       .fb_check_var = drm_fb_helper_check_var,
>>> +       .fb_set_par = drm_fb_helper_set_par,
>>> +       .fb_fillrect = drm_fb_helper_sys_fillrect,
>>> +       .fb_copyarea = drm_fb_helper_sys_copyarea,
>>> +       .fb_imageblit = drm_fb_helper_sys_imageblit,
>>> +       .fb_pan_display = drm_fb_helper_pan_display,
>>> +       .fb_blank = drm_fb_helper_blank,
>>> +       .fb_setcmap = drm_fb_helper_setcmap,
>>> +};
>>> +
>>> +static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
>>> +                              struct drm_fb_helper_surface_size *sizes)
>>> +{
>>> +       struct hibmc_fbdev *hi_fbdev =
>>> +               container_of(helper, struct hibmc_fbdev, helper);
>>> +       struct hibmc_drm_device *hidev =
>>> +               (struct hibmc_drm_device *)helper->dev->dev_private;
>>> +       struct fb_info *info;
>>> +       struct hibmc_framebuffer *hibmc_fb;
>>> +       struct drm_framebuffer *fb;
>>> +       struct drm_mode_fb_cmd2 mode_cmd;
>>> +       struct drm_gem_object *gobj = NULL;
>>> +       int ret = 0;
>>> +       size_t size;
>>> +       unsigned int bytes_per_pixel;
>>> +       struct hibmc_bo *bo = NULL;
>>> +
>>> +       DRM_DEBUG_DRIVER("surface width(%d), height(%d) and bpp(%d)\n",
>>> +                        sizes->surface_width, sizes->surface_height,
>>> +                        sizes->surface_bpp);
>>> +       sizes->surface_depth = 32;
>>> +
>>> +       bytes_per_pixel = DIV_ROUND_UP(sizes->surface_bpp, 8);
>>> +
>>> +       mode_cmd.width = sizes->surface_width;
>>> +       mode_cmd.height = sizes->surface_height;
>>> +       mode_cmd.pitches[0] = mode_cmd.width * bytes_per_pixel;
>>> +       mode_cmd.pixel_format =
>>> drm_mode_legacy_fb_format(sizes->surface_bpp,
>>> +
>>> sizes->surface_depth);
>>> +
>>> +       size = roundup(mode_cmd.pitches[0] * mode_cmd.height, PAGE_SIZE);
>>
>>
>> It's somewhat curious that you used ALIGN in the previous patch and
>> roundup here. But anyways, I think PAGE_ALIGN would be the most
>> appropriate thing to do here.
>
>
> agreed, thanks.
>
>>
>>> +
>>> +       ret = hibmcfb_create_object(hidev, &mode_cmd, &gobj);
>>> +       if (ret) {
>>> +               DRM_ERROR("failed to create fbcon backing object %d\r\n",
>>
>> ret);
>>
>> \r, yikes!!!
>
>
> will delete it, thanks.
>
>>
>>> +               return -ENOMEM;
>>> +       }
>>> +
>>> +       bo = gem_to_hibmc_bo(gobj);
>>> +
>>> +       ret = ttm_bo_reserve(&bo->bo, true, false, NULL);
>>> +       if (ret)
>>
>>
>> Print error here?
>
>
> will do.
>
>>
>> How about cleaning up gobj?
>
>
> you are right,
>
>>
>>> +               return ret;
>>> +
>>> +       ret = hibmc_bo_pin(bo, TTM_PL_FLAG_VRAM, NULL);
>>> +       if (ret) {
>>> +               DRM_ERROR("failed to pin fbcon\n");
>>
>>
>> Print ret
>>
>> ttm_bo_unreserve? It seems like you're missing clean-up in all of the
>> error paths in this function. Can you please make sure everything is
>> tidied up?
>
>
> ok, thanks.
>
>>
>>> +               return ret;
>>> +       }
>>> +
>>> +       ret = ttm_bo_kmap(&bo->bo, 0, bo->bo.num_pages, &bo->kmap);
>>> +
>>
>>
>> nit: extra space
>
>
> do you mean extra line?
>

yes, apologies.

>>
>>> +       if (ret) {
>>> +               DRM_ERROR("failed to kmap fbcon\n");
>>
>>
>> Print ret
>
>
> ok, thanks.
>
>>
>>> +               ttm_bo_unreserve(&bo->bo);
>>> +               return ret;
>>> +       }
>>> +
>>> +       ttm_bo_unreserve(&bo->bo);
>>
>>
>> Move this between ttm_bo_kmap and if (ret), then remove it from inside
>> the conditional.
>
>
> This is fine with me, thanks.
>
>>
>>> +
>>> +       info = drm_fb_helper_alloc_fbi(helper);
>>> +       if (IS_ERR(info))
>>
>>
>> Print error
>
>
> ok, thanks.
>
>>
>>> +               return PTR_ERR(info);
>>> +
>>> +       info->par = hi_fbdev;
>>> +
>>> +       hibmc_fb = hibmc_framebuffer_init(hidev->dev, &mode_cmd, gobj);
>>> +       if (IS_ERR(hibmc_fb)) {
>>> +               drm_fb_helper_release_fbi(helper);
>>> +               return PTR_ERR(hibmc_fb);
>>> +       }
>>> +
>>> +       hi_fbdev->fb = hibmc_fb;
>>> +       hidev->fbdev->size = size;
>>> +       fb = &hibmc_fb->fb;
>>
>>
>> The fb variable isn't necessary, and really, the hibmc_fb doesn't
>> really help either, it just masks what's actually happening IMO.
>
>
> i will clean the two variables.
>
>
>>
>>> +       if (!fb) {
>>> +               DRM_INFO("fb is NULL\n");
>>> +               return -EINVAL;
>>> +       }
>>> +
>>> +       hi_fbdev->helper.fb = fb;
>>> +
>>> +       strcpy(info->fix.id, "hibmcdrmfb");
>>> +
>>> +       info->flags = FBINFO_DEFAULT;
>>> +       info->fbops = &hibmc_drm_fb_ops;
>>> +
>>> +       drm_fb_helper_fill_fix(info, fb->pitches[0], fb->depth);
>>> +       drm_fb_helper_fill_var(info, &hidev->fbdev->helper,
>>> sizes->fb_width,
>>> +                              sizes->fb_height);
>>> +
>>> +       info->screen_base = bo->kmap.virtual;
>>> +       info->screen_size = size;
>>> +
>>> +       info->fix.smem_start = bo->bo.mem.bus.offset +
>>> bo->bo.mem.bus.base;
>>> +       info->fix.smem_len = size;
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static void hibmc_fbdev_destroy(struct hibmc_fbdev *fbdev)
>>> +{
>>> +       struct hibmc_framebuffer *gfb = fbdev->fb;
>>> +       struct drm_fb_helper *fbh = &fbdev->helper;
>>> +
>>> +       DRM_DEBUG_DRIVER("hibmc_fbdev_destroy\n");
>>
>>
>> Not useful
>
>
> ok, will remove.
>
>
>>
>>> +
>>> +       drm_fb_helper_unregister_fbi(fbh);
>>> +       drm_fb_helper_release_fbi(fbh);
>>> +
>>> +       drm_fb_helper_fini(fbh);
>>> +
>>> +       if (gfb)
>>> +               drm_framebuffer_unreference(&gfb->fb);
>>> +
>>> +       kfree(fbdev);
>>> +}
>>> +
>>> +static const struct drm_fb_helper_funcs hibmc_fbdev_helper_funcs = {
>>> +       .fb_probe = hibmc_drm_fb_create,
>>> +};
>>> +
>>> +int hibmc_fbdev_init(struct hibmc_drm_device *hidev)
>>> +{
>>> +       int ret;
>>> +       struct fb_var_screeninfo *var;
>>> +       struct fb_fix_screeninfo *fix;
>>> +       struct hibmc_fbdev *hifbdev;
>>> +
>>> +       hifbdev = kzalloc(sizeof(*hifbdev), GFP_KERNEL);
>>
>>
>> devm_kzalloc?
>
>
> sounds good, so there need no kfree at hibmc_fbdev_destroy(),
> thanks.
>

yep, exactly

>>
>>> +       if (!hifbdev)
>>> +               return -ENOMEM;
>>> +
>>> +       hidev->fbdev = hifbdev;
>>> +       drm_fb_helper_prepare(hidev->dev, &hifbdev->helper,
>>> +                             &hibmc_fbdev_helper_funcs);
>>> +
>>> +       /* Now just one crtc and one channel */
>>> +       ret = drm_fb_helper_init(hidev->dev,
>>> +                                &hifbdev->helper, 1, 1);
>>> +
>>
>>
>> nit: extra line
>
>
> ok, thanks.
>
>
>>
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       ret = drm_fb_helper_single_add_all_connectors(&hifbdev->helper);
>>> +       if (ret)
>>> +               goto fini;
>>> +
>>> +       ret = drm_fb_helper_initial_config(&hifbdev->helper, 16);
>>> +       if (ret)
>>> +               goto fini;
>>> +
>>> +       var = &hifbdev->helper.fbdev->var;
>>> +       fix = &hifbdev->helper.fbdev->fix;
>>> +
>>> +       DRM_DEBUG("Member of info->var is :\n"
>>> +                "xres=%d\n"
>>> +                "yres=%d\n"
>>> +                "xres_virtual=%d\n"
>>> +                "yres_virtual=%d\n"
>>> +                "xoffset=%d\n"
>>> +                "yoffset=%d\n"
>>> +                "bits_per_pixel=%d\n"
>>> +                "...\n", var->xres, var->yres, var->xres_virtual,
>>> +                var->yres_virtual, var->xoffset, var->yoffset,
>>> +                var->bits_per_pixel);
>>> +       DRM_DEBUG("Member of info->fix is :\n"
>>> +                "smem_start=%lx\n"
>>> +                "smem_len=%d\n"
>>> +                "type=%d\n"
>>> +                "type_aux=%d\n"
>>> +                "visual=%d\n"
>>> +                "xpanstep=%d\n"
>>> +                "ypanstep=%d\n"
>>> +                "ywrapstep=%d\n"
>>> +                "line_length=%d\n"
>>> +                "accel=%d\n"
>>> +                "capabilities=%d\n"
>>> +                "...\n", fix->smem_start, fix->smem_len, fix->type,
>>> +                fix->type_aux, fix->visual, fix->xpanstep,
>>> +                fix->ypanstep, fix->ywrapstep, fix->line_length,
>>> +                fix->accel, fix->capabilities);
>>
>>
>> You've been using DRM_DEBUG_DRIVER elsewhere, you should probably use
>> it here, too.
>
>
> ok, thanks.
>
>
>>
>>> +
>>> +       return 0;
>>> +
>>> +fini:
>>> +       drm_fb_helper_fini(&hifbdev->helper);
>>> +       return ret;
>>> +}
>>> +
>>> +void hibmc_fbdev_fini(struct hibmc_drm_device *hidev)
>>> +{
>>> +       if (!hidev->fbdev)
>>> +               return;
>>> +
>>> +       hibmc_fbdev_destroy(hidev->fbdev);
>>> +       hidev->fbdev = NULL;
>>> +}
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>> index 0802ebd..9822f62 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>> @@ -488,3 +488,69 @@ int hibmc_dumb_mmap_offset(struct drm_file *file,
>>> struct drm_device *dev,
>>>          drm_gem_object_unreference_unlocked(obj);
>>>          return 0;
>>>   }
>>> +
>>> +/*
>>> ---------------------------------------------------------------------- */
>>> +
>>
>>
>> Please remove
>
>
> will do.
>
>>
>>> +static void hibmc_user_framebuffer_destroy(struct drm_framebuffer *fb)
>>> +{
>>> +       struct hibmc_framebuffer *hibmc_fb = to_hibmc_framebuffer(fb);
>>> +
>>> +       drm_gem_object_unreference_unlocked(hibmc_fb->obj);
>>> +       drm_framebuffer_cleanup(fb);
>>> +       kfree(hibmc_fb);
>>> +}
>>> +
>>> +static const struct drm_framebuffer_funcs hibmc_fb_funcs = {
>>> +       .destroy = hibmc_user_framebuffer_destroy,
>>> +};
>>> +
>>> +struct hibmc_framebuffer *
>>> +hibmc_framebuffer_init(struct drm_device *dev,
>>> +                      const struct drm_mode_fb_cmd2 *mode_cmd,
>>> +                      struct drm_gem_object *obj)
>>> +{
>>> +       struct hibmc_framebuffer *hibmc_fb;
>>> +       int ret;
>>> +
>>> +       hibmc_fb = kzalloc(sizeof(*hibmc_fb), GFP_KERNEL);
>>> +       if (!hibmc_fb)
>>
>>
>> Print error
>
>
> ok, thanks.
>
> Regards,
> Rongrong.
>
>>
>>> +               return ERR_PTR(-ENOMEM);
>>> +
>>> +       drm_helper_mode_fill_fb_struct(&hibmc_fb->fb, mode_cmd);
>>> +       hibmc_fb->obj = obj;
>>> +       ret = drm_framebuffer_init(dev, &hibmc_fb->fb, &hibmc_fb_funcs);
>>> +       if (ret) {
>>> +               DRM_ERROR("drm_framebuffer_init failed: %d\n", ret);
>>> +               kfree(hibmc_fb);
>>> +               return ERR_PTR(ret);
>>> +       }
>>> +
>>> +       return hibmc_fb;
>>> +}
>>> +
>>> +static struct drm_framebuffer *
>>> +hibmc_user_framebuffer_create(struct drm_device *dev,
>>> +                             struct drm_file *filp,
>>> +                             const struct drm_mode_fb_cmd2 *mode_cmd)
>>> +{
>>> +       struct drm_gem_object *obj;
>>> +       struct hibmc_framebuffer *hibmc_fb;
>>> +
>>> +       DRM_DEBUG_DRIVER("%dx%d, format %c%c%c%c\n",
>>> +                        mode_cmd->width, mode_cmd->height,
>>> +                        (mode_cmd->pixel_format) & 0xff,
>>> +                        (mode_cmd->pixel_format >> 8)  & 0xff,
>>> +                        (mode_cmd->pixel_format >> 16) & 0xff,
>>> +                        (mode_cmd->pixel_format >> 24) & 0xff);
>>> +
>>> +       obj = drm_gem_object_lookup(filp, mode_cmd->handles[0]);
>>> +       if (!obj)
>>> +               return ERR_PTR(-ENOENT);
>>> +
>>> +       hibmc_fb = hibmc_framebuffer_init(dev, mode_cmd, obj);
>>> +       if (IS_ERR(hibmc_fb)) {
>>> +               drm_gem_object_unreference_unlocked(obj);
>>> +               return ERR_PTR((long)hibmc_fb);
>>> +       }
>>> +       return &hibmc_fb->fb;
>>> +}
>>> --
>>> 1.9.1
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>> _______________________________________________
>> linuxarm mailing list
>> linuxarm at huawei.com
>> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
>>
>> .
>>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* [PATCH V6 2/3] ACPI: Add support for ResourceSource/IRQ domain mapping
From: Hanjun Guo @ 2016-11-11 13:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <60c1d53146c0aebd3a05095823229224@codeaurora.org>

On 11/10/2016 11:02 PM, agustinv at codeaurora.org wrote:
> Hey Hanjun,
>
> On 2016-11-09 21:36, Hanjun Guo wrote:
>> Hi Marc, Rafael, Lorenzo,
>>
>> Since we agreed to add a probe deferral if we failed to get irq
>> resources which mirroring the DT does (patch 1 in this patch set),
>> I think the last blocker to make things work both for Agustin and
>> me [1] is this patch, which makes the interrupt producer and consumer
>> work in ACPI, we have two different solution for one thing, we'd happy
>> to work together for one solution, could you give some suggestions
>> please?
>>
>> [1]:
>> https://mail-archive.com/linux-kernel at vger.kernel.org/msg1257419.html
>>
>> Agustin, I have some comments below.
>>
>> On 2016/10/29 4:48, Agustin Vega-Frias wrote:
>>> This allows irqchip drivers to associate an ACPI DSDT device to
>>> an IRQ domain and provides support for using the ResourceSource
>>> in Extended IRQ Resources to find the domain and map the IRQs
>>> specified on that domain.
>>>
>>> Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
>>> ---
>>>  drivers/acpi/Makefile    |   1 +
>>>  drivers/acpi/irqdomain.c | 119
>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>
>> Could we just reuse the gsi.c and not introduce a new
>> file, probably we can change the gsi.c to irqdomain.c
>> or something similar, then reuse the code in gsi.c.
>
> I was thinking just that after we chatted off-list.

Great.

> I might revisit and see what I come up with given that we already have
> a device argument and we could pass the IRQ source there.

Sorry, I'm little confused here, why we "already have a device
argument"? in drivers/acpi/resource.c, it just use NULL as device
and we can't pass dev directly for the API is using now, could
you explain more?

Thanks
Hanjun

^ permalink raw reply

* [PATCH v6 2/9] drm/hisilicon/hibmc: Add video memory management
From: Sean Paul @ 2016-11-11 13:25 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <5825A8A3.5070409@huawei.com>

On Fri, Nov 11, 2016 at 6:16 AM, Rongrong Zou <zourongrong@huawei.com> wrote:
> ? 2016/11/11 1:35, Sean Paul ??:
>>
>> On Fri, Oct 28, 2016 at 3:27 AM, Rongrong Zou <zourongrong@gmail.com>
>> wrote:
>>>
>>> Hibmc have 32m video memory which can be accessed through PCIe by host,
>>> we use ttm to manage these memory.
>>>
>>> Signed-off-by: Rongrong Zou <zourongrong@gmail.com>
>>> ---
>>>   drivers/gpu/drm/hisilicon/hibmc/Kconfig         |   1 +
>>>   drivers/gpu/drm/hisilicon/hibmc/Makefile        |   2 +-
>>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c |  12 +
>>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h |  46 +++
>>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c     | 490
>>> ++++++++++++++++++++++++
>>>   5 files changed, 550 insertions(+), 1 deletion(-)
>>>   create mode 100644 drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>>
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>> b/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>> index a9af90d..bcb8c18 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/Kconfig
>>> @@ -1,6 +1,7 @@
>>>   config DRM_HISI_HIBMC
>>>          tristate "DRM Support for Hisilicon Hibmc"
>>>          depends on DRM && PCI
>>> +       select DRM_TTM
>>>
>>>          help
>>>            Choose this option if you have a Hisilicon Hibmc soc chipset.
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> index 97cf4a0..d5c40b8 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>>> @@ -1,5 +1,5 @@
>>>   ccflags-y := -Iinclude/drm
>>> -hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_power.o
>>> +hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_power.o hibmc_ttm.o
>>>
>>>   obj-$(CONFIG_DRM_HISI_HIBMC)   +=hibmc-drm.o
>>>   #obj-y += hibmc-drm.o
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> index 4669d42..81f4301 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>>> @@ -31,6 +31,7 @@
>>>   #ifdef CONFIG_COMPAT
>>>          .compat_ioctl   = drm_compat_ioctl,
>>>   #endif
>>> +       .mmap           = hibmc_mmap,
>>>          .poll           = drm_poll,
>>>          .read           = drm_read,
>>>          .llseek         = no_llseek,
>>> @@ -46,6 +47,8 @@ static void hibmc_disable_vblank(struct drm_device
>>> *dev, unsigned int pipe)
>>>   }
>>>
>>>   static struct drm_driver hibmc_driver = {
>>> +       .driver_features        = DRIVER_GEM,
>>> +
>>
>>
>> nit: extra space
>>
>>>          .fops                   = &hibmc_fops,
>>>          .name                   = "hibmc",
>>>          .date                   = "20160828",
>>> @@ -55,6 +58,10 @@ static void hibmc_disable_vblank(struct drm_device
>>> *dev, unsigned int pipe)
>>>          .get_vblank_counter     = drm_vblank_no_hw_counter,
>>>          .enable_vblank          = hibmc_enable_vblank,
>>>          .disable_vblank         = hibmc_disable_vblank,
>>> +       .gem_free_object_unlocked = hibmc_gem_free_object,
>>> +       .dumb_create            = hibmc_dumb_create,
>>> +       .dumb_map_offset        = hibmc_dumb_mmap_offset,
>>> +       .dumb_destroy           = drm_gem_dumb_destroy,
>>>   };
>>>
>>>   static int hibmc_pm_suspend(struct device *dev)
>>> @@ -163,6 +170,7 @@ static int hibmc_unload(struct drm_device *dev)
>>>   {
>>>          struct hibmc_drm_device *hidev = dev->dev_private;
>>>
>>> +       hibmc_mm_fini(hidev);
>>>          hibmc_hw_fini(hidev);
>>>          dev->dev_private = NULL;
>>>          return 0;
>>> @@ -183,6 +191,10 @@ static int hibmc_load(struct drm_device *dev,
>>> unsigned long flags)
>>>          if (ret)
>>>                  goto err;
>>>
>>> +       ret = hibmc_mm_init(hidev);
>>> +       if (ret)
>>> +               goto err;
>>> +
>>>          return 0;
>>>
>>>   err:
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> index 0037341..db8d80e 100644
>>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>>> @@ -20,6 +20,8 @@
>>>   #define HIBMC_DRM_DRV_H
>>>
>>>   #include <drm/drmP.h>
>>> +#include <drm/ttm/ttm_bo_driver.h>
>>> +#include <drm/drm_gem.h>
>>
>>
>> nit: alphabetize
>
>
> will fix it, thanks.
>
>>
>>>
>>>   struct hibmc_drm_device {
>>>          /* hw */
>>> @@ -30,6 +32,50 @@ struct hibmc_drm_device {
>>>
>>>          /* drm */
>>>          struct drm_device  *dev;
>>> +
>>> +       /* ttm */
>>> +       struct {
>>> +               struct drm_global_reference mem_global_ref;
>>> +               struct ttm_bo_global_ref bo_global_ref;
>>> +               struct ttm_bo_device bdev;
>>> +               bool initialized;
>>> +       } ttm;
>>
>>
>> I don't think you gain anything other than keystrokes from the substruct
>
>
> I'm sorry i didn't catch you, i looked at the all drivers used ttm such
> as ast/bochs/cirrus/mgag200/qxl/virtio_gpu, they all embedded the ttm
> substruct
> into the driver-private struct.
>
> so do you mean
> struct hibmc_drm_device {
>         /* hw */
>         void __iomem   *mmio;
>         void __iomem   *fb_map;
>         unsigned long  fb_base;
>         unsigned long  fb_size;
>
>         /* drm */
>         struct drm_device  *dev;
>         struct drm_plane plane;
>         struct drm_crtc crtc;
>         struct drm_encoder encoder;
>         struct drm_connector connector;
>         bool mode_config_initialized;
>
>         /* ttm */
>         struct drm_global_reference mem_global_ref;
>         struct ttm_bo_global_ref bo_global_ref;
>         struct ttm_bo_device bdev;
>         bool initialized;
>         ...
>         };
> ?

Yeah, that's what I was thinking

>
>>
>>> +
>>> +       bool mm_inited;
>>>   };
>>>
>>> +struct hibmc_bo {
>>> +       struct ttm_buffer_object bo;
>>> +       struct ttm_placement placement;
>>> +       struct ttm_bo_kmap_obj kmap;
>>> +       struct drm_gem_object gem;
>>> +       struct ttm_place placements[3];
>>> +       int pin_count;
>>> +};
>>> +
>>> +static inline struct hibmc_bo *hibmc_bo(struct ttm_buffer_object *bo)
>>> +{
>>> +       return container_of(bo, struct hibmc_bo, bo);
>>> +}
>>> +
>>> +static inline struct hibmc_bo *gem_to_hibmc_bo(struct drm_gem_object
>>> *gem)
>>> +{
>>> +       return container_of(gem, struct hibmc_bo, gem);
>>> +}
>>> +
>>> +#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
>>
>>
>> Hide this in ttm.c
>
>
> ok, will do that.
> thanks for pointing it out.
>
>
>>
>>> +
>>> +int hibmc_gem_create(struct drm_device *dev, u32 size, bool iskernel,
>>> +                    struct drm_gem_object **obj);
>>> +
>>> +int hibmc_mm_init(struct hibmc_drm_device *hibmc);
>>> +void hibmc_mm_fini(struct hibmc_drm_device *hibmc);
>>> +int hibmc_bo_pin(struct hibmc_bo *bo, u32 pl_flag, u64 *gpu_addr);
>>> +void hibmc_gem_free_object(struct drm_gem_object *obj);
>>> +int hibmc_dumb_create(struct drm_file *file, struct drm_device *dev,
>>> +                     struct drm_mode_create_dumb *args);
>>> +int hibmc_dumb_mmap_offset(struct drm_file *file, struct drm_device
>>> *dev,
>>> +                          u32 handle, u64 *offset);
>>> +int hibmc_mmap(struct file *filp, struct vm_area_struct *vma);
>>> +
>>>   #endif
>>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>> b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>> new file mode 100644
>>> index 0000000..0802ebd
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>>> @@ -0,0 +1,490 @@
>>> +/* Hisilicon Hibmc SoC drm driver
>>> + *
>>> + * Based on the bochs drm driver.
>>> + *
>>> + * Copyright (c) 2016 Huawei Limited.
>>> + *
>>> + * Author:
>>> + *     Rongrong Zou <zourongrong@huawei.com>
>>> + *     Rongrong Zou <zourongrong@gmail.com>
>>> + *     Jianhua Li <lijianhua@huawei.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License as published by
>>> + * the Free Software Foundation; either version 2 of the License, or
>>> + * (at your option) any later version.
>>> + *
>>> + */
>>> +
>>> +#include "hibmc_drm_drv.h"
>>> +#include <ttm/ttm_page_alloc.h>
>>> +#include <drm/drm_crtc_helper.h>
>>> +#include <drm/drm_atomic_helper.h>
>>> +
>>> +static inline struct hibmc_drm_device *
>>> +hibmc_bdev(struct ttm_bo_device *bd)
>>> +{
>>> +       return container_of(bd, struct hibmc_drm_device, ttm.bdev);
>>> +}
>>> +
>>> +static int
>>> +hibmc_ttm_mem_global_init(struct drm_global_reference *ref)
>>> +{
>>> +       return ttm_mem_global_init(ref->object);
>>> +}
>>> +
>>> +static void
>>> +hibmc_ttm_mem_global_release(struct drm_global_reference *ref)
>>> +{
>>> +       ttm_mem_global_release(ref->object);
>>> +}
>>> +
>>> +static int hibmc_ttm_global_init(struct hibmc_drm_device *hibmc)
>>> +{
>>> +       struct drm_global_reference *global_ref;
>>> +       int r;
>>
>>
>> nit: try not to use one character variable names unless it's for the
>> purpose of a loop (ie: i,j). You also use ret elsewhere in the driver,
>> so it'd be nice to remain consistent
>
>
> the whole file is delivered from bochs ttm, i didn't modify anything except
> some checkpatch warnings and the 'hibmc_' prefix. Unfortunately, some
> problems were delivered too.

Yeah, seems like it. Perhaps you can post patches to fix these issues
in the other drivers too :)

>
>>
>>> +
>>> +       global_ref = &hibmc->ttm.mem_global_ref;
>>
>>
>> I think using the global_ref local obfuscates what you're doing here.
>> It saves you 6 characters while typing, but adds a layer of
>> indirection for all future readers.
>>
>>> +       global_ref->global_type = DRM_GLOBAL_TTM_MEM;
>>> +       global_ref->size = sizeof(struct ttm_mem_global);
>>> +       global_ref->init = &hibmc_ttm_mem_global_init;
>>> +       global_ref->release = &hibmc_ttm_mem_global_release;
>>> +       r = drm_global_item_ref(global_ref);
>>> +       if (r != 0) {
>>
>>
>> nit: if (r)
>
>
> will fix it,
> thanks.
> BTW, i wonder why checkpatch.pl didn't report it.
>
>
>>
>>> +               DRM_ERROR("Failed setting up TTM memory accounting
>>> subsystem.\n"
>>> +                        );
>>
>>
>> Breaking up the line for one character is probably not worthwhile, and
>> you should really print the error. How about:
>>
>> DRM_ERROR("Could not get ref on ttm global ret=%d.\n", ret);
>
>
> i like your solution, thanks.
>
>
>>
>>
>>> +               return r;
>>> +       }
>>> +
>>> +       hibmc->ttm.bo_global_ref.mem_glob =
>>> +               hibmc->ttm.mem_global_ref.object;
>>> +       global_ref = &hibmc->ttm.bo_global_ref.ref;
>>> +       global_ref->global_type = DRM_GLOBAL_TTM_BO;
>>> +       global_ref->size = sizeof(struct ttm_bo_global);
>>> +       global_ref->init = &ttm_bo_global_init;
>>> +       global_ref->release = &ttm_bo_global_release;
>>> +       r = drm_global_item_ref(global_ref);
>>> +       if (r != 0) {
>>> +               DRM_ERROR("Failed setting up TTM BO subsystem.\n");
>>> +               drm_global_item_unref(&hibmc->ttm.mem_global_ref);
>>> +               return r;
>>> +       }
>>> +       return 0;
>>> +}
>>> +
>>> +static void
>>> +hibmc_ttm_global_release(struct hibmc_drm_device *hibmc)
>>> +{
>>> +       if (!hibmc->ttm.mem_global_ref.release)
>>
>>
>> Are you actually hitting this condition? This seems like it's papering
>> over something else.
>
>
> it was also delivered from others, i looked at the xxx_ttm_global_init
> function, 'mem_global_ref.release' is assigned unconditionally, so i
> think this condition never be hit, it may be hit when release twice,
> but this won't take place in my driver.
>

Yeah, that's what I was hoping for. So perhaps we can remove this?

>>
>>> +               return;
>>> +
>>> +       drm_global_item_unref(&hibmc->ttm.bo_global_ref.ref);
>>> +       drm_global_item_unref(&hibmc->ttm.mem_global_ref);
>>> +       hibmc->ttm.mem_global_ref.release = NULL;
>>> +}
>>> +
>>> +static void hibmc_bo_ttm_destroy(struct ttm_buffer_object *tbo)
>>> +{
>>> +       struct hibmc_bo *bo;
>>> +
>>> +       bo = container_of(tbo, struct hibmc_bo, bo);
>>
>>
>> nit: No need to split this into a separate line.
>
>
> agreed, thanks.
>
>>
>>> +
>>> +       drm_gem_object_release(&bo->gem);
>>> +       kfree(bo);
>>> +}
>>> +
>>> +static bool hibmc_ttm_bo_is_hibmc_bo(struct ttm_buffer_object *bo)
>>> +{
>>> +       if (bo->destroy == &hibmc_bo_ttm_destroy)
>>> +               return true;
>>> +       return false;
>>
>>
>> return bo->destroy == &hibmc_bo_ttm_destroy;
>
>
> looks better to me.
>
>
>>
>>> +}
>>> +
>>> +static int
>>> +hibmc_bo_init_mem_type(struct ttm_bo_device *bdev, u32 type,
>>> +                      struct ttm_mem_type_manager *man)
>>> +{
>>> +       switch (type) {
>>> +       case TTM_PL_SYSTEM:
>>> +               man->flags = TTM_MEMTYPE_FLAG_MAPPABLE;
>>> +               man->available_caching = TTM_PL_MASK_CACHING;
>>> +               man->default_caching = TTM_PL_FLAG_CACHED;
>>> +               break;
>>> +       case TTM_PL_VRAM:
>>> +               man->func = &ttm_bo_manager_func;
>>> +               man->flags = TTM_MEMTYPE_FLAG_FIXED |
>>> +                       TTM_MEMTYPE_FLAG_MAPPABLE;
>>> +               man->available_caching = TTM_PL_FLAG_UNCACHED |
>>> +                       TTM_PL_FLAG_WC;
>>> +               man->default_caching = TTM_PL_FLAG_WC;
>>> +               break;
>>> +       default:
>>> +               DRM_ERROR("Unsupported memory type %u\n", type);
>>> +               return -EINVAL;
>>> +       }
>>> +       return 0;
>>> +}
>>> +
>>> +void hibmc_ttm_placement(struct hibmc_bo *bo, int domain)
>>> +{
>>> +       u32 c = 0;
>>
>>
>> Can you please use a more descriptive name than 'c'?
>
>
> ok, will do that.
>
>>
>>> +       u32 i;
>>> +
>>> +       bo->placement.placement = bo->placements;
>>> +       bo->placement.busy_placement = bo->placements;
>>> +       if (domain & TTM_PL_FLAG_VRAM)
>>> +               bo->placements[c++].flags = TTM_PL_FLAG_WC |
>>> +               TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM;
>>
>>
>> nit: you're alignment is off here and below
>
>
> is it correct?
>
>         if (domain & TTM_PL_FLAG_VRAM)
>                 bo->placements[c++].flags = TTM_PL_FLAG_WC |
>                         TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM;
>         if (domain & TTM_PL_FLAG_SYSTEM)
>                 bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>                         TTM_PL_FLAG_SYSTEM;
>         if (!c)
>                 bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>                         TTM_PL_FLAG_SYSTEM;
>

Pretty much anything other than lining them up one under the other is better

>>
>>> +       if (domain & TTM_PL_FLAG_SYSTEM)
>>> +               bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>>> +               TTM_PL_FLAG_SYSTEM;
>>> +       if (!c)
>>> +               bo->placements[c++].flags = TTM_PL_MASK_CACHING |
>>> +               TTM_PL_FLAG_SYSTEM;
>>> +
>>> +       bo->placement.num_placement = c;
>>> +       bo->placement.num_busy_placement = c;
>>> +       for (i = 0; i < c; ++i) {
>>
>>
>> nit: we tend towards post-increment in kernel
>
>
> agreed, thanks.
>
>
>>
>>> +               bo->placements[i].fpfn = 0;
>>> +               bo->placements[i].lpfn = 0;
>>> +       }
>>> +}
>>> +
>>> +static void
>>> +hibmc_bo_evict_flags(struct ttm_buffer_object *bo, struct ttm_placement
>>> *pl)
>>> +{
>>> +       struct hibmc_bo *hibmcbo = hibmc_bo(bo);
>>> +
>>> +       if (!hibmc_ttm_bo_is_hibmc_bo(bo))
>>> +               return;
>>> +
>>> +       hibmc_ttm_placement(hibmcbo, TTM_PL_FLAG_SYSTEM);
>>> +       *pl = hibmcbo->placement;
>>> +}
>>> +
>>> +static int hibmc_bo_verify_access(struct ttm_buffer_object *bo,
>>> +                                 struct file *filp)
>>> +{
>>> +       struct hibmc_bo *hibmcbo = hibmc_bo(bo);
>>> +
>>> +       return drm_vma_node_verify_access(&hibmcbo->gem.vma_node,
>>> +                                         filp->private_data);
>>> +}
>>> +
>>> +static int hibmc_ttm_io_mem_reserve(struct ttm_bo_device *bdev,
>>> +                                   struct ttm_mem_reg *mem)
>>> +{
>>> +       struct ttm_mem_type_manager *man = &bdev->man[mem->mem_type];
>>> +       struct hibmc_drm_device *hibmc = hibmc_bdev(bdev);
>>> +
>>> +       mem->bus.addr = NULL;
>>> +       mem->bus.offset = 0;
>>> +       mem->bus.size = mem->num_pages << PAGE_SHIFT;
>>> +       mem->bus.base = 0;
>>> +       mem->bus.is_iomem = false;
>>> +       if (!(man->flags & TTM_MEMTYPE_FLAG_MAPPABLE))
>>> +               return -EINVAL;
>>> +       switch (mem->mem_type) {
>>> +       case TTM_PL_SYSTEM:
>>> +               /* system memory */
>>> +               return 0;
>>> +       case TTM_PL_VRAM:
>>> +               mem->bus.offset = mem->start << PAGE_SHIFT;
>>> +               mem->bus.base = pci_resource_start(hibmc->dev->pdev, 0);
>>> +               mem->bus.is_iomem = true;
>>> +               break;
>>> +       default:
>>> +               return -EINVAL;
>>> +       }
>>> +       return 0;
>>> +}
>>> +
>>> +static void hibmc_ttm_io_mem_free(struct ttm_bo_device *bdev,
>>> +                                 struct ttm_mem_reg *mem)
>>> +{
>>> +}
>>
>>
>> No need to stub this, the caller does a NULL-check before invoking
>
>
> will delete it, thanks.
>
>>
>>> +
>>> +static void hibmc_ttm_backend_destroy(struct ttm_tt *tt)
>>> +{
>>> +       ttm_tt_fini(tt);
>>> +       kfree(tt);
>>> +}
>>> +
>>> +static struct ttm_backend_func hibmc_tt_backend_func = {
>>> +       .destroy = &hibmc_ttm_backend_destroy,
>>> +};
>>> +
>>> +static struct ttm_tt *hibmc_ttm_tt_create(struct ttm_bo_device *bdev,
>>> +                                         unsigned long size,
>>> +                                         u32 page_flags,
>>> +                                         struct page *dummy_read_page)
>>> +{
>>> +       struct ttm_tt *tt;
>>> +
>>> +       tt = kzalloc(sizeof(*tt), GFP_KERNEL);
>>> +       if (!tt)
>>
>>
>> Print error
>
>
> ok, will do that, thanks.
>
>>
>>> +               return NULL;
>>> +       tt->func = &hibmc_tt_backend_func;
>>> +       if (ttm_tt_init(tt, bdev, size, page_flags, dummy_read_page)) {
>>
>>
>> Here too?
>
>
> ditto
>
>
>>
>>> +               kfree(tt);
>>> +               return NULL;
>>> +       }
>>> +       return tt;
>>> +}
>>> +
>>> +static int hibmc_ttm_tt_populate(struct ttm_tt *ttm)
>>> +{
>>> +       return ttm_pool_populate(ttm);
>>> +}
>>> +
>>> +static void hibmc_ttm_tt_unpopulate(struct ttm_tt *ttm)
>>> +{
>>> +       ttm_pool_unpopulate(ttm);
>>> +}
>>> +
>>> +struct ttm_bo_driver hibmc_bo_driver = {
>>> +       .ttm_tt_create          = hibmc_ttm_tt_create,
>>> +       .ttm_tt_populate        = hibmc_ttm_tt_populate,
>>> +       .ttm_tt_unpopulate      = hibmc_ttm_tt_unpopulate,
>>> +       .init_mem_type          = hibmc_bo_init_mem_type,
>>> +       .evict_flags            = hibmc_bo_evict_flags,
>>> +       .move                   = NULL,
>>> +       .verify_access          = hibmc_bo_verify_access,
>>> +       .io_mem_reserve         = &hibmc_ttm_io_mem_reserve,
>>> +       .io_mem_free            = &hibmc_ttm_io_mem_free,
>>> +       .lru_tail               = &ttm_bo_default_lru_tail,
>>> +       .swap_lru_tail          = &ttm_bo_default_swap_lru_tail,
>>> +};
>>> +
>>> +int hibmc_mm_init(struct hibmc_drm_device *hibmc)
>>> +{
>>> +       int ret;
>>> +       struct drm_device *dev = hibmc->dev;
>>> +       struct ttm_bo_device *bdev = &hibmc->ttm.bdev;
>>> +
>>> +       ret = hibmc_ttm_global_init(hibmc);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       ret = ttm_bo_device_init(&hibmc->ttm.bdev,
>>> +                                hibmc->ttm.bo_global_ref.ref.object,
>>> +                                &hibmc_bo_driver,
>>> +                                dev->anon_inode->i_mapping,
>>> +                                DRM_FILE_PAGE_OFFSET,
>>> +                                true);
>>> +       if (ret) {
>>
>>
>> Call hibmc_ttm_global_release here?
>
>
> agreed, thanks for pointing it out.
>
>>
>>> +               DRM_ERROR("Error initialising bo driver; %d\n", ret);
>>> +               return ret;
>>> +       }
>>> +
>>> +       ret = ttm_bo_init_mm(bdev, TTM_PL_VRAM,
>>> +                            hibmc->fb_size >> PAGE_SHIFT);
>>> +       if (ret) {
>>
>>
>> Clean up here as well?
>
>
> ditto
>
>
>>
>>> +               DRM_ERROR("Failed ttm VRAM init: %d\n", ret);
>>> +               return ret;
>>> +       }
>>> +
>>> +       hibmc->mm_inited = true;
>>> +       return 0;
>>> +}
>>> +
>>> +void hibmc_mm_fini(struct hibmc_drm_device *hibmc)
>>> +{
>>> +       if (!hibmc->mm_inited)
>>> +               return;
>>> +
>>> +       ttm_bo_device_release(&hibmc->ttm.bdev);
>>> +       hibmc_ttm_global_release(hibmc);
>>> +       hibmc->mm_inited = false;
>>> +}
>>> +
>>> +int hibmc_bo_create(struct drm_device *dev, int size, int align,
>>> +                   u32 flags, struct hibmc_bo **phibmcbo)
>>> +{
>>> +       struct hibmc_drm_device *hibmc = dev->dev_private;
>>> +       struct hibmc_bo *hibmcbo;
>>> +       size_t acc_size;
>>> +       int ret;
>>> +
>>> +       hibmcbo = kzalloc(sizeof(*hibmcbo), GFP_KERNEL);
>>> +       if (!hibmcbo)
>>> +               return -ENOMEM;
>>> +
>>> +       ret = drm_gem_object_init(dev, &hibmcbo->gem, size);
>>> +       if (ret) {
>>> +               kfree(hibmcbo);
>>> +               return ret;
>>> +       }
>>> +
>>> +       hibmcbo->bo.bdev = &hibmc->ttm.bdev;
>>> +
>>> +       hibmc_ttm_placement(hibmcbo, TTM_PL_FLAG_VRAM |
>>> TTM_PL_FLAG_SYSTEM);
>>> +
>>> +       acc_size = ttm_bo_dma_acc_size(&hibmc->ttm.bdev, size,
>>> +                                      sizeof(struct hibmc_bo));
>>> +
>>> +       ret = ttm_bo_init(&hibmc->ttm.bdev, &hibmcbo->bo, size,
>>> +                         ttm_bo_type_device, &hibmcbo->placement,
>>> +                         align >> PAGE_SHIFT, false, NULL, acc_size,
>>> +                         NULL, NULL, hibmc_bo_ttm_destroy);
>>> +       if (ret)
>>
>>
>> Missing hibmcbo clean up here
>
>
> i looked at all other ttm drivers and all of them return directly when
> ttm_bo_init
> failed, however, i think it is better to clean up here, should i call
> hibmc_bo_unref(&hibmc_bo) here ?
>

Yeah, that should work (might want to test it, though ;)


>>
>>> +               return ret;
>>> +
>>> +       *phibmcbo = hibmcbo;
>>> +       return 0;
>>> +}
>>> +
>>> +static inline u64 hibmc_bo_gpu_offset(struct hibmc_bo *bo)
>>> +{
>>> +       return bo->bo.offset;
>>> +}
>>
>>
>> I don't think this function provides any value
>
>
> do you nean i use bo->bo.offset instead of calling hibmc_bo_gpu_offset()?
>

yes

>>
>>> +
>>> +int hibmc_bo_pin(struct hibmc_bo *bo, u32 pl_flag, u64 *gpu_addr)
>>> +{
>>> +       int i, ret;
>>> +
>>> +       if (bo->pin_count) {
>>> +               bo->pin_count++;
>>> +               if (gpu_addr)
>>> +                       *gpu_addr = hibmc_bo_gpu_offset(bo);
>>
>>
>> Are you missing a return here?
>
>
> Thanks for pointing it out!
>
>
>>
>>> +       }
>>> +
>>> +       hibmc_ttm_placement(bo, pl_flag);
>>> +       for (i = 0; i < bo->placement.num_placement; i++)
>>> +               bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
>>> +       ret = ttm_bo_validate(&bo->bo, &bo->placement, false, false);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       bo->pin_count = 1;
>>> +       if (gpu_addr)
>>> +               *gpu_addr = hibmc_bo_gpu_offset(bo);
>>> +       return 0;
>>> +}
>>> +
>>> +int hibmc_bo_push_sysram(struct hibmc_bo *bo)
>>> +{
>>> +       int i, ret;
>>> +
>>> +       if (!bo->pin_count) {
>>> +               DRM_ERROR("unpin bad %p\n", bo);
>>> +               return 0;
>>> +       }
>>> +       bo->pin_count--;
>>> +       if (bo->pin_count)
>>> +               return 0;
>>> +
>>> +       if (bo->kmap.virtual)
>>
>>
>> ttm_bo_kunmap already does this check so you don't have to
>
>
> agreed. will remove this condition.
>
>>
>>> +               ttm_bo_kunmap(&bo->kmap);
>>> +
>>> +       hibmc_ttm_placement(bo, TTM_PL_FLAG_SYSTEM);
>>> +       for (i = 0; i < bo->placement.num_placement ; i++)
>>> +               bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
>>> +
>>> +       ret = ttm_bo_validate(&bo->bo, &bo->placement, false, false);
>>> +       if (ret) {
>>> +               DRM_ERROR("pushing to VRAM failed\n");
>>
>>
>> Print ret
>
>
> ok, thanks.
>
>
>>
>>> +               return ret;
>>> +       }
>>> +       return 0;
>>> +}
>>> +
>>> +int hibmc_mmap(struct file *filp, struct vm_area_struct *vma)
>>> +{
>>> +       struct drm_file *file_priv;
>>> +       struct hibmc_drm_device *hibmc;
>>> +
>>> +       if (unlikely(vma->vm_pgoff < DRM_FILE_PAGE_OFFSET))
>>> +               return -EINVAL;
>>> +
>>> +       file_priv = filp->private_data;
>>> +       hibmc = file_priv->minor->dev->dev_private;
>>> +       return ttm_bo_mmap(filp, vma, &hibmc->ttm.bdev);
>>> +}
>>> +
>>> +int hibmc_gem_create(struct drm_device *dev, u32 size, bool iskernel,
>>> +                    struct drm_gem_object **obj)
>>> +{
>>> +       struct hibmc_bo *hibmcbo;
>>> +       int ret;
>>> +
>>> +       *obj = NULL;
>>> +
>>> +       size = PAGE_ALIGN(size);
>>> +       if (size == 0)
>>
>>
>> Print error
>
>
> ditto
>
>>
>>> +               return -EINVAL;
>>> +
>>> +       ret = hibmc_bo_create(dev, size, 0, 0, &hibmcbo);
>>> +       if (ret) {
>>> +               if (ret != -ERESTARTSYS)
>>> +                       DRM_ERROR("failed to allocate GEM object\n");
>>
>>
>> Print ret
>
>
> ditto
>
>>
>>> +               return ret;
>>> +       }
>>> +       *obj = &hibmcbo->gem;
>>> +       return 0;
>>> +}
>>> +
>>> +int hibmc_dumb_create(struct drm_file *file, struct drm_device *dev,
>>> +                     struct drm_mode_create_dumb *args)
>>> +{
>>> +       struct drm_gem_object *gobj;
>>> +       u32 handle;
>>> +       int ret;
>>> +
>>> +       args->pitch = ALIGN(args->width * ((args->bpp + 7) / 8), 16);
>>
>>
>> What's up with the bpp + 7 here? Perhaps you're looking for DIV_ROUND_UP?
>
>
> Yes, that sounds sane.
>

sane is what i usually aim for :)

Sean


>>
>>
>>> +       args->size = args->pitch * args->height;
>>> +
>>> +       ret = hibmc_gem_create(dev, args->size, false,
>>> +                              &gobj);
>>> +       if (ret)
>>> +               return ret;
>>> +
>>> +       ret = drm_gem_handle_create(file, gobj, &handle);
>>> +       drm_gem_object_unreference_unlocked(gobj);
>>> +       if (ret)
>>
>>
>> Print error here
>
>
> agreed.
>
>
>>
>>> +               return ret;
>>> +
>>> +       args->handle = handle;
>>> +       return 0;
>>> +}
>>> +
>>> +static void hibmc_bo_unref(struct hibmc_bo **bo)
>>> +{
>>> +       struct ttm_buffer_object *tbo;
>>> +
>>> +       if ((*bo) == NULL)
>>> +               return;
>>> +
>>> +       tbo = &((*bo)->bo);
>>> +       ttm_bo_unref(&tbo);
>>> +       *bo = NULL;
>>> +}
>>> +
>>> +void hibmc_gem_free_object(struct drm_gem_object *obj)
>>> +{
>>> +       struct hibmc_bo *hibmcbo = gem_to_hibmc_bo(obj);
>>> +
>>> +       hibmc_bo_unref(&hibmcbo);
>>> +}
>>> +
>>> +static u64 hibmc_bo_mmap_offset(struct hibmc_bo *bo)
>>> +{
>>> +       return drm_vma_node_offset_addr(&bo->bo.vma_node);
>>> +}
>>> +
>>> +int hibmc_dumb_mmap_offset(struct drm_file *file, struct drm_device
>>> *dev,
>>> +                          u32 handle, u64 *offset)
>>> +{
>>> +       struct drm_gem_object *obj;
>>> +       struct hibmc_bo *bo;
>>> +
>>> +       obj = drm_gem_object_lookup(file, handle);
>>> +       if (!obj)
>>> +               return -ENOENT;
>>> +
>>> +       bo = gem_to_hibmc_bo(obj);
>>> +       *offset = hibmc_bo_mmap_offset(bo);
>>> +
>>> +       drm_gem_object_unreference_unlocked(obj);
>>> +       return 0;
>>> +}
>
>
> Regards,
> Rongrong.
>
>>> --
>>> 1.9.1
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>> _______________________________________________
>> linuxarm mailing list
>> linuxarm at huawei.com
>> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
>>
>> .
>>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* [PATCH v6 3/9] drm/hisilicon/hibmc: Add support for frame buffer
From: Rongrong Zou @ 2016-11-11 13:16 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <CAOw6vbJweGkWzXJwNSm7LmhDhW1gLG8=wu4dBw+dL-cvaBvaCw@mail.gmail.com>

? 2016/11/11 2:30, Sean Paul ??:
> On Fri, Oct 28, 2016 at 3:27 AM, Rongrong Zou <zourongrong@gmail.com> wrote:
>> Add support for fbdev and kms fb management.
>>
>> Signed-off-by: Rongrong Zou <zourongrong@gmail.com>
>> ---
>>   drivers/gpu/drm/hisilicon/hibmc/Makefile          |   2 +-
>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c   |  17 ++
>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h   |  24 ++
>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c | 255 ++++++++++++++++++++++
>>   drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c       |  66 ++++++
>>   5 files changed, 363 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
>>
>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/Makefile b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>> index d5c40b8..810a37e 100644
>> --- a/drivers/gpu/drm/hisilicon/hibmc/Makefile
>> +++ b/drivers/gpu/drm/hisilicon/hibmc/Makefile
>> @@ -1,5 +1,5 @@
>>   ccflags-y := -Iinclude/drm
>> -hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_power.o hibmc_ttm.o
>> +hibmc-drm-y := hibmc_drm_drv.o hibmc_drm_fbdev.o hibmc_drm_power.o hibmc_ttm.o
>>
>>   obj-$(CONFIG_DRM_HISI_HIBMC)   +=hibmc-drm.o
>>   #obj-y += hibmc-drm.o
>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> index 81f4301..5ac7a7e 100644
>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
>> @@ -66,11 +66,23 @@ static void hibmc_disable_vblank(struct drm_device *dev, unsigned int pipe)
>>
>>   static int hibmc_pm_suspend(struct device *dev)
>>   {
>> +       struct pci_dev *pdev = to_pci_dev(dev);
>> +       struct drm_device *drm_dev = pci_get_drvdata(pdev);
>> +       struct hibmc_drm_device *hidev = drm_dev->dev_private;
>> +
>> +       drm_fb_helper_set_suspend_unlocked(&hidev->fbdev->helper, 1);
>> +
>
> We have atomic helpers for suspend/resume now. Please use those.

drm_atomic_helper_suspend/resume()?

>
>>          return 0;
>>   }
>>
>>   static int hibmc_pm_resume(struct device *dev)
>>   {
>> +       struct pci_dev *pdev = to_pci_dev(dev);
>> +       struct drm_device *drm_dev = pci_get_drvdata(pdev);
>> +       struct hibmc_drm_device *hidev = drm_dev->dev_private;
>> +
>> +       drm_fb_helper_set_suspend_unlocked(&hidev->fbdev->helper, 0);
>> +
>>          return 0;
>>   }
>>
>> @@ -170,6 +182,7 @@ static int hibmc_unload(struct drm_device *dev)
>>   {
>>          struct hibmc_drm_device *hidev = dev->dev_private;
>>
>> +       hibmc_fbdev_fini(hidev);
>>          hibmc_mm_fini(hidev);
>>          hibmc_hw_fini(hidev);
>>          dev->dev_private = NULL;
>> @@ -195,6 +208,10 @@ static int hibmc_load(struct drm_device *dev, unsigned long flags)
>>          if (ret)
>>                  goto err;
>>
>> +       ret = hibmc_fbdev_init(hidev);
>> +       if (ret)
>> +               goto err;
>> +
>>          return 0;
>>
>>   err:
>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>> index db8d80e..a40e9a7 100644
>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h
>> @@ -20,9 +20,22 @@
>>   #define HIBMC_DRM_DRV_H
>>
>>   #include <drm/drmP.h>
>> +#include <drm/drm_fb_helper.h>
>>   #include <drm/ttm/ttm_bo_driver.h>
>>   #include <drm/drm_gem.h>
>>
>> +struct hibmc_framebuffer {
>> +       struct drm_framebuffer fb;
>> +       struct drm_gem_object *obj;
>> +       bool is_fbdev_fb;
>> +};
>> +
>> +struct hibmc_fbdev {
>> +       struct drm_fb_helper helper;
>> +       struct hibmc_framebuffer *fb;
>> +       int size;
>> +};
>> +
>>   struct hibmc_drm_device {
>>          /* hw */
>>          void __iomem   *mmio;
>> @@ -41,9 +54,13 @@ struct hibmc_drm_device {
>>                  bool initialized;
>>          } ttm;
>>
>> +       /* fbdev */
>> +       struct hibmc_fbdev *fbdev;
>>          bool mm_inited;
>>   };
>>
>> +#define to_hibmc_framebuffer(x) container_of(x, struct hibmc_framebuffer, fb)
>> +
>>   struct hibmc_bo {
>>          struct ttm_buffer_object bo;
>>          struct ttm_placement placement;
>> @@ -65,8 +82,15 @@ static inline struct hibmc_bo *gem_to_hibmc_bo(struct drm_gem_object *gem)
>>
>>   #define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
>>
>> +int hibmc_fbdev_init(struct hibmc_drm_device *hidev);
>> +void hibmc_fbdev_fini(struct hibmc_drm_device *hidev);
>> +
>>   int hibmc_gem_create(struct drm_device *dev, u32 size, bool iskernel,
>>                       struct drm_gem_object **obj);
>> +struct hibmc_framebuffer *
>> +hibmc_framebuffer_init(struct drm_device *dev,
>> +                      const struct drm_mode_fb_cmd2 *mode_cmd,
>> +                      struct drm_gem_object *obj);
>>
>>   int hibmc_mm_init(struct hibmc_drm_device *hibmc);
>>   void hibmc_mm_fini(struct hibmc_drm_device *hibmc);
>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
>> new file mode 100644
>> index 0000000..630124b
>> --- /dev/null
>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
>> @@ -0,0 +1,255 @@
>> +/* Hisilicon Hibmc SoC drm driver
>> + *
>> + * Based on the bochs drm driver.
>> + *
>> + * Copyright (c) 2016 Huawei Limited.
>> + *
>> + * Author:
>> + *     Rongrong Zou <zourongrong@huawei.com>
>> + *     Rongrong Zou <zourongrong@gmail.com>
>> + *     Jianhua Li <lijianhua@huawei.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + */
>> +
>> +#include <drm/drm_crtc.h>
>> +#include <drm/drm_crtc_helper.h>
>> +#include <drm/drm_fb_helper.h>
>> +
>> +#include "hibmc_drm_drv.h"
>> +
>> +/* ---------------------------------------------------------------------- */
>
> Please remove

will do that, thanks.

>
>> +
>> +static int hibmcfb_create_object(
>> +                               struct hibmc_drm_device *hidev,
>> +                               const struct drm_mode_fb_cmd2 *mode_cmd,
>> +                               struct drm_gem_object **gobj_p)
>> +{
>> +       struct drm_gem_object *gobj;
>> +       struct drm_device *dev = hidev->dev;
>> +       u32 size;
>> +       int ret = 0;
>> +
>> +       size = mode_cmd->pitches[0] * mode_cmd->height;
>> +       ret = hibmc_gem_create(dev, size, true, &gobj);
>> +       if (ret)
>> +               return ret;
>> +
>> +       *gobj_p = gobj;
>> +       return ret;
>> +}
>> +
>> +static struct fb_ops hibmc_drm_fb_ops = {
>> +       .owner = THIS_MODULE,
>> +       .fb_check_var = drm_fb_helper_check_var,
>> +       .fb_set_par = drm_fb_helper_set_par,
>> +       .fb_fillrect = drm_fb_helper_sys_fillrect,
>> +       .fb_copyarea = drm_fb_helper_sys_copyarea,
>> +       .fb_imageblit = drm_fb_helper_sys_imageblit,
>> +       .fb_pan_display = drm_fb_helper_pan_display,
>> +       .fb_blank = drm_fb_helper_blank,
>> +       .fb_setcmap = drm_fb_helper_setcmap,
>> +};
>> +
>> +static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
>> +                              struct drm_fb_helper_surface_size *sizes)
>> +{
>> +       struct hibmc_fbdev *hi_fbdev =
>> +               container_of(helper, struct hibmc_fbdev, helper);
>> +       struct hibmc_drm_device *hidev =
>> +               (struct hibmc_drm_device *)helper->dev->dev_private;
>> +       struct fb_info *info;
>> +       struct hibmc_framebuffer *hibmc_fb;
>> +       struct drm_framebuffer *fb;
>> +       struct drm_mode_fb_cmd2 mode_cmd;
>> +       struct drm_gem_object *gobj = NULL;
>> +       int ret = 0;
>> +       size_t size;
>> +       unsigned int bytes_per_pixel;
>> +       struct hibmc_bo *bo = NULL;
>> +
>> +       DRM_DEBUG_DRIVER("surface width(%d), height(%d) and bpp(%d)\n",
>> +                        sizes->surface_width, sizes->surface_height,
>> +                        sizes->surface_bpp);
>> +       sizes->surface_depth = 32;
>> +
>> +       bytes_per_pixel = DIV_ROUND_UP(sizes->surface_bpp, 8);
>> +
>> +       mode_cmd.width = sizes->surface_width;
>> +       mode_cmd.height = sizes->surface_height;
>> +       mode_cmd.pitches[0] = mode_cmd.width * bytes_per_pixel;
>> +       mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
>> +                                                         sizes->surface_depth);
>> +
>> +       size = roundup(mode_cmd.pitches[0] * mode_cmd.height, PAGE_SIZE);
>
> It's somewhat curious that you used ALIGN in the previous patch and
> roundup here. But anyways, I think PAGE_ALIGN would be the most
> appropriate thing to do here.

agreed, thanks.

>
>> +
>> +       ret = hibmcfb_create_object(hidev, &mode_cmd, &gobj);
>> +       if (ret) {
>> +               DRM_ERROR("failed to create fbcon backing object %d\r\n",
> ret);
>
> \r, yikes!!!

will delete it, thanks.

>
>> +               return -ENOMEM;
>> +       }
>> +
>> +       bo = gem_to_hibmc_bo(gobj);
>> +
>> +       ret = ttm_bo_reserve(&bo->bo, true, false, NULL);
>> +       if (ret)
>
> Print error here?

will do.

>
> How about cleaning up gobj?

you are right,

>
>> +               return ret;
>> +
>> +       ret = hibmc_bo_pin(bo, TTM_PL_FLAG_VRAM, NULL);
>> +       if (ret) {
>> +               DRM_ERROR("failed to pin fbcon\n");
>
> Print ret
>
> ttm_bo_unreserve? It seems like you're missing clean-up in all of the
> error paths in this function. Can you please make sure everything is
> tidied up?

ok, thanks.

>
>> +               return ret;
>> +       }
>> +
>> +       ret = ttm_bo_kmap(&bo->bo, 0, bo->bo.num_pages, &bo->kmap);
>> +
>
> nit: extra space

do you mean extra line?

>
>> +       if (ret) {
>> +               DRM_ERROR("failed to kmap fbcon\n");
>
> Print ret

ok, thanks.

>
>> +               ttm_bo_unreserve(&bo->bo);
>> +               return ret;
>> +       }
>> +
>> +       ttm_bo_unreserve(&bo->bo);
>
> Move this between ttm_bo_kmap and if (ret), then remove it from inside
> the conditional.

This is fine with me, thanks.

>
>> +
>> +       info = drm_fb_helper_alloc_fbi(helper);
>> +       if (IS_ERR(info))
>
> Print error

ok, thanks.

>
>> +               return PTR_ERR(info);
>> +
>> +       info->par = hi_fbdev;
>> +
>> +       hibmc_fb = hibmc_framebuffer_init(hidev->dev, &mode_cmd, gobj);
>> +       if (IS_ERR(hibmc_fb)) {
>> +               drm_fb_helper_release_fbi(helper);
>> +               return PTR_ERR(hibmc_fb);
>> +       }
>> +
>> +       hi_fbdev->fb = hibmc_fb;
>> +       hidev->fbdev->size = size;
>> +       fb = &hibmc_fb->fb;
>
> The fb variable isn't necessary, and really, the hibmc_fb doesn't
> really help either, it just masks what's actually happening IMO.

i will clean the two variables.

>
>> +       if (!fb) {
>> +               DRM_INFO("fb is NULL\n");
>> +               return -EINVAL;
>> +       }
>> +
>> +       hi_fbdev->helper.fb = fb;
>> +
>> +       strcpy(info->fix.id, "hibmcdrmfb");
>> +
>> +       info->flags = FBINFO_DEFAULT;
>> +       info->fbops = &hibmc_drm_fb_ops;
>> +
>> +       drm_fb_helper_fill_fix(info, fb->pitches[0], fb->depth);
>> +       drm_fb_helper_fill_var(info, &hidev->fbdev->helper, sizes->fb_width,
>> +                              sizes->fb_height);
>> +
>> +       info->screen_base = bo->kmap.virtual;
>> +       info->screen_size = size;
>> +
>> +       info->fix.smem_start = bo->bo.mem.bus.offset + bo->bo.mem.bus.base;
>> +       info->fix.smem_len = size;
>> +
>> +       return 0;
>> +}
>> +
>> +static void hibmc_fbdev_destroy(struct hibmc_fbdev *fbdev)
>> +{
>> +       struct hibmc_framebuffer *gfb = fbdev->fb;
>> +       struct drm_fb_helper *fbh = &fbdev->helper;
>> +
>> +       DRM_DEBUG_DRIVER("hibmc_fbdev_destroy\n");
>
> Not useful

ok, will remove.

>
>> +
>> +       drm_fb_helper_unregister_fbi(fbh);
>> +       drm_fb_helper_release_fbi(fbh);
>> +
>> +       drm_fb_helper_fini(fbh);
>> +
>> +       if (gfb)
>> +               drm_framebuffer_unreference(&gfb->fb);
>> +
>> +       kfree(fbdev);
>> +}
>> +
>> +static const struct drm_fb_helper_funcs hibmc_fbdev_helper_funcs = {
>> +       .fb_probe = hibmc_drm_fb_create,
>> +};
>> +
>> +int hibmc_fbdev_init(struct hibmc_drm_device *hidev)
>> +{
>> +       int ret;
>> +       struct fb_var_screeninfo *var;
>> +       struct fb_fix_screeninfo *fix;
>> +       struct hibmc_fbdev *hifbdev;
>> +
>> +       hifbdev = kzalloc(sizeof(*hifbdev), GFP_KERNEL);
>
> devm_kzalloc?

sounds good, so there need no kfree at hibmc_fbdev_destroy(),
thanks.

>
>> +       if (!hifbdev)
>> +               return -ENOMEM;
>> +
>> +       hidev->fbdev = hifbdev;
>> +       drm_fb_helper_prepare(hidev->dev, &hifbdev->helper,
>> +                             &hibmc_fbdev_helper_funcs);
>> +
>> +       /* Now just one crtc and one channel */
>> +       ret = drm_fb_helper_init(hidev->dev,
>> +                                &hifbdev->helper, 1, 1);
>> +
>
> nit: extra line

ok, thanks.

>
>> +       if (ret)
>> +               return ret;
>> +
>> +       ret = drm_fb_helper_single_add_all_connectors(&hifbdev->helper);
>> +       if (ret)
>> +               goto fini;
>> +
>> +       ret = drm_fb_helper_initial_config(&hifbdev->helper, 16);
>> +       if (ret)
>> +               goto fini;
>> +
>> +       var = &hifbdev->helper.fbdev->var;
>> +       fix = &hifbdev->helper.fbdev->fix;
>> +
>> +       DRM_DEBUG("Member of info->var is :\n"
>> +                "xres=%d\n"
>> +                "yres=%d\n"
>> +                "xres_virtual=%d\n"
>> +                "yres_virtual=%d\n"
>> +                "xoffset=%d\n"
>> +                "yoffset=%d\n"
>> +                "bits_per_pixel=%d\n"
>> +                "...\n", var->xres, var->yres, var->xres_virtual,
>> +                var->yres_virtual, var->xoffset, var->yoffset,
>> +                var->bits_per_pixel);
>> +       DRM_DEBUG("Member of info->fix is :\n"
>> +                "smem_start=%lx\n"
>> +                "smem_len=%d\n"
>> +                "type=%d\n"
>> +                "type_aux=%d\n"
>> +                "visual=%d\n"
>> +                "xpanstep=%d\n"
>> +                "ypanstep=%d\n"
>> +                "ywrapstep=%d\n"
>> +                "line_length=%d\n"
>> +                "accel=%d\n"
>> +                "capabilities=%d\n"
>> +                "...\n", fix->smem_start, fix->smem_len, fix->type,
>> +                fix->type_aux, fix->visual, fix->xpanstep,
>> +                fix->ypanstep, fix->ywrapstep, fix->line_length,
>> +                fix->accel, fix->capabilities);
>
> You've been using DRM_DEBUG_DRIVER elsewhere, you should probably use
> it here, too.

ok, thanks.

>
>> +
>> +       return 0;
>> +
>> +fini:
>> +       drm_fb_helper_fini(&hifbdev->helper);
>> +       return ret;
>> +}
>> +
>> +void hibmc_fbdev_fini(struct hibmc_drm_device *hidev)
>> +{
>> +       if (!hidev->fbdev)
>> +               return;
>> +
>> +       hibmc_fbdev_destroy(hidev->fbdev);
>> +       hidev->fbdev = NULL;
>> +}
>> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>> index 0802ebd..9822f62 100644
>> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
>> @@ -488,3 +488,69 @@ int hibmc_dumb_mmap_offset(struct drm_file *file, struct drm_device *dev,
>>          drm_gem_object_unreference_unlocked(obj);
>>          return 0;
>>   }
>> +
>> +/* ---------------------------------------------------------------------- */
>> +
>
> Please remove

will do.

>
>> +static void hibmc_user_framebuffer_destroy(struct drm_framebuffer *fb)
>> +{
>> +       struct hibmc_framebuffer *hibmc_fb = to_hibmc_framebuffer(fb);
>> +
>> +       drm_gem_object_unreference_unlocked(hibmc_fb->obj);
>> +       drm_framebuffer_cleanup(fb);
>> +       kfree(hibmc_fb);
>> +}
>> +
>> +static const struct drm_framebuffer_funcs hibmc_fb_funcs = {
>> +       .destroy = hibmc_user_framebuffer_destroy,
>> +};
>> +
>> +struct hibmc_framebuffer *
>> +hibmc_framebuffer_init(struct drm_device *dev,
>> +                      const struct drm_mode_fb_cmd2 *mode_cmd,
>> +                      struct drm_gem_object *obj)
>> +{
>> +       struct hibmc_framebuffer *hibmc_fb;
>> +       int ret;
>> +
>> +       hibmc_fb = kzalloc(sizeof(*hibmc_fb), GFP_KERNEL);
>> +       if (!hibmc_fb)
>
> Print error

ok, thanks.

Regards,
Rongrong.

>
>> +               return ERR_PTR(-ENOMEM);
>> +
>> +       drm_helper_mode_fill_fb_struct(&hibmc_fb->fb, mode_cmd);
>> +       hibmc_fb->obj = obj;
>> +       ret = drm_framebuffer_init(dev, &hibmc_fb->fb, &hibmc_fb_funcs);
>> +       if (ret) {
>> +               DRM_ERROR("drm_framebuffer_init failed: %d\n", ret);
>> +               kfree(hibmc_fb);
>> +               return ERR_PTR(ret);
>> +       }
>> +
>> +       return hibmc_fb;
>> +}
>> +
>> +static struct drm_framebuffer *
>> +hibmc_user_framebuffer_create(struct drm_device *dev,
>> +                             struct drm_file *filp,
>> +                             const struct drm_mode_fb_cmd2 *mode_cmd)
>> +{
>> +       struct drm_gem_object *obj;
>> +       struct hibmc_framebuffer *hibmc_fb;
>> +
>> +       DRM_DEBUG_DRIVER("%dx%d, format %c%c%c%c\n",
>> +                        mode_cmd->width, mode_cmd->height,
>> +                        (mode_cmd->pixel_format) & 0xff,
>> +                        (mode_cmd->pixel_format >> 8)  & 0xff,
>> +                        (mode_cmd->pixel_format >> 16) & 0xff,
>> +                        (mode_cmd->pixel_format >> 24) & 0xff);
>> +
>> +       obj = drm_gem_object_lookup(filp, mode_cmd->handles[0]);
>> +       if (!obj)
>> +               return ERR_PTR(-ENOENT);
>> +
>> +       hibmc_fb = hibmc_framebuffer_init(dev, mode_cmd, obj);
>> +       if (IS_ERR(hibmc_fb)) {
>> +               drm_gem_object_unreference_unlocked(obj);
>> +               return ERR_PTR((long)hibmc_fb);
>> +       }
>> +       return &hibmc_fb->fb;
>> +}
>> --
>> 1.9.1
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> _______________________________________________
> linuxarm mailing list
> linuxarm at huawei.com
> http://rnd-openeuler.huawei.com/mailman/listinfo/linuxarm
>
> .
>

^ permalink raw reply

* Applied "spi: spi-fsl-dspi: Add DMA support for Vybrid" to the spi tree
From: Mark Brown @ 2016-11-11 13:05 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <836dd6a84fa149cafdbb0b53e792d305febf6207.1478778329.git.maitysanchayan@gmail.com>

The patch

   spi: spi-fsl-dspi: Add DMA support for Vybrid

has been applied to the spi tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 90ba37033cb94207e97c4ced9be575770438213b Mon Sep 17 00:00:00 2001
From: Sanchayan Maity <maitysanchayan@gmail.com>
Date: Thu, 10 Nov 2016 17:49:15 +0530
Subject: [PATCH] spi: spi-fsl-dspi: Add DMA support for Vybrid

Add DMA support for Vybrid.

Signed-off-by: Sanchayan Maity <maitysanchayan@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
 drivers/spi/spi-fsl-dspi.c | 301 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 300 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 35c0dd945668..bc64700b514d 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -15,6 +15,8 @@
 
 #include <linux/clk.h>
 #include <linux/delay.h>
+#include <linux/dmaengine.h>
+#include <linux/dma-mapping.h>
 #include <linux/err.h>
 #include <linux/errno.h>
 #include <linux/interrupt.h>
@@ -40,6 +42,7 @@
 #define TRAN_STATE_WORD_ODD_NUM	0x04
 
 #define DSPI_FIFO_SIZE			4
+#define DSPI_DMA_BUFSIZE		(DSPI_FIFO_SIZE * 1024)
 
 #define SPI_MCR		0x00
 #define SPI_MCR_MASTER		(1 << 31)
@@ -71,6 +74,11 @@
 #define SPI_SR_EOQF		0x10000000
 #define SPI_SR_TCFQF		0x80000000
 
+#define SPI_RSER_TFFFE		BIT(25)
+#define SPI_RSER_TFFFD		BIT(24)
+#define SPI_RSER_RFDFE		BIT(17)
+#define SPI_RSER_RFDFD		BIT(16)
+
 #define SPI_RSER		0x30
 #define SPI_RSER_EOQFE		0x10000000
 #define SPI_RSER_TCFQE		0x80000000
@@ -108,6 +116,8 @@
 
 #define SPI_TCR_TCNT_MAX	0x10000
 
+#define DMA_COMPLETION_TIMEOUT	msecs_to_jiffies(3000)
+
 struct chip_data {
 	u32 mcr_val;
 	u32 ctar_val;
@@ -117,6 +127,7 @@ struct chip_data {
 enum dspi_trans_mode {
 	DSPI_EOQ_MODE = 0,
 	DSPI_TCFQ_MODE,
+	DSPI_DMA_MODE,
 };
 
 struct fsl_dspi_devtype_data {
@@ -125,7 +136,7 @@ struct fsl_dspi_devtype_data {
 };
 
 static const struct fsl_dspi_devtype_data vf610_data = {
-	.trans_mode = DSPI_EOQ_MODE,
+	.trans_mode = DSPI_DMA_MODE,
 	.max_clock_factor = 2,
 };
 
@@ -139,6 +150,22 @@ static const struct fsl_dspi_devtype_data ls2085a_data = {
 	.max_clock_factor = 8,
 };
 
+struct fsl_dspi_dma {
+	u32 curr_xfer_len;
+
+	u32 *tx_dma_buf;
+	struct dma_chan *chan_tx;
+	dma_addr_t tx_dma_phys;
+	struct completion cmd_tx_complete;
+	struct dma_async_tx_descriptor *tx_desc;
+
+	u32 *rx_dma_buf;
+	struct dma_chan *chan_rx;
+	dma_addr_t rx_dma_phys;
+	struct completion cmd_rx_complete;
+	struct dma_async_tx_descriptor *rx_desc;
+};
+
 struct fsl_dspi {
 	struct spi_master	*master;
 	struct platform_device	*pdev;
@@ -165,6 +192,7 @@ struct fsl_dspi {
 	u32			waitflags;
 
 	u32			spi_tcnt;
+	struct fsl_dspi_dma	*dma;
 };
 
 static inline int is_double_byte_mode(struct fsl_dspi *dspi)
@@ -176,6 +204,263 @@ static inline int is_double_byte_mode(struct fsl_dspi *dspi)
 	return ((val & SPI_FRAME_BITS_MASK) == SPI_FRAME_BITS(8)) ? 0 : 1;
 }
 
+static void dspi_tx_dma_callback(void *arg)
+{
+	struct fsl_dspi *dspi = arg;
+	struct fsl_dspi_dma *dma = dspi->dma;
+
+	complete(&dma->cmd_tx_complete);
+}
+
+static void dspi_rx_dma_callback(void *arg)
+{
+	struct fsl_dspi *dspi = arg;
+	struct fsl_dspi_dma *dma = dspi->dma;
+	int rx_word;
+	int i, len;
+	u16 d;
+
+	rx_word = is_double_byte_mode(dspi);
+
+	len = rx_word ? (dma->curr_xfer_len / 2) : dma->curr_xfer_len;
+
+	if (!(dspi->dataflags & TRAN_STATE_RX_VOID)) {
+		for (i = 0; i < len; i++) {
+			d = dspi->dma->rx_dma_buf[i];
+			rx_word ? (*(u16 *)dspi->rx = d) :
+						(*(u8 *)dspi->rx = d);
+			dspi->rx += rx_word + 1;
+		}
+	}
+
+	complete(&dma->cmd_rx_complete);
+}
+
+static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
+{
+	struct fsl_dspi_dma *dma = dspi->dma;
+	struct device *dev = &dspi->pdev->dev;
+	int time_left;
+	int tx_word;
+	int i, len;
+	u16 val;
+
+	tx_word = is_double_byte_mode(dspi);
+
+	len = tx_word ? (dma->curr_xfer_len / 2) : dma->curr_xfer_len;
+
+	for (i = 0; i < len - 1; i++) {
+		val = tx_word ? *(u16 *) dspi->tx : *(u8 *) dspi->tx;
+		dspi->dma->tx_dma_buf[i] =
+			SPI_PUSHR_TXDATA(val) | SPI_PUSHR_PCS(dspi->cs) |
+			SPI_PUSHR_CTAS(0) | SPI_PUSHR_CONT;
+		dspi->tx += tx_word + 1;
+	}
+
+	val = tx_word ? *(u16 *) dspi->tx : *(u8 *) dspi->tx;
+	dspi->dma->tx_dma_buf[i] = SPI_PUSHR_TXDATA(val) |
+					SPI_PUSHR_PCS(dspi->cs) |
+					SPI_PUSHR_CTAS(0);
+	dspi->tx += tx_word + 1;
+
+	dma->tx_desc = dmaengine_prep_slave_single(dma->chan_tx,
+					dma->tx_dma_phys,
+					DSPI_DMA_BUFSIZE, DMA_MEM_TO_DEV,
+					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	if (!dma->tx_desc) {
+		dev_err(dev, "Not able to get desc for DMA xfer\n");
+		return -EIO;
+	}
+
+	dma->tx_desc->callback = dspi_tx_dma_callback;
+	dma->tx_desc->callback_param = dspi;
+	if (dma_submit_error(dmaengine_submit(dma->tx_desc))) {
+		dev_err(dev, "DMA submit failed\n");
+		return -EINVAL;
+	}
+
+	dma->rx_desc = dmaengine_prep_slave_single(dma->chan_rx,
+					dma->rx_dma_phys,
+					DSPI_DMA_BUFSIZE, DMA_DEV_TO_MEM,
+					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	if (!dma->rx_desc) {
+		dev_err(dev, "Not able to get desc for DMA xfer\n");
+		return -EIO;
+	}
+
+	dma->rx_desc->callback = dspi_rx_dma_callback;
+	dma->rx_desc->callback_param = dspi;
+	if (dma_submit_error(dmaengine_submit(dma->rx_desc))) {
+		dev_err(dev, "DMA submit failed\n");
+		return -EINVAL;
+	}
+
+	reinit_completion(&dspi->dma->cmd_rx_complete);
+	reinit_completion(&dspi->dma->cmd_tx_complete);
+
+	dma_async_issue_pending(dma->chan_rx);
+	dma_async_issue_pending(dma->chan_tx);
+
+	time_left = wait_for_completion_timeout(&dspi->dma->cmd_tx_complete,
+					DMA_COMPLETION_TIMEOUT);
+	if (time_left == 0) {
+		dev_err(dev, "DMA tx timeout\n");
+		dmaengine_terminate_all(dma->chan_tx);
+		dmaengine_terminate_all(dma->chan_rx);
+		return -ETIMEDOUT;
+	}
+
+	time_left = wait_for_completion_timeout(&dspi->dma->cmd_rx_complete,
+					DMA_COMPLETION_TIMEOUT);
+	if (time_left == 0) {
+		dev_err(dev, "DMA rx timeout\n");
+		dmaengine_terminate_all(dma->chan_tx);
+		dmaengine_terminate_all(dma->chan_rx);
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int dspi_dma_xfer(struct fsl_dspi *dspi)
+{
+	struct fsl_dspi_dma *dma = dspi->dma;
+	struct device *dev = &dspi->pdev->dev;
+	int curr_remaining_bytes;
+	int bytes_per_buffer;
+	int tx_word;
+	int ret = 0;
+
+	tx_word = is_double_byte_mode(dspi);
+	curr_remaining_bytes = dspi->len;
+	while (curr_remaining_bytes) {
+		/* Check if current transfer fits the DMA buffer */
+		dma->curr_xfer_len = curr_remaining_bytes;
+		bytes_per_buffer = DSPI_DMA_BUFSIZE /
+				(DSPI_FIFO_SIZE / (tx_word ? 2 : 1));
+		if (curr_remaining_bytes > bytes_per_buffer)
+			dma->curr_xfer_len = bytes_per_buffer;
+
+		ret = dspi_next_xfer_dma_submit(dspi);
+		if (ret) {
+			dev_err(dev, "DMA transfer failed\n");
+			goto exit;
+
+		} else {
+			curr_remaining_bytes -= dma->curr_xfer_len;
+			if (curr_remaining_bytes < 0)
+				curr_remaining_bytes = 0;
+			dspi->len = curr_remaining_bytes;
+		}
+	}
+
+exit:
+	return ret;
+}
+
+static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
+{
+	struct fsl_dspi_dma *dma;
+	struct dma_slave_config cfg;
+	struct device *dev = &dspi->pdev->dev;
+	int ret;
+
+	dma = devm_kzalloc(dev, sizeof(*dma), GFP_KERNEL);
+	if (!dma)
+		return -ENOMEM;
+
+	dma->chan_rx = dma_request_slave_channel(dev, "rx");
+	if (!dma->chan_rx) {
+		dev_err(dev, "rx dma channel not available\n");
+		ret = -ENODEV;
+		return ret;
+	}
+
+	dma->chan_tx = dma_request_slave_channel(dev, "tx");
+	if (!dma->chan_tx) {
+		dev_err(dev, "tx dma channel not available\n");
+		ret = -ENODEV;
+		goto err_tx_channel;
+	}
+
+	dma->tx_dma_buf = dma_alloc_coherent(dev, DSPI_DMA_BUFSIZE,
+					&dma->tx_dma_phys, GFP_KERNEL);
+	if (!dma->tx_dma_buf) {
+		ret = -ENOMEM;
+		goto err_tx_dma_buf;
+	}
+
+	dma->rx_dma_buf = dma_alloc_coherent(dev, DSPI_DMA_BUFSIZE,
+					&dma->rx_dma_phys, GFP_KERNEL);
+	if (!dma->rx_dma_buf) {
+		ret = -ENOMEM;
+		goto err_rx_dma_buf;
+	}
+
+	cfg.src_addr = phy_addr + SPI_POPR;
+	cfg.dst_addr = phy_addr + SPI_PUSHR;
+	cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+	cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+	cfg.src_maxburst = 1;
+	cfg.dst_maxburst = 1;
+
+	cfg.direction = DMA_DEV_TO_MEM;
+	ret = dmaengine_slave_config(dma->chan_rx, &cfg);
+	if (ret) {
+		dev_err(dev, "can't configure rx dma channel\n");
+		ret = -EINVAL;
+		goto err_slave_config;
+	}
+
+	cfg.direction = DMA_MEM_TO_DEV;
+	ret = dmaengine_slave_config(dma->chan_tx, &cfg);
+	if (ret) {
+		dev_err(dev, "can't configure tx dma channel\n");
+		ret = -EINVAL;
+		goto err_slave_config;
+	}
+
+	dspi->dma = dma;
+	init_completion(&dma->cmd_tx_complete);
+	init_completion(&dma->cmd_rx_complete);
+
+	return 0;
+
+err_slave_config:
+	devm_kfree(dev, dma->rx_dma_buf);
+err_rx_dma_buf:
+	devm_kfree(dev, dma->tx_dma_buf);
+err_tx_dma_buf:
+	dma_release_channel(dma->chan_tx);
+err_tx_channel:
+	dma_release_channel(dma->chan_rx);
+
+	devm_kfree(dev, dma);
+	dspi->dma = NULL;
+
+	return ret;
+}
+
+static void dspi_release_dma(struct fsl_dspi *dspi)
+{
+	struct fsl_dspi_dma *dma = dspi->dma;
+	struct device *dev = &dspi->pdev->dev;
+
+	if (dma) {
+		if (dma->chan_tx) {
+			dma_unmap_single(dev, dma->tx_dma_phys,
+					DSPI_DMA_BUFSIZE, DMA_TO_DEVICE);
+			dma_release_channel(dma->chan_tx);
+		}
+
+		if (dma->chan_rx) {
+			dma_unmap_single(dev, dma->rx_dma_phys,
+					DSPI_DMA_BUFSIZE, DMA_FROM_DEVICE);
+			dma_release_channel(dma->chan_rx);
+		}
+	}
+}
+
 static void hz_to_spi_baud(char *pbr, char *br, int speed_hz,
 		unsigned long clkrate)
 {
@@ -424,6 +709,12 @@ static int dspi_transfer_one_message(struct spi_master *master,
 			regmap_write(dspi->regmap, SPI_RSER, SPI_RSER_TCFQE);
 			dspi_tcfq_write(dspi);
 			break;
+		case DSPI_DMA_MODE:
+			regmap_write(dspi->regmap, SPI_RSER,
+				SPI_RSER_TFFFE | SPI_RSER_TFFFD |
+				SPI_RSER_RFDFE | SPI_RSER_RFDFD);
+			status = dspi_dma_xfer(dspi);
+			goto out;
 		default:
 			dev_err(&dspi->pdev->dev, "unsupported trans_mode %u\n",
 				trans_mode);
@@ -733,6 +1024,13 @@ static int dspi_probe(struct platform_device *pdev)
 	if (ret)
 		goto out_master_put;
 
+	if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
+		if (dspi_request_dma(dspi, res->start)) {
+			dev_err(&pdev->dev, "can't get dma channels\n");
+			goto out_clk_put;
+		}
+	}
+
 	master->max_speed_hz =
 		clk_get_rate(dspi->clk) / dspi->devtype_data->max_clock_factor;
 
@@ -761,6 +1059,7 @@ static int dspi_remove(struct platform_device *pdev)
 	struct fsl_dspi *dspi = spi_master_get_devdata(master);
 
 	/* Disconnect from the SPI framework */
+	dspi_release_dma(dspi);
 	clk_disable_unprepare(dspi->clk);
 	spi_unregister_master(dspi->master);
 
-- 
2.10.2

^ permalink raw reply related

* [PATCH] arm64: dts: Add ARM PMU node for exynos7
From: Alim Akhtar @ 2016-11-11 13:01 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <4922ada9-2eb2-f0cf-e4b1-a74f4397f97c@arm.com>

Hi Robin,

On 11/10/2016 07:07 PM, Robin Murphy wrote:
> Hi Alim,
>
> On 10/11/16 03:30, Alim Akhtar wrote:
>> This patch adds ARM Performance Monitor Unit dt node for exynos7.
>> PMU provides various statistics on the operation of the CPU and
>> memory system at runtime, which are very useful when debugging or
>> profiling code. This enables the same.
>>
>> Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
>> ---
>>   arch/arm64/boot/dts/exynos/exynos7.dtsi |    8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/exynos/exynos7.dtsi b/arch/arm64/boot/dts/exynos/exynos7.dtsi
>> index e0d0d01..53ce4be 100644
>> --- a/arch/arm64/boot/dts/exynos/exynos7.dtsi
>> +++ b/arch/arm64/boot/dts/exynos/exynos7.dtsi
>> @@ -472,6 +472,14 @@
>>   			status = "disabled";
>>   		};
>>
>> +		arm-pmu {
>> +			compatible = "arm,cortex-a57-pmu", "arm,armv8-pmuv3";
>> +			interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>,
>> +				     <GIC_SPI 57 IRQ_TYPE_LEVEL_HIGH>,
>> +				     <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH>,
>> +				     <GIC_SPI 59 IRQ_TYPE_LEVEL_HIGH>;
>
> Per Documentation/devicetree/bindings/arm/pmu.txt there should also be
> an "interrupt-affinity" property describing which SPI belongs to which core.
>
Thanks for review, will resend after adding "interrupt-affinity" property.

> Robin.
>
>> +		};
>> +
>>   		timer {
>>   			compatible = "arm,armv8-timer";
>>   			interrupts = <GIC_PPI 13
>>
>
>
>
>

^ permalink raw reply

* [PATCH v7 00/16] ACPI IORT ARM SMMU support
From: Hanjun Guo @ 2016-11-11 12:44 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20161109141948.19244-1-lorenzo.pieralisi@arm.com>

On 11/09/2016 10:19 PM, Lorenzo Pieralisi wrote:
> This patch series is v7 of a previous posting:
>
> https://lkml.org/lkml/2016/10/18/506
>
> v6 -> v7
> 	- Rebased against v4.9-rc4
> 	- Fixed IORT probing on ACPI systems with missing IORT table
> 	- Fixed SMMUv1/v2 global interrupt detection
> 	- Updated iommu_ops firmware look-up

Although no major update for the new version, I tested
this patchset again on hisilicon d03, and the platform
device works as before with SMMUv3.

Thanks
Hanjun

^ permalink raw reply

* [PATCH v2 2/4] dt-bindings: Add TI SCI PM Domains
From: Ulf Hansson @ 2016-11-11 12:34 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <90e91588-d2fc-19be-5a28-63c801a8d061@ti.com>

On 10 November 2016 at 20:56, Dave Gerlach <d-gerlach@ti.com> wrote:
> Rob, Ulf, Jon,
>
> On 10/27/2016 08:15 AM, Dave Gerlach wrote:
>>
>> +Jon
>> On 10/26/2016 04:59 PM, Rob Herring wrote:
>>>
>>> On Mon, Oct 24, 2016 at 12:00 PM, Kevin Hilman <khilman@baylibre.com>
>>> wrote:
>>>>
>>>> Dave Gerlach <d-gerlach@ti.com> writes:
>>>>
>>>>> Hi,
>>>>> On 10/21/2016 01:48 PM, Kevin Hilman wrote:
>>>>>>
>>>>>> Dave Gerlach <d-gerlach@ti.com> writes:
>>>>>>
>>>>>>> Add a generic power domain implementation, TI SCI PM Domains, that
>>>>>>> will hook into the genpd framework and allow the TI SCI protocol to
>>>>>>> control device power states.
>>>>>>>
>>>>>>> Also, provide macros representing each device index as understood
>>>>>>> by TI SCI to be used in the device node power-domain references.
>>>>>>> These are identifiers for the K2G devices managed by the PMMC.
>>>>>>>
>>>>>>> Signed-off-by: Nishanth Menon <nm@ti.com>
>>>>>>> Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
>>>>>>> ---
>>>>>>>  .../devicetree/bindings/soc/ti/sci-pm-domain.txt   | 54
>>>>>>> +++++++++++++
>>>>>>>  MAINTAINERS                                        |  2 +
>>>>>>>  include/dt-bindings/genpd/k2g.h                    | 90
>>>>>>> ++++++++++++++++++++++
>>>>>>>  3 files changed, 146 insertions(+)
>>>>>>>  create mode 100644
>>>>>>> Documentation/devicetree/bindings/soc/ti/sci-pm-domain.txt
>>>>>>>  create mode 100644 include/dt-bindings/genpd/k2g.h
>>>>>>>
>>>>>>> diff --git
>>>>>>> a/Documentation/devicetree/bindings/soc/ti/sci-pm-domain.txt
>>>>>>> b/Documentation/devicetree/bindings/soc/ti/sci-pm-domain.txt
>>>>>>> new file mode 100644
>>>>>>> index 000000000000..32f38a349656
>>>>>>> --- /dev/null
>>>>>>> +++ b/Documentation/devicetree/bindings/soc/ti/sci-pm-domain.txt
>>>>>>> @@ -0,0 +1,54 @@
>>>>>>> +Texas Instruments TI-SCI Generic Power Domain
>>>>>>> +---------------------------------------------
>>>>>>> +
>>>>>>> +Some TI SoCs contain a system controller (like the PMMC, etc...)
>>>>>>> that is
>>>>>>> +responsible for controlling the state of the IPs that are present.
>>>>>>> +Communication between the host processor running an OS and the
>>>>>>> system
>>>>>>> +controller happens through a protocol known as TI-SCI [1]. This pm
>>>>>>> domain
>>>>>>> +implementation plugs into the generic pm domain framework and makes
>>>>>>> use of
>>>>>>> +the TI SCI protocol power on and off each device when needed.
>>>>>>> +
>>>>>>> +[1] Documentation/devicetree/bindings/arm/keystone/ti,sci.txt
>>>>>>> +
>>>>>>> +PM Domain Node
>>>>>>> +==============
>>>>>>> +The PM domain node represents the global PM domain managed by the
>>>>>>> PMMC,
>>>>>>> +which in this case is the single implementation as documented by the
>>>>>>> generic
>>>>>>> +PM domain bindings in
>>>>>>> Documentation/devicetree/bindings/power/power_domain.txt.
>>>>>>> +
>>>>>>> +Required Properties:
>>>>>>> +--------------------
>>>>>>> +- compatible: should be "ti,sci-pm-domain"
>>>>>>> +- #power-domain-cells: Must be 0.
>>>>>>> +- ti,sci: Phandle to the TI SCI device to use for managing the
>>>>>>> devices.
>>>>>>>
>>>>>>> +Example:
>>>>>>> +--------------------
>>>>>>> +k2g_pds: k2g_pds {
>>>>>>
>>>>>>
>>>>>> should use generic name like "power-contoller", e.g. k2g_pds:
>>>>>> power-controller
>>>>>
>>>>>
>>>>> Ok, that makes more sense.
>>>>>
>>>>>>
>>>>>>> +        compatible = "ti,sci-pm-domain";
>>>>>>> +        #power-domain-cells = <0>;
>>>>>>> +        ti,sci = <&pmmc>;
>>>>>>> +};
>>>>>>> +
>>>>>>> +PM Domain Consumers
>>>>>>> +===================
>>>>>>> +Hardware blocks that require SCI control over their state must
>>>>>>> provide
>>>>>>> +a reference to the sci-pm-domain they are part of and a unique
>>>>>>> device
>>>>>>> +specific ID that identifies the device.
>>>>>>> +
>>>>>>> +Required Properties:
>>>>>>> +--------------------
>>>>>>> +- power-domains: phandle pointing to the corresponding PM domain
>>>>>>> node.
>>>>>>> +- ti,sci-id: index representing the device id to be passed oevr SCI
>>>>>>> to
>>>>>>> +        be used for device control.
>>>>>>
>>>>>>
>>>>>> This ID doesn't look right.
>>>>>>
>>>>>> Why not use #power-domain-cells = <1> and pass the index in the DT?
>>>>>> ...
>>>
>>>
>>> Exactly. ti,sci-id is a NAK for me.
>>
>>
>> I was told not to use the onecell during v1 discussion. I agree this would
>> be
>> ideal but I cannot due to what the bindings represent, the phandle
>> parameter is
>> an index into a list of genpds, whereas we need an actual ID number we can
>> use
>> and I do not have the ability to get that from the phandle.
>>
>> @Ulf/Jon, is there any hope of bringing back custom xlate functions for
>> genpd
>> providers? I don't have a good background on why it was even removed. I
>> can
>> maintain a single genpd for all devices but I need a way to parse this ID,
>> whether it's from a separate property or a phandle. It is locked now to
>> indexing
>> into a list of genpds but I need additional per device information for
>> devices
>> bound to a genpd and I need either a custom parameter or the ability to
>> parse
>> the phandle myself.
>>
>
> Any comments here? The meaning of the phandle onecell is fixed in the genpd
> framework so I'm not sure how we want to move forward with this, I need to
> pass a power domain ID to the genpd driver, and if this shouldn't be a new
> property I'm not sure what direction we should take.
>
> Regards,
> Dave
>
>
>>>
>>>>>>
>>>>>>> +See dt-bindings/genpd/k2g.h for the list of valid identifiers for
>>>>>>> k2g.
>>>>>>> +
>>>>>>> +Example:
>>>>>>> +--------------------
>>>>>>> +uart0: serial at 02530c00 {
>>>>>>> +   compatible = "ns16550a";
>>>>>>> +   ...
>>>>>>> +   power-domains = <&k2g_pds>;
>>>>>>> +   ti,sci-id = <K2G_DEV_UART0>;
>>>>>>
>>>>>>
>>>>>> ... like this:
>>>>>>
>>>>>>      power-domains = <&k2g_pds K2G_DEV_UART0>;
>>>>>
>>>>>
>>>>> That's how I did it in version one actually. I was able to define my
>>>>> own xlate function to parse the phandle and get that index, but Ulf
>>>>> pointed me to this series by Jon Hunter [1] that simplified genpd
>>>>> providers and dropped the concept of adding your own xlate. This locks
>>>>> the onecell approach to using a fixed static array of genpds that get
>>>>> indexed into (without passing the index to the provider, just the
>>>>> genpd that's looked up), which doesn't fit our usecase, as we don't
>>>>> want a 1 to 1 genpd to device mapping based on the comments provided
>>>>> in v1. Now we just use the genpd device attach/detach hooks to parse
>>>>> the sci-id and then use it in the genpd device start/stop hooks.
>>>
>>>
>>> I have no idea what any of this means. All sounds like driver
>>> architecture, not anything to do with bindings.
>>
>>
>> This was a response to Kevin, not part of binding description.
>>
>>>
>>>>
>>>> Ah, right.  I remember now.  This approach allows you to use a single
>>>> genpd as discussed earlier.
>>>>
>>>> Makes sense now, suggestion retracted.
>>>
>>>
>>> IIRC, the bindings in Jon's case had a node for each domain and didn't
>>> need any additional property.
>>
>>
>> Yes but we only have one domain and index into it, not into a list of
>> domains,

Exactly. And this my main point as well. We are not talking about a
domain property but a device property.

>> so the additional property is solving a different problem.

Yes.

Perhaps you could try to elaborate about what the TI SCI ID really
represents for the device, as to help Rob understand the bigger
picture?

To me, the TI SCI ID, is similar to a "conid" for any another "device
resource" (like clock, pinctrl, regulator etc) which we can describe
in DT and assign to a device node. The only difference here, is that
we don't have common API to fetch the resource (like clk_get(),
regulator_get()), but instead we fetches the device's resource from
SoC specific code, via genpd's device ->attach() callback.

Hope that helps.

Kind regards
Uffe

^ permalink raw reply

* [PATCH v3] spi: spi-fsl-dspi: Add DMA support for Vybrid
From: Mark Brown @ 2016-11-11 12:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <836dd6a84fa149cafdbb0b53e792d305febf6207.1478778329.git.maitysanchayan@gmail.com>

On Thu, Nov 10, 2016 at 05:49:15PM +0530, Sanchayan Maity wrote:

A couple of small things, please send followup patches fixing them.

> +	rx_word = is_double_byte_mode(dspi);
> +
> +	len = rx_word ? (dma->curr_xfer_len / 2) : dma->curr_xfer_len;

Please use normal if statements, they're much easier to read.

> +err_slave_config:
> +	devm_kfree(dev, dma->rx_dma_buf);
> +err_rx_dma_buf:
> +	devm_kfree(dev, dma->tx_dma_buf);

You really shouldn't need to explicitly free things like this if you're
using devm_, especially in the error path from the probe function like
this where a failure is just going to result in the device failing to
instantiate so you won't have the allocation sitting around unused for
any length of time.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20161111/6447812f/attachment.sig>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox