Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC PATCH 6/6] arm/xen: Assign xen-virtio DMA ops for virtio devices in Xen guests
From: Oleksandr @ 2022-04-20  9:00 UTC (permalink / raw)
  To: Stefano Stabellini, Juergen Gross
  Cc: Christoph Hellwig, xen-devel, linux-kernel, linux-arm-kernel,
	Oleksandr Tyshchenko, Boris Ostrovsky, Julien Grall,
	Michael S. Tsirkin
In-Reply-To: <alpine.DEB.2.22.394.2204191717020.915916@ubuntu-linux-20-04-desktop>


Hello Stefano, Juergen


On 20.04.22 03:23, Stefano Stabellini wrote:
> On Tue, 19 Apr 2022, Oleksandr wrote:
>> On 19.04.22 17:48, Juergen Gross wrote:
>>> On 19.04.22 14:17, Oleksandr wrote:
>>>> Hello Stefano, Juergen
>>>>
>>>>
>>>> On 18.04.22 22:11, Stefano Stabellini wrote:
>>>>> On Mon, 18 Apr 2022, Oleksandr wrote:
>>>>>> On 16.04.22 09:07, Christoph Hellwig wrote:
>>>>>>
>>>>>> Hello Christoph
>>>>>>
>>>>>>> On Fri, Apr 15, 2022 at 03:02:45PM -0700, Stefano Stabellini wrote:
>>>>>>>> This makes sense overall. Considering that the swiotlb-xen case
>>>>>>>> and the
>>>>>>>> virtio case are mutually exclusive, I would write it like this:
>>>>>>> Curious question:  Why can't the same grant scheme also be used for
>>>>>>> non-virtio devices?  I really hate having virtio hooks in the arch
>>>>>>> dma code.  Why can't Xen just say in DT/ACPI that grants can be used
>>>>>>> for a given device?
>>>>> [...]
>>>>>
>>>>>> This patch series tries to make things work with "virtio" devices in
>>>>>> Xen
>>>>>> system without introducing any modifications to code under
>>>>>> drivers/virtio.
>>>>> Actually, I think Christoph has a point.
>>>>>
>>>>> There is nothing inherently virtio specific in this patch series or in
>>>>> the "xen,dev-domid" device tree binding.
>>>>
>>>> Although the main intention of this series was to enable using virtio
>>>> devices in Xen guests, I agree that nothing in new DMA ops layer
>>>> (xen-virtio.c) is virtio specific (at least at the moment). Regarding the
>>>> whole patch series I am not quite sure, as it uses
>>>> arch_has_restricted_virtio_memory_access(). >
>>>>>    Assuming a given device is
>>>>> emulated by a Xen backend, it could be used with grants as well.
>>>>>
>>>>> For instance, we could provide an emulated e1000 NIC with a
>>>>> "xen,dev-domid" property in device tree. Linux could use grants with it
>>>>> and the backend could map the grants. It would work the same way as
>>>>> virtio-net/block/etc. Passthrough devices wouldn't have the
>>>>> "xen,dev-domid" property, so no problems.
>>>>>
>>>>> So I think we could easily generalize this work and expand it to any
>>>>> device. We just need to hook on the "xen,dev-domid" device tree
>>>>> property.
>>>>>
>>>>> I think it is just a matter of:
>>>>> - remove the "virtio,mmio" check from xen_is_virtio_device
>>>>> - rename xen_is_virtio_device to something more generic, like
>>>>>     xen_is_grants_device
>>> xen_is_grants_dma_device, please. Normal Xen PV devices are covered by
>>> grants, too, and I'd like to avoid the confusion arising from this.
>>
>> yes, this definitely makes sense as we need to distinguish
>>
>>
>>>
>>>>> - rename xen_virtio_setup_dma_ops to something more generic, like
>>>>>     xen_grants_setup_dma_ops
>>>>>
>>>>> And that's pretty much it.
>>>> + likely renaming everything in that patch series not to mention virtio
>>>> (mostly related to xen-virtio.c internals).
>>>>
>>>>
>>>> Stefano, thank you for clarifying Christoph's point.
>>>>
>>>> Well, I am not against going this direction. Could we please make a
>>>> decision on this? @Juergen, what is your opinion?
>>> Yes, why not.
>>
>> ok, thank you for confirming.
>>
>>
>>>
>>> Maybe rename xen-virtio.c to grant-dma.c?
>>
>> Personally I don't mind.
>>
>>
>>> I'd keep the XEN_VIRTIO related config option, as this will be the normal
>>> use
>>> case. grant-dma.c should be covered by a new hidden config option
>>> XEN_GRANT_DMA
>>> selected by XEN_VIRTIO.
>>
>> I got it, ok
>>
>>
>>>
>>> CONFIG_XEN_VIRTIO should still guard
>>> xen_has_restricted_virtio_memory_access().
>>
>> ok
>>
>>
>> So a few questions to clarify:
>>
>> 1. What is the best place to keep "xen,dev-domid" binding's description now? I
>> think that proposed in current series place
>> (Documentation/devicetree/bindings/virtio/) is not good fit now.
> I would probably add it to the existing
> Documentation/devicetree/bindings/arm/xen.txt.
>
>
>> 2. I assume the logic in the current patch will remain the same, I mean we
>> will still assign Xen grant DMA ops from xen_setup_dma_ops() here?
> Yes I think so


Stefano, thank you for clarifying!


Regarding new naming scheme...

As there is an existing Kconfig option XEN_GRANT_DMA_ALLOC used for 
different purpose, we need to clarify naming scheme here a bit to avoid 
possible confusion.

For example, I am happy with proposed by Juergen ...

... Kconfig option: XEN_GRANT_DMA_OPS

and

... file: grant-dma-ops.c


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
From: Naresh Kamboju @ 2022-04-20  8:55 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Arnd Bergmann, Ard Biesheuvel, Andrew Morton,
	max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman
In-Reply-To: <Yl8GInPZyl2PqK7D@shell.armlinux.org.uk>

On Wed, 20 Apr 2022 at 00:28, Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Tue, Apr 19, 2022 at 04:28:52PM +0530, Naresh Kamboju wrote:
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
>
> Was the immediately previous linux-next behaving correctly?

This crash started happening from the next-20220413 tag.

- Naresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v3 3/4] staging: media: Add support for the Allwinner A31 ISP
From: Dan Carpenter @ 2022-04-20  8:53 UTC (permalink / raw)
  To: Paul Kocialkowski
  Cc: linux-media, devicetree, linux-arm-kernel, linux-sunxi,
	linux-kernel, linux-staging, Mauro Carvalho Chehab, Rob Herring,
	Krzysztof Kozlowski, Chen-Yu Tsai, Jernej Skrabec, Samuel Holland,
	Sakari Ailus, Hans Verkuil, Laurent Pinchart, Maxime Ripard,
	Thomas Petazzoni
In-Reply-To: <20220420085132.GC2951@kadam>

On Wed, Apr 20, 2022 at 11:51:33AM +0300, Dan Carpenter wrote:
> On Wed, Apr 20, 2022 at 09:50:42AM +0200, Paul Kocialkowski wrote:
> > Hi Dan,
> > 
> > On Wed 20 Apr 22, 10:42, Dan Carpenter wrote:
> > > I ran Smatch on this patch.
> > 
> > Thanks for doing this!
> > 
> > > On Fri, Apr 15, 2022 at 05:37:07PM +0200, Paul Kocialkowski wrote:
> > > > +void sun6i_isp_capture_configure(struct sun6i_isp_device *isp_dev)
> > > > +{
> > > > +	unsigned int width, height;
> > > > +	unsigned int stride_luma, stride_chroma = 0;
> > > > +	unsigned int stride_luma_div4, stride_chroma_div4;
> > > > +	const struct sun6i_isp_capture_format *format;
> > > > +	const struct v4l2_format_info *info;
> > > > +	u32 pixelformat;
> > > > +
> > > > +	sun6i_isp_capture_dimensions(isp_dev, &width, &height);
> > > > +	sun6i_isp_capture_format(isp_dev, &pixelformat);
> > > > +
> > > > +	format = sun6i_isp_capture_format_find(pixelformat);
> > > > +	if (WARN_ON(!format))
> > > > +		return;
> > > > +
> > > > +	sun6i_isp_load_write(isp_dev, SUN6I_ISP_MCH_SIZE_CFG_REG,
> > > > +			     SUN6I_ISP_MCH_SIZE_CFG_WIDTH(width) |
> > > > +			     SUN6I_ISP_MCH_SIZE_CFG_HEIGHT(height));
> > > > +
> > > > +	info = v4l2_format_info(pixelformat);
> > > > +	if (WARN_ON(!info))
> > > > +		return;
> > > > +
> > > > +	stride_luma = width * info->bpp[0];
> > > > +	stride_luma_div4 = DIV_ROUND_UP(stride_luma, 4);
> > > > +
> > > > +	if (info->comp_planes > 1) {
> > > > +		stride_chroma = width * info->bpp[1] / info->hdiv;
> > > > +		stride_chroma_div4 = DIV_ROUND_UP(stride_chroma, 4);
> > > 
> > > stride_chroma_div4 is not intialized on the else path.
> > 
> > One could say it's not an issue to put an uninitialized value in this situation
> > since the hardware won't be taking it into account but I'll initialize the value
> > early in the next iteration.
> > 
> 
> My understanding is that it will trigger a KASAN warning at run time.

Or is it KMEMsan?  One of those.  The one that complains about
uninitialized memory.

regards,
dan carpenter


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext
From: Naresh Kamboju @ 2022-04-20  8:54 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM, open list, Linux-Next Mailing List, lkft-triage,
	Stephen Rothwell, Russell King - ARM Linux, Arnd Bergmann,
	Andrew Morton, max.krummenacher, Shawn Guo, Stefano Stabellini,
	Christoph Hellwig, Konrad Rzeszutek Wilk, Eric W. Biederman
In-Reply-To: <CAMj1kXFKzi14UCoiDOMwS5jyNz61_UzxGXm+ke0EWEt4nn6E1g@mail.gmail.com>

On Wed, 20 Apr 2022 at 13:01, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> >
> > Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> > x15 device.
> >
> > kernel crash log from x15:
> > -----------------
> > [    6.866516] 8<--- cut here ---
> > [    6.869598] Unable to handle kernel paging request at virtual
> > address f000e62c
> > [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> > [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> > [    6.888000] Modules linked in:
> > [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
> >   5.18.0-rc3-next-20220419 #1
> > [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> > [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> > [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> > [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> > [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> > [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> > [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> > [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> > [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> > [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> > [    6.958526] Register r0 information: 2-page vmalloc region starting
> > at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    6.969299] Register r1 information: non-paged memory
> > [    6.974365] Register r2 information: NULL pointer
> > [    6.979095] Register r3 information: NULL pointer
> > [    6.983825] Register r4 information: slab task_struct start
> > c29d8000 pointer offset 0
> > [    6.991729] Register r5 information: non-paged memory
> > [    6.996795] Register r6 information: non-paged memory
> > [    7.001861] Register r7 information: slab vm_area_struct start
> > c3c95000 pointer offset 0
> > [    7.010009] Register r8 information: non-slab/vmalloc memory
> > [    7.015716] Register r9 information: NULL pointer
> > [    7.020446] Register r10 information: NULL pointer
> > [    7.025238] Register r11 information: non-paged memory
> > [    7.030426] Register r12 information: 2-page vmalloc region
> > starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> > [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> > [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> > [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> > c2065fa0 c1e09f50 b6db6db7
> > [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> > 000befff befff000 befffff1
> > [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> > befffff1 00002017 00002fb8
> > [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> > 00000000 00000001 c20ce440
> > [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> > c29d8000 00000000 c2d04000
> > [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> > c04c77e8 f000df18 00000000
> > [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> > 00000001 00000000 c04c7ad0
> > [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> > 00000000 00000011 c2a30010
> > [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> > 2cd9e000 c1d95c40 17c0f572
> > [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> > befffff1 00000000 c0524f74
> > [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> > bee00008 c2d02a00 c2a30000
> > [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> > c05266bc c209a000 c1944c60
> > [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> > c1e0e394 00000000 c12b5600
> > [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> > 00000000 00000000 00000000
> > [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> > 00000000 00000000 00000000
> > [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)
>
> This decodes to
>
>    0: 13110001 tstne r1, #1
>    4: 12211b02 eorne r1, r1, #2048 ; 0x800
>    8: 13110b02 tstne r1, #2048 ; 0x800
>    c: 03a03000 moveq r3, #0
>   10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction
>
> and R0 points into the stack. So we are updating a PTE that is located
> on the stack rather than in a page table somewhere, which seems very
> odd. However, this could be a latent bug that got uncovered by the
> VMAP stacks changes.
>
> Unfortunately, the vmlinux.xz file I downloaded from the link below
> seems to be different from the one that produced the crash, given that
> the LR address of c04cfeb8 does not seem to correspond with
> handle_mm_fault+0x60c/0xed0.
> Can you please double check the artifacts?

You can find the vmlinux.xz for the trace log I have pasted.

vmlinux.xz : https://builds.tuxbuild.com/280TS8MuM6sYWk5aUtrvWIw0RQ7/vmlinux.xz
artifact-location: https://builds.tuxbuild.com/280TS8MuM6sYWk5aUtrvWIw0RQ7

- Naresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH 2/2] arm64: dts: broadcom: align SPI NOR node name with dtschema
From: Krzysztof Kozlowski @ 2022-04-20  8:52 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Scott Branden, Krzysztof Kozlowski, Ray Jui, devicetree,
	linux-kernel, Rob Herring, linux-arm-kernel,
	bcm-kernel-feedback-list
In-Reply-To: <20220407185710.2576287-1-f.fainelli@gmail.com>

On 07/04/2022 20:57, Florian Fainelli wrote:
> On Thu,  7 Apr 2022 16:32:11 +0200, Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> wrote:
>> The node names should be generic and SPI NOR dtschema expects "flash".
>>
>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>> ---
> 
> Applied to https://github.com/Broadcom/stblinux/commits/devicetree-arm64/next, thanks!

Thanks Florian. It seems that patch is still not in linux-next. Is your
tree included in the linux-next?

Best regards,
Krzysztof

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v3 3/4] staging: media: Add support for the Allwinner A31 ISP
From: Dan Carpenter @ 2022-04-20  8:51 UTC (permalink / raw)
  To: Paul Kocialkowski
  Cc: linux-media, devicetree, linux-arm-kernel, linux-sunxi,
	linux-kernel, linux-staging, Mauro Carvalho Chehab, Rob Herring,
	Krzysztof Kozlowski, Chen-Yu Tsai, Jernej Skrabec, Samuel Holland,
	Sakari Ailus, Hans Verkuil, Laurent Pinchart, Maxime Ripard,
	Thomas Petazzoni
In-Reply-To: <Yl+7UrQFyLvfKRdG@aptenodytes>

On Wed, Apr 20, 2022 at 09:50:42AM +0200, Paul Kocialkowski wrote:
> Hi Dan,
> 
> On Wed 20 Apr 22, 10:42, Dan Carpenter wrote:
> > I ran Smatch on this patch.
> 
> Thanks for doing this!
> 
> > On Fri, Apr 15, 2022 at 05:37:07PM +0200, Paul Kocialkowski wrote:
> > > +void sun6i_isp_capture_configure(struct sun6i_isp_device *isp_dev)
> > > +{
> > > +	unsigned int width, height;
> > > +	unsigned int stride_luma, stride_chroma = 0;
> > > +	unsigned int stride_luma_div4, stride_chroma_div4;
> > > +	const struct sun6i_isp_capture_format *format;
> > > +	const struct v4l2_format_info *info;
> > > +	u32 pixelformat;
> > > +
> > > +	sun6i_isp_capture_dimensions(isp_dev, &width, &height);
> > > +	sun6i_isp_capture_format(isp_dev, &pixelformat);
> > > +
> > > +	format = sun6i_isp_capture_format_find(pixelformat);
> > > +	if (WARN_ON(!format))
> > > +		return;
> > > +
> > > +	sun6i_isp_load_write(isp_dev, SUN6I_ISP_MCH_SIZE_CFG_REG,
> > > +			     SUN6I_ISP_MCH_SIZE_CFG_WIDTH(width) |
> > > +			     SUN6I_ISP_MCH_SIZE_CFG_HEIGHT(height));
> > > +
> > > +	info = v4l2_format_info(pixelformat);
> > > +	if (WARN_ON(!info))
> > > +		return;
> > > +
> > > +	stride_luma = width * info->bpp[0];
> > > +	stride_luma_div4 = DIV_ROUND_UP(stride_luma, 4);
> > > +
> > > +	if (info->comp_planes > 1) {
> > > +		stride_chroma = width * info->bpp[1] / info->hdiv;
> > > +		stride_chroma_div4 = DIV_ROUND_UP(stride_chroma, 4);
> > 
> > stride_chroma_div4 is not intialized on the else path.
> 
> One could say it's not an issue to put an uninitialized value in this situation
> since the hardware won't be taking it into account but I'll initialize the value
> early in the next iteration.
> 

My understanding is that it will trigger a KASAN warning at run time.

regards,
dan carpenter


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v5 5/5] perf mem: Support mem_lvl_num in c2c command
From: Leo Yan @ 2022-04-20  8:48 UTC (permalink / raw)
  To: German Gomez
  Cc: Ali Saidi, linux-kernel, linux-perf-users, linux-arm-kernel, acme,
	benh, Nick.Forrington, alexander.shishkin, andrew.kilroy,
	james.clark, john.garry, jolsa, kjain, lihuafei1, mark.rutland,
	mathieu.poirier, mingo, namhyung, peterz, will
In-Reply-To: <3ad0128a-1b3d-37a7-81f6-fd597b565b18@arm.com>

On Mon, Apr 11, 2022 at 11:04:28AM +0100, German Gomez wrote:
> 
> On 08/04/2022 20:53, Ali Saidi wrote:
> > In addition to summarizing data encoded in mem_lvl also support data
> > encoded in mem_lvl_num.
> >
> > Since other architectures don't seem to populate the mem_lvl_num field
> > here there shouldn't be a change in functionality.
> >
> > Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> > ---
> >  tools/perf/util/mem-events.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> > index ed0ab838bcc5..e5e405185498 100644
> > --- a/tools/perf/util/mem-events.c
> > +++ b/tools/perf/util/mem-events.c
> > @@ -485,6 +485,7 @@ int c2c_decode_stats(struct c2c_stats *stats, struct mem_info *mi)
> >  	u64 daddr  = mi->daddr.addr;
> >  	u64 op     = data_src->mem_op;
> >  	u64 lvl    = data_src->mem_lvl;
> > +	u64 lnum   = data_src->mem_lvl_num;
> >  	u64 snoop  = data_src->mem_snoop;
> >  	u64 lock   = data_src->mem_lock;
> >  	u64 blk    = data_src->mem_blk;
> > @@ -527,16 +528,18 @@ do {				\
> >  			if (lvl & P(LVL, UNC)) stats->ld_uncache++;
> >  			if (lvl & P(LVL, IO))  stats->ld_io++;
> >  			if (lvl & P(LVL, LFB)) stats->ld_fbhit++;
> 
> Just for completion, can we also handle LFB as it seems to be being set
> in "/arch/x86/events/intel/ds.c"? (Sorry I missed this in the v4)

With fixing LFB issue pointed by German, the change looks good to me:

Reviewed-by: Leo Yan <leo.yan@linaro.org>

It would be appreciate if x86 or PowerPC maintainers could take a look
for this patch.  Thanks!

Leo

> > -			if (lvl & P(LVL, L1 )) stats->ld_l1hit++;
> > -			if (lvl & P(LVL, L2 )) stats->ld_l2hit++;
> > -			if (lvl & P(LVL, L3 )) {
> > +			if (lvl & P(LVL, L1) || lnum == P(LVLNUM, L1))
> > +				stats->ld_l1hit++;
> > +			if (lvl & P(LVL, L2) || lnum == P(LVLNUM, L2))
> > +				stats->ld_l2hit++;
> > +			if (lvl & P(LVL, L3) || lnum == P(LVLNUM, L3)) {
> >  				if (snoop & P(SNOOP, HITM))
> >  					HITM_INC(lcl_hitm);
> >  				else
> >  					stats->ld_llchit++;
> >  			}
> >  
> > -			if (lvl & P(LVL, LOC_RAM)) {
> > +			if (lvl & P(LVL, LOC_RAM) || lnum == P(LVLNUM, RAM)) {
> >  				stats->lcl_dram++;
> >  				if (snoop & P(SNOOP, HIT))
> >  					stats->ld_shared++;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v5 4/5] perf arm-spe: Use SPE data source for neoverse cores
From: Leo Yan @ 2022-04-20  8:42 UTC (permalink / raw)
  To: Ali Saidi
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, german.gomez,
	acme, benh, Nick.Forrington, alexander.shishkin, andrew.kilroy,
	james.clark, john.garry, jolsa, kjain, lihuafei1, mark.rutland,
	mathieu.poirier, mingo, namhyung, peterz, will
In-Reply-To: <20220408195344.32764-5-alisaidi@amazon.com>

On Fri, Apr 08, 2022 at 07:53:43PM +0000, Ali Saidi wrote:
> When synthesizing data from SPE, augment the type with source information
> for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use
> the same encoding. I can't find encoding information for any other SPE
> implementations to unify their choices with Arm's thus that is left for
> future work.
> 
> This change populates the mem_lvl_num for Neoverse cores instead of the
> deprecated mem_lvl namespace.
> 
> Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c    |   1 +
>  .../util/arm-spe-decoder/arm-spe-decoder.h    |  12 ++
>  tools/perf/util/arm-spe.c                     | 127 ++++++++++++++++--
>  3 files changed, 126 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 5e390a1a79ab..091987dd3966 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -220,6 +220,7 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
>  
>  			break;
>  		case ARM_SPE_DATA_SOURCE:
> +			decoder->record.source = payload;
>  			break;
>  		case ARM_SPE_BAD:
>  			break;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> index 69b31084d6be..46a61df1145b 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> @@ -29,6 +29,17 @@ enum arm_spe_op_type {
>  	ARM_SPE_ST		= 1 << 1,
>  };
>  
> +enum arm_spe_neoverse_data_source {
> +	ARM_SPE_NV_L1D		 = 0x0,
> +	ARM_SPE_NV_L2		 = 0x8,
> +	ARM_SPE_NV_PEER_CORE	 = 0x9,
> +	ARM_SPE_NV_LOCAL_CLUSTER = 0xa,
> +	ARM_SPE_NV_SYS_CACHE	 = 0xb,
> +	ARM_SPE_NV_PEER_CLUSTER	 = 0xc,
> +	ARM_SPE_NV_REMOTE	 = 0xd,
> +	ARM_SPE_NV_DRAM		 = 0xe,
> +};
> +
>  struct arm_spe_record {
>  	enum arm_spe_sample_type type;
>  	int err;
> @@ -40,6 +51,7 @@ struct arm_spe_record {
>  	u64 virt_addr;
>  	u64 phys_addr;
>  	u64 context_id;
> +	u16 source;
>  };
>  
>  struct arm_spe_insn;
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index d2b64e3f588b..a20285cf98e3 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -34,6 +34,7 @@
>  #include "arm-spe-decoder/arm-spe-decoder.h"
>  #include "arm-spe-decoder/arm-spe-pkt-decoder.h"
>  
> +#include "../../arch/arm64/include/asm/cputype.h"
>  #define MAX_TIMESTAMP (~0ULL)
>  
>  struct arm_spe {
> @@ -45,6 +46,7 @@ struct arm_spe {
>  	struct perf_session		*session;
>  	struct machine			*machine;
>  	u32				pmu_type;
> +	u64				midr;
>  
>  	struct perf_tsc_conversion	tc;
>  
> @@ -399,33 +401,127 @@ static bool arm_spe__is_memory_event(enum arm_spe_sample_type type)
>  	return false;
>  }
>  
> -static u64 arm_spe__synth_data_source(const struct arm_spe_record *record)
> +static const struct midr_range neoverse_spe[] = {
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
> +	{},
> +};
> +
> +
> +static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *record,
> +						union perf_mem_data_src *data_src)
>  {
> -	union perf_mem_data_src	data_src = { 0 };
> +	/*
> +	 * Even though four levels of cache hierarchy are possible, no known
> +	 * production Neoverse systems currently include more than three levels
> +	 * so for the time being we assume three exist. If a production system
> +	 * is built with four the this function would have to be changed to
> +	 * detect the number of levels for reporting.
> +	 */
>  
> -	if (record->op == ARM_SPE_LD)
> -		data_src.mem_op = PERF_MEM_OP_LOAD;
> -	else
> -		data_src.mem_op = PERF_MEM_OP_STORE;
> +	/*
> +	 * We have no data on the hit level or data source for stores in the
> +	 * Neoverse SPE records.
> +	 */
> +	if (record->op & ARM_SPE_ST) {
> +		data_src->mem_lvl = PERF_MEM_LVL_NA;
> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE;
> +		data_src->mem_snoop = PERF_MEM_SNOOP_NA;
> +		return;
> +	}
> +
> +

Redundant new line.

> +	switch (record->source) {
> +	case ARM_SPE_NV_L1D:
> +		data_src->mem_lvl = PERF_MEM_LVL_HIT;
> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_L1;
> +		data_src->mem_snoop = PERF_MEM_SNOOP_NONE;
> +		break;
> +	case ARM_SPE_NV_L2:
> +		data_src->mem_lvl = PERF_MEM_LVL_HIT;
> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2;
> +		data_src->mem_snoop = PERF_MEM_SNOOP_NONE;
> +		break;
> +	case ARM_SPE_NV_PEER_CORE:
> +		data_src->mem_lvl = PERF_MEM_LVL_HIT;
> +		data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER;
> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2;
> +		break;
> +	/*
> +	 * We don't know if this is L1, L2 but we do know it was a cache-2-cache
> +	 * transfer, so set SNOOPX_PEER
> +	 */
> +	case ARM_SPE_NV_LOCAL_CLUSTER:
> +	case ARM_SPE_NV_PEER_CLUSTER:
> +		data_src->mem_lvl = PERF_MEM_LVL_HIT;
> +		data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER;

As a side topic, it's better to use a new patch to dump snooping flag
PERF_MEM_SNOOPX_PEER, some code like below:

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index f8f234251f92..66d44280a4ea 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -410,6 +410,11 @@ static const char * const snoop_access[] = {
 	"HitM",
 };
 
+static const char * const snoopx_access[] = {
+	"Fwd",
+	"Peer",
+};
+
 int perf_mem__snp_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 {
 	size_t i, l = 0;
@@ -430,13 +435,18 @@ int perf_mem__snp_scnprintf(char *out, size_t sz, struct mem_info *mem_info)
 		}
 		l += scnprintf(out + l, sz - l, snoop_access[i]);
 	}
-	if (mem_info &&
-	     (mem_info->data_src.mem_snoopx & PERF_MEM_SNOOPX_FWD)) {
+
+	if (mem_info)
+		m = mem_info->data_src.mem_snoopx;
+
+	for (i = 0; m && i < ARRAY_SIZE(snoopx_access); i++, m >>= 1) {
+		if (!(m & 0x1))
+			continue;
 		if (l) {
 			strcat(out, " or ");
 			l += 4;
 		}
-		l += scnprintf(out + l, sz - l, "Fwd");
+		l += scnprintf(out + l, sz - l, snoopx_access[i]);
 	}
 

> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_L3;
> +		break;
> +	/*
> +	 * System cache is assumed to be L3
> +	 */
> +	case ARM_SPE_NV_SYS_CACHE:
> +		data_src->mem_lvl = PERF_MEM_LVL_HIT;
> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_L3;
> +		data_src->mem_snoop = PERF_MEM_SNOOP_HIT;
> +		break;
> +	/*
> +	 * We don't know what level it hit in, except it came from the other
> +	 * socket
> +	 */
> +	case ARM_SPE_NV_REMOTE:
> +		data_src->mem_snoop = PERF_MEM_SNOOP_HIT;
> +		data_src->mem_remote = PERF_MEM_REMOTE_REMOTE;
> +		data_src->mem_snoop = PERF_MEM_SNOOP_NA;
> +		break;
> +	case ARM_SPE_NV_DRAM:
> +		data_src->mem_lvl = PERF_MEM_LVL_HIT;
> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_RAM;
> +		data_src->mem_snoop = PERF_MEM_SNOOP_NONE;
> +		break;
> +	default:
> +		break;
> +	}
> +}
>
> +static void arm_spe__synth_data_source_generic(const struct arm_spe_record *record,
> +						union perf_mem_data_src *data_src)
> +{
>  	if (record->type & (ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS)) {
> -		data_src.mem_lvl = PERF_MEM_LVL_L3;
> +		data_src->mem_lvl = PERF_MEM_LVL_L3;
>  
>  		if (record->type & ARM_SPE_LLC_MISS)
> -			data_src.mem_lvl |= PERF_MEM_LVL_MISS;
> +			data_src->mem_lvl |= PERF_MEM_LVL_MISS;
>  		else
> -			data_src.mem_lvl |= PERF_MEM_LVL_HIT;
> +			data_src->mem_lvl |= PERF_MEM_LVL_HIT;
>  	} else if (record->type & (ARM_SPE_L1D_ACCESS | ARM_SPE_L1D_MISS)) {
> -		data_src.mem_lvl = PERF_MEM_LVL_L1;
> +		data_src->mem_lvl = PERF_MEM_LVL_L1;
>  
>  		if (record->type & ARM_SPE_L1D_MISS)
> -			data_src.mem_lvl |= PERF_MEM_LVL_MISS;
> +			data_src->mem_lvl |= PERF_MEM_LVL_MISS;
>  		else
> -			data_src.mem_lvl |= PERF_MEM_LVL_HIT;
> +			data_src->mem_lvl |= PERF_MEM_LVL_HIT;
>  	}
>  
>  	if (record->type & ARM_SPE_REMOTE_ACCESS)
> -		data_src.mem_lvl |= PERF_MEM_LVL_REM_CCE1;
> +		data_src->mem_lvl |= PERF_MEM_LVL_REM_CCE1;
> +}
> +
> +static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 midr)
> +{
> +	union perf_mem_data_src	data_src = { 0 };
> +	bool is_neoverse = is_midr_in_range(midr, neoverse_spe);
> +
> +	if (record->op & ARM_SPE_LD)
> +		data_src.mem_op = PERF_MEM_OP_LOAD;
> +	else
> +		data_src.mem_op = PERF_MEM_OP_STORE;
> +
> +	if (is_neoverse)
> +		arm_spe__synth_data_source_neoverse(record, &data_src);
> +	else
> +		arm_spe__synth_data_source_generic(record, &data_src);
>  
>  	if (record->type & (ARM_SPE_TLB_ACCESS | ARM_SPE_TLB_MISS)) {
>  		data_src.mem_dtlb = PERF_MEM_TLB_WK;
> @@ -446,7 +542,7 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
>  	u64 data_src;
>  	int err;
>  
> -	data_src = arm_spe__synth_data_source(record);
> +	data_src = arm_spe__synth_data_source(record, spe->midr);
>  
>  	if (spe->sample_flc) {
>  		if (record->type & ARM_SPE_L1D_MISS) {
> @@ -1183,6 +1279,8 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
>  	struct perf_record_auxtrace_info *auxtrace_info = &event->auxtrace_info;
>  	size_t min_sz = sizeof(u64) * ARM_SPE_AUXTRACE_PRIV_MAX;
>  	struct perf_record_time_conv *tc = &session->time_conv;
> +	const char *cpuid = perf_env__cpuid(session->evlist->env);
> +	u64 midr = strtol(cpuid, NULL, 16);
>  	struct arm_spe *spe;
>  	int err;
>  
> @@ -1202,6 +1300,7 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
>  	spe->machine = &session->machines.host; /* No kvm support */
>  	spe->auxtrace_type = auxtrace_info->type;
>  	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
> +	spe->midr = midr;

Except the redundant line, the patch is good for me and I tested it at
my side:

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>

>  
>  	spe->timeless_decoding = arm_spe__is_timeless_decoding(spe);
>  
> -- 
> 2.32.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* Re: [PATCH v5 4/5] perf arm-spe: Use SPE data source for neoverse cores
From: Leo Yan @ 2022-04-20  8:30 UTC (permalink / raw)
  To: Ali Saidi
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, german.gomez,
	acme, benh, Nick.Forrington, alexander.shishkin, andrew.kilroy,
	james.clark, john.garry, jolsa, kjain, lihuafei1, mark.rutland,
	mathieu.poirier, mingo, namhyung, peterz, will
In-Reply-To: <20220408195344.32764-5-alisaidi@amazon.com>

On Fri, Apr 08, 2022 at 07:53:43PM +0000, Ali Saidi wrote:
> When synthesizing data from SPE, augment the type with source information
> for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use
> the same encoding. I can't find encoding information for any other SPE
> implementations to unify their choices with Arm's thus that is left for
> future work.
> 
> This change populates the mem_lvl_num for Neoverse cores instead of the
> deprecated mem_lvl namespace.
> 
> Signed-off-by: Ali Saidi <alisaidi@amazon.com>
> ---
>  .../util/arm-spe-decoder/arm-spe-decoder.c    |   1 +
>  .../util/arm-spe-decoder/arm-spe-decoder.h    |  12 ++
>  tools/perf/util/arm-spe.c                     | 127 ++++++++++++++++--
>  3 files changed, 126 insertions(+), 14 deletions(-)
> 
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> index 5e390a1a79ab..091987dd3966 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
> @@ -220,6 +220,7 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
>  
>  			break;
>  		case ARM_SPE_DATA_SOURCE:
> +			decoder->record.source = payload;
>  			break;
>  		case ARM_SPE_BAD:
>  			break;
> diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> index 69b31084d6be..46a61df1145b 100644
> --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
> @@ -29,6 +29,17 @@ enum arm_spe_op_type {
>  	ARM_SPE_ST		= 1 << 1,
>  };
>  
> +enum arm_spe_neoverse_data_source {
> +	ARM_SPE_NV_L1D		 = 0x0,
> +	ARM_SPE_NV_L2		 = 0x8,
> +	ARM_SPE_NV_PEER_CORE	 = 0x9,
> +	ARM_SPE_NV_LOCAL_CLUSTER = 0xa,
> +	ARM_SPE_NV_SYS_CACHE	 = 0xb,
> +	ARM_SPE_NV_PEER_CLUSTER	 = 0xc,
> +	ARM_SPE_NV_REMOTE	 = 0xd,
> +	ARM_SPE_NV_DRAM		 = 0xe,
> +};
> +
>  struct arm_spe_record {
>  	enum arm_spe_sample_type type;
>  	int err;
> @@ -40,6 +51,7 @@ struct arm_spe_record {
>  	u64 virt_addr;
>  	u64 phys_addr;
>  	u64 context_id;
> +	u16 source;
>  };
>  
>  struct arm_spe_insn;
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index d2b64e3f588b..a20285cf98e3 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -34,6 +34,7 @@
>  #include "arm-spe-decoder/arm-spe-decoder.h"
>  #include "arm-spe-decoder/arm-spe-pkt-decoder.h"
>  
> +#include "../../arch/arm64/include/asm/cputype.h"
>  #define MAX_TIMESTAMP (~0ULL)
>  
>  struct arm_spe {
> @@ -45,6 +46,7 @@ struct arm_spe {
>  	struct perf_session		*session;
>  	struct machine			*machine;
>  	u32				pmu_type;
> +	u64				midr;
>  
>  	struct perf_tsc_conversion	tc;
>  
> @@ -399,33 +401,127 @@ static bool arm_spe__is_memory_event(enum arm_spe_sample_type type)
>  	return false;
>  }
>  
> -static u64 arm_spe__synth_data_source(const struct arm_spe_record *record)
> +static const struct midr_range neoverse_spe[] = {
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
> +	{},
> +};
> +
> +
> +static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *record,
> +						union perf_mem_data_src *data_src)
>  {
> -	union perf_mem_data_src	data_src = { 0 };
> +	/*
> +	 * Even though four levels of cache hierarchy are possible, no known
> +	 * production Neoverse systems currently include more than three levels
> +	 * so for the time being we assume three exist. If a production system
> +	 * is built with four the this function would have to be changed to
> +	 * detect the number of levels for reporting.
> +	 */
>  
> -	if (record->op == ARM_SPE_LD)
> -		data_src.mem_op = PERF_MEM_OP_LOAD;
> -	else
> -		data_src.mem_op = PERF_MEM_OP_STORE;
> +	/*
> +	 * We have no data on the hit level or data source for stores in the
> +	 * Neoverse SPE records.
> +	 */
> +	if (record->op & ARM_SPE_ST) {
> +		data_src->mem_lvl = PERF_MEM_LVL_NA;
> +		data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE;
> +		data_src->mem_snoop = PERF_MEM_SNOOP_NA;
> +		return;
> +	}

For the store operation, I found we need to use more strictly criteria
to check memory operations, otherwise, we might wrongly synthesize
memory sample even for other types of operations.

To fix the issue, I think we need to add below patch; if this is okay
for you, please consider to include it in the next patch set version.

Thanks,
Leo

From 7f8499d4f44b400d217c01d42059f00e8a1697b0 Mon Sep 17 00:00:00 2001
From: Leo Yan <leo.yan@linaro.org>
Date: Wed, 20 Apr 2022 15:46:21 +0800
Subject: [PATCH] perf arm-spe: Don't set data source if it's not a memory
 operation

Except memory load and store operations, Arm SPE records also can
support other operation types, bug when set the data source field the
current code assumes a record is a either load operation or store
operation, this leads to wrongly synthesize memory samples.

This patch strictly checks the record operation type, it only sets data
source only for the operation types ARM_SPE_LD and ARM_SPE_ST,
otherwise, returns zero for data source.  Therefore, we can synthesize
memory samples only when data source is a non-zero value, the function
arm_spe__is_memory_event() is useless and removed.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index d2b64e3f588b..76251825c01d 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -387,26 +387,16 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq,
 	return arm_spe_deliver_synth_event(spe, speq, event, &sample);
 }
 
-#define SPE_MEM_TYPE	(ARM_SPE_L1D_ACCESS | ARM_SPE_L1D_MISS | \
-			 ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS | \
-			 ARM_SPE_REMOTE_ACCESS)
-
-static bool arm_spe__is_memory_event(enum arm_spe_sample_type type)
-{
-	if (type & SPE_MEM_TYPE)
-		return true;
-
-	return false;
-}
-
 static u64 arm_spe__synth_data_source(const struct arm_spe_record *record)
 {
 	union perf_mem_data_src	data_src = { 0 };
 
 	if (record->op == ARM_SPE_LD)
 		data_src.mem_op = PERF_MEM_OP_LOAD;
-	else
+	else if (record->op & ARM_SPE_ST)
 		data_src.mem_op = PERF_MEM_OP_STORE;
+	else
+		return 0;
 
 	if (record->type & (ARM_SPE_LLC_ACCESS | ARM_SPE_LLC_MISS)) {
 		data_src.mem_lvl = PERF_MEM_LVL_L3;
@@ -510,7 +500,11 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 			return err;
 	}
 
-	if (spe->sample_memory && arm_spe__is_memory_event(record->type)) {
+	/*
+	 * When data_src is zero it means the record is not a memory operation,
+	 * skip to synthesize memory sample for this case.
+	 */
+	if (spe->sample_memory && data_src) {
 		err = arm_spe__synth_mem_sample(speq, spe->memory_id, data_src);
 		if (err)
 			return err;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* Re: [PATCH v5 3/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER
From: Leo Yan @ 2022-04-20  8:23 UTC (permalink / raw)
  To: German Gomez
  Cc: Ali Saidi, linux-kernel, linux-perf-users, linux-arm-kernel, acme,
	benh, Nick.Forrington, alexander.shishkin, andrew.kilroy,
	james.clark, john.garry, jolsa, kjain, lihuafei1, mark.rutland,
	mathieu.poirier, mingo, namhyung, peterz, will
In-Reply-To: <47ec29fe-75ea-62ef-447b-1164569616de@arm.com>

On Mon, Apr 11, 2022 at 03:35:52PM +0100, German Gomez wrote:
> 
> On 11/04/2022 11:26, German Gomez wrote:
> > On 08/04/2022 20:53, Ali Saidi wrote:
> >> Add a flag to the perf mem data struct to signal that a request caused a
> >> cache-to-cache transfer of a line from a peer of the requestor and
> >> wasn't sourced from a lower cache level.  The line being moved from one
> >> peer cache to another has latency and performance implications. On Arm64
> >> Neoverse systems the data source can indicate a cache-to-cache transfer
> >> but not if the line is dirty or clean, so instead of overloading HITM
> >> define a new flag that indicates this type of transfer.
> > I think it's fine to merge this and the previous commit rather than have
> > two commits with the same msg.
> >
> 
> I take it back. It has been pointed out to me that having the separation
> is normal/ok. So my comment doesn't apply.

Yeah, it's good that we split these two patches since it's easier
for maintainer to pick up separately :)

Thanks,
Leo

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v5 2/5] perf: Add SNOOP_PEER flag to perf mem data struct
From: Leo Yan @ 2022-04-20  8:20 UTC (permalink / raw)
  To: Ali Saidi
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, german.gomez,
	acme, benh, Nick.Forrington, alexander.shishkin, andrew.kilroy,
	james.clark, john.garry, jolsa, kjain, lihuafei1, mark.rutland,
	mathieu.poirier, mingo, namhyung, peterz, will
In-Reply-To: <20220408195344.32764-3-alisaidi@amazon.com>

On Fri, Apr 08, 2022 at 07:53:41PM +0000, Ali Saidi wrote:
> Add a flag to the perf mem data struct to signal that a request caused a
> cache-to-cache transfer of a line from a peer of the requestor and
> wasn't sourced from a lower cache level.  The line being moved from one
> peer cache to another has latency and performance implications. On Arm64
> Neoverse systems the data source can indicate a cache-to-cache transfer
> but not if the line is dirty or clean, so instead of overloading HITM
> define a new flag that indicates this type of transfer.
> 
> Signed-off-by: Ali Saidi <alisaidi@amazon.com>

The patch looks good to me:
Reviewed-by: Leo Yan <leo.yan@linaro.org>

Sine this is a common flag, it's better if x86 or PowerPC maintainers
could take a look for this new snooping type.  Thanks!

Leo

> ---
>  include/uapi/linux/perf_event.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 82858b697c05..c9e58c79f3e5 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -1308,7 +1308,7 @@ union perf_mem_data_src {
>  #define PERF_MEM_SNOOP_SHIFT	19
>  
>  #define PERF_MEM_SNOOPX_FWD	0x01 /* forward */
> -/* 1 free */
> +#define PERF_MEM_SNOOPX_PEER	0x02 /* xfer from peer */
>  #define PERF_MEM_SNOOPX_SHIFT  38
>  
>  /* locked instruction */
> -- 
> 2.32.0
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v4 1/2] mtd: rawnand: meson: discard the common MMC sub clock framework
From: Liang Yang @ 2022-04-20  8:19 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: linux-mtd, Rob Herring, Richard Weinberger, Vignesh Raghavendra,
	Jerome Brunet, Neil Armstrong, Martin Blumenstingl, Kevin Hilman,
	Jianxin Pan, Victor Wan, XianWei Zhao, Kelvin Zhang, BiChao Zheng,
	YongHui Yu, linux-arm-kernel, linux-amlogic, linux-kernel,
	devicetree
In-Reply-To: <20220420092912.10ce66ec@xps13>

Hi Miquel,

On 2022/4/20 15:29, Miquel Raynal wrote:
> [ EXTERNAL EMAIL ]
> 
> Hi Liang,
> 
> liang.yang@amlogic.com wrote on Wed, 20 Apr 2022 13:44:32 +0800:
> 
>> Hi Miquel,
>>
>> On 2022/4/19 23:25, Miquel Raynal wrote:
>>> [ EXTERNAL EMAIL ]
>>>
>>> Hello,
>>>
>>> liang.yang@amlogic.com wrote on Tue, 19 Apr 2022 17:17:48 +0800:
>>>    
>>>> Hello Miquel,
>>>>
>>>> On 2022/4/19 16:26, Miquel Raynal wrote:
>>>>> [ EXTERNAL EMAIL ]
>>>>>
>>>>> Hello,
>>>>>
>>>>> liang.yang@amlogic.com wrote on Mon, 18 Apr 2022 11:40:10 +0800:
>>>>>     >>>> Hi Miquel,
>>>>>>
>>>>>> i have some confusion when i prepare the patches. for DT compatibility, it falls back to the old DT when failed to get resource by the new DT, but there is some points:
>>>>>> a. old DT depends on MMC sub clock driver, but it never be merged, so it can't work.
>>>>>
>>>>> I don't get what you mean here, sorry. I believe there is a new way to
>>>>> describe this clock but grabbing the one from the MMC still works, does
>>>>> not it?
>>>>>     >>
>>>> No, it doesn't. after the NFC driver using the MMC sub clock framework was merged into the mainline of kernel, we didn't continue to submit the series of patches about MMC sub clock after v9. when i found that, we made a discussion to decide whether to recover the series of patches about MMC sub clock framework, finally, see the description from cover letter, we plan to abandon it and adopt the new clock scheme in this series of patches.
>>>
>>> I am not sure to follow. Is the current code completely broken? I
>>> believe it is not, so I don't understand your issue.
>>
>> i think only the code about the clock is completely broken.
>>
>>>
>>> Can you please summarize the situation?
>>
>> Yes. the current NFC clock implementation depends on the following series of patches [https://lore.kernel.org/all/20220121074508.42168-5-liang.yang@amlogic.com], which we call "Meson MMC Sub Clock Controller Driver".
>> when i was preparing the NFC patchset at that time, we discussed how the clock should be implemented base on the special clock framework for NFC and EMMC port. then we decided to implement a driver "Meson MMC Sub Clock Controller Driver". so another people begin to prepare "Meson MMC Sub Clock Controller Driver", but submitted it by different patchset.
>> finally, now the meson NFC patchset is accepted and merged, but "Meson MMC Sub Clock Controller Driver" patchset is not. also we decide to abandon the patset "Meson MMC Sub Clock Controller Driver" and implement the new clock design in this series.
> 
> Ok thanks for the summary and the link with the discussion with Jerome
> and Neil, it's informative.
> 
> So in the end, we are not really breaking anything here as this NAND
> controller driver never worked in the first place? Or is it only one of
> the two compatibles which is not working?

i think no one can work now. i am preparing the newer clock framework in 
this series.

> 
> If this never worked then please do the binding changes (in the first
> patch of your series) and then do the necessary changes in the code. If
> this worked with at least one of the two compatibles, then you have to
> create dedicated helpers, one for each, in order to grab the clocks
> differently and not break anybody.

ok, i am changing the bindings and code in this series. thanks for your 
explanation.

> 
>>
>>>    
>>>>
>>>> Thanks.
>>>>   
>>>>>> b. if it falls back to the old DT, beside the regmap lookup below, it seems that we have to preserve the code of the old clock setting in nfc_clk_init().
>>>>>
>>>>> Yes, probably.
>>>>>     >>>> do we still need to avoid break DT compatibility?
>>>>>
>>>>> We should try our best to avoid breaking the DT, yes.
>>>>>     >>>>
>>>>>> Thanks.
>>>>>>
>>>>>> On 2022/4/11 10:40, Liang Yang wrote:
>>>>>>>>>          nfc->dev = dev;
>>>>>>>>> -    res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>>>>>>>>> -    nfc->reg_base = devm_ioremap_resource(dev, res);
>>>>>>>>> +    nfc->reg_base = devm_platform_ioremap_resource_byname(pdev, "nfc");
>>>>>>>>
>>>>>>>> This change seems unrelated.
>>>>>>>
>>>>>>> To be consistent with the following > devm_platform_ioremap_resource_byname(pdev, "emmc"). do you mean that we > don't need it?>
>>>>>>>>>          if (IS_ERR(nfc->reg_base))
>>>>>>>>>              return PTR_ERR(nfc->reg_base);
>>>>>>>>> -    nfc->reg_clk =
>>>>>>>>> -        syscon_regmap_lookup_by_phandle(dev->of_node,
>>>>>>>>> -                        "amlogic,mmc-syscon");
>>>>>>>>> -    if (IS_ERR(nfc->reg_clk)) {
>>>>>>>>> -        dev_err(dev, "Failed to lookup clock base\n");
>>>>>>>>> -        return PTR_ERR(nfc->reg_clk);
>>>>>>>>> -    }
>>>>>>>>> +    nfc->sd_emmc_clock = devm_platform_ioremap_resource_byname(pdev, >>> "emmc");
>>>>>>>>> +    if (IS_ERR(nfc->sd_emmc_clock))
>>>>>>>>> +        return PTR_ERR(nfc->sd_emmc_clock);
>>>>>>>>
>>>>>>>> While I agree this is much better than the previous solution, we cannot
>>>>>>>> break DT compatibility, so you need to try getting the emmc clock, but
>>>>>>>> if it fails you should fallback to the regmap lookup.
>>>>>>>
>>>>>>> ok, i will fix it next version. thanks.
>>>>>>>      >>>>   >>>>>        irq = platform_get_irq(pdev, 0);
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Miquèl
>>>>>
>>>>> .
>>>
>>>
>>> Thanks,
>>> Miquèl
>>>
>>> .
> 
> 
> Thanks,
> Miquèl
> 
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v2 2/5] media: platform: visconti: Add Toshiba Visconti Video Input Interface driver headers
From: Hans Verkuil @ 2022-04-20  8:16 UTC (permalink / raw)
  To: Yuji Ishikawa, Mauro Carvalho Chehab, Nobuhiro Iwamatsu
  Cc: linux-media, linux-arm-kernel, linux-kernel
In-Reply-To: <20220414053528.31460-3-yuji2.ishikawa@toshiba.co.jp>

On 14/04/2022 07:35, Yuji Ishikawa wrote:
> Add support to Video Input Interface on Toshiba Visconti Video Input Interface driver.
> The Video Input Interface includes CSI2 receiver, frame grabber and image signal processor.
> Headers in this commit provide definitions of data-structure and hardware registers.
> 
> Signed-off-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
> Reviewed-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
> ---
> v1 -> v2:
>   - moved driver headers to this patch; to decrease patch size
> ---
>  drivers/media/platform/visconti/hwd_viif.h    |  834 +++++
>  .../platform/visconti/hwd_viif_internal.h     |  361 +++
>  .../media/platform/visconti/hwd_viif_reg.h    | 2802 +++++++++++++++++
>  drivers/media/platform/visconti/viif.h        |  134 +
>  include/uapi/linux/visconti_viif.h            |  356 +++
>  5 files changed, 4487 insertions(+)
>  create mode 100644 drivers/media/platform/visconti/hwd_viif.h
>  create mode 100644 drivers/media/platform/visconti/hwd_viif_internal.h
>  create mode 100644 drivers/media/platform/visconti/hwd_viif_reg.h
>  create mode 100644 drivers/media/platform/visconti/viif.h
>  create mode 100644 include/uapi/linux/visconti_viif.h

<snip>

> diff --git a/include/uapi/linux/visconti_viif.h b/include/uapi/linux/visconti_viif.h
> new file mode 100644
> index 000000000..a235b4d7c
> --- /dev/null
> +++ b/include/uapi/linux/visconti_viif.h
> @@ -0,0 +1,356 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/* Toshiba Visconti Video Capture Support
> + *
> + * (C) Copyright 2022 TOSHIBA CORPORATION
> + * (C) Copyright 2022 Toshiba Electronic Devices & Storage Corporation
> + */
> +
> +#ifndef __UAPI_VISCONTI_VIIF_H_
> +#define __UAPI_VISCONTI_VIIF_H_
> +
> +#include <linux/types.h>
> +#include <linux/videodev2.h>
> +
> +/* Private IPCTLs */

Typo: IPCTLs -> IOCTLs

> +#define VIDIOC_VIIF_MAIN_SET_RAWPACK_MODE                                      \
> +	_IOW('V', BASE_VIDIOC_PRIVATE + 1, uint32_t)
> +#define VIDIOC_VIIF_L2_SET_UNDIST                                              \
> +	_IOW('V', BASE_VIDIOC_PRIVATE + 21, struct viif_l2_undist_config)
> +#define VIDIOC_VIIF_L2_SET_ROI                                                 \
> +	_IOW('V', BASE_VIDIOC_PRIVATE + 22, struct viif_l2_roi_config)
> +#define VIDIOC_VIIF_L2_SET_GAMMA                                               \
> +	_IOW('V', BASE_VIDIOC_PRIVATE + 23, struct viif_l2_gamma_config)
> +#define VIDIOC_VIIF_L2_SET_CROP                                                \
> +	_IOW('V', BASE_VIDIOC_PRIVATE + 24, struct viif_l2_crop_config)
> +#define VIDIOC_VIIF_CSI2RX_SET_MBUS_FMT                                        \
> +	_IOW('V', BASE_VIDIOC_PRIVATE + 25, uint32_t)
> +#define VIDIOC_VIIF_CSI2RX_GET_CALIBRATION_STATUS                              \
> +	_IOR('V', BASE_VIDIOC_PRIVATE + 26,                                    \
> +	     struct viif_csi2rx_dphy_calibration_status)
> +#define VIDIOC_VIIF_CSI2RX_GET_ERR_STATUS                                      \
> +	_IOR('V', BASE_VIDIOC_PRIVATE + 27, struct viif_csi2rx_err_status)
> +#define VIDIOC_VIIF_ISP_GET_LAST_CAPTURE_STATUS                                \
> +	_IOR('V', BASE_VIDIOC_PRIVATE + 28, struct viif_isp_capture_status)

We really don't want to introduce private ioctls in a public API. It's hard to
maintain, and you would need very good reasons to go this route.

A better choice is either using compound controls, as is used by stateless codecs:

https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/ext-ctrls-codec-stateless.html

or streaming metadata, as is done by the rkisp1 driver:

https://linuxtv.org/downloads/v4l-dvb-apis-new/admin-guide/rkisp1.html

In both cases you have to make sure that the data layout is the same regardless
of whether you run on a 32 bit or 64 bit OS. I.e. if the kernel is 64 bit (arm64)
but the application is compiled for 32 bit, then you don't want to have to
convert between the two layouts. The pahole utility is very helpful for checking
this.

Actually, the same issue is present when using private ioctls.

> +
> +/* Enable/Disable flag */
> +#define VIIF_DISABLE (0U)
> +#define VIIF_ENABLE  (1U)
> +
> +/**
> + * enum viif_rawpack_mode - RAW pack mode for ioctl(VIDIOC_VIIF_MAIN_SET_RAWPACK_MODE)
> + *
> + * @VIIF_RAWPACK_DISABLE: RAW pack disable
> + * @VIIF_RAWPACK_MSBFIRST: RAW pack enable (MSB First)
> + * @VIIF_RAWPACK_LSBFIRST: RAW pack enable (LSB First)
> + */
> +enum viif_rawpack_mode {
> +	VIIF_RAWPACK_DISABLE = 0,
> +	VIIF_RAWPACK_MSBFIRST = 2,
> +	VIIF_RAWPACK_LSBFIRST = 3,
> +};
> +
> +/* L2ISP undistortion mode */
> +enum viif_l2_undist_mode {
> +	VIIF_L2_UNDIST_POLY = 0, /* polynomial mode */
> +	VIIF_L2_UNDIST_GRID = 1, /* grid table mode */
> +	VIIF_L2_UNDIST_POLY_TO_GRID = 2, /* polynomial, then grid table mode */
> +	VIIF_L2_UNDIST_GRID_TO_POLY = 3, /* grid table, then polynomial mode */
> +};
> +
> +/**
> + * struct viif_l2_undist - L2ISP UNDIST parameters
> + * for &struct viif_l2_undist_config
> + * @through_mode: 1:enable or 0:disable through mode of undistortion
> + * @roi_mode: :ref:`L2ISP undistortion mode <L2ISP_undistortion_mode>`
> + * @sensor_crop_ofs_h: Horizontal start position of sensor crop area[pixel]
> + *                     [-4296..4296], accuracy: 1/2
> + * @sensor_crop_ofs_v: Vertical start position of sensor crop area[line]
> + *                     [-2360..2360], accuracy: 1/2
> + * @norm_scale: Normalization coefficient for distance from center
> + *              [0..1677721], accuracy: 1/33554432
> + * @valid_r_norm2_poly: Setting target area for polynomial correction
> + *                      [0..0x3FFFFFF], accuracy: 1/33554432
> + * @valid_r_norm2_grid: Setting target area for grid table correction
> + *                      [0..0x3FFFFFF], accuracy: 1/33554432
> + * @roi_write_area_delta: Error adjustment value of forward function and
> + *                        inverse function for pixel position calculation
> + *                        [0..0x7FF], accuracy: 1/1024
> + * @poly_write_g_coef: 10th-order polynomial coefficient for G write pixel position calculation
> + *                     [-2147352576..2147352576], accuracy: 1/131072
> + * @poly_read_b_coef: 10th-order polynomial coefficient for B read pixel position calculation
> + *                    [-2147352576..2147352576], accuracy: 1/131072
> + * @poly_read_g_coef: 10th-order polynomial coefficient for G read pixel position calculation
> + *                    [-2147352576..2147352576], accuracy: 1/131072
> + * @poly_read_r_coef: 10th-order polynomial coefficient for R read pixel position calculation
> + *                    [-2147352576..2147352576], accuracy: 1/131072
> + * @grid_node_num_h: Number of horizontal grids [16..64]
> + * @grid_node_num_v: Number of vertical grids [16..64]
> + * @grid_patch_hsize_inv: Inverse pixel size between horizontal grids
> + *                        [0..0x7FFFFF], accuracy: 1/8388608
> + * @grid_patch_vsize_inv: Inverse pixel size between vertical grids
> + *                        [0..0x7FFFFF], accuracy: 1/8388608
> + */
> +struct viif_l2_undist {
> +	uint32_t through_mode;
> +	uint32_t roi_mode;
> +	int32_t sensor_crop_ofs_h;
> +	int32_t sensor_crop_ofs_v;
> +	uint32_t norm_scale;
> +	uint32_t valid_r_norm2_poly;
> +	uint32_t valid_r_norm2_grid;
> +	uint32_t roi_write_area_delta;
> +	int32_t poly_write_g_coef[11];
> +	int32_t poly_read_b_coef[11];
> +	int32_t poly_read_g_coef[11];
> +	int32_t poly_read_r_coef[11];
> +	uint32_t grid_node_num_h;
> +	uint32_t grid_node_num_v;
> +	uint32_t grid_patch_hsize_inv;
> +	uint32_t grid_patch_vsize_inv;
> +};
> +/**
> + * struct viif_l2_undist_config - L2ISP UNDIST parameters
> + * for :ref:`VIDIOC_VIIF_L2_SET_UNDIST`
> + * @param: &struct viif_l2_undist
> + * @write_g: Write for G Grid table address.
> + *           Table is not transferred if a NULL pointer is set
> + * @read_b: Read for B Grid table address.
> + *          Table is not transferred if a NULL pointer is set
> + * @read_g: Read for G Grid table address.
> + *          Table is not transferred if a NULL pointer is set
> + * @read_r: Read for R Grid table address.
> + *          Table is not transferred if a NULL pointer is set
> + * @size: Table size [byte]. Range: [1024..8192] or 0.
> + *        Should be set to "grid_node_num_h * grid_node_num_v * 4".
> + *        Refer to &struct viif_l2_undist.
> + *        Should set 0 in case NULL is set for all tables.
> + *        Should set size other than 0 in case If other is set in more than one table.
> + *
> + * Application should make sure that the table data is based on HW specification
> + * since this driver does not check the contents of specified grid table.
> + */
> +struct viif_l2_undist_config {
> +	struct viif_l2_undist param;
> +	uint32_t *write_g;
> +	uint32_t *read_b;
> +	uint32_t *read_g;
> +	uint32_t *read_r;

Pointers in a public API are possibly but it really complicates the code.

When using controls this can be done by placing these tables in separate
controls using the upcoming 'Dynamic Array' support.

Patches adding support for that are part of this series:

https://lore.kernel.org/linux-media/20220407152940.738159-1-benjamin.gaignard@collabora.com/T/#t

> +	uint32_t size;
> +};
> +
> +/**
> + * struct viif_l2_roi_config - L2ISP ROI parameters
> + * for :ref:`VIDIOC_VIIF_L2_SET_ROI`
> + * @roi_scale: Scale value for each ROI [32768..131072], accuracy: 1/65536
> + * @roi_scale_inv: Inverse scale value for each ROI [32768..131072], accuracy: 1/65536
> + * @corrected_wo_scale_hsize: Corrected image width for each ROI [pixel] [128..8190]
> + * @corrected_wo_scale_vsize: Corrected image height for each ROI [line] [128..4094]
> + * @corrected_hsize: Corrected and scaled image width for each ROI [pixel] [128..8190]
> + * @corrected_vsize: Corrected and scaled image height for each ROI [line] [128..4094]
> + */
> +struct viif_l2_roi_config {
> +	uint32_t roi_scale;
> +	uint32_t roi_scale_inv;
> +	uint32_t corrected_wo_scale_hsize;
> +	uint32_t corrected_wo_scale_vsize;
> +	uint32_t corrected_hsize;
> +	uint32_t corrected_vsize;
> +};
> +
> +/** enum viif_gamma_mode - Gamma correction mode
> + *
> + * @VIIF_GAMMA_COMPRESSED: compressed table mode
> + * @VIIF_GAMMA_LINEAR: liner table mode

liner -> linear

> + */
> +enum viif_gamma_mode {
> +	VIIF_GAMMA_COMPRESSED = 0,
> +	VIIF_GAMMA_LINEAR = 1,
> +};
> +
> +/**
> + * struct viif_l2_gamma_config - L2ISP gamma correction parameters
> + * for :ref:`VIDIOC_VIIF_L2_SET_GAMMA`
> + * @enable: 1:Enable, 0:Disable settings of L2ISP gamma correction control
> + * @vsplit: Line switching position of first table and second table [line] [0..4094].
> + *          Should set 0 in case 0 is set to @enable
> + * @mode: :ref:`Gamma correction mode <Gamma_correction_mode>`.
> + *        Should set VIIF_GAMMA_COMPRESSED in case 0 is set to @enable
> + * @table: Table address.
> + *         Gamma table is not transferred if a NULL pointer is set to table.
> + *         The size of each table is fixed to 512 bytes.
> + *         [0]: G/Y(1st table), [1]: G/Y(2nd table), [2]: B/U(1st table)
> + *         [3]: B/U(2nd table), [4]: R/V(1st table), [5]: R/V(2nd table)
> + */
> +struct viif_l2_gamma_config {
> +	uint32_t enable;
> +	uint32_t vsplit;
> +	uint32_t mode;
> +	uint16_t *table[6];
> +};
> +
> +/**
> + * struct viif_l2_crop_config - L2ISP Cropping parameters
> + * for :ref:`VIDIOC_VIIF_L2_SET_CROP`
> + * @x: X coordinate position
> + *     (with the upper left corner of the image as the origin)[pixel] [0..8062]
> + * @y: Y coordinate position
> + *     (with the upper left corner of the image as the origin)[Line] [0..3966]
> + * @w: Image width[pixel] [128..8190]
> + * @h: Image height[pixel] [128..4094]
> + */
> +struct viif_l2_crop_config {
> +	uint32_t x;
> +	uint32_t y;
> +	uint32_t w;
> +	uint32_t h;
> +};
> +
> +/**
> + * enum viif_csi2_cal_status - CSI2RX calibration status
> + *
> + * @VIIF_CSI2_CAL_NOT_DONE: Calibration not complete
> + * @VIIF_CSI2_CAL_SUCCESS: Calibration success
> + * @VIIF_CSI2_CAL_FAIL: Calibration failed
> + */
> +enum viif_csi2_cal_status {
> +	VIIF_CSI2_CAL_NOT_DONE = 0,
> +	VIIF_CSI2_CAL_SUCCESS = 1,
> +	VIIF_CSI2_CAL_FAIL = 2,
> +};
> +
> +/**
> + * struct viif_csi2rx_dphy_calibration_status - CSI2-RX D-PHY Calibration
> + * information for :ref:`VIDIOC_VIIF_CSI2RX_GET_CALIBRATION_STATUS`
> + * @term_cal_with_rext: Result of termination calibration with rext
> + * @clock_lane_offset_cal: Result of offset calibration of clock lane
> + * @data_lane0_offset_cal: Result of offset calibration of data lane0
> + * @data_lane1_offset_cal: Result of offset calibration of data lane1
> + * @data_lane2_offset_cal: Result of offset calibration of data lane2
> + * @data_lane3_offset_cal: Result of offset calibration of data lane3
> + * @data_lane0_ddl_tuning_cal: Result of digital delay line tuning calibration of data lane0
> + * @data_lane1_ddl_tuning_cal: Result of digital delay line tuning calibration of data lane1
> + * @data_lane2_ddl_tuning_cal: Result of digital delay line tuning calibration of data lane2
> + * @data_lane3_ddl_tuning_cal: Result of digital delay line tuning calibration of data lane3
> + *
> + * Refer to :ref:`CSI2-RX calibration status <CSI2RX_calibration_status>`
> + * for the definitions of each member
> + */
> +struct viif_csi2rx_dphy_calibration_status {
> +	uint32_t term_cal_with_rext;
> +	uint32_t clock_lane_offset_cal;
> +	uint32_t data_lane0_offset_cal;
> +	uint32_t data_lane1_offset_cal;
> +	uint32_t data_lane2_offset_cal;
> +	uint32_t data_lane3_offset_cal;
> +	uint32_t data_lane0_ddl_tuning_cal;
> +	uint32_t data_lane1_ddl_tuning_cal;
> +	uint32_t data_lane2_ddl_tuning_cal;
> +	uint32_t data_lane3_ddl_tuning_cal;
> +};
> +
> +/**
> + * struct viif_csi2rx_err_status - CSI2RX Error status parameters
> + * for :ref:`VIDIOC_VIIF_CSI2RX_GET_ERR_STATUS`
> + * @err_phy_fatal: D-PHY FATAL error.
> + *                 bit[3]: Start of transmission error on DATA Lane3.
> + *                 bit[2]: Start of transmission error on DATA Lane2.
> + *                 bit[1]: Start of transmission error on DATA Lane1.
> + *                 bit[0]: Start of transmission error on DATA Lane0.
> + * @err_pkt_fatal: Packet FATAL error.
> + *                 bit[16]: Header ECC contains 2 errors, unrecoverable.
> + *                 bit[3]: Checksum error detected on virtual channel 3.
> + *                 bit[2]: Checksum error detected on virtual channel 2.
> + *                 bit[1]: Checksum error detected on virtual channel 1.
> + *                 bit[0]: Checksum error detected on virtual channel 0.
> + * @err_frame_fatal: Frame FATAL error.
> + *                   bit[19]: Last received Frame, in virtual channel 3, has at least one CRC error.
> + *                   bit[18]: Last received Frame, in virtual channel 2, has at least one CRC error.
> + *                   bit[17]: Last received Frame, in virtual channel 1, has at least one CRC error.
> + *                   bit[16]: Last received Frame, in virtual channel 0, has at least one CRC error.
> + *                   bit[11]: Incorrect Frame Sequence detected in virtual channel 3.
> + *                   bit[10]: Incorrect Frame Sequence detected in virtual channel 2.
> + *                   bit[9]: Incorrect Frame Sequence detected in virtual channel 1.
> + *                   bit[8]: Incorrect Frame Sequence detected in virtual channel 0.
> + *                   bit[3]: Error matching Frame Start with Frame End for virtual channel 3.
> + *                   bit[2]: Error matching Frame Start with Frame End for virtual channel 2.
> + *                   bit[1]: Error matching Frame Start with Frame End for virtual channel 1.
> + *                   bit[0]: Error matching Frame Start with Frame End for virtual channel 0.
> + * @err_phy: D-PHY error.
> + *           bit[19]: Escape Entry Error on Data Lane 3.
> + *           bit[18]: Escape Entry Error on Data Lane 2.
> + *           bit[17]: Escape Entry Error on Data Lane 1.
> + *           bit[16]: Escape Entry Error on Data Lane 0.
> + *           bit[3]: Start of Transmission Error on Data Lane 3 (synchronization can still be achieved).
> + *           bit[2]: Start of Transmission Error on Data Lane 2 (synchronization can still be achieved).
> + *           bit[1]: Start of Transmission Error on Data Lane 1 (synchronization can still be achieved).
> + *           bit[0]: Start of Transmission Error on Data Lane 0 (synchronization can still be achieved).
> + * @err_pkt: Packet error.
> + *           bit[19]: Header Error detected and corrected on virtual channel 3.
> + *           bit[18]: Header Error detected and corrected on virtual channel 2.
> + *           bit[17]: Header Error detected and corrected on virtual channel 1.
> + *           bit[16]: Header Error detected and corrected on virtual channel 0.
> + *           bit[3]: Unrecognized or unimplemented data type detected in virtual channel 3.
> + *           bit[2]: Unrecognized or unimplemented data type detected in virtual channel 2.
> + *           bit[1]: Unrecognized or unimplemented data type detected in virtual channel 1.
> + *           bit[0]: Unrecognized or unimplemented data type detected in virtual channel 0.
> + * @err_line: Line error.
> + *            bit[23]: Error in the sequence of lines for vc7 and dt7.
> + *            bit[22]: Error in the sequence of lines for vc6 and dt6.
> + *            bit[21]: Error in the sequence of lines for vc5 and dt5.
> + *            bit[20]: Error in the sequence of lines for vc4 and dt4.
> + *            bit[19]: Error in the sequence of lines for vc3 and dt3.
> + *            bit[18]: Error in the sequence of lines for vc2 and dt2.
> + *            bit[17]: Error in the sequence of lines for vc1 and dt1.
> + *            bit[16]: Error in the sequence of lines for vc0 and dt0.
> + *            bit[7]: Error matching Line Start with Line End for vc7 and dt7.
> + *            bit[6]: Error matching Line Start with Line End for vc6 and dt6.
> + *            bit[5]: Error matching Line Start with Line End for vc5 and dt5.
> + *            bit[4]: Error matching Line Start with Line End for vc4 and dt4.
> + *            bit[3]: Error matching Line Start with Line End for vc3 and dt3.
> + *            bit[2]: Error matching Line Start with Line End for vc2 and dt2.
> + *            bit[1]: Error matching Line Start with Line End for vc1 and dt1.
> + *            bit[0]: Error matching Line Start with Line End for vc0 and dt0.
> + */
> +struct viif_csi2rx_err_status {
> +	uint32_t err_phy_fatal;
> +	uint32_t err_pkt_fatal;
> +	uint32_t err_frame_fatal;
> +	uint32_t err_phy;
> +	uint32_t err_pkt;
> +	uint32_t err_line;
> +};
> +
> +/**
> + * struct viif_l1_info - L1ISP AWB information
> + * for &struct viif_isp_capture_status
> + * @awb_ave_u: U average value of AWB adjustment [pixel]
> + * @awb_ave_v: V average value of AWB adjustment [pixel]
> + * @awb_accumulated_pixel: Accumulated pixel count of AWB adjustment
> + * @awb_gain_r: R gain used in the next frame of AWB adjustment
> + * @awb_gain_g: G gain used in the next frame of AWB adjustment
> + * @awb_gain_b: B gain used in the next frame of AWB adjustment
> + * @awb_status_u: U convergence state of AWB adjustment
> + *                (true: converged, false: not-converged)
> + * @awb_status_v: V convergence state of AWB adjustment
> + *                (true: converged, false: not-converged)
> + */
> +struct viif_l1_info {
> +	uint32_t awb_ave_u;
> +	uint32_t awb_ave_v;
> +	uint32_t awb_accumulated_pixel;
> +	uint32_t awb_gain_r;
> +	uint32_t awb_gain_g;
> +	uint32_t awb_gain_b;
> +	bool awb_status_u;
> +	bool awb_status_v;

bool is not allowed in a userspace API. Use __u32 or something like that instead.

Regards,

	Hans

> +};
> +/**
> + * struct viif_isp_capture_status - L1ISP capture information
> + * for :ref:`VIDIOC_VIIF_ISP_GET_LAST_CAPTURE_STATUS`
> + * @l1_info: L1ISP AWB information. Refer to &struct viif_l1_info
> + */
> +struct viif_isp_capture_status {
> +	struct viif_l1_info l1_info;
> +};
> +
> +#endif /* __UAPI_VISCONTI_VIIF_H_ */


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* [PATCH 5/5] ARM: dts: i.MX51: digi-connectcore-jsk: Use usb-nop-xceiv usbphy for USB1
From: Alexander Shiyan @ 2022-04-20  8:12 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Shawn Guo, Alexander Shiyan
In-Reply-To: <20220420081206.65083-1-eagle.alexander923@gmail.com>

Add node describing the USB1 usbphy.

Signed-off-by: Alexander Shiyan <eagle.alexander923@gmail.com>
---
 arch/arm/boot/dts/imx51-digi-connectcore-jsk.dts | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/imx51-digi-connectcore-jsk.dts b/arch/arm/boot/dts/imx51-digi-connectcore-jsk.dts
index aab8d6f137c3..10cae7c3a879 100644
--- a/arch/arm/boot/dts/imx51-digi-connectcore-jsk.dts
+++ b/arch/arm/boot/dts/imx51-digi-connectcore-jsk.dts
@@ -13,6 +13,13 @@ / {
 	chosen {
 		stdout-path = &uart1;
 	};
+
+	usbphy1: usbphy1 {
+		compatible = "usb-nop-xceiv";
+		clocks = <&clks IMX5_CLK_USB_PHY_GATE>;
+		clock-names = "main_clk";
+		#phy-cells = <0>;
+	};
 };
 
 &esdhc1 {
@@ -63,6 +70,7 @@ &usbotg {
 &usbh1 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&pinctrl_usbh1>;
+	fsl,usbphy = <&usbphy1>;
 	dr_mode = "host";
 	phy_type = "ulpi";
 	disable-over-current;
-- 
2.32.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* [PATCH 4/5] ARM: dts: i.MX51: digi-connectcore-som: Setup usbotg vbus-supply
From: Alexander Shiyan @ 2022-04-20  8:12 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Shawn Guo, Alexander Shiyan
In-Reply-To: <20220420081206.65083-1-eagle.alexander923@gmail.com>

The VBUS on usbotg is connected to the PMIC SWBST, let's reflect
that in the bindings.

Signed-off-by: Alexander Shiyan <eagle.alexander923@gmail.com>
---
 arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi b/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
index 04a47200fc73..f0809a16a2ce 100644
--- a/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
+++ b/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
@@ -184,6 +184,7 @@ &nfc {
 &usbotg {
 	phy_type = "utmi_wide";
 	disable-over-current;
+	vbus-supply = <&swbst_reg>;
 	/* Device role is not known, keep status disabled */
 };
 
-- 
2.32.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* [PATCH 3/5] ARM: dts: i.MX51: digi-connectcore-som: Update PMIC voltages
From: Alexander Shiyan @ 2022-04-20  8:12 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Shawn Guo, Alexander Shiyan
In-Reply-To: <20220420081206.65083-1-eagle.alexander923@gmail.com>

This patch updates the PMIC voltages according to the module's
datasheet to match both commercial and industrial variants of the module.

Signed-off-by: Alexander Shiyan <eagle.alexander923@gmail.com>
---
 arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi b/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
index 20e89cc4c838..04a47200fc73 100644
--- a/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
+++ b/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
@@ -34,22 +34,22 @@ pmic: mc13892@0 {
 
 		regulators {
 			sw1_reg: sw1 {
-				regulator-min-microvolt = <1000000>;
+				regulator-min-microvolt = <1050000>;
 				regulator-max-microvolt = <1100000>;
 				regulator-boot-on;
 				regulator-always-on;
 			};
 
 			sw2_reg: sw2 {
-				regulator-min-microvolt = <1225000>;
-				regulator-max-microvolt = <1225000>;
+				regulator-min-microvolt = <1175000>;
+				regulator-max-microvolt = <1275000>;
 				regulator-boot-on;
 				regulator-always-on;
 			};
 
 			sw3_reg: sw3 {
-				regulator-min-microvolt = <1200000>;
-				regulator-max-microvolt = <1200000>;
+				regulator-min-microvolt = <1150000>;
+				regulator-max-microvolt = <1350000>;
 				regulator-boot-on;
 				regulator-always-on;
 			};
-- 
2.32.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* [PATCH 2/5] ARM: dts: i.MX51: digi-connectcore-som: Remove unused regulators
From: Alexander Shiyan @ 2022-04-20  8:12 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Shawn Guo, Alexander Shiyan
In-Reply-To: <20220420081206.65083-1-eagle.alexander923@gmail.com>

VGEN1, VGEN2 and GPO1 regulators are not used on SOM.
Let's remove these entries.

Signed-off-by: Alexander Shiyan <eagle.alexander923@gmail.com>
---
 arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi b/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
index 7d4970417dce..20e89cc4c838 100644
--- a/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
+++ b/arch/arm/boot/dts/imx51-digi-connectcore-som.dtsi
@@ -102,18 +102,6 @@ vcam_reg: vcam {
 				regulator-always-on;
 			};
 
-			vgen1_reg: vgen1 {
-				regulator-min-microvolt = <1200000>;
-				regulator-max-microvolt = <1200000>;
-				regulator-always-on;
-			};
-
-			vgen2_reg: vgen2 {
-				regulator-min-microvolt = <3150000>;
-				regulator-max-microvolt = <3150000>;
-				regulator-always-on;
-			};
-
 			vgen3_reg: vgen3 {
 				regulator-min-microvolt = <1800000>;
 				regulator-max-microvolt = <1800000>;
@@ -124,8 +112,6 @@ vusb_reg: vusb {
 				regulator-always-on;
 			};
 
-			gpo1_reg: gpo1 { };
-
 			gpo2_reg: gpo2 { };
 
 			gpo3_reg: gpo3 { };
-- 
2.32.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* [PATCH 1/5] ARM: dts: imx51: Add generic DMA bindings for UART nodes
From: Alexander Shiyan @ 2022-04-20  8:12 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Shawn Guo, Alexander Shiyan

Updates UART nodes to adopt generic DMA bindings.

Signed-off-by: Alexander Shiyan <eagle.alexander923@gmail.com>
---
 arch/arm/boot/dts/imx51.dtsi | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/boot/dts/imx51.dtsi b/arch/arm/boot/dts/imx51.dtsi
index 56c8d87864c3..1e20a6639e42 100644
--- a/arch/arm/boot/dts/imx51.dtsi
+++ b/arch/arm/boot/dts/imx51.dtsi
@@ -215,6 +215,8 @@ uart3: serial@7000c000 {
 					clocks = <&clks IMX5_CLK_UART3_IPG_GATE>,
 						 <&clks IMX5_CLK_UART3_PER_GATE>;
 					clock-names = "ipg", "per";
+					dmas = <&sdma 43 5 1>, <&sdma 44 5 2>;
+					dma-names = "rx", "tx";
 					status = "disabled";
 				};
 
@@ -426,6 +428,8 @@ uart1: serial@73fbc000 {
 				clocks = <&clks IMX5_CLK_UART1_IPG_GATE>,
 					 <&clks IMX5_CLK_UART1_PER_GATE>;
 				clock-names = "ipg", "per";
+				dmas = <&sdma 18 4 1>, <&sdma 19 4 2>;
+				dma-names = "rx", "tx";
 				status = "disabled";
 			};
 
@@ -436,6 +440,8 @@ uart2: serial@73fc0000 {
 				clocks = <&clks IMX5_CLK_UART2_IPG_GATE>,
 					 <&clks IMX5_CLK_UART2_PER_GATE>;
 				clock-names = "ipg", "per";
+				dmas = <&sdma 16 4 1>, <&sdma 17 4 2>;
+				dma-names = "rx", "tx";
 				status = "disabled";
 			};
 
-- 
2.32.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* Re: [PATCH] iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()
From: Jean-Philippe Brucker @ 2022-04-20  8:06 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: jgg, will, robin.murphy, joro, jacob.jun.pan, baolu.lu,
	fenghua.yu, rikard.falkeborn, linux-arm-kernel, iommu,
	linux-kernel, stable
In-Reply-To: <20220419210158.21320-1-nicolinc@nvidia.com>

On Tue, Apr 19, 2022 at 02:01:58PM -0700, Nicolin Chen wrote:
> The arm_smmu_mm_invalidate_range function is designed to be called
> by mm core for Shared Virtual Addressing purpose between IOMMU and
> CPU MMU. However, the ways of two subsystems defining their "end"
> addresses are slightly different. IOMMU defines its "end" address
> using the last address of an address range, while mm core defines
> that using the following address of an address range:
> 
> 	include/linux/mm_types.h:
> 		unsigned long vm_end;
> 		/* The first byte after our end address ...
> 
> This mismatch resulted in an incorrect calculation for size so it
> failed to be page-size aligned. Further, it caused a dead loop at
> "while (iova < end)" check in __arm_smmu_tlb_inv_range function.
> 
> This patch fixes the issue by doing the calculation correctly.
> 
> Fixes: 2f7e8c553e98d ("iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops")
> Cc: stable@vger.kernel.org
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

Thanks for the fix, I guess we didn't catch this earlier because our test
platforms didn't support range invalidation, so __arm_smmu_tlb_inv_range()
would always use PAGE_SIZE as increment.

Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> index 22ddd05bbdcd..c623dae1e115 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
> @@ -183,7 +183,14 @@ static void arm_smmu_mm_invalidate_range(struct mmu_notifier *mn,
>  {
>  	struct arm_smmu_mmu_notifier *smmu_mn = mn_to_smmu(mn);
>  	struct arm_smmu_domain *smmu_domain = smmu_mn->domain;
> -	size_t size = end - start + 1;
> +	size_t size;
> +
> +	/*
> +	 * The mm_types defines vm_end as the first byte after the end address,
> +	 * different from IOMMU subsystem using the last address of an address
> +	 * range. So do a simple translation here by calculating size correctly.
> +	 */
> +	size = end - start;
>  
>  	if (!(smmu_domain->smmu->features & ARM_SMMU_FEAT_BTM))
>  		arm_smmu_tlb_inv_range_asid(start, size, smmu_mn->cd->asid,
> -- 
> 2.17.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH] arm64: head: Fix cache inconsistency of the identity-mapped region
From: Ard Biesheuvel @ 2022-04-20  8:05 UTC (permalink / raw)
  To: Shanker Donthineni
  Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Mark Rutland,
	Linux ARM, Linux Kernel Mailing List, Vikram Sethi,
	Thierry Reding, Anshuman Khandual
In-Reply-To: <20220415170504.3781878-1-sdonthineni@nvidia.com>

On Fri, 15 Apr 2022 at 19:05, Shanker Donthineni <sdonthineni@nvidia.com> wrote:
>
> The secondary cores boot is stuck due to data abort while executing the
> instruction 'ldr x8, =__secondary_switched'. The RELA value of this
> instruction was updated by a primary boot core from __relocate_kernel()
> but those memory updates are not visible to CPUs after calling
> switch_to_vhe() causing problem.
>
> The cacheable/shareable attributes of the identity-mapped regions are
> different while CPU executing in EL1 (MMU enabled) and for a short period
> of time in hyp-stub (EL2-MMU disabled). As per the ARM-ARM specification
> (DDI0487G_b), this is not allowed.
>
> G5.10.3 Cache maintenance requirement:
>  "If the change affects the cacheability attributes of the area of memory,
>  including any change between Write-Through and Write-Back attributes,
>  software must ensure that any cached copies of affected locations are
>  removed from the caches, typically by cleaning and invalidating the
>  locations from the levels of cache that might hold copies of the locations
>  affected by the attribute change."
>
> Clean+invalidate the identity-mapped region till PoC before switching to
> VHE world to fix the cache inconsistency.
>
> Problem analysis with disassembly (vmlinux):
>  1) Both __primary_switch() and enter_vhe() are part of the identity region
>  2) RELA entries and enter_vhe() are sharing the same cache line fff800010970480
>  3) Memory ffff800010970484-ffff800010970498 is updated with EL1-MMU enabled
>  4) CPU fetches intrsuctions of enter_vhe() with EL2-MMU disabled
>    - Non-coherent access causing the cache line fff800010970480 drop
>  5) Secondary core executes 'ldr x8, __secondary_switched'
>    - Getting data abort because of the incorrect value at ffff800010970488
>
> ffff800010970418 <__primary_switch>:
> ffff800010970418:  d503245f  bti  c
> ffff80001097041c:  aa0003f3  mov  x19, x0
> ffff800010970420:  d5381014  mrs  x20, sctlr_el1
> ffff800010970424:  90003c81  adrp x1, ffff800011100000 <init_pg_dir>
> ffff800010970428:  97ffffc4  bl   ffff800010970338 <__enable_mmu>
> ffff80001097042c:  97ffffe8  bl   ffff8000109703cc <__relocate_kernel>
> ffff800010970430:  58000308  ldr  x8, ffff800010970490 <__primary_switch+0x78>
> ffff800010970434:  90ffb480  adrp x0, ffff800010000000 <_text>
> ffff800010970438:  d63f0100  blr  x8
> ffff80001097043c:  d5033fdf  isb
> ffff800010970440:  d5181014  msr  sctlr_el1, x20
> ffff800010970444:  d5033fdf  isb
> ffff800010970448:  940f7efe  bl   ffff800010d50040 <__create_page_tables>
> ffff80001097044c:  d508871f  tlbi vmalle1
> ffff800010970450:  d503379f  dsb  nsh
> ffff800010970454:  d5033fdf  isb
> ffff800010970458:  d5181013  msr  sctlr_el1, x19
> ffff80001097045c:  d5033fdf  isb
> ffff800010970460:  d508751f  ic   iallu
> ffff800010970464:  d503379f  dsb  nsh
> ffff800010970468:  d5033fdf  isb
> ffff80001097046c:  97ffffd8  bl   ffff8000109703cc <__relocate_kernel>
> ffff800010970470:  58000108  ldr  x8, ffff800010970490 <__primary_switch+0x78>
> ffff800010970474:  90ffb480  adrp x0, ffff800010000000 <_text>
> ffff800010970478:  d61f0100  br   x8
> ffff80001097047c:  00df10c8  .word   0x00df10c8
> ffff800010970480:  000dfba8  .word   0x000dfba8
>         ...
> ffff800010970498:  d51cd041  msr  tpidr_el2, x1
> ffff80001097049c:  d503201f  nop
>
> ffff8000109704a0 <enter_vhe>:
> ffff8000109704a0:  d508871f  tlbi vmalle1
> ffff8000109704a4:  d503379f  dsb  nsh
> ffff8000109704a8:  d5033fdf  isb
> ffff8000109704ac:  d53d1000  mrs  x0, sctlr_el12
> ffff8000109704b0:  d5181000  msr  sctlr_el1, x0
> ffff8000109704b4:  d5033fdf  isb
> ffff8000109704b8:  d508751f  ic   iallu
> ffff8000109704bc:  d503379f  dsb  nsh
> ffff8000109704c0:  d5033fdf  isb
> ffff8000109704c4:  d2a60a00  mov  x0, #0x30500000
> ffff8000109704c8:  f2810000  movk x0, #0x800
> ffff8000109704cc:  d51d1000  msr  sctlr_el12, x0
> ffff8000109704d0:  aa1f03e0  mov  x0, xzr
> ffff8000109704d4:  d69f03e0  eret
>
> ffff800010961850 <mutate_to_vhe>:
> ffff800010961850: d53c1001   mrs  x1, sctlr_el2
> ffff800010961854: 370001c1   tbnz w1, #0, ffff80001096188c <mutate_to_vhe+0x3c>
> ffff800010961858: d5380721   mrs  x1, id_aa64mmfr1_el1
> ...
> ffff80001096190c: aa010000   orr  x0, x0, x1
> ffff800010961910: d5184000   msr  spsr_el1, x0
> ffff800010961914: 14003ae3   b    ffff8000109704a0 <enter_vhe>
>
> ffff800010970270 <secondary_startup>:
> ffff800010970270: d503245f  bti  c
> ffff800010970274: 97dab23a  bl   ffff80001001cb5c <switch_to_vhe>
> ffff800010970278: 94000049  bl   ffff80001097039c <__cpu_secondary_check52bitva>
> ffff80001097027c: 94000145  bl   ffff800010970790 <__cpu_setup>
> ffff800010970280: 90001e81  adrp x1, ffff800010d40000 <swapper_pg_dir>
> ffff800010970284: 9400002d  bl   ffff800010970338 <__enable_mmu>
> ffff800010970288: 58001008  ldr  x8, ffff800010970488 <__primary_switch+0x70>
> ffff80001097028c: d61f0100  br   x8
>
> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
> ---
>  arch/arm64/kernel/head.S | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 6a98f1a38c29a..b5786163697bb 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -462,6 +462,16 @@ SYM_FUNC_START_LOCAL(__primary_switched)
>         ldp     x29, x30, [sp], #16             // we must enable KASLR, return
>         ret                                     // to __primary_switch()
>  0:
> +#endif
> +#ifdef CONFIG_RELOCATABLE
> +       /*
> +        * Since the RELA entries of the identity-mapped region are updated
> +        * with MMU enabled, clean and invalidate those entries to avoid
> +        * cache inconsistency while accessing with MMU disabled in hyp-stub.
> +        */
> +       adrp    x0, __idmap_text_start
> +       adr_l   x1, __idmap_text_end
> +       bl      dcache_clean_inval_poc
>  #endif
>         bl      switch_to_vhe                   // Prefer VHE if possible
>         ldp     x29, x30, [sp], #16

Thanks for the elaborate report.

I'd prefer to fix this by moving the literal out of the ID map
entirely. Does the below also fix your issue?

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 6a98f1a38c29..97134d6f78ff 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -639,10 +639,15 @@ SYM_FUNC_START_LOCAL(secondary_startup)
        bl      __cpu_setup                     // initialise processor
        adrp    x1, swapper_pg_dir
        bl      __enable_mmu
-       ldr     x8, =__secondary_switched
+       ldr_l   x8, .L__secondary_switched
        br      x8
 SYM_FUNC_END(secondary_startup)

+       .pushsection ".rodata", "a", %progbits
+.L__secondary_switched:
+       .quad   __secondary_switched
+       .popsection
+
 SYM_FUNC_START_LOCAL(__secondary_switched)
        adr_l   x5, vectors
        msr     vbar_el1, x5

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related

* Re: [RESEND PATCH] soc: mediatek: cmdq: Use mailbox rx_callback instead of cmdq_task_cb
From: Jason-JH Lin @ 2022-04-20  8:02 UTC (permalink / raw)
  To: Chun-Kuang Hu, Matthias Brugger, linux-arm-kernel, linux-mediatek,
	linux-kernel
  Cc: roy-cw.yeh, moudy.ho, nancy.lin, singo.chang
In-Reply-To: <1650102868-26219-1-git-send-email-chunkuang.hu@kernel.org>

Hi Chun-Kuang,

Reviewed-by: jason-jh.lin <jason-jh.lin@mediatek.com>
and
Tested-by: jason-jh.lin <jason-jh.lin@mediatek.com>


Hi Roy, Moudy,

Please aware of this modification for MDP on-going patch.

You can refer to this patch on DRM:

https://patchwork.kernel.org/project/linux-mediatek/patch/20211028101912.4624-2-jason-jh.lin@mediatek.com/


Thanks.

Regard,
Jason-JH.Lin

On Sat, 2022-04-16 at 17:54 +0800, Chun-Kuang Hu wrote:
> rx_callback is a standard mailbox callback mechanism and could cover
> the
> function of proprietary cmdq_task_cb, so use the standard one instead
> of
> the proprietary one. Client has changed to use the standard callback
> machanism and sync dma buffer in client driver, so remove the
> proprietary
> callback in cmdq helper.
> 
> Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
> ---
>  drivers/soc/mediatek/mtk-cmdq-helper.c | 25 +-----------------------
> -
>  include/linux/soc/mediatek/mtk-cmdq.h  |  5 +----
>  2 files changed, 2 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/soc/mediatek/mtk-cmdq-helper.c
> b/drivers/soc/mediatek/mtk-cmdq-helper.c
> index 3c8e421..c1837a4 100644
> --- a/drivers/soc/mediatek/mtk-cmdq-helper.c
> +++ b/drivers/soc/mediatek/mtk-cmdq-helper.c
> @@ -425,34 +425,11 @@ int cmdq_pkt_finalize(struct cmdq_pkt *pkt)
>  }
>  EXPORT_SYMBOL(cmdq_pkt_finalize);
>  
> -static void cmdq_pkt_flush_async_cb(struct cmdq_cb_data data)
> -{
> -	struct cmdq_pkt *pkt = (struct cmdq_pkt *)data.data;
> -	struct cmdq_task_cb *cb = &pkt->cb;
> -	struct cmdq_client *client = (struct cmdq_client *)pkt->cl;
> -
> -	dma_sync_single_for_cpu(client->chan->mbox->dev, pkt->pa_base,
> -				pkt->cmd_buf_size, DMA_TO_DEVICE);
> -	if (cb->cb) {
> -		data.data = cb->data;
> -		cb->cb(data);
> -	}
> -}
> -
> -int cmdq_pkt_flush_async(struct cmdq_pkt *pkt, cmdq_async_flush_cb
> cb,
> -			 void *data)
> +int cmdq_pkt_flush_async(struct cmdq_pkt *pkt)
>  {
>  	int err;
>  	struct cmdq_client *client = (struct cmdq_client *)pkt->cl;
>  
> -	pkt->cb.cb = cb;
> -	pkt->cb.data = data;
> -	pkt->async_cb.cb = cmdq_pkt_flush_async_cb;
> -	pkt->async_cb.data = pkt;
> -
> -	dma_sync_single_for_device(client->chan->mbox->dev, pkt-
> >pa_base,
> -				   pkt->cmd_buf_size, DMA_TO_DEVICE);
> -
>  	err = mbox_send_message(client->chan, pkt);
>  	if (err < 0)
>  		return err;
> diff --git a/include/linux/soc/mediatek/mtk-cmdq.h
> b/include/linux/soc/mediatek/mtk-cmdq.h
> index ac6b5f3..2b498f4 100644
> --- a/include/linux/soc/mediatek/mtk-cmdq.h
> +++ b/include/linux/soc/mediatek/mtk-cmdq.h
> @@ -268,8 +268,6 @@ int cmdq_pkt_finalize(struct cmdq_pkt *pkt);
>   * cmdq_pkt_flush_async() - trigger CMDQ to asynchronously execute
> the CMDQ
>   *                          packet and call back at the end of done
> packet
>   * @pkt:	the CMDQ packet
> - * @cb:		called at the end of done packet
> - * @data:	this data will pass back to cb
>   *
>   * Return: 0 for success; else the error code is returned
>   *
> @@ -277,7 +275,6 @@ int cmdq_pkt_finalize(struct cmdq_pkt *pkt);
>   * at the end of done packet. Note that this is an ASYNC function.
> When the
>   * function returned, it may or may not be finished.
>   */
> -int cmdq_pkt_flush_async(struct cmdq_pkt *pkt, cmdq_async_flush_cb
> cb,
> -			 void *data);
> +int cmdq_pkt_flush_async(struct cmdq_pkt *pkt);
>  
>  #endif	/* __MTK_CMDQ_H__ */
-- 
Jason-JH Lin <jason-jh.lin@mediatek.com>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v2 1/5] dt-bindings: media: platform: visconti: Add Toshiba Visconti Video Input Interface bindings
From: Hans Verkuil @ 2022-04-20  7:56 UTC (permalink / raw)
  To: Yuji Ishikawa, Mauro Carvalho Chehab, Nobuhiro Iwamatsu
  Cc: linux-media, linux-arm-kernel, linux-kernel
In-Reply-To: <20220414053528.31460-2-yuji2.ishikawa@toshiba.co.jp>

On 14/04/2022 07:35, Yuji Ishikawa wrote:
> Adds the Device Tree binding documentation that allows to describe
> the Video Input Interface found in Toshiba Visconti SoCs.
> 
> Signed-off-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
> Reviewed-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
> ---
>  .../bindings/media/toshiba,visconti-viif.yaml | 103 ++++++++++++++++++
>  1 file changed, 103 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/toshiba,visconti-viif.yaml
> 
> diff --git a/Documentation/devicetree/bindings/media/toshiba,visconti-viif.yaml b/Documentation/devicetree/bindings/media/toshiba,visconti-viif.yaml
> new file mode 100644

You need to CC this series to devicetree@vger.kernel.org so that the device tree reviewers
can take a look at this.

Regards,

	Hans

> index 000000000..848ea5019
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/media/toshiba,visconti-viif.yaml
> @@ -0,0 +1,103 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/media/toshiba,visconti-viif.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Toshiba Visconti5 SoC Video Input Interface Device Tree Bindings
> +
> +maintainers:
> +  - Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
> +
> +description: |
> +  Toshiba Visconti5 SoC Video Input Interface (VIIF) receives MIPI CSI2 video stream,
> +  processes the stream with embedded image signal processor (L1ISP, L2ISP), then stores pictures to main memory.
> +
> +properties:
> +  compatible:
> +    const: toshiba,visconti-viif
> +
> +  reg:
> +    items:
> +      - description: registers for capture control
> +      - description: registers for CSI2 receiver control
> +
> +  interrupts:
> +    items:
> +      - description: Sync Interrupt
> +      - description: Status (Error) Interrupt
> +      - description: CSI2 Receiver Interrupt
> +      - description: L1ISP Interrupt
> +
> +  index:
> +    enum: [0, 1]
> +
> +  port:
> +    $ref: /schemas/graph.yaml#/$defs/port-base
> +    unevaluatedProperties: false
> +    description: Input port node, single endpoint describing the CSI-2 transmitter.
> +
> +    properties:
> +      endpoint:
> +        $ref: video-interfaces.yaml#
> +        unevaluatedProperties: false
> +
> +        properties:
> +          data-lanes:
> +            description: VIIF supports 2 or 4 data lines
> +            items:
> +              minItems: 1
> +              maxItems: 4
> +              items:
> +                - const: 1
> +                - const: 2
> +                - const: 3
> +                - const: 4
> +          clock-lanes:
> +            description: VIIF supports 1 clock line
> +            const: 0
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupts
> +  - port
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    #include <dt-bindings/interrupt-controller/arm-gic.h>
> +    #include <dt-bindings/interrupt-controller/irq.h>
> +
> +    soc {
> +        #address-cells = <2>;
> +        #size-cells = <2>;
> +
> +        viif0: viif@1c000000 {
> +            compatible = "toshiba,visconti-viif";
> +            reg = <0 0x1c000000 0 0x6000>,
> +                  <0 0x1c008000 0 0x400>;
> +            interrupts = <GIC_SPI 64 IRQ_TYPE_LEVEL_HIGH>,
> +                         <GIC_SPI 67 IRQ_TYPE_LEVEL_HIGH>,
> +                         <GIC_SPI 73 IRQ_TYPE_LEVEL_HIGH>,
> +                         <GIC_SPI 76 IRQ_TYPE_LEVEL_HIGH>;
> +            index = <0>;
> +            status = "disabled";
> +
> +            port {
> +                #address-cells = <1>;
> +                #size-cells = <0>;
> +
> +                csi_in0: endpoint {
> +                    remote-endpoint = <&imx219_out0>;
> +                    bus-type = <4>;
> +                    data-lanes = <1 2>;
> +                    clock-lanes = <0>;
> +                    clock-noncontinuous;
> +                    link-frequencies = /bits/ 64 <456000000>;
> +                };
> +            };
> +        };
> +    };
> +


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* RE: [PATCH 1/2] ACPI/AEST: Initial AEST driver
From: ishii.shuuichir @ 2022-04-20  7:54 UTC (permalink / raw)
  To: 'Tyler Baicar', 'Tyler Baicar',
	patches@amperecomputing.com, abdulhamid@os.amperecomputing.com,
	darren@os.amperecomputing.com, catalin.marinas@arm.com,
	will@kernel.org, maz@kernel.org, james.morse@arm.com,
	alexandru.elisei@arm.com, suzuki.poulose@arm.com,
	lorenzo.pieralisi@arm.com, guohanjun@huawei.com,
	sudeep.holla@arm.com, rafael@kernel.org, lenb@kernel.org,
	tony.luck@intel.com, bp@alien8.de, mark.rutland@arm.com,
	anshuman.khandual@arm.com, vincenzo.frascino@arm.com,
	tabba@google.com, marcan@marcan.st, keescook@chromium.org,
	jthierry@redhat.com, masahiroy@kernel.org,
	samitolvanen@google.com, john.garry@huawei.com,
	daniel.lezcano@linaro.org, gor@linux.ibm.com,
	zhangshaokun@hisilicon.com, tmricht@linux.ibm.com,
	dchinner@redhat.com, tglx@linutronix.de,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.cs.columbia.edu, linux-acpi@vger.kernel.org,
	linux-edac@vger.kernel.org, Vineeth.Pillai@microsoft.com
  Cc: ishii.shuuichir@fujitsu.com
In-Reply-To: <9330bbfb-d016-0283-a5ed-e2f4d5446759@amperemail.onmicrosoft.com>

Hi, Tyler.

When do you plan to post the v2 patch series?
Please let me know if you don't mind.

Best regards.

> -----Original Message-----
> From: Tyler Baicar <baicar@amperemail.onmicrosoft.com>
> Sent: Friday, December 17, 2021 8:33 AM
> To: Ishii, Shuuichirou/石井 周一郎 <ishii.shuuichir@fujitsu.com>; 'Tyler Baicar'
> <baicar@os.amperecomputing.com>; patches@amperecomputing.com;
> abdulhamid@os.amperecomputing.com; darren@os.amperecomputing.com;
> catalin.marinas@arm.com; will@kernel.org; maz@kernel.org;
> james.morse@arm.com; alexandru.elisei@arm.com; suzuki.poulose@arm.com;
> lorenzo.pieralisi@arm.com; guohanjun@huawei.com; sudeep.holla@arm.com;
> rafael@kernel.org; lenb@kernel.org; tony.luck@intel.com; bp@alien8.de;
> mark.rutland@arm.com; anshuman.khandual@arm.com;
> vincenzo.frascino@arm.com; tabba@google.com; marcan@marcan.st;
> keescook@chromium.org; jthierry@redhat.com; masahiroy@kernel.org;
> samitolvanen@google.com; john.garry@huawei.com; daniel.lezcano@linaro.org;
> gor@linux.ibm.com; zhangshaokun@hisilicon.com; tmricht@linux.ibm.com;
> dchinner@redhat.com; tglx@linutronix.de; linux-kernel@vger.kernel.org;
> linux-arm-kernel@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-acpi@vger.kernel.org; linux-edac@vger.kernel.org;
> Vineeth.Pillai@microsoft.com
> Subject: Re: [PATCH 1/2] ACPI/AEST: Initial AEST driver
> 
> Hi Shuuichirou,
> 
> Thank you for your feedback!
> 
> On 12/9/2021 3:10 AM, ishii.shuuichir@fujitsu.com wrote:
> > Hi, Tyler.
> >
> > We would like to make a few comments.
> >
> >> -----Original Message-----
> >> From: Tyler Baicar <baicar@os.amperecomputing.com>
> >> Sent: Thursday, November 25, 2021 2:07 AM
> >> To: patches@amperecomputing.com; abdulhamid@os.amperecomputing.com;
> >> darren@os.amperecomputing.com; catalin.marinas@arm.com;
> >> will@kernel.org; maz@kernel.org; james.morse@arm.com;
> >> alexandru.elisei@arm.com; suzuki.poulose@arm.com;
> >> lorenzo.pieralisi@arm.com; guohanjun@huawei.com;
> >> sudeep.holla@arm.com; rafael@kernel.org; lenb@kernel.org;
> >> tony.luck@intel.com; bp@alien8.de; mark.rutland@arm.com;
> >> anshuman.khandual@arm.com; vincenzo.frascino@arm.com;
> >> tabba@google.com; marcan@marcan.st; keescook@chromium.org;
> >> jthierry@redhat.com; masahiroy@kernel.org; samitolvanen@google.com;
> >> john.garry@huawei.com; daniel.lezcano@linaro.org; gor@linux.ibm.com;
> >> zhangshaokun@hisilicon.com; tmricht@linux.ibm.com;
> >> dchinner@redhat.com; tglx@linutronix.de;
> >> linux-kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> >> kvmarm@lists.cs.columbia.edu; linux-acpi@vger.kernel.org;
> >> linux-edac@vger.kernel.org; Ishii, Shuuichirou/石井
> >> 周一郎 <ishii.shuuichir@fujitsu.com>; Vineeth.Pillai@microsoft.com
> >> Cc: Tyler Baicar <baicar@os.amperecomputing.com>
> >> Subject: [PATCH 1/2] ACPI/AEST: Initial AEST driver
> >>
> >> Add support for parsing the ARM Error Source Table and basic handling
> >> of errors reported through both memory mapped and system register
> interfaces.
> >>
> >> Assume system register interfaces are only registered with private
> >> peripheral interrupts (PPIs); otherwise there is no guarantee the
> >> core handling the error is the core which took the error and has the
> >> syndrome info in its system registers.
> >>
> >> Add logging for all detected errors and trigger a kernel panic if
> >> there is any uncorrected error present.
> >>
> >> Signed-off-by: Tyler Baicar <baicar@os.amperecomputing.com>
> >> ---
> > [...]
> >
> >> +static int __init aest_init_node(struct acpi_aest_hdr *node) {
> >> +	union acpi_aest_processor_data *proc_data;
> >> +	union aest_node_spec *node_spec;
> >> +	struct aest_node_data *data;
> >> +	int ret;
> >> +
> >> +	data = kzalloc(sizeof(struct aest_node_data), GFP_KERNEL);
> >> +	if (!data)
> >> +		return -ENOMEM;
> >> +
> >> +	data->node_type = node->type;
> >> +
> >> +	node_spec = ACPI_ADD_PTR(union aest_node_spec, node,
> >> node->node_specific_offset);
> >> +
> >> +	switch (node->type) {
> >> +	case ACPI_AEST_PROCESSOR_ERROR_NODE:
> >> +		memcpy(&data->data, node_spec, sizeof(struct
> >> acpi_aest_processor));
> >> +		break;
> >> +	case ACPI_AEST_MEMORY_ERROR_NODE:
> >> +		memcpy(&data->data, node_spec, sizeof(struct
> >> acpi_aest_memory));
> >> +		break;
> >> +	case ACPI_AEST_SMMU_ERROR_NODE:
> >> +		memcpy(&data->data, node_spec, sizeof(struct
> >> acpi_aest_smmu));
> >> +		break;
> >> +	case ACPI_AEST_VENDOR_ERROR_NODE:
> >> +		memcpy(&data->data, node_spec, sizeof(struct
> >> acpi_aest_vendor));
> >> +		break;
> >> +	case ACPI_AEST_GIC_ERROR_NODE:
> >> +		memcpy(&data->data, node_spec, sizeof(struct
> >> acpi_aest_gic));
> >> +		break;
> >> +	default:
> >> +		kfree(data);
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	if (node->type == ACPI_AEST_PROCESSOR_ERROR_NODE) {
> >> +		proc_data = ACPI_ADD_PTR(union acpi_aest_processor_data,
> >> node_spec,
> >> +					 sizeof(acpi_aest_processor));
> >> +
> >> +		switch (data->data.processor.resource_type) {
> >> +		case ACPI_AEST_CACHE_RESOURCE:
> >> +			memcpy(&data->proc_data, proc_data,
> >> +			       sizeof(struct acpi_aest_processor_cache));
> >> +			break;
> >> +		case ACPI_AEST_TLB_RESOURCE:
> >> +			memcpy(&data->proc_data, proc_data,
> >> +			       sizeof(struct acpi_aest_processor_tlb));
> >> +			break;
> >> +		case ACPI_AEST_GENERIC_RESOURCE:
> >> +			memcpy(&data->proc_data, proc_data,
> >> +			       sizeof(struct acpi_aest_processor_generic));
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	ret = aest_init_interface(node, data);
> >> +	if (ret) {
> >> +		kfree(data);
> >> +		return ret;
> >> +	}
> >> +
> >> +	return aest_init_interrupts(node, data);
> > If aest_init_interrupts() failed, is it necessary to release the data
> > pointer acquired by kzalloc?
> aest_init_interrupts() returns an error if any of the interrupts in the interrupt list
> fails, but it's possible that some interrupts in the list registered successfully. So
> we attempt to keep chugging along in that scenario because some interrupts may
> be enabled and registered with the interface successfully. If any interrupt
> registration fails, there will be a print notifying that there was a failure when
> initializing that node.
> >> +}
> >> +
> >> +static void aest_count_ppi(struct acpi_aest_hdr *node)
> >> +{
> >> +	struct acpi_aest_node_interrupt *interrupt;
> >> +	int i;
> >> +
> >> +	interrupt = ACPI_ADD_PTR(struct acpi_aest_node_interrupt, node,
> >> +				 node->node_interrupt_offset);
> >> +
> >> +	for (i = 0; i < node->node_interrupt_count; i++, interrupt++) {
> >> +		if (interrupt->gsiv >= 16 && interrupt->gsiv < 32)
> >> +			num_ppi++;
> >> +	}
> >> +}
> >> +
> >> +static int aest_starting_cpu(unsigned int cpu)
> >> +{
> >> +	int i;
> >> +
> >> +	for (i = 0; i < num_ppi; i++)
> >> +		enable_percpu_irq(ppi_irqs[i], IRQ_TYPE_NONE);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int aest_dying_cpu(unsigned int cpu)
> >> +{
> > Wouldn't it be better to execute disable_percpu_irq(), which is paired
> > with enable_percpu_irq(), in aest_dying_cpu()?
> 
> Good point. I will add that in the next version.
> 
> Thanks,
> 
> Tyler

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH v2 0/5] Visconti: Add Toshiba Visconti Video Input Interface driver
From: Hans Verkuil @ 2022-04-20  7:54 UTC (permalink / raw)
  To: Yuji Ishikawa, Mauro Carvalho Chehab, Nobuhiro Iwamatsu,
	Laurent Pinchart
  Cc: linux-media, linux-arm-kernel, linux-kernel
In-Reply-To: <20220414053528.31460-1-yuji2.ishikawa@toshiba.co.jp>

Hi Yuji,

On 14/04/2022 07:35, Yuji Ishikawa wrote:
> This series is the Video Input Interface driver for Toshiba's ARM SoC, Visconti[0].
> This provides DT binding documentation, device driver, MAINTAINER fiels.
> 
> Best regards,
> Yuji
> 
> [0]: https://toshiba.semicon-storage.com/ap-en/semiconductor/product/image-recognition-processors-visconti.html
> 
> 
>   dt-bindings: media: platform: visconti: Add Toshiba Visconti Video Input Interface bindings
>     v1 -> v2:
>       - No update
> 
>   media: platform: visconti: Add Toshiba Visconti Video Input Interface driver headers
>     v1 -> v2:
>       - moved driver headers to an individual patch
> 
>   media: platform: visconti: Add Toshiba Visconti Video Input Interface driver body
>     v1 -> v2:
>       - moved driver sources to an individual patch
>    
>   media: platform: visconti: Add Toshiba VIIF image signal processor driver
>     v1 -> v2:
>       - moved image signal processor driver to an individual patch
> 
>   MAINTAINERS: Add entries for Toshiba Visconti Video Input Interface
>     v1 -> v2:
>       - No update
> 
> Change in V2:
>   - moved files into individual patches to decrease patch size

Thank you for your patch series.

However, there are quite a few things that need more work. I'll make some
high level guidelines here, and go into a bit more detail in some of the
patches.

First of all, run your patches through 'scripts/checkpatch.pl --strict' and
fix the many warnings, errors and checks. Use common sense, sometimes a
check or warning isn't actually valid, but the vast majority of what
checkpatch spits out appears reasonable.

Another thing I noticed is code like this:

+		if (param->r_cr_in_offset > HWD_VIIF_CSC_MAX_OFFSET)
+			return -EINVAL;
+
+		if (param->g_y_in_offset > HWD_VIIF_CSC_MAX_OFFSET)
+			return -EINVAL;
+
+		if (param->b_cb_in_offset > HWD_VIIF_CSC_MAX_OFFSET)
+			return -EINVAL;
+
+		if (param->r_cr_out_offset > HWD_VIIF_CSC_MAX_OFFSET)
+			return -EINVAL;
+
+		if (param->g_y_out_offset > HWD_VIIF_CSC_MAX_OFFSET)
+			return -EINVAL;
+
+		if (param->b_cb_out_offset > HWD_VIIF_CSC_MAX_OFFSET)
+			return -EINVAL;

This can easily be combined into a single if:

		if (param->r_cr_in_offset > HWD_VIIF_CSC_MAX_OFFSET ||
		    param->g_y_in_offset > HWD_VIIF_CSC_MAX_OFFSET ||
		    param->b_cb_in_offset > HWD_VIIF_CSC_MAX_OFFSET ||
		    param->r_cr_out_offset > HWD_VIIF_CSC_MAX_OFFSET ||
		    param->g_y_out_offset > HWD_VIIF_CSC_MAX_OFFSET ||
		    param->b_cb_out_offset > HWD_VIIF_CSC_MAX_OFFSET)
			return -EINVAL;

Easier to read and a lot shorter.

Another thing to avoid is mixing lower and upper case in function names.
A lot of functions have this prefix: 'hwd_VIIF_'. Just change that to
'hwd_viif_': that's much easier on the eyes.

I also see a fair amount of code that is indented very far to the right.
Often due to constructs like this:

	if (test) {
		// lots of code
	}
	return ret;

Which can be changed to:

	if (!test)
		return ret;
	// lots of code
	return ret;

The same can also happen in a for/while loop where you can just 'continue'
instead of 'return'.

This makes the code easier to read and review.

It doesn't look like this driver uses the media controller API. This is
probably something you want to look into, esp. in combination with libcamera
support (https://libcamera.org/). I've added Laurent to this, since he's
the expert on this.

Regards,

	Hans

> 
> Yuji Ishikawa (5):
>   dt-bindings: media: platform: visconti: Add Toshiba Visconti Video
>     Input Interface bindings
>   media: platform: visconti: Add Toshiba Visconti Video Input Interface
>     driver headers
>   media: platform: visconti: Add Toshiba Visconti Video Input Interface
>     driver body
>   media: platform: visconti: Add Toshiba VIIF image signal processor
>     driver
>   MAINTAINERS: Add entries for Toshiba Visconti Video Input Interface
> 
>  .../bindings/media/toshiba,visconti-viif.yaml |  103 +
>  MAINTAINERS                                   |    2 +
>  drivers/media/platform/Kconfig                |    2 +
>  drivers/media/platform/Makefile               |    4 +
>  drivers/media/platform/visconti/Kconfig       |    9 +
>  drivers/media/platform/visconti/Makefile      |    9 +
>  drivers/media/platform/visconti/hwd_viif.c    | 2233 ++++++++++
>  drivers/media/platform/visconti/hwd_viif.h    | 1776 ++++++++
>  .../media/platform/visconti/hwd_viif_csi2rx.c |  767 ++++
>  .../platform/visconti/hwd_viif_internal.h     |  361 ++
>  .../media/platform/visconti/hwd_viif_l1isp.c  | 3769 +++++++++++++++++
>  .../media/platform/visconti/hwd_viif_reg.h    | 2802 ++++++++++++
>  drivers/media/platform/visconti/viif.c        | 2384 +++++++++++
>  drivers/media/platform/visconti/viif.h        |  134 +
>  include/uapi/linux/visconti_viif.h            | 1683 ++++++++
>  15 files changed, 16038 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/toshiba,visconti-viif.yaml
>  create mode 100644 drivers/media/platform/visconti/Kconfig
>  create mode 100644 drivers/media/platform/visconti/Makefile
>  create mode 100644 drivers/media/platform/visconti/hwd_viif.c
>  create mode 100644 drivers/media/platform/visconti/hwd_viif.h
>  create mode 100644 drivers/media/platform/visconti/hwd_viif_csi2rx.c
>  create mode 100644 drivers/media/platform/visconti/hwd_viif_internal.h
>  create mode 100644 drivers/media/platform/visconti/hwd_viif_l1isp.c
>  create mode 100644 drivers/media/platform/visconti/hwd_viif_reg.h
>  create mode 100644 drivers/media/platform/visconti/viif.c
>  create mode 100644 drivers/media/platform/visconti/viif.h
>  create mode 100644 include/uapi/linux/visconti_viif.h
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* [PATCH v4 6/6] drivers: remoteproc: Add Xilinx r5 remoteproc driver
From: Tanmay Shah @ 2022-04-20  7:44 UTC (permalink / raw)
  To: mathieu.poirier, bjorn.andersson
  Cc: michal.simek, robh+dt, bill.mills, linux-remoteproc, devicetree,
	linux-kernel, linux-arm-kernel
In-Reply-To: <20220420074450.2034092-1-tanmay.shah@xilinx.com>

This driver enables r5f dual core Real time Processing Unit subsystem
available on Xilinx Zynq Ultrascale MPSoC Platform. RPU subsystem
(cluster) can be configured in different modes e.g. split mode in which
two r5f cores work independent of each other and lock-step mode in which
both r5f cores execute same code clock-for-clock and notify if the
result is different.

The Xilinx r5 Remoteproc Driver boots the RPU cores via calls to the Xilinx
Platform Management Unit that handles the R5 configuration, memory access
and R5 lifecycle management. The interface to this manager is done in this
driver via zynqmp_pm_* function calls.

Signed-off-by: Ben Levinsky <ben.levinsky@xilinx.com>
Signed-off-by: Tanmay Shah <tanmay.shah@xilinx.com>
---

Changes in v4:
  - Remove redundant header files
  - use dev_err_probe() to report errors during probe
  - Fix missing check on error code returned by zynqmp_r5_add_rproc_core()
  - Fix memory leaks all over the driver when resource allocation fails for any core
  - make cluster mode check only at one place
  - remove redundant initialization of variable
  - remove redundant use of of_node_put() 
  - Fix Comment format problem
  - Assign offset of zynqmp_tcm_banks instead of duplicating it
  - Add tcm and memory regions rproc carveouts during prepare instead of parse_fw
  - Remove rproc_mem_entry object from r5_core
  - Use put_device() and rproc_del() APIs to fix memory leaks
  - Replace pr_* with dev_*. This was missed in v3, fix now.
  - Use "GPL" instead of "GPL v2" in MODULE_LICENSE macro. This was suggested by checkpatch script.

Changes in v3:
  - Fix checkpatch script indentation warning
  - Remove unused variable from xilinx remoteproc driver
  - use C style comments, i.e /*...*/
  - Remove redundant debug information which can be derived using /proc/device-tree
  - Fix multilined comment format
  - s/"final fot TCM"/"final for TCM"
  - Function devm_kzalloc() does not return an code on error, just NULL.
    Remove redundant error check for this function throughout the driver.
  - Fix RPU mode configuration and add documentation accordingly
  - Get rid of the indentations to match function documentation style with rest of the driver
  - Fix memory leak by only using r5_rproc->priv and not replace it with new instance
  - Use 'i' for the outer loop and 'j' for the inner one as per convention
  - Remove redundant error and NULL checks throughout the driver
  - Use devm_kcalloc() when more than one element is required
  - Add memory-regions carveouts during driver probe instead of parse_fw call
    This removes redundant copy of reserved_mem object in r5_core structure.
  - Fix memory leak by using of_node_put()
  - Fix indentation of tcm_mem_map function args
  - Remove redundant init of variables
  - Initialize tcm bank size variable for lockstep mode
  - Replace u32 with phys_addr_t for variable stroing memory bank address
  - Add documentation of TCM behavior in lockstep mode
  - Use dev_get_drvdata instead of platform driver API
  - Remove info level messages
  - Fix checkpatch.pl warnings
  - Add documentation for the Xilinx r5f platform to understand driver design

 drivers/remoteproc/Kconfig              |   12 +
 drivers/remoteproc/Makefile             |    1 +
 drivers/remoteproc/xlnx_r5_remoteproc.c | 1045 +++++++++++++++++++++++
 3 files changed, 1058 insertions(+)
 create mode 100644 drivers/remoteproc/xlnx_r5_remoteproc.c

diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
index 166019786653..5637a71c0677 100644
--- a/drivers/remoteproc/Kconfig
+++ b/drivers/remoteproc/Kconfig
@@ -352,6 +352,18 @@ config TI_K3_R5_REMOTEPROC
 	  It's safe to say N here if you're not interested in utilizing
 	  a slave processor.
 
+config XLNX_R5_REMOTEPROC
+	tristate "Xilinx R5 remoteproc support"
+	depends on PM && ARCH_ZYNQMP
+	depends on ZYNQMP_FIRMWARE
+	select RPMSG_VIRTIO
+	select ZYNQMP_IPI_MBOX
+	help
+	  Say y or m here to support Xilinx R5 remote processors via the remote
+	  processor framework.
+
+	  It's safe to say N if not interested in using RPU r5f cores.
+
 endif # REMOTEPROC
 
 endmenu
diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
index 5478c7cb9e07..91314a9b43ce 100644
--- a/drivers/remoteproc/Makefile
+++ b/drivers/remoteproc/Makefile
@@ -38,3 +38,4 @@ obj-$(CONFIG_ST_SLIM_REMOTEPROC)	+= st_slim_rproc.o
 obj-$(CONFIG_STM32_RPROC)		+= stm32_rproc.o
 obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)	+= ti_k3_dsp_remoteproc.o
 obj-$(CONFIG_TI_K3_R5_REMOTEPROC)	+= ti_k3_r5_remoteproc.o
+obj-$(CONFIG_XLNX_R5_REMOTEPROC)	+= xlnx_r5_remoteproc.o
diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c b/drivers/remoteproc/xlnx_r5_remoteproc.c
new file mode 100644
index 000000000000..394b3469463c
--- /dev/null
+++ b/drivers/remoteproc/xlnx_r5_remoteproc.c
@@ -0,0 +1,1045 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ZynqMP R5 Remote Processor driver
+ *
+ */
+
+#include <dt-bindings/power/xlnx-zynqmp-power.h>
+#include <linux/dma-mapping.h>
+#include <linux/firmware/xlnx-zynqmp.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/platform_device.h>
+#include <linux/remoteproc.h>
+#include <linux/slab.h>
+
+#include "remoteproc_internal.h"
+
+/* settings for RPU cluster mode */
+enum zynqmp_r5_cluster_mode {
+	SPLIT_MODE = 0, /* When cores run as separate processor */
+	LOCKSTEP_MODE = 1, /* cores execute same code in lockstep,clk-for-clk */
+	SINGLE_CPU_MODE = 2, /* core0 is held in reset and only core1 runs */
+};
+
+/**
+ * struct mem_bank_data - Memory Bank description
+ *
+ * @addr: Start address of memory bank
+ * @size: Size of Memory bank
+ * @pm_domain_id: Power-domains id of memory bank for firmware to turn on/off
+ * @bank_name: name of the bank for remoteproc framework
+ */
+struct mem_bank_data {
+	phys_addr_t addr;
+	size_t size;
+	u32 pm_domain_id;
+	char *bank_name;
+};
+
+static const struct mem_bank_data zynqmp_tcm_banks[] = {
+	{0xffe00000UL, 0x10000UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 64KB each */
+	{0xffe20000UL, 0x10000UL, PD_R5_0_BTCM, "btcm0"},
+	{0xffe90000UL, 0x10000UL, PD_R5_1_ATCM, "atcm1"},
+	{0xffeb0000UL, 0x10000UL, PD_R5_1_BTCM, "btcm1"},
+};
+
+/**
+ * struct zynqmp_r5_core - ZynqMP R5 core structure
+ *
+ * @dev: device of RPU instance
+ * @np: device node of RPU instance
+ * @tcm_bank_count: number TCM banks accessible to this RPU
+ * @tcm_banks: array of each TCM bank data
+ * @rmem_count: Number of reserved mem regions
+ * @rmem: reserved memory region nodes from device tree
+ * @rproc: rproc handle
+ * @pm_domain_id: RPU CPU power domain id
+ */
+struct zynqmp_r5_core {
+	struct device *dev;
+	struct device_node *np;
+	int tcm_bank_count;
+	struct mem_bank_data **tcm_banks;
+	int rmem_count;
+	struct reserved_mem **rmem;
+	struct rproc *rproc;
+	u32 pm_domain_id;
+};
+
+/**
+ * struct zynqmp_r5_cluster - ZynqMP R5 cluster structure
+ *
+ * @dev: r5f subsystem cluster device node
+ * @mode: cluster mode of type zynqmp_r5_cluster_mode
+ * @core_count: number of r5 cores used for this cluster mode
+ * @r5_cores: Array of pointers pointing to r5 core
+ */
+struct zynqmp_r5_cluster {
+	struct device *dev;
+	enum  zynqmp_r5_cluster_mode mode;
+	int core_count;
+	struct zynqmp_r5_core **r5_cores;
+};
+
+/*
+ * zynqmp_r5_set_mode - set RPU operation mode
+ *
+ * set RPU operation mode
+ *
+ * Return: 0 for success, negative value for failure
+ */
+static int zynqmp_r5_set_mode(struct zynqmp_r5_core *r5_core,
+			      enum rpu_oper_mode fw_reg_val,
+			      enum rpu_tcm_comb tcm_mode)
+{
+	int ret;
+
+	ret = zynqmp_pm_set_rpu_mode(r5_core->pm_domain_id, fw_reg_val);
+	if (ret < 0) {
+		dev_err(r5_core->dev, "failed to set RPU mode\n");
+		return ret;
+	}
+
+	ret = zynqmp_pm_set_tcm_config(r5_core->pm_domain_id, tcm_mode);
+	if (ret < 0)
+		dev_err(r5_core->dev, "failed to configure TCM\n");
+
+	return ret;
+}
+
+/*
+ * zynqmp_r5_rproc_start
+ * @rproc: single R5 core's corresponding rproc instance
+ *
+ * Start R5 Core from designated boot address.
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int zynqmp_r5_rproc_start(struct rproc *rproc)
+{
+	struct zynqmp_r5_core *r5_core = rproc->priv;
+	enum rpu_boot_mem bootmem;
+	int ret;
+
+	/*
+	 * The exception vector pointers (EVP) refer to the base-address of
+	 * exception vectors (for reset, IRQ, FIQ, etc). The reset-vector
+	 * starts at the base-address and subsequent vectors are on 4-byte
+	 * boundaries.
+	 *
+	 * Exception vectors can start either from 0x0000_0000 (LOVEC) or
+	 * from 0xFFFF_0000 (HIVEC) which is mapped in the OCM (On-Chip Memory)
+	 *
+	 * Usually firmware will put Exception vectors at LOVEC.
+	 *
+	 * It is not recommend that you change the exception vector.
+	 * Changing the EVP to HIVEC will result in increased interrupt latency
+	 * and jitter. Also, if the OCM is secured and the Cortex-R5F processor
+	 * is non-secured, then the Cortex-R5F processor cannot access the
+	 * HIVEC exception vectors in the OCM.
+	 */
+	bootmem = (rproc->bootaddr >= 0xFFFC0000) ?
+		   PM_RPU_BOOTMEM_HIVEC : PM_RPU_BOOTMEM_LOVEC;
+
+	dev_dbg(r5_core->dev, "RPU boot addr 0x%llx from %s.", rproc->bootaddr,
+		bootmem == PM_RPU_BOOTMEM_HIVEC ? "OCM" : "TCM");
+
+	ret = zynqmp_pm_request_wake(r5_core->pm_domain_id, 1,
+				     bootmem, ZYNQMP_PM_REQUEST_ACK_NO);
+	if (ret)
+		dev_err(r5_core->dev,
+			"failed to start RPU = 0x%x\n", r5_core->pm_domain_id);
+	return ret;
+}
+
+/*
+ * zynqmp_r5_rproc_stop
+ * @rproc: single R5 core's corresponding rproc instance
+ *
+ * Power down  R5 Core.
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int zynqmp_r5_rproc_stop(struct rproc *rproc)
+{
+	struct zynqmp_r5_core *r5_core = rproc->priv;
+	int ret;
+
+	ret = zynqmp_pm_force_pwrdwn(r5_core->pm_domain_id,
+				     ZYNQMP_PM_REQUEST_ACK_BLOCKING);
+	if (ret)
+		dev_err(r5_core->dev, "failed to stop remoteproc RPU %d\n", ret);
+
+	return ret;
+}
+
+/*
+ * zynqmp_r5_mem_region_map
+ * @rproc: single R5 core's corresponding rproc instance
+ * @mem: mem entry to map
+ *
+ * Callback to map va for memory-region's carveout.
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int zynqmp_r5_mem_region_map(struct rproc *rproc,
+				    struct rproc_mem_entry *mem)
+{
+	void __iomem *va;
+
+	va = ioremap_wc(mem->dma, mem->len);
+	if (IS_ERR_OR_NULL(va))
+		return -ENOMEM;
+
+	mem->va = (void *)va;
+
+	return 0;
+}
+
+/*
+ * zynqmp_r5_rproc_mem_unmap
+ * @rproc: single R5 core's corresponding rproc instance
+ * @mem: mem entry to unmap
+ *
+ * Unmap memory-region carveout
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int zynqmp_r5_mem_region_unmap(struct rproc *rproc,
+				      struct rproc_mem_entry *mem)
+{
+	iounmap((void __iomem *)mem->va);
+	return 0;
+}
+
+/*
+ * add_mem_regions
+ * @r5_core: single R5 core's corresponding zynqmp_r5_core type instance
+ * @rmem: reserved mem region parsed from dt node
+ *
+ * Construct rproc mem carveouts from carveout provided in
+ * memory-region property
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int add_mem_regions_carveout(struct rproc *rproc)
+{
+	struct zynqmp_r5_core *r5_core;
+	struct reserved_mem *rmem;
+	struct rproc_mem_entry **rproc_mem;
+	int i, num_mem_regions;
+
+	r5_core = (struct zynqmp_r5_core *)rproc->priv;
+	num_mem_regions = r5_core->rmem_count;
+
+	/* memory regions not defined */
+	if (num_mem_regions < 1)
+		return 0;
+
+	rproc_mem = kcalloc(num_mem_regions,
+			    sizeof(struct rproc_mem_entry *), GFP_KERNEL);
+	if (!rproc_mem)
+		return -ENOMEM;
+
+	for (i = 0; i < num_mem_regions; i++) {
+		rmem = r5_core->rmem[i];
+
+		/* Register associated reserved memory regions */
+		rproc_mem[i] = rproc_mem_entry_init(&rproc->dev, NULL,
+						    (dma_addr_t)rmem->base,
+						    rmem->size, rmem->base,
+						    zynqmp_r5_mem_region_map,
+						    zynqmp_r5_mem_region_unmap,
+						    rmem->name);
+		if (!rproc_mem[i])
+			goto release_mem_regions;
+
+		dev_dbg(&rproc->dev, "reserved mem carveout %s addr=%llx, size=0x%llx",
+			rmem->name, rmem->base, rmem->size);
+	}
+
+	/*
+	 * Add carveouts only if all rproc mem enties are
+	 * successfully initialized
+	 */
+	for (i = 0; i < num_mem_regions; i++)
+		rproc_add_carveout(rproc, rproc_mem[i]);
+
+	kfree(rproc_mem);
+	return 0;
+
+release_mem_regions:
+	for (i--; i > -1; i--)
+		kfree(rproc_mem[i]);
+	kfree(rproc_mem);
+	return -ENOMEM;
+}
+
+/*
+ * zynqmp_r5_rproc_mem_unmap
+ * @rproc: single R5 core's corresponding rproc instance
+ * @mem: mem entry to unmap
+ *
+ * Unmap TCM banks when powering down R5 core.
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int tcm_mem_unmap(struct rproc *rproc, struct rproc_mem_entry *mem)
+{
+	iounmap((void __iomem *)mem->va);
+
+	return 0;
+}
+
+/*
+ * tcm_mem_map
+ * @rproc: single R5 core's corresponding rproc instance
+ * @mem: mem entry to initialize the va and da fields of
+ *
+ * Given TCM bank entry, this callback will set device address for R5
+ * running on TCM and also setup virtual address for TCM bank
+ * remoteproc carveout.
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int tcm_mem_map(struct rproc *rproc,
+		       struct rproc_mem_entry *mem)
+{
+	void __iomem *va;
+
+	va = ioremap_wc(mem->dma, mem->len);
+	if (IS_ERR_OR_NULL(va))
+		return -ENOMEM;
+
+	/* Update memory entry va */
+	mem->va = (void *)va;
+
+	/* clear TCMs */
+	memset_io(va, 0, mem->len);
+
+	/*
+	 * The R5s expect their TCM banks to be at address 0x0 and 0x2000,
+	 * while on the Linux side they are at 0xffexxxxx.
+	 *
+	 * Zero out the high 12 bits of the address. This will give
+	 * expected values for TCM Banks 0A and 0B (0x0 and 0x20000).
+	 */
+	mem->da &= 0x000fffff;
+
+	/*
+	 * TCM Banks 1A and 1B still have to be translated.
+	 *
+	 * Below handle these two banks' absolute addresses (0xffe90000 and
+	 * 0xffeb0000) and convert to the expected relative addresses
+	 * (0x0 and 0x20000).
+	 */
+	if (mem->da == 0x90000 || mem->da == 0xB0000)
+		mem->da -= 0x90000;
+
+	/* if translated TCM bank address is not valid report error */
+	if (mem->da != 0x0 && mem->da != 0x20000) {
+		dev_err(&rproc->dev, "invalid TCM address: %x\n", mem->da);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int add_tcm_carveout_split_mode(struct rproc *rproc)
+{
+	int i, num_banks, ret;
+	struct rproc_mem_entry **rproc_mem;
+	u32 pm_domain_id;
+	phys_addr_t bank_addr;
+	size_t bank_size;
+	char *bank_name;
+	struct device *dev;
+	struct zynqmp_r5_core *r5_core;
+
+	r5_core = (struct zynqmp_r5_core *)rproc->priv;
+
+	dev = r5_core->dev;
+
+	/* go through zynqmp banks for r5 node */
+	num_banks = r5_core->tcm_bank_count;
+	if (num_banks <= 0) {
+		dev_err(dev, "need to specify TCM banks\n");
+		return -EINVAL;
+	}
+
+	rproc_mem = kcalloc(num_banks,
+			    sizeof(struct rproc_mem_entry *), GFP_KERNEL);
+	if (!rproc_mem)
+		return -ENOMEM;
+
+	/*
+	 * Power-on Each 64KB TCM,
+	 * register its address space, map and unmap functions
+	 * and add carveouts accordingly
+	 */
+	for (i = 0; i < num_banks; i++) {
+		bank_addr = r5_core->tcm_banks[i]->addr;
+		bank_name = r5_core->tcm_banks[i]->bank_name;
+		bank_size = r5_core->tcm_banks[i]->size;
+		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
+
+		ret = zynqmp_pm_request_node(pm_domain_id,
+					     ZYNQMP_PM_CAPABILITY_ACCESS, 0,
+					     ZYNQMP_PM_REQUEST_ACK_BLOCKING);
+		if (ret < 0) {
+			dev_err(dev, "failed to turn on TCM 0x%x", pm_domain_id);
+			goto release_tcm_split;
+		}
+
+		dev_dbg(dev, "TCM carveout split mode %s addr=%llx, size=0x%lx",
+			bank_name, bank_addr, bank_size);
+
+		rproc_mem[i] = rproc_mem_entry_init(dev, NULL, bank_addr,
+						    bank_size, bank_addr,
+						    tcm_mem_map, tcm_mem_unmap,
+						    bank_name);
+		if (!rproc_mem[i]) {
+			ret = -ENOMEM;
+			goto release_tcm_split;
+		}
+	}
+
+	/*
+	 * Add carveouts only if all rproc mem enties are
+	 * successfully initialized
+	 */
+	for (i = 0; i < num_banks; i++)
+		rproc_add_carveout(rproc, rproc_mem[i]);
+
+	kfree(rproc_mem);
+	return 0;
+
+release_tcm_split:
+	/* If failed, Turn off all TCM banks turned on before */
+	for (i--; i > -1; i--) {
+		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
+		zynqmp_pm_release_node(pm_domain_id);
+		kfree(rproc_mem[i]);
+	}
+	kfree(rproc_mem);
+	return ret;
+}
+
+static int add_tcm_carveout_lockstep_mode(struct rproc *rproc)
+{
+	int i, num_banks, ret;
+	struct rproc_mem_entry *rproc_mem;
+	u32 pm_domain_id;
+	phys_addr_t bank_addr;
+	size_t bank_size = 0;
+	char *bank_name;
+	struct device *dev;
+	struct zynqmp_r5_core *r5_core;
+
+	r5_core = (struct zynqmp_r5_core *)rproc->priv;
+	dev = r5_core->dev;
+
+	/* Go through zynqmp banks for r5 node */
+	num_banks = r5_core->tcm_bank_count;
+	if (num_banks <= 0) {
+		dev_err(dev, "need to specify TCM banks\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * In lockstep mode, TCM is contiguous memory block
+	 * However, each TCM block still needs to be enabled individually.
+	 * So, Enable each TCM block individually, but add their size
+	 * to create contiguous memory region.
+	 */
+	bank_addr = r5_core->tcm_banks[0]->addr;
+	bank_name = r5_core->tcm_banks[0]->bank_name;
+
+	for (i = 0; i < num_banks; i++) {
+		bank_size += r5_core->tcm_banks[i]->size;
+		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
+
+		/* Turn on each TCM bank individually */
+		ret = zynqmp_pm_request_node(pm_domain_id,
+					     ZYNQMP_PM_CAPABILITY_ACCESS, 0,
+					     ZYNQMP_PM_REQUEST_ACK_BLOCKING);
+		if (ret < 0) {
+			dev_err(dev, "failed to turn on TCM 0x%x", pm_domain_id);
+			goto release_tcm_lockstep;
+		}
+	}
+
+	dev_dbg(dev, "TCM add carveout lockstep mode %s addr=0x%llx, size=0x%lx",
+		bank_name, bank_addr, bank_size);
+
+	/* Register TCM address range, TCM map and unmap functions */
+	rproc_mem = rproc_mem_entry_init(dev, NULL, bank_addr,
+					 bank_size, bank_addr,
+					 tcm_mem_map, tcm_mem_unmap,
+					 bank_name);
+	if (!rproc_mem) {
+		ret = -ENOMEM;
+		goto release_tcm_lockstep;
+	}
+
+	/* If registration is success, add carveouts */
+	rproc_add_carveout(rproc, rproc_mem);
+
+	return 0;
+
+release_tcm_lockstep:
+	/* If failed, Turn off all TCM banks turned on before */
+	for (i--; i > -1; i--) {
+		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
+		zynqmp_pm_release_node(pm_domain_id);
+	}
+	return ret;
+}
+
+/*
+ * add_tcm_banks()
+ * @rproc: single R5 core's corresponding rproc instance
+ *
+ * Given R5 node in remoteproc instance
+ * allocate remoteproc carveout for TCM memory
+ * needed for firmware to be loaded
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int add_tcm_banks(struct rproc *rproc)
+{
+	struct device *dev;
+	struct zynqmp_r5_cluster *cluster;
+	struct zynqmp_r5_core *r5_core;
+
+	r5_core = (struct zynqmp_r5_core *)rproc->priv;
+	if (!r5_core)
+		return -EINVAL;
+
+	dev = r5_core->dev;
+
+	cluster = dev_get_drvdata(dev->parent);
+	if (!cluster) {
+		dev_err(dev->parent, "Invalid driver data\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * In lockstep mode TCM banks are one contiguous memory region of 256Kb
+	 * In split mode, each TCM bank is 64Kb and not contiguous.
+	 * We add memory carveouts accordingly.
+	 */
+	if (cluster->mode == SPLIT_MODE)
+		return add_tcm_carveout_split_mode(rproc);
+	else if (cluster->mode == LOCKSTEP_MODE)
+		return add_tcm_carveout_lockstep_mode(rproc);
+
+	dev_err(cluster->dev, "invalid cluster mode\n");
+	return -EINVAL;
+}
+
+/*
+ * zynqmp_r5_parse_fw()
+ * @rproc: single R5 core's corresponding rproc instance
+ * @fw: ptr to firmware to be loaded onto r5 core
+ *
+ * When loading firmware, ensure the necessary carveouts are in remoteproc
+ *
+ * return 0 on success, otherwise non-zero value on failure
+ */
+static int zynqmp_r5_parse_fw(struct rproc *rproc, const struct firmware *fw)
+{
+	int ret;
+
+	ret = rproc_elf_load_rsc_table(rproc, fw);
+	if (ret == -EINVAL) {
+		/*
+		 * resource table only required for IPC.
+		 * if not present, this is not necessarily an error;
+		 * for example, loading r5 hello world application
+		 * so simply inform user and keep going.
+		 */
+		dev_info(&rproc->dev, "no resource table found.\n");
+		ret = 0;
+	}
+	return ret;
+}
+
+static int zynqmp_r5_rproc_prepare(struct rproc *rproc)
+{
+	int ret = 0;
+
+	ret = add_tcm_banks(rproc);
+	if (ret) {
+		dev_err(&rproc->dev, "failed to get TCM banks, err %d\n", ret);
+		return ret;
+	}
+
+	ret = add_mem_regions_carveout(rproc);
+	if (ret)
+		dev_warn(&rproc->dev, "failed to get reserve mem regions %d\n",
+			 ret);
+
+	return 0;
+}
+
+static int zynqmp_r5_rproc_unprepare(struct rproc *rproc)
+{
+	struct zynqmp_r5_core *r5_core;
+	int i;
+	u32 pm_domain_id;
+
+	r5_core = (struct zynqmp_r5_core *)rproc->priv;
+
+	for (i = 0; i < r5_core->tcm_bank_count; i++) {
+		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
+		if (zynqmp_pm_release_node(pm_domain_id))
+			dev_warn(r5_core->dev,
+				 "can't turn off TCM bank 0x%x", pm_domain_id);
+	}
+
+	return 0;
+}
+
+static const struct rproc_ops zynqmp_r5_rproc_ops = {
+	.prepare	= zynqmp_r5_rproc_prepare,
+	.unprepare	= zynqmp_r5_rproc_unprepare,
+	.start		= zynqmp_r5_rproc_start,
+	.stop		= zynqmp_r5_rproc_stop,
+	.load		= rproc_elf_load_segments,
+	.parse_fw	= zynqmp_r5_parse_fw,
+	.find_loaded_rsc_table = rproc_elf_find_loaded_rsc_table,
+	.sanity_check	= rproc_elf_sanity_check,
+	.get_boot_addr	= rproc_elf_get_boot_addr,
+};
+
+/**
+ * zynqmp_r5_add_rproc_core()
+ * Probes ZynqMP R5 processor device node
+ * this is called for each individual R5 core
+ *
+ * @cdev: Device node of each r5 core
+ *
+ * Return: zynqmp_r5_core object for success, error pointer in case of error.
+ */
+static struct zynqmp_r5_core *zynqmp_r5_add_rproc_core(struct device *cdev)
+{
+	int ret;
+	struct rproc *r5_rproc;
+	struct zynqmp_r5_core *r5_core;
+
+	/* Set up DMA mask */
+	ret = dma_set_coherent_mask(cdev, DMA_BIT_MASK(32));
+	if (ret)
+		return ERR_PTR(ret);
+
+	/* Allocate remoteproc instance */
+	r5_rproc = devm_rproc_alloc(cdev, dev_name(cdev),
+				    &zynqmp_r5_rproc_ops,
+				    NULL, sizeof(struct zynqmp_r5_core));
+	if (!r5_rproc) {
+		dev_err(cdev, "failed to allocate memory for rproc instance\n");
+		return ERR_PTR(-ENOMEM);
+	}
+
+	r5_rproc->auto_boot = false;
+	r5_core = (struct zynqmp_r5_core *)r5_rproc->priv;
+	r5_core->dev = cdev;
+	r5_core->np = dev_of_node(cdev);
+	if (!r5_core->np) {
+		dev_err(cdev, "can't get device node for r5 core\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* Add R5 remoteproc core */
+	ret = rproc_add(r5_rproc);
+	if (ret) {
+		dev_err(cdev, "failed to add r5 remoteproc\n");
+		return ERR_PTR(ret);
+	}
+
+	r5_core->rproc = r5_rproc;
+	return r5_core;
+}
+
+/**
+ * zynqmp_r5_get_tcm_node()
+ * Ideally this function should parse tcm node and store information
+ * in r5_core instance. We will use hardcoded TCM information from
+ * driver for now in this function.
+ *
+ * @cluster: pointer to zynqmp_r5_cluster type object
+ *
+ * Return: 0 for success and error code for failure.
+ */
+static int zynqmp_r5_get_tcm_node(struct zynqmp_r5_cluster *cluster)
+{
+	int tcm_bank_count, tcm_node;
+	int i, j;
+	struct zynqmp_r5_core *r5_core;
+	struct device *dev = cluster->dev;
+
+	/*
+	 * ToDo: Use predefined TCM address space values from driver until
+	 * system-dt spec is not final for TCM
+	 */
+	tcm_bank_count = ARRAY_SIZE(zynqmp_tcm_banks);
+
+	/* count per core tcm banks */
+	tcm_bank_count = tcm_bank_count / cluster->core_count;
+
+	/*
+	 * r5 core 0 will use all of TCM banks in lockstep mode.
+	 * In split mode, r5 core0 will use 128k and r5 core1 will use another
+	 * 128k. Assign TCM banks to each core accordingly
+	 */
+	tcm_node = 0;
+	for (i = 0; i < cluster->core_count; i++) {
+		r5_core = cluster->r5_cores[i];
+		r5_core->tcm_banks = devm_kcalloc(dev, tcm_bank_count,
+						  sizeof(struct mem_bank_data *),
+						  GFP_KERNEL);
+		if (!r5_core->tcm_banks)
+			return -ENOMEM;
+
+		for (j = 0; j < tcm_bank_count; j++) {
+			/*
+			 * Use pre-defined TCM reg values.
+			 * Eventually this should be replaced by values
+			 * parsed from dts.
+			 */
+			r5_core->tcm_banks[j] =
+				(struct mem_bank_data *)&zynqmp_tcm_banks[tcm_node];
+			tcm_node++;
+		}
+
+		r5_core->tcm_bank_count = tcm_bank_count;
+	}
+
+	return 0;
+}
+
+/**
+ * zynqmp_r5_get_mem_region_node()
+ * parse memory-region property from dt node and add
+ * memory region carveouts
+ *
+ * @r5_core: pointer to zynqmp_r5_core type object
+ *
+ * Return: 0 for success and error code for failure.
+ */
+static int zynqmp_r5_get_mem_region_node(struct zynqmp_r5_core *r5_core)
+{
+	int res_mem_count, i;
+	struct device *dev;
+	struct device_node *np, *rmem_np;
+	struct reserved_mem **rmem;
+
+	dev = r5_core->dev;
+
+	np = r5_core->np;
+
+	res_mem_count = of_property_count_elems_of_size(np, "memory-region",
+							sizeof(phandle));
+	if (res_mem_count <= 0) {
+		dev_warn(dev, "failed to get memory-region property %d\n",
+			 res_mem_count);
+		return -EINVAL;
+	}
+
+	rmem = devm_kcalloc(dev, res_mem_count,
+			    sizeof(struct reserved_mem *), GFP_KERNEL);
+	if (!rmem)
+		return -ENOMEM;
+
+	for (i = 0; i < res_mem_count; i++) {
+		rmem_np = of_parse_phandle(np, "memory-region", i);
+		if (!rmem_np)
+			goto release_rmem;
+
+		rmem[i] = of_reserved_mem_lookup(rmem_np);
+		if (!rmem[i]) {
+			of_node_put(rmem_np);
+			goto release_rmem;
+		}
+
+		of_node_put(rmem_np);
+	}
+
+	r5_core->rmem_count = res_mem_count;
+	r5_core->rmem = rmem;
+	return 0;
+
+release_rmem:
+	for (i--; i > -1; i--)
+		kfree(rmem[i]);
+	devm_kfree(dev, rmem);
+	return -ENOMEM;
+}
+
+/*
+ * zynqmp_r5_core_init()
+ * Create and initialize zynqmp_r5_core type object
+ *
+ * @cluster: pointer to zynqmp_r5_cluster type object
+ *
+ * Return: 0 for success and error code for failure.
+ */
+static int zynqmp_r5_core_init(struct zynqmp_r5_cluster *cluster,
+			       enum rpu_oper_mode fw_reg_val, int tcm_mode)
+{
+	int ret, i;
+	struct zynqmp_r5_core *r5_core;
+	struct device *dev = cluster->dev;
+
+	ret = zynqmp_r5_get_tcm_node(cluster);
+	if (ret < 0) {
+		dev_err(dev, "can't get tcm node, err %d\n", ret);
+		return ret;
+	}
+
+	for (i = 0; i < cluster->core_count; i++) {
+		r5_core = cluster->r5_cores[i];
+
+		ret = zynqmp_r5_get_mem_region_node(r5_core);
+		if (ret)
+			dev_warn(dev, "memory-region prop failed %d\n", ret);
+
+		/* Initialize r5 cores with power-domains parsed from dts */
+		ret = of_property_read_u32_index(r5_core->np, "power-domains",
+						 1, &r5_core->pm_domain_id);
+		if (ret) {
+			dev_err(dev, "failed to get power-domains property\n");
+			return ret;
+		}
+
+		ret = zynqmp_r5_set_mode(r5_core, fw_reg_val, tcm_mode);
+		if (ret) {
+			dev_err(dev, "failed to set r5 cluster mode %d, err %d\n",
+				cluster->mode, ret);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * zynqmp_r5_cluster_init()
+ * Create and initialize zynqmp_r5_cluster type object
+ *
+ * @cluster: pointer to zynqmp_r5_cluster type object
+ *
+ * Return: 0 for success and error code for failure.
+ */
+static int zynqmp_r5_cluster_init(struct zynqmp_r5_cluster *cluster)
+{
+	struct device *dev = cluster->dev;
+	struct device_node *dev_node = dev_of_node(dev);
+	struct device_node *child;
+	struct platform_device *child_pdev;
+	int core_count, ret, i;
+	enum zynqmp_r5_cluster_mode cluster_mode = LOCKSTEP_MODE;
+	struct zynqmp_r5_core **r5_cores;
+	enum rpu_tcm_comb tcm_mode;
+	enum rpu_oper_mode fw_reg_val;
+
+	ret = of_property_read_u32(dev_node, "xlnx,cluster-mode", &cluster_mode);
+
+	/*
+	 * on success returns 0, if not defined then returns -EINVAL,
+	 * In that case, default is LOCKSTEP mode
+	 */
+	if (ret != -EINVAL && ret != 0) {
+		dev_err(dev, "Invalid xlnx,cluster-mode property\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * For now driver only supports split mode and lockstep mode.
+	 * fail driver probe if either of that is not set in dts.
+	 */
+	if (cluster_mode == LOCKSTEP_MODE) {
+		tcm_mode = PM_RPU_TCM_COMB;
+		fw_reg_val = PM_RPU_MODE_LOCKSTEP;
+	} else if (cluster_mode == SPLIT_MODE) {
+		tcm_mode = PM_RPU_TCM_SPLIT;
+		fw_reg_val = PM_RPU_MODE_SPLIT;
+	} else {
+		dev_err(dev, "driver does not support cluster mode %d\n", cluster_mode);
+		return -EINVAL;
+	}
+
+	/*
+	 * Number of cores is decided by number of child nodes of
+	 * r5f subsystem node in dts. If Split mode is used in dts
+	 * 2 child nodes are expected.
+	 * In lockstep mode if two child nodes are available,
+	 * only use first child node and consider it as core0
+	 * and ignore core1 dt node.
+	 */
+	core_count = of_get_available_child_count(dev_node);
+	if (core_count <= 0) {
+		dev_err(dev, "Invalid number of r5 cores %d", core_count);
+		return -EINVAL;
+	} else if (cluster_mode == SPLIT_MODE && core_count != 2) {
+		dev_err(dev, "Invalid number of r5 cores for split mode\n");
+		return -EINVAL;
+	} else if (cluster_mode == LOCKSTEP_MODE && core_count == 2) {
+		dev_warn(dev, "Only r5 core0 will be used\n");
+		core_count = 1;
+	}
+
+	r5_cores = kcalloc(core_count,
+			   sizeof(struct zynqmp_r5_core *), GFP_KERNEL);
+	if (!r5_cores)
+		return -ENOMEM;
+
+	i = 0;
+	for_each_available_child_of_node(dev_node, child) {
+		child_pdev = of_find_device_by_node(child);
+		if (!child_pdev) {
+			of_node_put(child);
+			ret = -ENODEV;
+			goto release_r5_cores;
+		}
+
+		/* create and add remoteproc instance of type struct rproc */
+		r5_cores[i] = zynqmp_r5_add_rproc_core(&child_pdev->dev);
+		if (IS_ERR(r5_cores[i])) {
+			of_node_put(child);
+			ret = PTR_ERR(r5_cores[i]);
+			goto release_r5_cores;
+		}
+
+		i++;
+
+		/*
+		 * If two child nodes are available in dts in lockstep mode,
+		 * then ignore second child node.
+		 */
+		if (i == core_count) {
+			of_node_put(child);
+			break;
+		}
+	}
+
+	cluster->mode = cluster_mode;
+	cluster->core_count = core_count;
+	cluster->r5_cores = r5_cores;
+
+	ret = zynqmp_r5_core_init(cluster, fw_reg_val, tcm_mode);
+	if (ret < 0) {
+		dev_err(dev, "failed to init r5 core err %d\n", ret);
+		cluster->core_count = 0;
+		cluster->r5_cores = NULL;
+		goto release_r5_cores;
+	}
+
+	return 0;
+
+release_r5_cores:
+	for (i--; i > -1; i--) {
+		put_device(r5_cores[i]->dev);
+		rproc_del(r5_cores[i]->rproc);
+	}
+	kfree(r5_cores);
+	return ret;
+}
+
+static void zynqmp_r5_core_exit(struct zynqmp_r5_core *r5_core)
+{
+	/* release r5_core device */
+	put_device(r5_core->dev);
+
+	rproc_del(r5_core->rproc);
+}
+
+static void zynqmp_r5_cluster_exit(void *data)
+{
+	struct platform_device *pdev = (struct platform_device *)data;
+	struct zynqmp_r5_cluster *cluster;
+	int i;
+
+	cluster = (struct zynqmp_r5_cluster *)platform_get_drvdata(pdev);
+	if (!cluster)
+		return;
+
+	for (i = 0; i < cluster->core_count; i++) {
+		zynqmp_r5_core_exit(cluster->r5_cores[i]);
+		cluster->r5_cores[i] = NULL;
+	}
+
+	kfree(cluster->r5_cores);
+	kfree(cluster);
+	platform_set_drvdata(pdev, NULL);
+}
+
+/*
+ * zynqmp_r5_remoteproc_probe()
+ *
+ * @pdev: domain platform device for R5 cluster
+ *
+ * called when driver is probed, for each R5 core specified in DT,
+ * setup as needed to do remoteproc-related operations
+ *
+ * Return: 0 for success, negative value for failure.
+ */
+static int zynqmp_r5_remoteproc_probe(struct platform_device *pdev)
+{
+	int ret;
+	struct zynqmp_r5_cluster *cluster;
+	struct device *dev = &pdev->dev;
+
+	cluster = kzalloc(sizeof(*cluster), GFP_KERNEL);
+	if (!cluster)
+		return -ENOMEM;
+
+	cluster->dev = dev;
+
+	ret = devm_of_platform_populate(dev);
+	if (ret) {
+		dev_err_probe(dev, ret, "failed to populate platform dev\n");
+		kfree(cluster);
+		return ret;
+	}
+
+	/* wire in so each core can be cleaned up at driver remove */
+	platform_set_drvdata(pdev, cluster);
+
+	ret = zynqmp_r5_cluster_init(cluster);
+	if (ret) {
+		zynqmp_r5_cluster_exit(pdev);
+		dev_err_probe(dev, ret, "Invalid r5f subsystem device tree\n");
+		return ret;
+	}
+
+	ret = devm_add_action_or_reset(dev, zynqmp_r5_cluster_exit, pdev);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+/* Match table for OF platform binding */
+static const struct of_device_id zynqmp_r5_remoteproc_match[] = {
+	{ .compatible = "xlnx,zynqmp-r5fss", },
+	{ /* end of list */ },
+};
+MODULE_DEVICE_TABLE(of, zynqmp_r5_remoteproc_match);
+
+static struct platform_driver zynqmp_r5_remoteproc_driver = {
+	.probe = zynqmp_r5_remoteproc_probe,
+	.driver = {
+		.name = "zynqmp_r5_remoteproc",
+		.of_match_table = zynqmp_r5_remoteproc_match,
+	},
+};
+module_platform_driver(zynqmp_r5_remoteproc_driver);
+
+MODULE_DESCRIPTION("Xilinx R5F remote processor driver");
+MODULE_AUTHOR("Xilinx Inc.");
+MODULE_LICENSE("GPL");
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox