linux-fpga.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383
@ 2025-08-04 13:48 ` Pavel Pisa
  2025-08-05 10:01   ` Marek Szyprowski
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Pisa @ 2025-08-04 13:48 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: linux-fpga, Michal Simek, Jason Gunthorpe, Xu Yilun, Pavel Hronek,
	Jiri Novak, Ondrej Ille

Hello Marek and others,

we are running daily tests of SocketCAN stack on mainline and RT kernel
with our CTU CAN FD IP core

   https://canbus.pages.fel.cvut.cz/#can-bus-channels-mutual-latency-testing

It seems that the setup is broken after the mainlie kernel version
from 2025-07-29, 6.16.0-g283564a43383. There is only one commit
identified in drivers/fpga after the last working version

  zynq_fpga: use sgtable-based scatterlist wrappers

The last working mainlne kernel version is recorded in the
last graph data 

  https://canbus.pages.fel.cvut.cz/can-latester/inspect.html

We use the dtbocfg out of tree module to initiate update
by devicetree overlay but fpga manager is mainline one
and all worked for years correctly. The log messages

[  104.934323] dtbocfg: loading out-of-tree module taints kernel.
[  104.940681] dtbocfg: 0.1.0
[  104.943543] dtbocfg: OK
[  105.022979] fpga_manager fpga0: writing system.bit.bin to Xilinx Zynq FPGA Manager
[  105.097721] fpga_manager fpga0: Unable to DMA map (TO_DEVICE)
[  105.103562] fpga_manager fpga0: Error while writing image data to FPGA
[  105.110485] fpga_region region0: failed to load FPGA image
[  105.116059] OF: overlay: overlay changeset pre-apply notifier error -12, target: /fpga-full
[  105.124499] dtbocfg_overlay_item_create: Failed to apply overlay (ret_val=-12)

The overlay source is available there

  https://gitlab.fel.cvut.cz/canbus/zynq/zynq-can-sja1000-top/-/blob/can-bench-2x-xcan-4x-ctu/scripts/dts/bitstream+dt.dts

The base zynq-7000.dtsi could be a little dated, but I have not
found related change except name/alias change in mainline

-       fpga_full: fpga-full {
+       fpga_full: fpga-region {

but there are recorded months of correct operation after
this change. I am aware that there has been change of name
alias 

-       amba: amba {
+       amba: axi {

in the sources but again, it has not been problem.

It seems that mapping fails in

        priv->dma_nelms =
            dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);

The last tested failing mainline kernel version is from this
midnight

   6.16.0-ge991acf1bce7

Do you have some idea what could be a problem?

Do you have suggestion what to test?

Best wishes

                Pavel

                Pavel Pisa
    phone:      +420 603531357
    e-mail:     pisa@cmp.felk.cvut.cz
    Department of Control Engineering FEE CVUT
    Karlovo namesti 13, 121 35, Prague 2
    university: http://control.fel.cvut.cz/
    personal:   http://cmp.felk.cvut.cz/~pisa
    social:     https://social.kernel.org/ppisa
    projects:   https://www.openhub.net/accounts/ppisa
    CAN related:http://canbus.pages.fel.cvut.cz/
    RISC-V education: https://comparch.edu.cvut.cz/
    Open Technologies Research Education and Exchange Services
    https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383
  2025-08-04 13:48 ` AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383 Pavel Pisa
@ 2025-08-05 10:01   ` Marek Szyprowski
  2025-08-05 17:12     ` Xu Yilun
  0 siblings, 1 reply; 7+ messages in thread
From: Marek Szyprowski @ 2025-08-05 10:01 UTC (permalink / raw)
  To: Pavel Pisa
  Cc: linux-fpga, Michal Simek, Jason Gunthorpe, Xu Yilun, Pavel Hronek,
	Jiri Novak, Ondrej Ille

On 04.08.2025 15:48, Pavel Pisa wrote:
> Hello Marek and others,
>
> we are running daily tests of SocketCAN stack on mainline and RT kernel
> with our CTU CAN FD IP core
>
>     https://canbus.pages.fel.cvut.cz/#can-bus-channels-mutual-latency-testing
>
> It seems that the setup is broken after the mainlie kernel version
> from 2025-07-29, 6.16.0-g283564a43383. There is only one commit
> identified in drivers/fpga after the last working version
>
>    zynq_fpga: use sgtable-based scatterlist wrappers
>
> The last working mainlne kernel version is recorded in the
> last graph data
>
>    https://canbus.pages.fel.cvut.cz/can-latester/inspect.html
>
> We use the dtbocfg out of tree module to initiate update
> by devicetree overlay but fpga manager is mainline one
> and all worked for years correctly. The log messages
>
> [  104.934323] dtbocfg: loading out-of-tree module taints kernel.
> [  104.940681] dtbocfg: 0.1.0
> [  104.943543] dtbocfg: OK
> [  105.022979] fpga_manager fpga0: writing system.bit.bin to Xilinx Zynq FPGA Manager
> [  105.097721] fpga_manager fpga0: Unable to DMA map (TO_DEVICE)
> [  105.103562] fpga_manager fpga0: Error while writing image data to FPGA
> [  105.110485] fpga_region region0: failed to load FPGA image
> [  105.116059] OF: overlay: overlay changeset pre-apply notifier error -12, target: /fpga-full
> [  105.124499] dtbocfg_overlay_item_create: Failed to apply overlay (ret_val=-12)
>
> The overlay source is available there
>
>    https://gitlab.fel.cvut.cz/canbus/zynq/zynq-can-sja1000-top/-/blob/can-bench-2x-xcan-4x-ctu/scripts/dts/bitstream+dt.dts
>
> The base zynq-7000.dtsi could be a little dated, but I have not
> found related change except name/alias change in mainline
>
> -       fpga_full: fpga-full {
> +       fpga_full: fpga-region {
>
> but there are recorded months of correct operation after
> this change. I am aware that there has been change of name
> alias
>
> -       amba: amba {
> +       amba: axi {
>
> in the sources but again, it has not been problem.
>
> It seems that mapping fails in
>
>          priv->dma_nelms =
>              dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
>
> The last tested failing mainline kernel version is from this
> midnight
>
>     6.16.0-ge991acf1bce7
>
> Do you have some idea what could be a problem?

Well, my fault. I forgot that dma_map_sgtable() returns only the error 
code or zero on success, not the number of mapped segments. It looks 
that the easiest way to fix this issue is to revert my commit 
37e00703228a ("zynq_fpga: use sgtable-based scatterlist wrappers"). I'm 
sorry for this issue.

> Do you have suggestion what to test?
>
> Best wishes
>
>                  Pavel
>
>                  Pavel Pisa
>      phone:      +420 603531357
>      e-mail:     pisa@cmp.felk.cvut.cz
>      Department of Control Engineering FEE CVUT
>      Karlovo namesti 13, 121 35, Prague 2
>      university: http://control.fel.cvut.cz/
>      personal:   http://cmp.felk.cvut.cz/~pisa
>      social:     https://social.kernel.org/ppisa
>      projects:   https://protect2.fireeye.com/v1/url?k=087d50e3-6906fa6a-087cdbac-74fe48600034-5576b8f5b2b1fa2a&q=1&e=978abd42-4782-42e4-ad5b-318f390cffea&u=https%3A%2F%2Fwww.openhub.net%2Faccounts%2Fppisa
>      CAN related:http://canbus.pages.fel.cvut.cz/
>      RISC-V education: https://comparch.edu.cvut.cz/
>      Open Technologies Research Education and Exchange Services
>      https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home
>
Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383
  2025-08-05 10:01   ` Marek Szyprowski
@ 2025-08-05 17:12     ` Xu Yilun
  2025-08-05 18:52       ` Jason Gunthorpe
  2025-08-06  5:22       ` Marek Szyprowski
  0 siblings, 2 replies; 7+ messages in thread
From: Xu Yilun @ 2025-08-05 17:12 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Pavel Pisa, linux-fpga, Michal Simek, Jason Gunthorpe, Xu Yilun,
	Pavel Hronek, Jiri Novak, Ondrej Ille

> Well, my fault. I forgot that dma_map_sgtable() returns only the error 
> code or zero on success, not the number of mapped segments. It looks 
> that the easiest way to fix this issue is to revert my commit 
> 37e00703228a ("zynq_fpga: use sgtable-based scatterlist wrappers"). I'm 

Instead of reverting, can we fix like this?

---

diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
index 0be0d569589d..b7629a0e4813 100644
--- a/drivers/fpga/zynq-fpga.c
+++ b/drivers/fpga/zynq-fpga.c
@@ -405,12 +405,12 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr, struct sg_table *sgt)
                }
        }

-       priv->dma_nelms =
-           dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
-       if (priv->dma_nelms == 0) {
+       err = dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
+       if (err) {
                dev_err(&mgr->dev, "Unable to DMA map (TO_DEVICE)\n");
-               return -ENOMEM;
+               return err;
        }
+       priv->dma_nelms = sgt->nents;

        /* enable clock */
        err = clk_enable(priv->clk);

Thanks,
Yilun

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383
  2025-08-05 17:12     ` Xu Yilun
@ 2025-08-05 18:52       ` Jason Gunthorpe
  2025-08-05 19:39         ` Pavel Pisa
  2025-08-06  5:22       ` Marek Szyprowski
  1 sibling, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2025-08-05 18:52 UTC (permalink / raw)
  To: Xu Yilun
  Cc: Marek Szyprowski, Pavel Pisa, linux-fpga, Michal Simek, Xu Yilun,
	Pavel Hronek, Jiri Novak, Ondrej Ille

On Wed, Aug 06, 2025 at 01:12:46AM +0800, Xu Yilun wrote:
> @@ -405,12 +405,12 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr, struct sg_table *sgt)
>                 }
>         }
> 
> -       priv->dma_nelms =
> -           dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> -       if (priv->dma_nelms == 0) {
> +       err = dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> +       if (err) {
>                 dev_err(&mgr->dev, "Unable to DMA map (TO_DEVICE)\n");
> -               return -ENOMEM;
> +               return err;
>         }
> +       priv->dma_nelms = sgt->nents;

That looks pretty good, Marek is certainly right the original had a
bug.

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383
  2025-08-05 18:52       ` Jason Gunthorpe
@ 2025-08-05 19:39         ` Pavel Pisa
  2025-08-06  7:11           ` Xu Yilun
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Pisa @ 2025-08-05 19:39 UTC (permalink / raw)
  To: Jason Gunthorpe, Xu Yilun, Marek Szyprowski
  Cc: linux-fpga, Michal Simek, Xu Yilun, Pavel Hronek, Jiri Novak,
	Ondrej Ille

Hello all

On Tuesday 05 of August 2025 20:52:35 Jason Gunthorpe wrote:
> On Wed, Aug 06, 2025 at 01:12:46AM +0800, Xu Yilun wrote:
> > @@ -405,12 +405,12 @@ static int zynq_fpga_ops_write(struct fpga_manager
> > *mgr, struct sg_table *sgt) }
> >         }
> >
> > -       priv->dma_nelms =
> > -           dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> > -       if (priv->dma_nelms == 0) {
> > +       err = dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> > +       if (err) {
> >                 dev_err(&mgr->dev, "Unable to DMA map (TO_DEVICE)\n");
> > -               return -ENOMEM;
> > +               return err;
> >         }
> > +       priv->dma_nelms = sgt->nents;
>
> That looks pretty good, Marek is certainly right the original had a
> bug.
>
> Jason

Thanks for the fast correction proposal. I have tested the change
at our kernel build, actual mainline version form midnight
rebuild with patch now

  Linux mzapo 6.16.0+ #2 SMP Tue Aug  5 20:59:36 CEST 2025 armv7l GNU/Linux

and it works correctly, DTBOCFG and FPGA manager is able to load
CTU CAN FD IP core design and then driver is loaded and its instances
are setup according to the device tree overlay and driver detects
CAN controllers implemented in FPGA. I have not run whole CAN test
sequence but I do not expect problems there. You can add my

Reported-by: Pavel Pisa <pisa@fel.cvut.cz>
Tested-by: Pavel Pisa <pisa@fel.cvut.cz>

Do you have some idea how fast can the change propagate
into mainline? I would setup automatic patching with
the fix if the testing setup is broken for some longer time.
If the fix can get in mainline in days or week then I would
spent time on that.

Best wishes,

                Pavel

                Pavel Pisa
    phone:      +420 603531357
    e-mail:     pisa@cmp.felk.cvut.cz
    Department of Control Engineering FEE CVUT
    Karlovo namesti 13, 121 35, Prague 2
    university: http://control.fel.cvut.cz/
    personal:   http://cmp.felk.cvut.cz/~pisa
    social:     https://social.kernel.org/ppisa
    projects:   https://www.openhub.net/accounts/ppisa
    CAN related:http://canbus.pages.fel.cvut.cz/
    RISC-V education: https://comparch.edu.cvut.cz/
    Open Technologies Research Education and Exchange Services
    https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383
  2025-08-05 17:12     ` Xu Yilun
  2025-08-05 18:52       ` Jason Gunthorpe
@ 2025-08-06  5:22       ` Marek Szyprowski
  1 sibling, 0 replies; 7+ messages in thread
From: Marek Szyprowski @ 2025-08-06  5:22 UTC (permalink / raw)
  To: Xu Yilun
  Cc: Pavel Pisa, linux-fpga, Michal Simek, Jason Gunthorpe, Xu Yilun,
	Pavel Hronek, Jiri Novak, Ondrej Ille

On 05.08.2025 19:12, Xu Yilun wrote:
>> Well, my fault. I forgot that dma_map_sgtable() returns only the error
>> code or zero on success, not the number of mapped segments. It looks
>> that the easiest way to fix this issue is to revert my commit
>> 37e00703228a ("zynq_fpga: use sgtable-based scatterlist wrappers"). I'm
> Instead of reverting, can we fix like this?
>
> ---
>
> diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c
> index 0be0d569589d..b7629a0e4813 100644
> --- a/drivers/fpga/zynq-fpga.c
> +++ b/drivers/fpga/zynq-fpga.c
> @@ -405,12 +405,12 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr, struct sg_table *sgt)
>                  }
>          }
>
> -       priv->dma_nelms =
> -           dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> -       if (priv->dma_nelms == 0) {
> +       err = dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> +       if (err) {
>                  dev_err(&mgr->dev, "Unable to DMA map (TO_DEVICE)\n");
> -               return -ENOMEM;
> +               return err;
>          }
> +       priv->dma_nelms = sgt->nents;
>
>          /* enable clock */
>          err = clk_enable(priv->clk);
>
Yes, this is a proper fix for the current code.

Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383
  2025-08-05 19:39         ` Pavel Pisa
@ 2025-08-06  7:11           ` Xu Yilun
  0 siblings, 0 replies; 7+ messages in thread
From: Xu Yilun @ 2025-08-06  7:11 UTC (permalink / raw)
  To: Pavel Pisa
  Cc: Jason Gunthorpe, Marek Szyprowski, linux-fpga, Michal Simek,
	Xu Yilun, Pavel Hronek, Jiri Novak, Ondrej Ille

On Tue, Aug 05, 2025 at 09:39:59PM +0200, Pavel Pisa wrote:
> Hello all
> 
> On Tuesday 05 of August 2025 20:52:35 Jason Gunthorpe wrote:
> > On Wed, Aug 06, 2025 at 01:12:46AM +0800, Xu Yilun wrote:
> > > @@ -405,12 +405,12 @@ static int zynq_fpga_ops_write(struct fpga_manager
> > > *mgr, struct sg_table *sgt) }
> > >         }
> > >
> > > -       priv->dma_nelms =
> > > -           dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> > > -       if (priv->dma_nelms == 0) {
> > > +       err = dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE, 0);
> > > +       if (err) {
> > >                 dev_err(&mgr->dev, "Unable to DMA map (TO_DEVICE)\n");
> > > -               return -ENOMEM;
> > > +               return err;
> > >         }
> > > +       priv->dma_nelms = sgt->nents;
> >
> > That looks pretty good, Marek is certainly right the original had a
> > bug.
> >
> > Jason
> 
> Thanks for the fast correction proposal. I have tested the change
> at our kernel build, actual mainline version form midnight
> rebuild with patch now
> 
>   Linux mzapo 6.16.0+ #2 SMP Tue Aug  5 20:59:36 CEST 2025 armv7l GNU/Linux
> 
> and it works correctly, DTBOCFG and FPGA manager is able to load
> CTU CAN FD IP core design and then driver is loaded and its instances
> are setup according to the device tree overlay and driver detects
> CAN controllers implemented in FPGA. I have not run whole CAN test
> sequence but I do not expect problems there. You can add my
> 
> Reported-by: Pavel Pisa <pisa@fel.cvut.cz>
> Tested-by: Pavel Pisa <pisa@fel.cvut.cz>
> 
> Do you have some idea how fast can the change propagate
> into mainline? I would setup automatic patching with

I've sent the patch to Greg, lets see how to fix it.

https://lore.kernel.org/linux-fpga/20250806070605.1920909-1-yilun.xu@linux.intel.com/

Thanks,
Yilun

> the fix if the testing setup is broken for some longer time.
> If the fix can get in mainline in days or week then I would
> spent time on that.
> 
> Best wishes,
> 
>                 Pavel
> 
>                 Pavel Pisa
>     phone:      +420 603531357
>     e-mail:     pisa@cmp.felk.cvut.cz
>     Department of Control Engineering FEE CVUT
>     Karlovo namesti 13, 121 35, Prague 2
>     university: http://control.fel.cvut.cz/
>     personal:   http://cmp.felk.cvut.cz/~pisa
>     social:     https://social.kernel.org/ppisa
>     projects:   https://www.openhub.net/accounts/ppisa
>     CAN related:http://canbus.pages.fel.cvut.cz/
>     RISC-V education: https://comparch.edu.cvut.cz/
>     Open Technologies Research Education and Exchange Services
>     https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-08-06  7:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20250804134826eucas1p1350956e17ae463332a69eaa67dc2f8d7@eucas1p1.samsung.com>
2025-08-04 13:48 ` AMD/Xilinx Zynq FPGA manager stopped to work after 6.16.0-g283564a43383 Pavel Pisa
2025-08-05 10:01   ` Marek Szyprowski
2025-08-05 17:12     ` Xu Yilun
2025-08-05 18:52       ` Jason Gunthorpe
2025-08-05 19:39         ` Pavel Pisa
2025-08-06  7:11           ` Xu Yilun
2025-08-06  5:22       ` Marek Szyprowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).