* Re: [PATCH] dmaengine: xgene-dma: fix spelling mistake "descripto" -> "descriptor"
From: Vinod Koul @ 2019-04-26 11:27 UTC (permalink / raw)
To: Colin King; +Cc: Dan Williams, dmaengine, kernel-janitors, linux-kernel
In-Reply-To: <20190415153028.24849-1-colin.king@canonical.com>
On 15-04-19, 16:30, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> There is a spelling mistake in a chan_dbg message, fix it.
Applied, thanks
--
~Vinod
^ permalink raw reply
* dmaengine: xgene-dma: fix spelling mistake "descripto" -> "descriptor"
From: Vinod Koul @ 2019-04-26 11:27 UTC (permalink / raw)
To: Colin King; +Cc: Dan Williams, dmaengine, kernel-janitors, linux-kernel
On 15-04-19, 16:30, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> There is a spelling mistake in a chan_dbg message, fix it.
Applied, thanks
^ permalink raw reply
* Re: [PATCH v1] dmaengine: idma64: Move driver name to the header
From: Vinod Koul @ 2019-04-26 11:25 UTC (permalink / raw)
To: Andy Shevchenko; +Cc: dmaengine, Lee Jones, linux-kernel
In-Reply-To: <20190409150219.15094-1-andriy.shevchenko@linux.intel.com>
On 09-04-19, 18:02, Andy Shevchenko wrote:
> There are two drivers that are relying on the iDMA 64-bit driver name
> to match. Instead of duplicating string in both of them, dedicate
> a header file and share it between users.
Applied, thanks
--
~Vinod
^ permalink raw reply
* [v1] dmaengine: idma64: Move driver name to the header
From: Vinod Koul @ 2019-04-26 11:25 UTC (permalink / raw)
To: Andy Shevchenko; +Cc: dmaengine, Lee Jones, linux-kernel
On 09-04-19, 18:02, Andy Shevchenko wrote:
> There are two drivers that are relying on the iDMA 64-bit driver name
> to match. Instead of duplicating string in both of them, dedicate
> a header file and share it between users.
Applied, thanks
^ permalink raw reply
* Re: [PATCH] dmaengine: bcm2835: Drop duplicate capability setting.
From: Vinod Koul @ 2019-04-26 11:23 UTC (permalink / raw)
To: Michal Suchanek
Cc: bcm-kernel-feedback-list, dmaengine, linux-rpi-kernel,
linux-arm-kernel, linux-kernel, Dan Williams, Eric Anholt,
Stefan Wahren, Florian Fainelli, Ray Jui, Scott Branden
In-Reply-To: <20190404172503.12155-1-msuchanek@suse.de>
On 04-04-19, 19:25, Michal Suchanek wrote:
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>
> ---
> drivers/dma/bcm2835-dma.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
> index ec8a291d62ba..e38b19dd2895 100644
> --- a/drivers/dma/bcm2835-dma.c
> +++ b/drivers/dma/bcm2835-dma.c
> @@ -891,7 +891,6 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
> dma_cap_set(DMA_SLAVE, od->ddev.cap_mask);
> dma_cap_set(DMA_PRIVATE, od->ddev.cap_mask);
> dma_cap_set(DMA_CYCLIC, od->ddev.cap_mask);
> - dma_cap_set(DMA_SLAVE, od->ddev.cap_mask);
> dma_cap_set(DMA_MEMCPY, od->ddev.cap_mask);
> od->ddev.device_alloc_chan_resources = bcm2835_dma_alloc_chan_resources;
> od->ddev.device_free_chan_resources = bcm2835_dma_free_chan_resources;
Applied, thanks
--
~Vinod
^ permalink raw reply
* dmaengine: bcm2835: Drop duplicate capability setting.
From: Vinod Koul @ 2019-04-26 11:23 UTC (permalink / raw)
To: Michal Suchanek
Cc: bcm-kernel-feedback-list, dmaengine, linux-rpi-kernel,
linux-arm-kernel, linux-kernel, Dan Williams, Eric Anholt,
Stefan Wahren, Florian Fainelli, Ray Jui, Scott Branden
On 04-04-19, 19:25, Michal Suchanek wrote:
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>
> ---
> drivers/dma/bcm2835-dma.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
> index ec8a291d62ba..e38b19dd2895 100644
> --- a/drivers/dma/bcm2835-dma.c
> +++ b/drivers/dma/bcm2835-dma.c
> @@ -891,7 +891,6 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
> dma_cap_set(DMA_SLAVE, od->ddev.cap_mask);
> dma_cap_set(DMA_PRIVATE, od->ddev.cap_mask);
> dma_cap_set(DMA_CYCLIC, od->ddev.cap_mask);
> - dma_cap_set(DMA_SLAVE, od->ddev.cap_mask);
> dma_cap_set(DMA_MEMCPY, od->ddev.cap_mask);
> od->ddev.device_alloc_chan_resources = bcm2835_dma_alloc_chan_resources;
> od->ddev.device_free_chan_resources = bcm2835_dma_free_chan_resources;
Applied, thanks
^ permalink raw reply
* Re: [PATCH] dmaengine: pl330: _stop: clear interrupt status
From: Vinod Koul @ 2019-04-26 11:21 UTC (permalink / raw)
To: Sugar Zhang
Cc: heiko, broonie, linux-rockchip, Dan Williams, dmaengine,
linux-kernel
In-Reply-To: <1554289582-43695-1-git-send-email-sugar.zhang@rock-chips.com>
On 03-04-19, 19:06, Sugar Zhang wrote:
> This patch kill instructs the DMAC to immediately terminate
> execution of a thread. and then clear the interrupt status,
> at last, stop generating interrupts for DMA_SEV. to guarantee
> the next dma start is clean. otherwise, one interrupt maybe leave
> to next start and make some mistake.
Applied, thanks
--
~Vinod
^ permalink raw reply
* dmaengine: pl330: _stop: clear interrupt status
From: Vinod Koul @ 2019-04-26 11:21 UTC (permalink / raw)
To: Sugar Zhang
Cc: heiko, broonie, linux-rockchip, Dan Williams, dmaengine,
linux-kernel
On 03-04-19, 19:06, Sugar Zhang wrote:
> This patch kill instructs the DMAC to immediately terminate
> execution of a thread. and then clear the interrupt status,
> at last, stop generating interrupts for DMA_SEV. to guarantee
> the next dma start is clean. otherwise, one interrupt maybe leave
> to next start and make some mistake.
Applied, thanks
^ permalink raw reply
* Re: [PATCH v1] dmaengine: tegra: Use relaxed versions of readl/writel
From: Jon Hunter @ 2019-04-26 11:13 UTC (permalink / raw)
To: Dmitry Osipenko, Laxman Dewangan, Vinod Koul, Thierry Reding
Cc: dmaengine, linux-tegra, linux-kernel
In-Reply-To: <4a315b63-bc71-3c3e-f1ae-8638bcf4033d@gmail.com>
On 26/04/2019 11:45, Dmitry Osipenko wrote:
> 26.04.2019 12:52, Jon Hunter пишет:
>>
>> On 25/04/2019 00:17, Dmitry Osipenko wrote:
>>> The readl/writel functions are inserting memory barrier in order to
>>> ensure that memory stores are completed. On Tegra20 and Tegra30 this
>>> results in L2 cache syncing which isn't a cheapest operation. The
>>> tegra20-apb-dma driver doesn't need to synchronize generic memory
>>> accesses, hence use the relaxed versions of the functions.
>>
>> Do you mean device-io accesses here as this is not generic memory?
>
> Yes. The IOMEM accesses within are always ordered and uncached, while
> generic memory accesses are out-of-order and cached.
>
>> Although there may not be any issues with this change, I think I need a
>> bit more convincing that we should do this given that we have had it
>> this way for sometime and I would not like to see us introduce any
>> regressions as this point without being 100% certain we would not.
>> Ideally, if I had some good extensive tests I could run to hammer the
>> DMA for all configurations with different combinations of channels
>> running simultaneously then we could test this, but right now I don't :-(
>>
>> Have you ...
>> 1. Tested both cyclic and scatter-gather transfers?
>> 2. Stress tested simultaneous transfers with various different
>> configurations?
>> 3. Quantified the actual performance benefit of this change so we can
>> understand how much of a performance boost this offers?
>
> Actually I found a case where this change causes a problem, I'm seeing
> I2C transfer timeout for touchscreen and it breaks the touch input.
> Indeed, I haven't tested this patch very well.
>
> And the fix is this:
>
> @@ -1592,6 +1592,8 @@ static int tegra_dma_runtime_suspend(struct device
> *dev)
> TEGRA_APBDMA_CHAN_WCOUNT);
> }
>
> + dsb();
> +
> clk_disable_unprepare(tdma->dma_clk);
>
> return 0;
>
>
> Apparently the problem is that CLK/DMA (PPSB/APB) accesses are
> incoherent and CPU disables clock before writes are reaching DMA controller.
>
> I'd say that cyclic and scatter-gather transfers are now tested. I also
> made some more testing of simultaneous transfers.
>
> Quantifying performance probably won't be easy to make as the DMA
> read/writes are not on any kind of code's hot-path.
So why make the change?
> Jon, are you still insisting about to drop this patch or you will be
> fine with the v2 that will have the dsb() in place?
If we can't quantify the performance gain, then it is difficult to
justify the change. I would also be concerned if that is the only place
we need an explicit dsb.
Cheers
Jon
--
nvpublic
^ permalink raw reply
* [v1] dmaengine: tegra: Use relaxed versions of readl/writel
From: Jon Hunter @ 2019-04-26 11:13 UTC (permalink / raw)
To: Dmitry Osipenko, Laxman Dewangan, Vinod Koul, Thierry Reding
Cc: dmaengine, linux-tegra, linux-kernel
On 26/04/2019 11:45, Dmitry Osipenko wrote:
> 26.04.2019 12:52, Jon Hunter пишет:
>>
>> On 25/04/2019 00:17, Dmitry Osipenko wrote:
>>> The readl/writel functions are inserting memory barrier in order to
>>> ensure that memory stores are completed. On Tegra20 and Tegra30 this
>>> results in L2 cache syncing which isn't a cheapest operation. The
>>> tegra20-apb-dma driver doesn't need to synchronize generic memory
>>> accesses, hence use the relaxed versions of the functions.
>>
>> Do you mean device-io accesses here as this is not generic memory?
>
> Yes. The IOMEM accesses within are always ordered and uncached, while
> generic memory accesses are out-of-order and cached.
>
>> Although there may not be any issues with this change, I think I need a
>> bit more convincing that we should do this given that we have had it
>> this way for sometime and I would not like to see us introduce any
>> regressions as this point without being 100% certain we would not.
>> Ideally, if I had some good extensive tests I could run to hammer the
>> DMA for all configurations with different combinations of channels
>> running simultaneously then we could test this, but right now I don't :-(
>>
>> Have you ...
>> 1. Tested both cyclic and scatter-gather transfers?
>> 2. Stress tested simultaneous transfers with various different
>> configurations?
>> 3. Quantified the actual performance benefit of this change so we can
>> understand how much of a performance boost this offers?
>
> Actually I found a case where this change causes a problem, I'm seeing
> I2C transfer timeout for touchscreen and it breaks the touch input.
> Indeed, I haven't tested this patch very well.
>
> And the fix is this:
>
> @@ -1592,6 +1592,8 @@ static int tegra_dma_runtime_suspend(struct device
> *dev)
> TEGRA_APBDMA_CHAN_WCOUNT);
> }
>
> + dsb();
> +
> clk_disable_unprepare(tdma->dma_clk);
>
> return 0;
>
>
> Apparently the problem is that CLK/DMA (PPSB/APB) accesses are
> incoherent and CPU disables clock before writes are reaching DMA controller.
>
> I'd say that cyclic and scatter-gather transfers are now tested. I also
> made some more testing of simultaneous transfers.
>
> Quantifying performance probably won't be easy to make as the DMA
> read/writes are not on any kind of code's hot-path.
So why make the change?
> Jon, are you still insisting about to drop this patch or you will be
> fine with the v2 that will have the dsb() in place?
If we can't quantify the performance gain, then it is difficult to
justify the change. I would also be concerned if that is the only place
we need an explicit dsb.
Cheers
Jon
^ permalink raw reply
* Re: [PATCH v1] dmaengine: tegra: Use relaxed versions of readl/writel
From: Dmitry Osipenko @ 2019-04-26 10:45 UTC (permalink / raw)
To: Jon Hunter, Laxman Dewangan, Vinod Koul, Thierry Reding
Cc: dmaengine, linux-tegra, linux-kernel
In-Reply-To: <d789f1bc-44d1-abf6-a046-7cf835416e8f@nvidia.com>
26.04.2019 12:52, Jon Hunter пишет:
>
> On 25/04/2019 00:17, Dmitry Osipenko wrote:
>> The readl/writel functions are inserting memory barrier in order to
>> ensure that memory stores are completed. On Tegra20 and Tegra30 this
>> results in L2 cache syncing which isn't a cheapest operation. The
>> tegra20-apb-dma driver doesn't need to synchronize generic memory
>> accesses, hence use the relaxed versions of the functions.
>
> Do you mean device-io accesses here as this is not generic memory?
Yes. The IOMEM accesses within are always ordered and uncached, while
generic memory accesses are out-of-order and cached.
> Although there may not be any issues with this change, I think I need a
> bit more convincing that we should do this given that we have had it
> this way for sometime and I would not like to see us introduce any
> regressions as this point without being 100% certain we would not.
> Ideally, if I had some good extensive tests I could run to hammer the
> DMA for all configurations with different combinations of channels
> running simultaneously then we could test this, but right now I don't :-(
>
> Have you ...
> 1. Tested both cyclic and scatter-gather transfers?
> 2. Stress tested simultaneous transfers with various different
> configurations?
> 3. Quantified the actual performance benefit of this change so we can
> understand how much of a performance boost this offers?
Actually I found a case where this change causes a problem, I'm seeing
I2C transfer timeout for touchscreen and it breaks the touch input.
Indeed, I haven't tested this patch very well.
And the fix is this:
@@ -1592,6 +1592,8 @@ static int tegra_dma_runtime_suspend(struct device
*dev)
TEGRA_APBDMA_CHAN_WCOUNT);
}
+ dsb();
+
clk_disable_unprepare(tdma->dma_clk);
return 0;
Apparently the problem is that CLK/DMA (PPSB/APB) accesses are
incoherent and CPU disables clock before writes are reaching DMA controller.
I'd say that cyclic and scatter-gather transfers are now tested. I also
made some more testing of simultaneous transfers.
Quantifying performance probably won't be easy to make as the DMA
read/writes are not on any kind of code's hot-path.
Jon, are you still insisting about to drop this patch or you will be
fine with the v2 that will have the dsb() in place?
^ permalink raw reply
* [v1] dmaengine: tegra: Use relaxed versions of readl/writel
From: Dmitry Osipenko @ 2019-04-26 10:45 UTC (permalink / raw)
To: Jon Hunter, Laxman Dewangan, Vinod Koul, Thierry Reding
Cc: dmaengine, linux-tegra, linux-kernel
26.04.2019 12:52, Jon Hunter пишет:
>
> On 25/04/2019 00:17, Dmitry Osipenko wrote:
>> The readl/writel functions are inserting memory barrier in order to
>> ensure that memory stores are completed. On Tegra20 and Tegra30 this
>> results in L2 cache syncing which isn't a cheapest operation. The
>> tegra20-apb-dma driver doesn't need to synchronize generic memory
>> accesses, hence use the relaxed versions of the functions.
>
> Do you mean device-io accesses here as this is not generic memory?
Yes. The IOMEM accesses within are always ordered and uncached, while
generic memory accesses are out-of-order and cached.
> Although there may not be any issues with this change, I think I need a
> bit more convincing that we should do this given that we have had it
> this way for sometime and I would not like to see us introduce any
> regressions as this point without being 100% certain we would not.
> Ideally, if I had some good extensive tests I could run to hammer the
> DMA for all configurations with different combinations of channels
> running simultaneously then we could test this, but right now I don't :-(
>
> Have you ...
> 1. Tested both cyclic and scatter-gather transfers?
> 2. Stress tested simultaneous transfers with various different
> configurations?
> 3. Quantified the actual performance benefit of this change so we can
> understand how much of a performance boost this offers?
Actually I found a case where this change causes a problem, I'm seeing
I2C transfer timeout for touchscreen and it breaks the touch input.
Indeed, I haven't tested this patch very well.
And the fix is this:
@@ -1592,6 +1592,8 @@ static int tegra_dma_runtime_suspend(struct device
*dev)
TEGRA_APBDMA_CHAN_WCOUNT);
}
+ dsb();
+
clk_disable_unprepare(tdma->dma_clk);
return 0;
Apparently the problem is that CLK/DMA (PPSB/APB) accesses are
incoherent and CPU disables clock before writes are reaching DMA controller.
I'd say that cyclic and scatter-gather transfers are now tested. I also
made some more testing of simultaneous transfers.
Quantifying performance probably won't be easy to make as the DMA
read/writes are not on any kind of code's hot-path.
Jon, are you still insisting about to drop this patch or you will be
fine with the v2 that will have the dsb() in place?
^ permalink raw reply
* Re: [EXT] Re: [PATCH v2 06/15] spi: imx: fix ERR009165
From: Robin Gong @ 2019-04-26 9:53 UTC (permalink / raw)
To: robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, dan.j.williams@intel.com,
broonie@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com,
will.deacon@arm.com, shawnguo@kernel.org, l.stach@pengutronix.de,
s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, dmaengine@vger.kernel.org,
dl-linux-imx, kernel@pengutronix.de, devicetree@vger.kernel.org
In-Reply-To: <1556271442.2584.29.camel@pengutronix.de>
On 2019-04-26 at 11:37 +0200, Lucas Stach wrote:
> Am Freitag, den 26.04.2019, 09:22 +0000 schrieb Robin Gong:
> >
> > On 2019-04-26 at 11:07 +0200, Lucas Stach wrote:
> > >
> > >
> > > Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> > > >
> > > > static void mx51_setup_wml(struct spi_imx_data *spi_imx)
> > > > {
> > > > + u32 tx_wml = 0;
> > > > +
> > > With a wml of 0 you might set the maxburst of the TX dma channel
> > > to
> > > fifosize to minimize the performance impact of this workaround.
> > >
> > > Regards,
> > > Lucas
> > Unfortunately, this is a MUST part of errata which cause
> > performance
> > drop.
> I'm not talking about changing the WML, but when the WML must be 0
> you
> can safely increase the DMA burst size without overflowing the FIFO,
> which might recover some of of the performance loss.
>
> Regards,
> Lucas
Good point, will add it in V3, thanks.
^ permalink raw reply
* [v2,06/15] spi: imx: fix ERR009165
From: Robin Gong @ 2019-04-26 9:53 UTC (permalink / raw)
To: robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, dan.j.williams@intel.com,
broonie@kernel.org, mark.rutland@arm.com, catalin.marinas@arm.com,
will.deacon@arm.com, shawnguo@kernel.org, l.stach@pengutronix.de,
s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, dmaengine@vger.kernel.org,
dl-linux-imx, kernel@pengutronix.de, devicetree@vger.kernel.org
On 2019-04-26 at 11:37 +0200, Lucas Stach wrote:
> Am Freitag, den 26.04.2019, 09:22 +0000 schrieb Robin Gong:
> >
> > On 2019-04-26 at 11:07 +0200, Lucas Stach wrote:
> > >
> > >
> > > Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> > > >
> > > > static void mx51_setup_wml(struct spi_imx_data *spi_imx)
> > > > {
> > > > + u32 tx_wml = 0;
> > > > +
> > > With a wml of 0 you might set the maxburst of the TX dma channel
> > > to
> > > fifosize to minimize the performance impact of this workaround.
> > >
> > > Regards,
> > > Lucas
> > Unfortunately, this is a MUST part of errata which cause
> > performance
> > drop.
> I'm not talking about changing the WML, but when the WML must be 0
> you
> can safely increase the DMA burst size without overflowing the FIFO,
> which might recover some of of the performance loss.
>
> Regards,
> Lucas
Good point, will add it in V3, thanks.
^ permalink raw reply
* Re: [PATCH v1] dmaengine: tegra: Use relaxed versions of readl/writel
From: Jon Hunter @ 2019-04-26 9:52 UTC (permalink / raw)
To: Dmitry Osipenko, Laxman Dewangan, Vinod Koul, Thierry Reding
Cc: dmaengine, linux-tegra, linux-kernel
In-Reply-To: <20190424231708.21219-1-digetx@gmail.com>
On 25/04/2019 00:17, Dmitry Osipenko wrote:
> The readl/writel functions are inserting memory barrier in order to
> ensure that memory stores are completed. On Tegra20 and Tegra30 this
> results in L2 cache syncing which isn't a cheapest operation. The
> tegra20-apb-dma driver doesn't need to synchronize generic memory
> accesses, hence use the relaxed versions of the functions.
Do you mean device-io accesses here as this is not generic memory?
Although there may not be any issues with this change, I think I need a
bit more convincing that we should do this given that we have had it
this way for sometime and I would not like to see us introduce any
regressions as this point without being 100% certain we would not.
Ideally, if I had some good extensive tests I could run to hammer the
DMA for all configurations with different combinations of channels
running simultaneously then we could test this, but right now I don't :-(
Have you ...
1. Tested both cyclic and scatter-gather transfers?
2. Stress tested simultaneous transfers with various different
configurations?
3. Quantified the actual performance benefit of this change so we can
understand how much of a performance boost this offers?
Cheers
Jon
^ permalink raw reply
* [v1] dmaengine: tegra: Use relaxed versions of readl/writel
From: Jon Hunter @ 2019-04-26 9:52 UTC (permalink / raw)
To: Dmitry Osipenko, Laxman Dewangan, Vinod Koul, Thierry Reding
Cc: dmaengine, linux-tegra, linux-kernel
On 25/04/2019 00:17, Dmitry Osipenko wrote:
> The readl/writel functions are inserting memory barrier in order to
> ensure that memory stores are completed. On Tegra20 and Tegra30 this
> results in L2 cache syncing which isn't a cheapest operation. The
> tegra20-apb-dma driver doesn't need to synchronize generic memory
> accesses, hence use the relaxed versions of the functions.
Do you mean device-io accesses here as this is not generic memory?
Although there may not be any issues with this change, I think I need a
bit more convincing that we should do this given that we have had it
this way for sometime and I would not like to see us introduce any
regressions as this point without being 100% certain we would not.
Ideally, if I had some good extensive tests I could run to hammer the
DMA for all configurations with different combinations of channels
running simultaneously then we could test this, but right now I don't :-(
Have you ...
1. Tested both cyclic and scatter-gather transfers?
2. Stress tested simultaneous transfers with various different
configurations?
3. Quantified the actual performance benefit of this change so we can
understand how much of a performance boost this offers?
Cheers
Jon
^ permalink raw reply
* Re: [EXT] Re: [PATCH v2 06/15] spi: imx: fix ERR009165
From: Lucas Stach @ 2019-04-26 9:37 UTC (permalink / raw)
To: Robin Gong, robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, broonie@kernel.org,
dan.j.williams@intel.com, mark.rutland@arm.com,
catalin.marinas@arm.com, will.deacon@arm.com, shawnguo@kernel.org,
s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
dmaengine@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org
In-Reply-To: <1556270196.9298.16.camel@nxp.com>
Am Freitag, den 26.04.2019, 09:22 +0000 schrieb Robin Gong:
> On 2019-04-26 at 11:07 +0200, Lucas Stach wrote:
> >
> > Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> > > static void mx51_setup_wml(struct spi_imx_data *spi_imx)
> > > {
> > > + u32 tx_wml = 0;
> > > +
> >
> > With a wml of 0 you might set the maxburst of the TX dma channel to
> > fifosize to minimize the performance impact of this workaround.
> >
> > Regards,
> > Lucas
>
> Unfortunately, this is a MUST part of errata which cause performance
> drop.
I'm not talking about changing the WML, but when the WML must be 0 you
can safely increase the DMA burst size without overflowing the FIFO,
which might recover some of of the performance loss.
Regards,
Lucas
^ permalink raw reply
* [v2,06/15] spi: imx: fix ERR009165
From: Lucas Stach @ 2019-04-26 9:37 UTC (permalink / raw)
To: Robin Gong, robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, broonie@kernel.org,
dan.j.williams@intel.com, mark.rutland@arm.com,
catalin.marinas@arm.com, will.deacon@arm.com, shawnguo@kernel.org,
s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
dmaengine@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Am Freitag, den 26.04.2019, 09:22 +0000 schrieb Robin Gong:
> On 2019-04-26 at 11:07 +0200, Lucas Stach wrote:
> >
> > Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> > > static void mx51_setup_wml(struct spi_imx_data *spi_imx)
> > > {
> > > + u32 tx_wml = 0;
> > > +
> >
> > With a wml of 0 you might set the maxburst of the TX dma channel to
> > fifosize to minimize the performance impact of this workaround.
> >
> > Regards,
> > Lucas
>
> Unfortunately, this is a MUST part of errata which cause performance
> drop.
I'm not talking about changing the WML, but when the WML must be 0 you
can safely increase the DMA burst size without overflowing the FIFO,
which might recover some of of the performance loss.
Regards,
Lucas
^ permalink raw reply
* Re: [EXT] Re: [PATCH v2 07/15] spi: imx: remove ERR009165 workaround on i.mx6ul
From: Robin Gong @ 2019-04-26 9:32 UTC (permalink / raw)
To: robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, broonie@kernel.org,
dan.j.williams@intel.com, mark.rutland@arm.com,
catalin.marinas@arm.com, will.deacon@arm.com, shawnguo@kernel.org,
l.stach@pengutronix.de, s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
dmaengine@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org
In-Reply-To: <1556269796.2584.25.camel@pengutronix.de>
On 2019-04-26 at 11:09 +0200, Lucas Stach wrote:
>
> > static inline int is_imx53_ecspi(struct spi_imx_data *d)
> > @@ -585,9 +587,16 @@ static int mx51_ecspi_prepare_transfer(struct
> > spi_imx_data *spi_imx,
> > ctrl |= mx51_ecspi_clkdiv(spi_imx, t->speed_hz, &clk);
> > spi_imx->spi_bus_clk = clk;
> >
> > - /* ERR009165: work in XHC mode as PIO */
> > - if (spi_imx->usedma)
> > - ctrl &= ~MX51_ECSPI_CTRL_SMC;
> > + /*
> > + * ERR009165: work in XHC mode instead of SMC as PIO on the
> > chips
> > + * before i.mx6ul.
> > + */
> > + if (spi_imx->usedma) {
> > + if (spi_imx->devtype_data->devtype == IMX6UL_ECSPI)
> >
> > =, otherwise the workaround might be applied to later generations
> > of
> the core if more are added later.
>
> Regards,
> Lucas
Understood your point, but for now choose different compatible name
could apply this workaround or not. I prefer to leave it for next ecspi
IP upgrade if it really come in the future, otherwise that '>=' bring
a little bit confuse if there is no update for ecspi IP...
^ permalink raw reply
* [v2,07/15] spi: imx: remove ERR009165 workaround on i.mx6ul
From: Robin Gong @ 2019-04-26 9:32 UTC (permalink / raw)
To: robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, broonie@kernel.org,
dan.j.williams@intel.com, mark.rutland@arm.com,
catalin.marinas@arm.com, will.deacon@arm.com, shawnguo@kernel.org,
l.stach@pengutronix.de, s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
dmaengine@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org
On 2019-04-26 at 11:09 +0200, Lucas Stach wrote:
>
> > static inline int is_imx53_ecspi(struct spi_imx_data *d)
> > @@ -585,9 +587,16 @@ static int mx51_ecspi_prepare_transfer(struct
> > spi_imx_data *spi_imx,
> > ctrl |= mx51_ecspi_clkdiv(spi_imx, t->speed_hz, &clk);
> > spi_imx->spi_bus_clk = clk;
> >
> > - /* ERR009165: work in XHC mode as PIO */
> > - if (spi_imx->usedma)
> > - ctrl &= ~MX51_ECSPI_CTRL_SMC;
> > + /*
> > + * ERR009165: work in XHC mode instead of SMC as PIO on the
> > chips
> > + * before i.mx6ul.
> > + */
> > + if (spi_imx->usedma) {
> > + if (spi_imx->devtype_data->devtype == IMX6UL_ECSPI)
> >
> > =, otherwise the workaround might be applied to later generations
> > of
> the core if more are added later.
>
> Regards,
> Lucas
Understood your point, but for now choose different compatible name
could apply this workaround or not. I prefer to leave it for next ecspi
IP upgrade if it really come in the future, otherwise that '>=' bring
a little bit confuse if there is no update for ecspi IP...
^ permalink raw reply
* Re: [EXT] Re: [PATCH v2 06/15] spi: imx: fix ERR009165
From: Robin Gong @ 2019-04-26 9:22 UTC (permalink / raw)
To: robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, broonie@kernel.org,
dan.j.williams@intel.com, mark.rutland@arm.com,
catalin.marinas@arm.com, will.deacon@arm.com, shawnguo@kernel.org,
l.stach@pengutronix.de, s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
dmaengine@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org
In-Reply-To: <1556269635.2584.23.camel@pengutronix.de>
On 2019-04-26 at 11:07 +0200, Lucas Stach wrote:
>
> Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> > static void mx51_setup_wml(struct spi_imx_data *spi_imx)
> > {
> > + u32 tx_wml = 0;
> > +
> With a wml of 0 you might set the maxburst of the TX dma channel to
> fifosize to minimize the performance impact of this workaround.
>
> Regards,
> Lucas
Unfortunately, this is a MUST part of errata which cause performance
drop.
^ permalink raw reply
* [v2,06/15] spi: imx: fix ERR009165
From: Robin Gong @ 2019-04-26 9:22 UTC (permalink / raw)
To: robh+dt@kernel.org, u.kleine-koenig@pengutronix.de,
festevam@gmail.com, plyatov@gmail.com, broonie@kernel.org,
dan.j.williams@intel.com, mark.rutland@arm.com,
catalin.marinas@arm.com, will.deacon@arm.com, shawnguo@kernel.org,
l.stach@pengutronix.de, s.hauer@pengutronix.de
Cc: linux-spi@vger.kernel.org, linux-kernel@vger.kernel.org,
dmaengine@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org
On 2019-04-26 at 11:07 +0200, Lucas Stach wrote:
>
> Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> > static void mx51_setup_wml(struct spi_imx_data *spi_imx)
> > {
> > + u32 tx_wml = 0;
> > +
> With a wml of 0 you might set the maxburst of the TX dma channel to
> fifosize to minimize the performance impact of this workaround.
>
> Regards,
> Lucas
Unfortunately, this is a MUST part of errata which cause performance
drop.
^ permalink raw reply
* Re: [PATCH v2 07/15] spi: imx: remove ERR009165 workaround on i.mx6ul
From: Lucas Stach @ 2019-04-26 9:09 UTC (permalink / raw)
To: Robin Gong, broonie@kernel.org, shawnguo@kernel.org,
s.hauer@pengutronix.de, festevam@gmail.com, robh+dt@kernel.org,
mark.rutland@arm.com, u.kleine-koenig@pengutronix.de,
plyatov@gmail.com, dan.j.williams@intel.com,
catalin.marinas@arm.com, will.deacon@arm.com
Cc: devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-spi@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
dmaengine@vger.kernel.org, linux-arm-kernel@lists.infradead.org
In-Reply-To: <1556265512-9130-8-git-send-email-yibin.gong@nxp.com>
Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> ERR009165 fix on i.mx6ul and next chip, such as i.mx6ull/i.mx8mq/i.mx8mm.
> Remove workaround on those chips. Add new i.mx6ul type for that.
>
> Signed-off-by: Robin Gong <yibin.gong@nxp.com>
> ---
> drivers/spi/spi-imx.c | 39 +++++++++++++++++++++++++++++++++++----
> 1 file changed, 35 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c
> index eb56eac..2e5e978 100644
> --- a/drivers/spi/spi-imx.c
> +++ b/drivers/spi/spi-imx.c
> @@ -57,6 +57,7 @@ enum spi_imx_devtype {
> IMX35_CSPI, /* CSPI on all i.mx except above */
> IMX51_ECSPI, /* ECSPI on i.mx51 */
> IMX53_ECSPI, /* ECSPI on i.mx53 and later */
> + IMX6UL_ECSPI, /* ERR009165 fix from i.mx6ul */
> };
>
> struct spi_imx_data;
> @@ -128,7 +129,8 @@ static inline int is_imx35_cspi(struct spi_imx_data *d)
>
> static inline int is_imx51_ecspi(struct spi_imx_data *d)
> {
> - return d->devtype_data->devtype == IMX51_ECSPI;
> + return d->devtype_data->devtype == IMX51_ECSPI ||
> + d->devtype_data->devtype == IMX6UL_ECSPI;
> }
>
> static inline int is_imx53_ecspi(struct spi_imx_data *d)
> @@ -585,9 +587,16 @@ static int mx51_ecspi_prepare_transfer(struct spi_imx_data *spi_imx,
> ctrl |= mx51_ecspi_clkdiv(spi_imx, t->speed_hz, &clk);
> spi_imx->spi_bus_clk = clk;
>
> - /* ERR009165: work in XHC mode as PIO */
> - if (spi_imx->usedma)
> - ctrl &= ~MX51_ECSPI_CTRL_SMC;
> + /*
> + * ERR009165: work in XHC mode instead of SMC as PIO on the chips
> + * before i.mx6ul.
> + */
> + if (spi_imx->usedma) {
> + if (spi_imx->devtype_data->devtype == IMX6UL_ECSPI)
>=, otherwise the workaround might be applied to later generations of
the core if more are added later.
Regards,
Lucas
^ permalink raw reply
* [v2,07/15] spi: imx: remove ERR009165 workaround on i.mx6ul
From: Lucas Stach @ 2019-04-26 9:09 UTC (permalink / raw)
To: Robin Gong, broonie@kernel.org, shawnguo@kernel.org,
s.hauer@pengutronix.de, festevam@gmail.com, robh+dt@kernel.org,
mark.rutland@arm.com, u.kleine-koenig@pengutronix.de,
plyatov@gmail.com, dan.j.williams@intel.com,
catalin.marinas@arm.com, will.deacon@arm.com
Cc: devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-spi@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
dmaengine@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> ERR009165 fix on i.mx6ul and next chip, such as i.mx6ull/i.mx8mq/i.mx8mm.
> Remove workaround on those chips. Add new i.mx6ul type for that.
>
> Signed-off-by: Robin Gong <yibin.gong@nxp.com>
> ---
> drivers/spi/spi-imx.c | 39 +++++++++++++++++++++++++++++++++++----
> 1 file changed, 35 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c
> index eb56eac..2e5e978 100644
> --- a/drivers/spi/spi-imx.c
> +++ b/drivers/spi/spi-imx.c
> @@ -57,6 +57,7 @@ enum spi_imx_devtype {
> IMX35_CSPI, /* CSPI on all i.mx except above */
> IMX51_ECSPI, /* ECSPI on i.mx51 */
> IMX53_ECSPI, /* ECSPI on i.mx53 and later */
> + IMX6UL_ECSPI, /* ERR009165 fix from i.mx6ul */
> };
>
> struct spi_imx_data;
> @@ -128,7 +129,8 @@ static inline int is_imx35_cspi(struct spi_imx_data *d)
>
> static inline int is_imx51_ecspi(struct spi_imx_data *d)
> {
> - return d->devtype_data->devtype == IMX51_ECSPI;
> + return d->devtype_data->devtype == IMX51_ECSPI ||
> + d->devtype_data->devtype == IMX6UL_ECSPI;
> }
>
> static inline int is_imx53_ecspi(struct spi_imx_data *d)
> @@ -585,9 +587,16 @@ static int mx51_ecspi_prepare_transfer(struct spi_imx_data *spi_imx,
> ctrl |= mx51_ecspi_clkdiv(spi_imx, t->speed_hz, &clk);
> spi_imx->spi_bus_clk = clk;
>
> - /* ERR009165: work in XHC mode as PIO */
> - if (spi_imx->usedma)
> - ctrl &= ~MX51_ECSPI_CTRL_SMC;
> + /*
> + * ERR009165: work in XHC mode instead of SMC as PIO on the chips
> + * before i.mx6ul.
> + */
> + if (spi_imx->usedma) {
> + if (spi_imx->devtype_data->devtype == IMX6UL_ECSPI)
>=, otherwise the workaround might be applied to later generations of
the core if more are added later.
Regards,
Lucas
^ permalink raw reply
* Re: [PATCH v2 06/15] spi: imx: fix ERR009165
From: Lucas Stach @ 2019-04-26 9:07 UTC (permalink / raw)
To: Robin Gong, broonie@kernel.org, shawnguo@kernel.org,
s.hauer@pengutronix.de, festevam@gmail.com, robh+dt@kernel.org,
mark.rutland@arm.com, u.kleine-koenig@pengutronix.de,
plyatov@gmail.com, dan.j.williams@intel.com,
catalin.marinas@arm.com, will.deacon@arm.com
Cc: devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-spi@vger.kernel.org, dl-linux-imx, kernel@pengutronix.de,
dmaengine@vger.kernel.org, linux-arm-kernel@lists.infradead.org
In-Reply-To: <1556265512-9130-7-git-send-email-yibin.gong@nxp.com>
Am Freitag, den 26.04.2019, 08:05 +0000 schrieb Robin Gong:
> Change to XCH mode even in dma mode, please refer to the below
> errata:
> https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf
>
> Signed-off-by: Robin Gong <yibin.gong@nxp.com>
> ---
> drivers/spi/spi-imx.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c
> index 09c9a1e..eb56eac 100644
> --- a/drivers/spi/spi-imx.c
> +++ b/drivers/spi/spi-imx.c
> @@ -585,8 +585,9 @@ static int mx51_ecspi_prepare_transfer(struct
> spi_imx_data *spi_imx,
> ctrl |= mx51_ecspi_clkdiv(spi_imx, t->speed_hz, &clk);
> spi_imx->spi_bus_clk = clk;
>
> + /* ERR009165: work in XHC mode as PIO */
> if (spi_imx->usedma)
> - ctrl |= MX51_ECSPI_CTRL_SMC;
> + ctrl &= ~MX51_ECSPI_CTRL_SMC;
>
> writel(ctrl, spi_imx->base + MX51_ECSPI_CTRL);
>
> @@ -612,12 +613,14 @@ static int mx51_ecspi_prepare_transfer(struct
> spi_imx_data *spi_imx,
>
> static void mx51_setup_wml(struct spi_imx_data *spi_imx)
> {
> + u32 tx_wml = 0;
> +
With a wml of 0 you might set the maxburst of the TX dma channel to
fifosize to minimize the performance impact of this workaround.
Regards,
Lucas
> /*
> * Configure the DMA register: setup the watermark
> * and enable DMA request.
> */
> writel(MX51_ECSPI_DMA_RX_WML(spi_imx->wml - 1) |
> - MX51_ECSPI_DMA_TX_WML(spi_imx->wml) |
> + MX51_ECSPI_DMA_TX_WML(tx_wml) |
> MX51_ECSPI_DMA_RXT_WML(spi_imx->wml) |
> MX51_ECSPI_DMA_TEDEN | MX51_ECSPI_DMA_RXDEN |
> MX51_ECSPI_DMA_RXTDEN, spi_imx->base +
> MX51_ECSPI_DMA);
> @@ -1265,10 +1268,6 @@ static int spi_imx_sdma_init(struct device
> *dev, struct spi_imx_data *spi_imx,
> {
> int ret;
>
> - /* use pio mode for i.mx6dl chip TKT238285 */
> - if (of_machine_is_compatible("fsl,imx6dl"))
> - return 0;
> -
> spi_imx->wml = spi_imx->devtype_data->fifo_size / 2;
>
> /* Prepare for TX DMA: */
> --
> 2.7.4
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox