From: Stefan Roese <sr-ynQEQJNshbs@public.gmane.org>
To: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>
Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
Mark Brown <broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Thomas Petazzoni
<thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>,
Gregory CLEMENT
<gregory.clement-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>,
Andrew Lunn <andrew-g2DYL2Zd6BY@public.gmane.org>,
linux-spi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v5] spi: orion.c: Add direct access mode
Date: Wed, 20 Apr 2016 09:38:29 +0200 [thread overview]
Message-ID: <571731F5.2010809@denx.de> (raw)
In-Reply-To: <3956609.3ttDzXh866@wuerfel>
On 19.04.2016 15:41, Arnd Bergmann wrote:
> On Tuesday 19 April 2016 11:42:25 Stefan Roese wrote:
>> On 18.04.2016 15:57, Arnd Bergmann wrote:
>>> On Monday 18 April 2016 15:00:52 Stefan Roese wrote:
>>>> On 18.04.2016 13:32, Arnd Bergmann wrote:
>>> Ok, so I looked up the datasheet again and understood now that it
>>> actually does both ways: a) the direct read/write that I had always assumed
>>> it did with automatic cmd/addr/data generation, and b) the mode where
>>> you do each write transfer separately as implemented by your patch.
>>>
>>> Those two modes seem to be wildly different in their needs for address
>>> space:
>>>
>>> - For a) we definitely need the mapping to be the same size as the
>>> addressable storage of the SPI-NOR (or other device that fits
>>> into the cmd/addr/data pattern supported by the hardware)
>>>
>>> - For b) the address is ignored and at most 32 bytes are transfered,
>>> so whatever the smallest MBUS mapping window size is would certainly
>>> be enough.
>>>
>>> As a side-note, I see that you use a regular devm_ioremap_resource(),
>>> which I assume prevents the transfers from ever being more than
>>> a single 4-byte word, while burst mode with up to 32 bytes likely
>>> requires a write-combining mapping.
>>
>> Thanks for the hint. I've tested again with devm_ioremap_wc() and the
>> resulting SPI TX time is identical to the one using the driver with
>> the uncached (non write-combined) mapping. This is most likely a
>> result of the already fully saturated SPI bus speed being the
>> bottle-neck here.
>
> Ok
>
>> A bit more testing has shown, that using devm_ioremap_wc() leads
>> to sporadic failures in the FPGA image downloading though. Now
>> I'm unsure, if its really worth to add some cache flushing after
>> the memcpy() call here (flush_cache_vmap ?) or better keep the
>> uncached ioremap in place.
>
> The cache is not involved here, as devm_ioremap_wc is still uncached.
> However it's possible that memcpy has other issues here. If you
> don't gain anything from sending more than 32 bits at a time,
> writesl() might be a better alternative, but it requires the data
> to be 32-bit aligned in RAM. If the buffer is not aligned, the
> memcpy() might already cause problems in case the kernel
> implementation does not send the data in the correct order.
>
> memcpy_toio() could also be helpful here.
>
>>>>> This means that the 1MB window is probably a reasonable size (provided
>>>>> that the (most importantly) the SPI-NOR driver can guarantee that it
>>>>> never exceeds this).
>>>>
>>>> I have no problem switching from 1MiB to using 0xffffffff in the
>>>> SPI controller 'reg' node to allow even bigger windows in v6.
>>>
>>> How do you decide how much of the window to map though? Using 0xffffffff
>>> is nice because it better represents what the hardware looks like
>>> and doesn't limit you from having arbitrarily large mbus windows
>>> based on your needs, but you'd probably have to try mapping various
>>> sizes then to find the maximum supported one, or you'd have to do
>>> some ugly parsing of the parent node ranges.
>>
>> This is for the single-window mode, right? My comment above was
>> still referring to the original implementation. And yes, this
>> gets a bit complicated and is most likely not really necessary.
>> So I would prefer to keep the mapping fixed to a max. of 1MiB for
>> now.
>
> No, I also meant the separate windows here. Based on the discussion
> above, I think the driver can ideally just ioremap 4KB for devices
> other than SPI-NOR, and then you can list the 0xffffffff length.
>
>>> In this example, you have to configure the SPI controller to have 1MB per CS
>>> in single-window mode using the "SPI DecodeMode" and "SPI CS Adress Decode"
>>> registers. The mbus then contains two mappings, one 2MB window spanning
>>> CS3 and CS4 in a continuous CPU range, and one 64K window for a device at
>>> CS6, making it as small as possible to conserve CPU address space.
>>
>> Thanks. I've prepared a new version implementing this single-
>> window mode. Please let me know how you prefer the cache issue
>> above to be handled (uncached mapping, flush or ?) and I'll
>> update the patch accordingly.
>
> If a write-combining mapping doesn't help you here, just use the normal
> ioremap as before.
Okay, switching back to the normal ioremap.
> The single-window mode won't really work with SPI-NOR unless you make each
> window large enough to hold the largest possible flash (128MB with
> single-window, 256MB with double-window), but that size makes it harder
> to gain much from saving mbus windows when you trade them for gigantic
> CPU address space consumption.
>
> Therefore, we probably want the separate-window mode to allow the
> combination of large SPI flashes with other SPI devices in direct-access
> mode. We can also implement the single (and/or double) window mode
> in addition to that, but realistically we probably don't need that
> as all machines we support in the kernel today have at most one
> non-flash device.
Yes, this is also my feeling. While implementing this 0xffffffff
size in the SPI controller 'reg' DT node, I just stumbled over the
following problem: The resulting size of the area to map
(of_address_to_resource) is not the minimum of the 'soc' 'ranges'
entry and this 0xffffffff as I would have expected. But its always
0xffffffff. Do you know if this is general problem with this
approach or perhaps a problem / bug in the DT address probing /
translation?
Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: sr@denx.de (Stefan Roese)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5] spi: orion.c: Add direct access mode
Date: Wed, 20 Apr 2016 09:38:29 +0200 [thread overview]
Message-ID: <571731F5.2010809@denx.de> (raw)
In-Reply-To: <3956609.3ttDzXh866@wuerfel>
On 19.04.2016 15:41, Arnd Bergmann wrote:
> On Tuesday 19 April 2016 11:42:25 Stefan Roese wrote:
>> On 18.04.2016 15:57, Arnd Bergmann wrote:
>>> On Monday 18 April 2016 15:00:52 Stefan Roese wrote:
>>>> On 18.04.2016 13:32, Arnd Bergmann wrote:
>>> Ok, so I looked up the datasheet again and understood now that it
>>> actually does both ways: a) the direct read/write that I had always assumed
>>> it did with automatic cmd/addr/data generation, and b) the mode where
>>> you do each write transfer separately as implemented by your patch.
>>>
>>> Those two modes seem to be wildly different in their needs for address
>>> space:
>>>
>>> - For a) we definitely need the mapping to be the same size as the
>>> addressable storage of the SPI-NOR (or other device that fits
>>> into the cmd/addr/data pattern supported by the hardware)
>>>
>>> - For b) the address is ignored and at most 32 bytes are transfered,
>>> so whatever the smallest MBUS mapping window size is would certainly
>>> be enough.
>>>
>>> As a side-note, I see that you use a regular devm_ioremap_resource(),
>>> which I assume prevents the transfers from ever being more than
>>> a single 4-byte word, while burst mode with up to 32 bytes likely
>>> requires a write-combining mapping.
>>
>> Thanks for the hint. I've tested again with devm_ioremap_wc() and the
>> resulting SPI TX time is identical to the one using the driver with
>> the uncached (non write-combined) mapping. This is most likely a
>> result of the already fully saturated SPI bus speed being the
>> bottle-neck here.
>
> Ok
>
>> A bit more testing has shown, that using devm_ioremap_wc() leads
>> to sporadic failures in the FPGA image downloading though. Now
>> I'm unsure, if its really worth to add some cache flushing after
>> the memcpy() call here (flush_cache_vmap ?) or better keep the
>> uncached ioremap in place.
>
> The cache is not involved here, as devm_ioremap_wc is still uncached.
> However it's possible that memcpy has other issues here. If you
> don't gain anything from sending more than 32 bits at a time,
> writesl() might be a better alternative, but it requires the data
> to be 32-bit aligned in RAM. If the buffer is not aligned, the
> memcpy() might already cause problems in case the kernel
> implementation does not send the data in the correct order.
>
> memcpy_toio() could also be helpful here.
>
>>>>> This means that the 1MB window is probably a reasonable size (provided
>>>>> that the (most importantly) the SPI-NOR driver can guarantee that it
>>>>> never exceeds this).
>>>>
>>>> I have no problem switching from 1MiB to using 0xffffffff in the
>>>> SPI controller 'reg' node to allow even bigger windows in v6.
>>>
>>> How do you decide how much of the window to map though? Using 0xffffffff
>>> is nice because it better represents what the hardware looks like
>>> and doesn't limit you from having arbitrarily large mbus windows
>>> based on your needs, but you'd probably have to try mapping various
>>> sizes then to find the maximum supported one, or you'd have to do
>>> some ugly parsing of the parent node ranges.
>>
>> This is for the single-window mode, right? My comment above was
>> still referring to the original implementation. And yes, this
>> gets a bit complicated and is most likely not really necessary.
>> So I would prefer to keep the mapping fixed to a max. of 1MiB for
>> now.
>
> No, I also meant the separate windows here. Based on the discussion
> above, I think the driver can ideally just ioremap 4KB for devices
> other than SPI-NOR, and then you can list the 0xffffffff length.
>
>>> In this example, you have to configure the SPI controller to have 1MB per CS
>>> in single-window mode using the "SPI DecodeMode" and "SPI CS Adress Decode"
>>> registers. The mbus then contains two mappings, one 2MB window spanning
>>> CS3 and CS4 in a continuous CPU range, and one 64K window for a device at
>>> CS6, making it as small as possible to conserve CPU address space.
>>
>> Thanks. I've prepared a new version implementing this single-
>> window mode. Please let me know how you prefer the cache issue
>> above to be handled (uncached mapping, flush or ?) and I'll
>> update the patch accordingly.
>
> If a write-combining mapping doesn't help you here, just use the normal
> ioremap as before.
Okay, switching back to the normal ioremap.
> The single-window mode won't really work with SPI-NOR unless you make each
> window large enough to hold the largest possible flash (128MB with
> single-window, 256MB with double-window), but that size makes it harder
> to gain much from saving mbus windows when you trade them for gigantic
> CPU address space consumption.
>
> Therefore, we probably want the separate-window mode to allow the
> combination of large SPI flashes with other SPI devices in direct-access
> mode. We can also implement the single (and/or double) window mode
> in addition to that, but realistically we probably don't need that
> as all machines we support in the kernel today have at most one
> non-flash device.
Yes, this is also my feeling. While implementing this 0xffffffff
size in the SPI controller 'reg' DT node, I just stumbled over the
following problem: The resulting size of the area to map
(of_address_to_resource) is not the minimum of the 'soc' 'ranges'
entry and this 0xffffffff as I would have expected. But its always
0xffffffff. Do you know if this is general problem with this
approach or perhaps a problem / bug in the DT address probing /
translation?
Thanks,
Stefan
next prev parent reply other threads:[~2016-04-20 7:38 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-18 10:13 [PATCH v5] spi: orion.c: Add direct access mode Stefan Roese
2016-04-18 10:13 ` Stefan Roese
[not found] ` <1460974417-32375-1-git-send-email-sr-ynQEQJNshbs@public.gmane.org>
2016-04-18 10:51 ` Arnd Bergmann
2016-04-18 10:51 ` Arnd Bergmann
2016-04-18 11:04 ` Mark Brown
2016-04-18 11:04 ` Mark Brown
[not found] ` <20160418110415.GT3217-GFdadSzt00ze9xe1eoZjHA@public.gmane.org>
2016-04-18 11:32 ` Arnd Bergmann
2016-04-18 11:32 ` Arnd Bergmann
2016-04-18 13:00 ` Stefan Roese
2016-04-18 13:00 ` Stefan Roese
[not found] ` <5714DA84.2050505-ynQEQJNshbs@public.gmane.org>
2016-04-18 13:57 ` Arnd Bergmann
2016-04-18 13:57 ` Arnd Bergmann
2016-04-19 9:42 ` Stefan Roese
2016-04-19 9:42 ` Stefan Roese
[not found] ` <5715FD81.8090100-ynQEQJNshbs@public.gmane.org>
2016-04-19 13:41 ` Arnd Bergmann
2016-04-19 13:41 ` Arnd Bergmann
2016-04-20 7:38 ` Stefan Roese [this message]
2016-04-20 7:38 ` Stefan Roese
[not found] ` <571731F5.2010809-ynQEQJNshbs@public.gmane.org>
2016-04-20 8:11 ` Arnd Bergmann
2016-04-20 8:11 ` Arnd Bergmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=571731F5.2010809@denx.de \
--to=sr-ynqeqjnshbs@public.gmane.org \
--cc=andrew-g2DYL2Zd6BY@public.gmane.org \
--cc=arnd-r2nGTMty4D4@public.gmane.org \
--cc=broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=gregory.clement-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org \
--cc=linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
--cc=linux-spi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=thomas.petazzoni-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.