From: Ulf Hansson <ulf.hansson@stericsson.com>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Stefan NILSSON9 <stefan.xk.nilsson@stericsson.com>,
"linux-mmc@vger.kernel.org" <linux-mmc@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
Lee Jones <lee.jones@linaro.org>
Subject: Re: [PATCH 1/4] mmc: mmci: Bugfix in pio read for small packets
Date: Fri, 7 Oct 2011 15:38:44 +0200 [thread overview]
Message-ID: <4E8F00E4.5010702@stericsson.com> (raw)
In-Reply-To: <4E895F6A.9070203@stericsson.com>
Stefan NILSSON9 wrote:
> Hi Russel,
>
> On 10/01/2011 06:09 PM, Russell King - ARM Linux wrote:
>> Does this even work? From the MMCI spec, I see no way for the MMCI
>> peripheral to know the size of the read/write on the APB bus.
>>
>> The APB bus signals the MMCI uses are:
>>
>> PCLK - APB bus clock
>> PRESETn - APB bus reset
>> PADDR[11:2] - APB bus address
>> PSEL - APB bus peripheral select
>> PENABLE - APB bus enable
>> PWRITE - APB bus write signal
>> PWDATA[31:0] - APB bus write data
>> PRDATA[31:0] - APB bus read data
>>
>> As you can see, nothing in that set indicates whether it's an 8-bit,
>> 16-bit or 32-bit access.
>
>> Moreover, if you read the MMCIFifoCnt register writeup:
>>
>> The MMCIFifoCnt register contains the remaining number of words to be
>> written to or read from the FIFO. The FIFO counter loads the value
>> from the data length register (see Data length register, MMCIDataLength
>> on page 3-11) when the Enable bit is set in the data control register.
>> If the data length is not word aligned (multiple of 4), the remaining
>> 1 to 3 bytes are regarded as a word.
>>
>> This suggests that we should be reading a 32-bit word and then storing
>> the relevant bytes from it.
>
> Hmm, that is true. I will try to rework the patch to skip the case and
> only do 32 bit reads. We do however need a patch since the current
> "count" calculation is wrong (more info below). But it does in fact work.
>
>> The other thing which concerns me is that the MMCI (ARM Ltd one at least)
>> only supports power-of-two block sizes. So requesting a transfer of a
>> single block with a block size of 3 bytes is not supported by the ARM Ltd
>> MMCI. (The way you end up with 1 to 3 bytes being received with ARM's
>> MMCI is if you're using a streaming transfer.)
>
> This is true for the standard ARM block. We have however done some
> modifications of the standard block and added support for SDIO which
> also implies streaming transfers which can be non power of 2. So with
> the current implementation:
>
> readsl(base + MMCIFIFO, ptr, count >> 2);
>
> when we get a request to read 3, 2 or 1 bytes (which our WLAN driver
> actually requests from time to time), this will result in reading 0
> words which is not correct. This was the original problem, but I might
> have "overdone" the solution a bit. I will upload a new patch which
> solves the original problem in a better way.
>
>> The last thing I don't like about this patch is that this code is in a
>> really hot path - one which is absolutely critical for things to work -
>> and the need for the condition to be dealt with is only at the end of a
>> transfer, not each time the FIFO needs emptying.
In this hot path we already do a read of the FIFOCNT register for every
loop in pio_read, won't this sometimes cause caches to flush and
similar, thus cost quite a lot - at least a lot more than executing a
switch/if sentence like Stefan added? Or do I miss something?
I were also thinking of a possible optimization of minimizing the total
numbers of reads of the FIFOCNT register in pio_read. Basically we can
make use of the RXFIFOHALFFULL irq/status to know when there is a
"burst" available in the FIFO. Do you think this will be feasible for
the ARM MMCI Pl18x IP as well? I mean the consequence of using
RXFIFOHALFFULL will be less numbers of IRQ raised and then when reading
data from the FIFO it will be done in larger chunks.
>>
>> Bear in mind that there are platforms with the ARM MMCI which must read
>> the data within a certain time or suffer overruns, and which have either
>> totally broken and useless DMA or no DMA capability at all (which are
>> the only platforms I have with a MMCI on.)
We could make use of the "likely" makro to make compiler optimizations
of the code, is that a way forward do you think?
>
> I will keep this in mind. We are fortunately blessed with a working DMA
> in our platform, but for small SDIO transfers there is no point in
> setting up a DMA job which is when we fall back to PIO mode.
>
> Best Regards
>
> Stefan Nilsson
>
WARNING: multiple messages have this Message-ID (diff)
From: ulf.hansson@stericsson.com (Ulf Hansson)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 1/4] mmc: mmci: Bugfix in pio read for small packets
Date: Fri, 7 Oct 2011 15:38:44 +0200 [thread overview]
Message-ID: <4E8F00E4.5010702@stericsson.com> (raw)
In-Reply-To: <4E895F6A.9070203@stericsson.com>
Stefan NILSSON9 wrote:
> Hi Russel,
>
> On 10/01/2011 06:09 PM, Russell King - ARM Linux wrote:
>> Does this even work? From the MMCI spec, I see no way for the MMCI
>> peripheral to know the size of the read/write on the APB bus.
>>
>> The APB bus signals the MMCI uses are:
>>
>> PCLK - APB bus clock
>> PRESETn - APB bus reset
>> PADDR[11:2] - APB bus address
>> PSEL - APB bus peripheral select
>> PENABLE - APB bus enable
>> PWRITE - APB bus write signal
>> PWDATA[31:0] - APB bus write data
>> PRDATA[31:0] - APB bus read data
>>
>> As you can see, nothing in that set indicates whether it's an 8-bit,
>> 16-bit or 32-bit access.
>
>> Moreover, if you read the MMCIFifoCnt register writeup:
>>
>> The MMCIFifoCnt register contains the remaining number of words to be
>> written to or read from the FIFO. The FIFO counter loads the value
>> from the data length register (see Data length register, MMCIDataLength
>> on page 3-11) when the Enable bit is set in the data control register.
>> If the data length is not word aligned (multiple of 4), the remaining
>> 1 to 3 bytes are regarded as a word.
>>
>> This suggests that we should be reading a 32-bit word and then storing
>> the relevant bytes from it.
>
> Hmm, that is true. I will try to rework the patch to skip the case and
> only do 32 bit reads. We do however need a patch since the current
> "count" calculation is wrong (more info below). But it does in fact work.
>
>> The other thing which concerns me is that the MMCI (ARM Ltd one at least)
>> only supports power-of-two block sizes. So requesting a transfer of a
>> single block with a block size of 3 bytes is not supported by the ARM Ltd
>> MMCI. (The way you end up with 1 to 3 bytes being received with ARM's
>> MMCI is if you're using a streaming transfer.)
>
> This is true for the standard ARM block. We have however done some
> modifications of the standard block and added support for SDIO which
> also implies streaming transfers which can be non power of 2. So with
> the current implementation:
>
> readsl(base + MMCIFIFO, ptr, count >> 2);
>
> when we get a request to read 3, 2 or 1 bytes (which our WLAN driver
> actually requests from time to time), this will result in reading 0
> words which is not correct. This was the original problem, but I might
> have "overdone" the solution a bit. I will upload a new patch which
> solves the original problem in a better way.
>
>> The last thing I don't like about this patch is that this code is in a
>> really hot path - one which is absolutely critical for things to work -
>> and the need for the condition to be dealt with is only at the end of a
>> transfer, not each time the FIFO needs emptying.
In this hot path we already do a read of the FIFOCNT register for every
loop in pio_read, won't this sometimes cause caches to flush and
similar, thus cost quite a lot - at least a lot more than executing a
switch/if sentence like Stefan added? Or do I miss something?
I were also thinking of a possible optimization of minimizing the total
numbers of reads of the FIFOCNT register in pio_read. Basically we can
make use of the RXFIFOHALFFULL irq/status to know when there is a
"burst" available in the FIFO. Do you think this will be feasible for
the ARM MMCI Pl18x IP as well? I mean the consequence of using
RXFIFOHALFFULL will be less numbers of IRQ raised and then when reading
data from the FIFO it will be done in larger chunks.
>>
>> Bear in mind that there are platforms with the ARM MMCI which must read
>> the data within a certain time or suffer overruns, and which have either
>> totally broken and useless DMA or no DMA capability at all (which are
>> the only platforms I have with a MMCI on.)
We could make use of the "likely" makro to make compiler optimizations
of the code, is that a way forward do you think?
>
> I will keep this in mind. We are fortunately blessed with a working DMA
> in our platform, but for small SDIO transfers there is no point in
> setting up a DMA job which is when we fall back to PIO mode.
>
> Best Regards
>
> Stefan Nilsson
>
next prev parent reply other threads:[~2011-10-07 13:39 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-27 7:46 [PATCH 1/4] mmc: mmci: Bugfix in pio read for small packets Ulf Hansson
2011-09-27 7:46 ` Ulf Hansson
2011-10-01 16:09 ` Russell King - ARM Linux
2011-10-01 16:09 ` Russell King - ARM Linux
2011-10-03 7:08 ` Stefan Nilsson XK
2011-10-03 7:08 ` Stefan Nilsson XK
2011-10-07 13:38 ` Ulf Hansson [this message]
2011-10-07 13:38 ` Ulf Hansson
2011-10-07 19:11 ` Russell King - ARM Linux
2011-10-07 19:11 ` Russell King - ARM Linux
2011-10-14 7:38 ` Stefan Nilsson XK
2011-10-14 7:38 ` Stefan Nilsson XK
2011-10-07 13:45 ` Ulf Hansson
2011-10-07 13:45 ` Ulf Hansson
2011-10-08 9:10 ` Russell King - ARM Linux
2011-10-08 9:10 ` Russell King - ARM Linux
2011-10-09 6:59 ` Linus Walleij
2011-10-09 6:59 ` Linus Walleij
2011-10-10 8:23 ` Ulf Hansson
2011-10-10 8:23 ` Ulf Hansson
2011-10-13 15:48 ` Russell King - ARM Linux
2011-10-13 15:48 ` Russell King - ARM Linux
2011-10-14 8:07 ` Ulf Hansson
2011-10-14 8:07 ` Ulf Hansson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E8F00E4.5010702@stericsson.com \
--to=ulf.hansson@stericsson.com \
--cc=lee.jones@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mmc@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=stefan.xk.nilsson@stericsson.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.