From: Mason <slash.tmp@free.fr>
To: linux-serial@vger.kernel.org
Cc: LKML <linux-kernel@vger.kernel.org>,
Peter Hurley <peter@hurleysoftware.com>,
Mans Rullgard <mans@mansr.com>
Subject: Hardware spec prevents optimal performance in device driver
Date: Sat, 09 May 2015 12:22:43 +0200 [thread overview]
Message-ID: <554DDFF3.5060906@free.fr> (raw)
Hello everyone,
I'm writing a device driver for a serial-ish kind of device.
I'm interested in the TX side of the problem. (I'm working on
an ARM Cortex A9 system by the way.)
There's a 16-byte TX FIFO. Data is queued to the FIFO by writing
{1,2,4} bytes to a TX{8,16,32} memory-mapped register.
Reading the TX_DEPTH register returns the current queue depth.
The TX_READY IRQ is asserted when (and only when) TX_DEPTH
transitions from 1 to 0.
With this spec in mind, I don't see how it is possible to
attain optimal TX performance in the driver. There's a race
between the SW thread filling the queue and the HW thread
emptying it.
My first attempt went along these lines:
SW thread pseudo-code (blocking write)
while (bytes_to_send > 16) {
write 16 bytes to the queue /* NON ATOMIC */
bytes_to_send -= 16;
wait for semaphore
}
write the last bytes to the queue
wait for semaphore
The simplest way to "write 16 bytes to the queue" is
a byte-access loop.
for (i = 0; i < 16; ++i) write buf[i] to TX8
or -- just slightly more complex
for (i = 0; i < 4; ++i) write buf[4i .. 4i+3] to TX32
But you see the problem: I write a byte, and then, for some
reason (low freq from cpufreq, IRQ) the CPU takes a very long
time to get to the next, thus TX_READY fires before I even
write the next byte.
In short, TX_READY could fire at any point while filling the queue.
In my opinion, the semantics of TX_READY are fuzzy. When I hit
the ISR, I just know that "the TX queue reached 0 at some point
in time" but the HW might still be working on sending some bytes.
Seems the best one can do is:
while (bytes_to_send > 4) {
write 4 bytes to TX32 /* ATOMIC */
bytes_to_send -= 4;
wait for semaphore
}
while (bytes_to_send > 0) {
write 1 byte to TX8 /* ATOMIC */
bytes_to_send -= 1;
wait for semaphore
}
(This is ignoring the fact that the original buffer to
send may not be word-aligned, I will have to investigate
misaligned loads, or handle the first 0-3 bytes manually.)
In the solution proposed above, using atomic writes to
the device, I know that TX_READY signals "the work you
requested in now complete". But I have sacrificed
performance, as I will take an IRQ for every 4 bytes,
instead of one for every 16 bytes.
Is this making any sense? Or am I completely mistaken?
Regards.
next reply other threads:[~2015-05-09 10:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-09 10:22 Mason [this message]
2015-05-09 17:32 ` Hardware spec prevents optimal performance in device driver One Thousand Gnomes
2015-05-09 20:48 ` Mason
2015-05-10 10:29 ` Måns Rullgård
2015-05-10 16:46 ` Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=554DDFF3.5060906@free.fr \
--to=slash.tmp@free.fr \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=mans@mansr.com \
--cc=peter@hurleysoftware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox