From: Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp>
To: "Edgar E. Iglesias" <edgar.iglesias@axis.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 2/3] sh: movca.l cancel by ocbi
Date: Mon, 12 Jan 2009 14:13:06 +0900 [thread overview]
Message-ID: <496AD162.1040808@juno.dti.ne.jp> (raw)
In-Reply-To: <20090111142221.GG26952@edgar.se.axis.com>
Thank you for your commmnts!
Edgar E. Iglesias wrote:
> On Sun, Jan 11, 2009 at 06:14:40PM +0900, Shin-ichiro KAWASAKI wrote:
>> Current sh4's "movca.l" instruction is implemented in a same way
>> as "mov.l". Then, write to memory cannot be canceled by "ocbi"
>> cache control instrunction. This makes text area broken on
>> cache flush by linux kernel.
>>
>> This patch delays "movca.l" execution and provide the chance to
>> cancel it for "ocbi".
>> # Thank you Edgar, for your advice!
>>
>> This patch does,
>> - on executing "movca.l", just records what and where movca
>> should store data.
>> - on executing "ocbi", find the corresponding record by movca,
>> and delete it to cancel.
>> - lets TCG produce "delayed_movca" instruction at the first
>> instruction which is neither "movca.l" nor "ocbi".
>> - on executing "delayed_movca", does the data store task,
>> according to the record.
>>
>
> Hello,
>
> I think your patch will catch the linux dflush sequence but I've got a
> few comments.
>
> There is a complication with the delayed stores and potential mmu
> exceptions that they might generate. I think with this approach
> you'll take those faults at the next instruction which might not
> even be a load/store. Not sure if that can cause problems for sh.
> Maybe you could avoid that by making movca a load + store and the
> ocbi a store with the original value loaded at movca. But you'll
> still get into trouble with exceptions that might break the
> movca/ocbi sequence.
That's an idea. Current implementation is "delay and do-or-cancel" way.
But your idea is "do and commit-or-rollback" way.
I'll try to implement it to avoid mmu fault by movca.
The mmu fault by movca will cause invoke exception handler, and break
movca/ocbi sequence. It will be a quite rare case, however, I'll consider
it for next version.
> I don't it's worth having a dynamically sized movca delay buffer.
> The movca insn has a restricted addressing mode making it a bit hard
> to use back-to-back without alu insns in between. IIU your code
> correctly you flush the buffer before every non movca/ocbi insn so I
> guess that for most realistic use-cases you'll only need very few
> (maybe only 1) entries in the delay buffer.
That is a point I wavered.
I investigated linux kernel and found following lines in
"__flush_dcache_segment_4way()" in "arch/sh/mm/cache-sh4.c".
a0 = base_addr;
a1 = a0 + way_incr;
a2 = a1 + way_incr;
a3 = a2 + way_incr;
a0e = base_addr + extent_per_way;
do {
asm volatile("ldc %0, sr" : : "r" (sr_with_bl));
asm volatile("movca.l r0, @%0\n\t"
"movca.l r0, @%1\n\t"
"movca.l r0, @%2\n\t"
"movca.l r0, @%3\n\t"
"ocbi @%0\n\t"
"ocbi @%1\n\t"
"ocbi @%2\n\t"
"ocbi @%3\n\t" : :
"r" (a0), "r" (a1), "r" (a2), "r" (a3));
Multiple movca.l insns executed consequently, and they access different lines.
At least, 4 entries are needed for it. This code shows kernel implementation
decides how many number of movcas are invoked consequently.
That's the reason why I introudcued dynamic size change.
>> Signed-off-by: Shin-ichiro KAWASAKI <kawasaki@juno.dti.ne.jp>
>>
>> Index: trunk/target-sh4/op_helper.c
>> ===================================================================
>> --- trunk/target-sh4/op_helper.c (revision 6133)
>> +++ trunk/target-sh4/op_helper.c (working copy)
>> @@ -20,6 +20,7 @@
>> #include <assert.h>
>> #include "exec.h"
>> #include "helper.h"
>> +void *qemu_mallocz(size_t size);
>>
>> #ifndef CONFIG_USER_ONLY
>>
>> @@ -604,3 +605,57 @@
>> d.ll = t0;
>> return float64_to_int32_round_to_zero(d.d, &env->fp_status);
>> }
>> +
>> +void helper_movca(uint32_t val, uint32_t addr)
>> +{
>> + delayed_movca_t *cur = &env->movca_list;
>> + delayed_movca_t *prev = NULL;
>> + while (cur) {
>> + if (!cur->valid) {
>
> Might be overkill but you can merge here, i.e:
>
> if (!cur->valid || cur->addr == addr) {
>
>
>> + cur->valid = 1;
>> + cur->value = val;
>> + cur->addr = addr;
>> + return;
>> + }
>> + prev = cur;
>> + cur = cur->next;
>> + }
>> +
>> + /* movca entry shortage. allocate it. */
>> + prev->next = cur = qemu_mallocz(sizeof(delayed_movca_t));
>> + if (cur == NULL) {
>> + printf("out of memory for delayed movca. @%08x\n", addr);
>> + return;
>> + }
>> + cur->valid = 1;
>> + cur->value = val;
>> + cur->addr = addr;
>> +}
>> +
>> +void helper_ocbi(uint32_t addr)
>> +{
>> + delayed_movca_t *cur = &env->movca_list;
>> +
>> + while (cur) {
>> + if (cur->valid && cur->addr == addr) { /* found! */
>> + cur->valid = 0;
>> + return;
>> + }
>> + cur = cur->next;
>> + }
>
> I think you can quite easy catch a few more movca use cases if you
> dont compare line offsets, i.e for sh4 you'd mask the addr & ~(32 - 1).
> Also, you should not return until you've scanned the entire movca delay
> buffer as there might be multiple stores to the same line that need to be
> ignored.
>
> example:
> addr &= ~31;
> while (cur) {
> if (cur->valid && (cur->addr & ~31) == addr)
> cur->valid = 0;
> cur = cur->next;
> }
That's a good advice. I'll try to implement it for next version.
Thanks again for your review!
Regards,
Shin-ichiro KAWASAKI
next prev parent reply other threads:[~2009-01-12 5:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-11 9:14 [Qemu-devel] [PATCH 2/3] sh: movca.l cancel by ocbi Shin-ichiro KAWASAKI
2009-01-11 9:16 ` [Qemu-devel] [PATCH 3/3] sh: SCI improvements Shin-ichiro KAWASAKI
2009-01-11 15:30 ` Jean-Christophe PLAGNIOL-VILLARD
2009-01-11 17:02 ` [Qemu-devel] [PATCH] sh/serial: allow cpu regs definition Jean-Christophe PLAGNIOL-VILLARD
2009-01-12 6:42 ` [Qemu-devel] [PATCH 3/3] sh: SCI improvements Jean-Christophe PLAGNIOL-VILLARD
2009-01-11 14:22 ` [Qemu-devel] [PATCH 2/3] sh: movca.l cancel by ocbi Edgar E. Iglesias
2009-01-12 5:13 ` Shin-ichiro KAWASAKI [this message]
2009-01-13 0:21 ` Shin-ichiro KAWASAKI
2009-01-14 18:46 ` Edgar E. Iglesias
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=496AD162.1040808@juno.dti.ne.jp \
--to=kawasaki@juno.dti.ne.jp \
--cc=edgar.iglesias@axis.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).