qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Henderson <rth@twiddle.net>
Cc: peter.maydell@linaro.org, pbonzini@redhat.com,
	edgar.iglesias@xilinx.com, cota@braap.org, qemu-devel@nongnu.org,
	Peter Crosthwaite <crosthwaite.peter@gmail.com>,
	"open list:ARM" <qemu-arm@nongnu.org>
Subject: Re: [Qemu-devel] [RFC DEBUG PATCH 3/3] translate-a64: fix lookup_tb_ptr hang (DEBUG!)
Date: Sat, 10 Jun 2017 09:51:26 +0100	[thread overview]
Message-ID: <87vao4b4z5.fsf@linaro.org> (raw)
In-Reply-To: <fc351edb-7c08-c341-d8ee-85f6768e4931@twiddle.net>


Richard Henderson <rth@twiddle.net> writes:

> On 06/09/2017 10:01 AM, Alex Bennée wrote:
>> THIS IS A DEBUG PATCH DO NOT MERGE
>>
>> I include all the comments to show my working. I was trying to
>> isolate which instructions cause the problem. It turns out it is the
>> RET instruction. I don't understand why because AFAICT it is a
>> pretty much a BR instruction.
>
> Yeah, same thing for Alpha.
>
> It has been my guess that not chaining through RET means that we get
> back to the main loop regularly and often, letting interrupts be
> recognized in a timely manner.
>
> I can't figure out why that would be, however, since interrupts
> *ought* to be setting icount_decr, and the TB to which we chain *is*
> checking that to return to the main loop.

Indeed - if that was broken a lot more stuff wouldn't work.

> Since changing the timing affects the outcome (e.g. -d exec), it
> follows that this *must* be some sort of race condition.  But since
> this still happens with single-threaded mode, I can't imagine what
> sort of race condition it might be.

Apart from timer expiry I can't think what other interactions the other
threads have on the main TCG thread. I guess there is IO but my test
hangs way before the kernel starts poking the disk. Is there an
interaction between IRQs and QEMU's serial driver?

>
> More data points.  I removed the tb_htable_lookup, and that by itself
> is enough to fix Alpha booting.  But it doesn't help the aarch64
> kernel+image that I have.  Which does still boot with -d nochain
> (which, along with disabling goto_tb chaining, also disables all
> goto_ptr).

I wonder what is different about your aarch64 image and mine then?
Because mine works just with suppressing the chaining for RET.

>
> Not really sure where to go from here.

I would agree with Emilio that we revert but I can't quite shake the
feeling we are missing an underlying problem. Would just skipping the
htable lookup (but keeping the tb_jmp_cache) be an OK fix for now? Have
we just been lucky that whatever mechanism causes the "hang" wasn't due
to?

>
>
> r~


--
Alex Bennée

  reply	other threads:[~2017-06-10  8:51 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-09 17:00 [Qemu-devel] [RFC DEBUG PATCH 0/3] debug patch for lookup-ptr hang Alex Bennée
2017-06-09 17:00 ` [Qemu-devel] [RFC DEBUG PATCH 1/3] vl: Fix broken thread=xxx option of the --accel parameter Alex Bennée
2017-06-09 17:00 ` [Qemu-devel] [RFC DEBUG PATCH 2/3] tcg-runtime: light re-factor of lookup_tb_ptr Alex Bennée
2017-06-09 17:01 ` [Qemu-devel] [RFC DEBUG PATCH 3/3] translate-a64: fix lookup_tb_ptr hang (DEBUG!) Alex Bennée
2017-06-10  2:29   ` Richard Henderson
2017-06-10  8:51     ` Alex Bennée [this message]
2017-06-10 16:59       ` Richard Henderson
2017-06-11  5:07         ` Emilio G. Cota
2017-06-12 10:31           ` Alex Bennée
2017-06-13 22:53           ` [Qemu-devel] [PATCH] target/aarch64: exit to main loop after handling MSR Emilio G. Cota
2017-06-13 23:01             ` no-reply
2017-06-14  4:48             ` Richard Henderson
2017-06-14 10:46               ` Paolo Bonzini
2017-06-14 11:45                 ` Alex Bennée
2017-06-14 12:02                   ` Paolo Bonzini
2017-06-14 12:14                     ` Alex Bennée
2017-06-14 12:16                       ` Paolo Bonzini
2017-06-14 12:35                         ` Alex Bennée
2017-06-14 12:43                           ` Paolo Bonzini
2017-06-14 10:38             ` Alex Bennée
2017-06-09 21:11 ` [Qemu-devel] [RFC DEBUG PATCH 0/3] debug patch for lookup-ptr hang no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87vao4b4z5.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=cota@braap.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=edgar.iglesias@xilinx.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).