From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Alex Bennée" <alex.bennee@linaro.org>,
"Sajjan Rao" <sajjanr@gmail.com>,
"Gregory Price" <gregory.price@memverge.com>,
"Dimitrios Palyvos" <dimitrios.palyvos@zptcorp.com>,
linux-cxl@vger.kernel.org, qemu-devel@nongnu.org,
richard.henderson@linaro.org
Subject: Re: Crash with CXL + TCG on 8.2: Was Re: qemu cxl memory expander shows numa_node -1
Date: Thu, 15 Feb 2024 15:04:33 +0000 [thread overview]
Message-ID: <20240215150433.00007a51@Huawei.com> (raw)
In-Reply-To: <CAFEAcA_a_AyQ=Epz3_+CheAT8Crsk9mOu894wbNW_FywamkZiw@mail.gmail.com>
On Thu, 1 Feb 2024 16:00:56 +0000
Peter Maydell <peter.maydell@linaro.org> wrote:
> On Thu, 1 Feb 2024 at 15:17, Alex Bennée <alex.bennee@linaro.org> wrote:
> >
> > Peter Maydell <peter.maydell@linaro.org> writes:
> > > So, that looks like:
> > > * we call cpu_tb_exec(), which executes some generated code
> > > * that generated code calls the lookup_tb_ptr helper to see
> > > if we have a generated TB already for the address we're going
> > > to execute next
> > > * lookup_tb_ptr probes the TLB to see if we know the host RAM
> > > address for the guest address
> > > * this results in a TLB walk for an instruction fetch
> > > * the page table descriptor load is to IO memory
> > > * io_prepare assumes it needs to do a TLB recompile, because
> > > can_do_io is clear
> > >
> > > I am not surprised that the corner case of "the guest put its
> > > page tables in an MMIO device" has not yet come up :-)
> > >
> > > I'm really not sure how the icount handling should interact
> > > with that...
> >
> > Its not just icount - we need to handle it for all modes now. That said
> > seeing as we are at the end of a block shouldn't can_do_io be set?
>
> The lookup_tb_ptr helper gets called from tcg_gen_goto_tb(),
> which happens earlier than the tb_stop callback (it can
> happen in the trans function for branch etc insns, for
> example).
>
> I think it should be OK to clear can_do_io at the start
> of the lookup_tb_ptr helper, something like:
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 977576ca143..7818537f318 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -396,6 +396,15 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
> uint64_t cs_base;
> uint32_t flags, cflags;
>
> + /*
> + * By definition we've just finished a TB, so I/O is OK.
> + * Avoid the possibility of calling cpu_io_recompile() if
> + * a page table walk triggered by tb_lookup() calling
> + * probe_access_internal() happens to touch an MMIO device.
> + * The next TB, if we chain to it, will clear the flag again.
> + */
> + cpu->neg.can_do_io = true;
> +
> cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>
> cflags = curr_cflags(cpu);
>
> -- PMM
Hi Peter,
I've included this in the series I just sent out:
https://lore.kernel.org/qemu-devel/20240215150133.2088-1-Jonathan.Cameron@huawei.com/T/#t
Could you add your Signed-off-by if you are happy doing so?
Jonathan
WARNING: multiple messages have this Message-ID (diff)
From: Jonathan Cameron via <qemu-devel@nongnu.org>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Alex Bennée" <alex.bennee@linaro.org>,
"Sajjan Rao" <sajjanr@gmail.com>,
"Gregory Price" <gregory.price@memverge.com>,
"Dimitrios Palyvos" <dimitrios.palyvos@zptcorp.com>,
linux-cxl@vger.kernel.org, qemu-devel@nongnu.org,
richard.henderson@linaro.org
Subject: Re: Crash with CXL + TCG on 8.2: Was Re: qemu cxl memory expander shows numa_node -1
Date: Thu, 15 Feb 2024 15:04:33 +0000 [thread overview]
Message-ID: <20240215150433.00007a51@Huawei.com> (raw)
In-Reply-To: <CAFEAcA_a_AyQ=Epz3_+CheAT8Crsk9mOu894wbNW_FywamkZiw@mail.gmail.com>
On Thu, 1 Feb 2024 16:00:56 +0000
Peter Maydell <peter.maydell@linaro.org> wrote:
> On Thu, 1 Feb 2024 at 15:17, Alex Bennée <alex.bennee@linaro.org> wrote:
> >
> > Peter Maydell <peter.maydell@linaro.org> writes:
> > > So, that looks like:
> > > * we call cpu_tb_exec(), which executes some generated code
> > > * that generated code calls the lookup_tb_ptr helper to see
> > > if we have a generated TB already for the address we're going
> > > to execute next
> > > * lookup_tb_ptr probes the TLB to see if we know the host RAM
> > > address for the guest address
> > > * this results in a TLB walk for an instruction fetch
> > > * the page table descriptor load is to IO memory
> > > * io_prepare assumes it needs to do a TLB recompile, because
> > > can_do_io is clear
> > >
> > > I am not surprised that the corner case of "the guest put its
> > > page tables in an MMIO device" has not yet come up :-)
> > >
> > > I'm really not sure how the icount handling should interact
> > > with that...
> >
> > Its not just icount - we need to handle it for all modes now. That said
> > seeing as we are at the end of a block shouldn't can_do_io be set?
>
> The lookup_tb_ptr helper gets called from tcg_gen_goto_tb(),
> which happens earlier than the tb_stop callback (it can
> happen in the trans function for branch etc insns, for
> example).
>
> I think it should be OK to clear can_do_io at the start
> of the lookup_tb_ptr helper, something like:
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 977576ca143..7818537f318 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -396,6 +396,15 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
> uint64_t cs_base;
> uint32_t flags, cflags;
>
> + /*
> + * By definition we've just finished a TB, so I/O is OK.
> + * Avoid the possibility of calling cpu_io_recompile() if
> + * a page table walk triggered by tb_lookup() calling
> + * probe_access_internal() happens to touch an MMIO device.
> + * The next TB, if we chain to it, will clear the flag again.
> + */
> + cpu->neg.can_do_io = true;
> +
> cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>
> cflags = curr_cflags(cpu);
>
> -- PMM
Hi Peter,
I've included this in the series I just sent out:
https://lore.kernel.org/qemu-devel/20240215150133.2088-1-Jonathan.Cameron@huawei.com/T/#t
Could you add your Signed-off-by if you are happy doing so?
Jonathan
next prev parent reply other threads:[~2024-02-15 15:04 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-18 9:38 qemu cxl memory expander shows numa_node -1 Sajjan Rao
2023-08-18 15:01 ` Dimitrios Palyvos
2023-08-21 10:00 ` Sajjan Rao
2023-08-21 10:53 ` Dimitrios Palyvos
2023-08-23 11:13 ` Sajjan Rao
2023-08-23 16:50 ` Jonathan Cameron
2023-08-24 6:26 ` Sajjan Rao
2024-01-25 8:15 ` Sajjan Rao
2024-01-26 12:39 ` Jonathan Cameron
2024-01-26 15:43 ` Gregory Price
2024-01-26 17:12 ` Jonathan Cameron
2024-01-30 8:20 ` Sajjan Rao
2024-02-01 13:04 ` Crash with CXL + TCG on 8.2: Was " Jonathan Cameron
2024-02-01 13:04 ` Jonathan Cameron via
2024-02-01 13:12 ` Peter Maydell
2024-02-01 14:01 ` Jonathan Cameron
2024-02-01 14:01 ` Jonathan Cameron via
2024-02-01 14:35 ` Peter Maydell
2024-02-01 15:17 ` Alex Bennée
2024-02-01 15:29 ` Jonathan Cameron
2024-02-01 15:29 ` Jonathan Cameron via
2024-02-01 16:00 ` Peter Maydell
2024-02-01 16:21 ` Jonathan Cameron
2024-02-01 16:21 ` Jonathan Cameron via
2024-02-01 16:45 ` Alex Bennée
2024-02-01 17:04 ` Gregory Price
2024-02-01 17:07 ` Peter Maydell
2024-02-01 17:29 ` Gregory Price
2024-02-01 17:08 ` Jonathan Cameron
2024-02-01 17:08 ` Jonathan Cameron via
2024-02-01 17:21 ` Peter Maydell
2024-02-01 17:41 ` Jonathan Cameron
2024-02-01 17:41 ` Jonathan Cameron via
2024-02-01 17:25 ` Alex Bennée
2024-02-01 18:04 ` Peter Maydell
2024-02-01 18:56 ` Gregory Price
2024-02-02 16:26 ` Jonathan Cameron
2024-02-02 16:26 ` Jonathan Cameron via
2024-02-02 16:33 ` Peter Maydell
2024-02-02 16:50 ` Gregory Price
2024-02-02 16:56 ` Peter Maydell
2024-02-07 17:34 ` Jonathan Cameron
2024-02-07 17:34 ` Jonathan Cameron via
2024-02-08 14:50 ` Jonathan Cameron
2024-02-08 14:50 ` Jonathan Cameron via
2024-02-15 15:29 ` Jonathan Cameron
2024-02-15 15:29 ` Jonathan Cameron via
2024-02-19 7:55 ` Mattias Nissler
2024-02-15 15:04 ` Jonathan Cameron [this message]
2024-02-15 15:04 ` Jonathan Cameron via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240215150433.00007a51@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alex.bennee@linaro.org \
--cc=dimitrios.palyvos@zptcorp.com \
--cc=gregory.price@memverge.com \
--cc=linux-cxl@vger.kernel.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=sajjanr@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.