All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Mahmoud Mandour <ma.mandourr@gmail.com>
Cc: "Emilio G. Cota" <cota@braap.org>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>
Subject: Re: [PATCH 3/6] plugins/cache: Fixed a use-after-free bug with multithreaded usermode
Date: Mon, 19 Jul 2021 13:48:35 +0100	[thread overview]
Message-ID: <87o8aypiwu.fsf@linaro.org> (raw)
In-Reply-To: <CAD-LL6g8edEAjCVd0OZ3LVyEFidqM8p2KP9fikU2fLx2i0H9Kg@mail.gmail.com>


Mahmoud Mandour <ma.mandourr@gmail.com> writes:

> On Mon, Jul 19, 2021 at 1:08 PM Alex Bennée <alex.bennee@linaro.org> wrote:
>
>  Mahmoud Mandour <ma.mandourr@gmail.com> writes:
>
>  > On Mon, Jul 19, 2021 at 11:48 AM Alex Bennée <alex.bennee@linaro.org> wrote:
>  >
>  >  Mahmoud Mandour <ma.mandourr@gmail.com> writes:
>  >
>  >  > Since callbacks may be interleaved because of multithreaded execution,
>  >  > we should not make assumptions about `plugin_exit` either. The problem
>  >  > with `plugin_exit` is that it frees shared data structures (caches and
>  >  > `miss_ht` hash table). It should not be assumed that callbacks will not
>  >  > be called after it and hence use already-freed data structures.
>  >
>  >  What was your test case for this because I wonder if it would be worth
>  >  coming up with one for check-tcg? 
>  >
>  > I think just any ad-hoc multithreaded execution will evoke the race pretty much 
>  > consistently.
>
>  I haven't managed to trigger it with testthread but of course my
>  libcache is trying to to defend against it.
>
> https://pastebin.com/X4Xazemh
> That's a minimal program that reproduces the bug consistently for me (to my observation, a simple 
> program that creates a bunch of threads that immediately exit does not produce the bug as frequently)
>
> You can also make hotblocks produce a similar crash (but make sure to add a g_hash_table_destroy(hotblocks)
> at the end of plugin_exit.)
>

Thanks, I'll give that a try.

>  
>  >  
>  >  From what I remember the race is
>  >  in between preexit_cleanup and the eventual _exit/exit_group which nixes
>  >  all other child threads. Maybe we should be triggering a more graceful
>  >  unload here?
>  >
>  > I think so. This remedies the bug for this particular plugin and I think there
>  > would be a better solution of course. However, I just can't ever get plugin_exit
>  > callback to be called more than once so I think that's probably not the problem?
>  >
>  > The problem is that we *use* the data in translation/mem_access/exec callbacks
>  > after a plugin_exit call is already called (this can be easily verified by having a 
>  > boolean set to true once plugin_exit is called and then g_assert this boolean is 
>  > false in the callbacks)
>
>  We have mechanisms for safely unloading plugins during running so I
>  think we should be able to do something cleanly here. I'll cook up an
>  RFC.
>
> I'll be waiting for it. Note that as I think I mentioned in the cover letter, it's that plugin_uninstall
> is probably problematic since it unregisters callbacks but already-fired callbacks will continue executing.
> So calling plugin_uninstall at the end of plugin_exit does not relieve
> the bug...

That's OK - the plugin_uninstall contract explicitly says that callbacks
may occur until the callback is called.

>  
>  
>  >
>  >  > This is mitigated in this commit by synchronizing the call to
>  >  > `plugin_exit` through locking to ensure execlusive access to data
>  >  > structures, and NULL-ifying those data structures so that subsequent
>  >  > callbacks can check whether the data strucutres are freed, and if so,
>  >  > immediately exit.
>  >  >
>  >  > It's okay to immediately exit and don't account for those callbacks
>  >  > since they won't be accounted for anyway since `plugin_exit` is already
>  >  > called once and reported the statistics.
>  >  >
>  >  > Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
>  >  > ---
>  >  >  contrib/plugins/cache.c | 31 ++++++++++++++++++++++++++++++-
>  >  >  1 file changed, 30 insertions(+), 1 deletion(-)
>  >  >
>  >  > diff --git a/contrib/plugins/cache.c b/contrib/plugins/cache.c
>  >  > index 695fb969dc..a452aba01c 100644
>  >  > --- a/contrib/plugins/cache.c
>  >  > +++ b/contrib/plugins/cache.c
>  >  > @@ -363,6 +363,11 @@ static void vcpu_mem_access(unsigned int vcpu_index, qemu_plugin_meminfo_t info,
>  >  >      effective_addr = hwaddr ? qemu_plugin_hwaddr_phys_addr(hwaddr) : vaddr;
>  >  >  
>  >  >      g_mutex_lock(&mtx);
>  >  > +    if (dcache == NULL) {
>  >  > +        g_mutex_unlock(&mtx);
>  >  > +        return;
>  >  > +    }
>  >  > +
>  >  >      if (!access_cache(dcache, effective_addr)) {
>  >  >          insn = (InsnData *) userdata;
>  >  >          insn->dmisses++;
>  >  > @@ -380,6 +385,11 @@ static void vcpu_insn_exec(unsigned int vcpu_index, void *userdata)
>  >  >      g_mutex_lock(&mtx);
>  >  >      insn_addr = ((InsnData *) userdata)->addr;
>  >  >  
>  >  > +    if (icache == NULL) {
>  >  > +        g_mutex_unlock(&mtx);
>  >  > +        return;
>  >  > +    }
>  >  > +
>  >  >      if (!access_cache(icache, insn_addr)) {
>  >  >          insn = (InsnData *) userdata;
>  >  >          insn->imisses++;
>  >  > @@ -406,12 +416,24 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
>  >  >              effective_addr = (uint64_t) qemu_plugin_insn_vaddr(insn);
>  >  >          }
>  >  >  
>  >  > +        g_mutex_lock(&mtx);
>  >  > +
>  >  > +        /*
>  >  > +         * is the plugin_exit callback called? If so, any further callback
>  >  > +         * registration is useless as it won't get accounted for after calling
>  >  > +         * plugin_exit once already, and also will use miss_ht after it's freed
>  >  > +         */
>  >  > +        if (miss_ht == NULL) {
>  >  > +            g_mutex_unlock(&mtx);
>  >  > +            return;
>  >  > +        }
>  >  > +
>  >  >          /*
>  >  >           * Instructions might get translated multiple times, we do not create
>  >  >           * new entries for those instructions. Instead, we fetch the same
>  >  >           * entry from the hash table and register it for the callback again.
>  >  >           */
>  >  > -        g_mutex_lock(&mtx);
>  >  > +
>  >  >          data = g_hash_table_lookup(miss_ht, GUINT_TO_POINTER(effective_addr));
>  >  >          if (data == NULL) {
>  >  >              data = g_new0(InsnData, 1);
>  >  > @@ -527,13 +549,20 @@ static void log_top_insns()
>  >  >  
>  >  >  static void plugin_exit(qemu_plugin_id_t id, void *p)
>  >  >  {
>  >  > +    g_mutex_lock(&mtx);
>  >  >      log_stats();
>  >  >      log_top_insns();
>  >  >  
>  >  >      cache_free(dcache);
>  >  > +    dcache = NULL;
>  >  > +
>  >  >      cache_free(icache);
>  >  > +    icache = NULL;
>  >  >  
>  >  >      g_hash_table_destroy(miss_ht);
>  >  > +    miss_ht = NULL;
>  >  > +
>  >  > +    g_mutex_unlock(&mtx);
>  >  >  }
>  >  >  
>  >  >  static void policy_init()
>  >
>  >  -- 
>  >  Alex Bennée
>
>  -- 
>  Alex Bennée


-- 
Alex Bennée


  reply	other threads:[~2021-07-19 12:51 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14 17:21 [PATCH 0/6] plugins/cache: multicore cache emulation and minor Mahmoud Mandour
2021-07-14 17:21 ` [PATCH 1/6] plugins/cache: Fixed a bug with destroying FIFO metadata Mahmoud Mandour
2021-07-19  9:21   ` Alex Bennée
2021-07-14 17:21 ` [PATCH 2/6] plugins/cache: limited the scope of a mutex lock Mahmoud Mandour
2021-07-19  9:34   ` Alex Bennée
2021-07-14 17:21 ` [PATCH 3/6] plugins/cache: Fixed a use-after-free bug with multithreaded usermode Mahmoud Mandour
2021-07-19  9:45   ` Alex Bennée
2021-07-19 10:46     ` Mahmoud Mandour
2021-07-19 11:06       ` Alex Bennée
2021-07-19 11:28         ` Mahmoud Mandour
2021-07-19 12:48           ` Alex Bennée [this message]
2021-07-14 17:21 ` [PATCH 4/6] plugins/cache: Supported multicore cache modelling Mahmoud Mandour
2021-07-19 10:52   ` Alex Bennée
2021-07-14 17:21 ` [PATCH 5/6] docs/devel/tcg-plugins: added cores arg to cache plugin Mahmoud Mandour
2021-07-14 17:21 ` [PATCH 6/6] plugins/cache: Fixed "function decl. is not a prototype" warnings Mahmoud Mandour
2021-07-19 12:38   ` Alex Bennée
2021-07-20 12:46 ` [PATCH 0/6] plugins/cache: multicore cache emulation and minor Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o8aypiwu.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=cota@braap.org \
    --cc=ma.mandourr@gmail.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.