Slowness with multi-thread TCG?

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Frederic Barrat <fbarrat@linux.ibm.com>
To: qemu-devel@nongnu.org, qemu-ppc@nongnu.org
Subject: Slowness with multi-thread TCG?
Date: Mon, 27 Jun 2022 20:25:35 +0200	[thread overview]
Message-ID: <111e5b6c-41a7-89a4-b4d2-2eda1a295ffa@linux.ibm.com> (raw)

[ Resending as it was meant for the qemu-ppc list ]

Hello,

I've been looking at why our qemu powernv model is so slow when booting 
a compressed linux kernel, using multiple vcpus and multi-thread tcg. 
With only one vcpu, the decompression time of the kernel is what it is, 
but when using multiple vcpus, the decompression is actually slower. And 
worse: it degrades very fast with the number of vcpus!

Rough measurement of the decompression time on a x86 laptop with 
multi-thread tcg and using the qemu powernv10 machine:
1 vcpu => 15 seconds
2 vcpus => 45 seconds
4 vcpus => 1 min 30 seconds

Looking in details, when the firmware (skiboot) hands over execution to 
the linux kernel, there's one main thread entering some bootstrap code 
and running the kernel decompression algorithm. All the other secondary 
threads are left spinning in skiboot (1 thread per vpcu). So on paper, 
with multi-thread tcg and assuming the system has enough available 
physical cpus, I would expect the decompression to hog one physical cpu 
and the time needed to be constant, no matter the number of vpcus.

All the secondary threads are left spinning in code like this:

	for (;;) {
		if (cpu_check_jobs(cpu))  // reading cpu-local data
			break;
		if (reconfigure_idle)     // global variable
			break;
		barrier();
	}

The barrier is to force reading the memory with each iteration. It's 
defined as:

   asm volatile("" : : : "memory");

Some time later, the main thread in the linux kernel will get the 
secondary threads out of that loop by posting a job.

My first thought was that the translation of that code through tcg was 
somehow causing some abnormally slow behavior, maybe due to some 
non-obvious contention between the threads. However, if I send the 
threads spinning forever with simply:

     for (;;) ;

supposedly removing any contention, then the decompression time is the same.

Ironically, the behavior seen with single thread tcg is what I would 
expect: 1 thread decompressing in 15 seconds, all the other threads 
spinning for that same amount of time, all sharing the same physical 
cpu, so it all adds up nicely: I see 60 seconds decompression time with 
4 vcpus (4x15). Which means multi-thread tcg is slower by quite a bit. 
And single thread tcg hogs one physical cpu of the laptop vs. 4 physical 
cpus for the slower multi-thread tcg.

Does anybody have an idea of what might happen or have suggestion to 
keep investigating?
Thanks for your help!

   Fred

next             reply	other threads:[~2022-06-27 18:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-27 18:25 Frederic Barrat [this message]
2022-06-27 21:10 ` Slowness with multi-thread TCG? Alex Bennée
2022-06-28 11:25 ` Matheus K. Ferst
2022-06-28 13:08   ` Frederic Barrat
2022-06-28 15:12     ` Alex Bennée
2022-06-28 16:16       ` Frederic Barrat
2022-06-28 22:17         ` Alex Bennée
2022-06-29 15:36           ` Frederic Barrat
2022-06-29 16:01             ` Alex Bennée
2022-06-29 16:25             ` Matheus K. Ferst
2022-06-29 17:13               ` Alex Bennée
2022-06-29 20:55                 ` Cédric Le Goater
  -- strict thread matches above, loose matches on Subject: below --
2022-06-27 16:25 Frederic Barrat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=111e5b6c-41a7-89a4-b4d2-2eda1a295ffa@linux.ibm.com \
    --to=fbarrat@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).