From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=45815 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OQo4m-0005SJ-JR for qemu-devel@nongnu.org; Mon, 21 Jun 2010 16:58:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OQo4l-0005DX-6G for qemu-devel@nongnu.org; Mon, 21 Jun 2010 16:58:40 -0400 Received: from fmmailgate03.web.de ([217.72.192.234]:34371) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OQo4k-0005DJ-QL for qemu-devel@nongnu.org; Mon, 21 Jun 2010 16:58:39 -0400 Message-ID: <4C1FD278.5040403@web.de> Date: Mon, 21 Jun 2010 22:58:32 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <4C1BA0B4.2010803@siemens.com> <4C1BCEB0.6050601@codemonkey.ws> <20100621193123.GA14083@amt.cnet> <4C1FCABA.3060207@web.de> In-Reply-To: <4C1FCABA.3060207@web.de> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigA95CCC59148B707995394970" Sender: jan.kiszka@web.de Subject: [Qemu-devel] Re: [PATCH] fix smp with tcg mode and --enable-io-thread List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcelo Tosatti Cc: Glauber Costa , qemu-devel This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigA95CCC59148B707995394970 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Jan Kiszka wrote: > Marcelo Tosatti wrote: >> Clear exit_request when iothread grabs the global lock.=20 >> >> Signed-off-by: Marcelo Tosatti >> >> diff --git a/cpu-exec.c b/cpu-exec.c >> index 026980a..74cb973 100644 >> --- a/cpu-exec.c >> +++ b/cpu-exec.c >> @@ -236,10 +236,8 @@ int cpu_exec(CPUState *env1) >> asm(""); >> env =3D env1; >> =20 >> - if (exit_request) { >> + if (exit_request) >> env->exit_request =3D 1; >> - exit_request =3D 0; >> - } >=20 > Coding style... >=20 >> =20 >> #if defined(TARGET_I386) >> if (!kvm_enabled()) { >> diff --git a/cpus.c b/cpus.c >> index fcd0f09..ef1ab22 100644 >> --- a/cpus.c >> +++ b/cpus.c >> @@ -598,6 +598,7 @@ void qemu_mutex_lock_iothread(void) >> } >> qemu_mutex_unlock(&qemu_fair_mutex); >> } >> + exit_request =3D 0; >> } >> =20 >> void qemu_mutex_unlock_iothread(void) >> >> >=20 > I looked into this a bit as well, and that's what I also have in my > queue. >=20 > But things are still widely broken: pause_all_vcpus and run_on_cpu as > there is no guarantee that all VCPUs regularly call into > qemu_wait_io_event. Also breakpoints don't work, not only in SMP mode. >=20 > The former issue is mostly fine with this patch: >=20 > diff --git a/cpu-exec.c b/cpu-exec.c > index 026980a..9f4191d 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -236,9 +236,8 @@ int cpu_exec(CPUState *env1) > asm(""); > env =3D env1; > =20 > - if (exit_request) { > + if (unlikely(exit_request)) { > env->exit_request =3D 1; > - exit_request =3D 0; > } > =20 > #if defined(TARGET_I386) > diff --git a/cpus.c b/cpus.c > index fcd0f09..2ce839d 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -39,7 +39,6 @@ > #define SIG_IPI SIGUSR1 > #endif > =20 > -static CPUState *cur_cpu; > static CPUState *next_cpu; > =20 > /***********************************************************/ > @@ -331,6 +330,7 @@ int qemu_init_main_loop(void) > return ret; > =20 > qemu_cond_init(&qemu_pause_cond); > + qemu_cond_init(&qemu_system_cond); > qemu_mutex_init(&qemu_fair_mutex); > qemu_mutex_init(&qemu_global_mutex); > qemu_mutex_lock(&qemu_global_mutex); > @@ -401,10 +401,12 @@ static void qemu_wait_io_event_common(CPUState *e= nv) > flush_queued_work(env); > } > =20 > -static void qemu_wait_io_event(CPUState *env) > +static void qemu_tcg_wait_io_event(void) > { > + CPUState *env; > + > while (!tcg_has_work()) > - qemu_cond_timedwait(env->halt_cond, &qemu_global_mutex, 1000);= > + qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, 1000); > =20 > qemu_mutex_unlock(&qemu_global_mutex); > =20 > @@ -417,7 +419,10 @@ static void qemu_wait_io_event(CPUState *env) > qemu_mutex_unlock(&qemu_fair_mutex); > =20 > qemu_mutex_lock(&qemu_global_mutex); > - qemu_wait_io_event_common(env); > + > + for (env =3D first_cpu; env !=3D NULL; env =3D env->next_cpu) { > + qemu_wait_io_event_common(env); > + } > } > =20 > static void qemu_kvm_eat_signal(CPUState *env, int timeout) > @@ -502,7 +507,7 @@ static void *tcg_cpu_thread_fn(void *arg) > =20 > while (1) { > tcg_cpu_exec(); > - qemu_wait_io_event(cur_cpu); > + qemu_tcg_wait_io_event(); > } > =20 > return NULL; > @@ -768,11 +773,11 @@ bool tcg_cpu_exec(void) > =20 > if (next_cpu =3D=3D NULL) > next_cpu =3D first_cpu; > - for (; next_cpu !=3D NULL; next_cpu =3D next_cpu->next_cpu) { > - CPUState *env =3D cur_cpu =3D next_cpu; > + for (; next_cpu !=3D NULL && !exit_request; next_cpu =3D next_cpu-= >next_cpu) { > + CPUState *env =3D next_cpu; > =20 > qemu_clock_enable(vm_clock, > - (cur_cpu->singlestep_enabled & SSTEP_NOTIMER= ) =3D=3D 0); > + (env->singlestep_enabled & SSTEP_NOTIMER) =3D= =3D 0); > =20 > if (qemu_alarm_pending()) > break; > @@ -787,6 +792,7 @@ bool tcg_cpu_exec(void) > break; > } > } > + exit_request =3D 0; > return tcg_has_work(); > } > =20 >=20 > I haven't looked into the breakpoint bug yet as performance (likely > VCPU scheduling) in iothread mode is still suffering compared to classi= c > mode, specifically when you go up with the number of VCPUs (try -smp 8)= =2E Hmm, retesting this, I cannot reproduce the performance regression anymore. Likely I confused something or had instrumentations applied. Locking up still works "reliably". :( Jan > And there is some race that cause a lock up in qemu_mutex_lock_iothread= > after a while (the cpu_unlink_tb seems to race with the linking - just = a > guess so far). >=20 > Anyway, all this is getting terribly complex, hard to grasp, and maybe > therefore fragile as well. Specifically the SMP round-robin scheduling > looks like it requires a redesign for iothread mode (my feeling is it > only works by chance ATM). Part of the issue may not be new, but could > have been shadowed by the frequently interrupting I/O load when using a= > single thread. The rest seems to lack a bit a user base... >=20 > Jan >=20 --------------enigA95CCC59148B707995394970 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkwf0nwACgkQitSsb3rl5xQYLwCfTdafMll5qzSEyNUkC9edalMA 7GIAnAj/C+hD41VEIPduZUMW91sXl35Y =EW+d -----END PGP SIGNATURE----- --------------enigA95CCC59148B707995394970--