From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48509) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bmpzn-0000Sx-J4 for qemu-devel@nongnu.org; Wed, 21 Sep 2016 18:28:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bmpzj-00006J-Az for qemu-devel@nongnu.org; Wed, 21 Sep 2016 18:28:03 -0400 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:56343) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bmpzh-0008V3-25 for qemu-devel@nongnu.org; Wed, 21 Sep 2016 18:27:59 -0400 Date: Wed, 21 Sep 2016 18:27:46 -0400 From: "Emilio G. Cota" Message-ID: <20160921222746.GB30386@flamenco> References: <1474289459-15242-1-git-send-email-pbonzini@redhat.com> <1474289459-15242-17-git-send-email-pbonzini@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1474289459-15242-17-git-send-email-pbonzini@redhat.com> Subject: Re: [Qemu-devel] [PATCH 16/16] cpus-common: lock-free fast path for cpu_exec_start/end List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org, serge.fdrv@gmail.com, alex.bennee@linaro.org, sergey.fedorov@linaro.org On Mon, Sep 19, 2016 at 14:50:59 +0200, Paolo Bonzini wrote: (snip) > @@ -212,24 +216,75 @@ void end_exclusive(void) > /* Wait for exclusive ops to finish, and begin cpu execution. */ > void cpu_exec_start(CPUState *cpu) > { > - qemu_mutex_lock(&qemu_cpu_list_mutex); > - exclusive_idle(); > cpu->running = true; > - qemu_mutex_unlock(&qemu_cpu_list_mutex); > + > + /* Write cpu->running before reading pending_cpus. */ > + smp_mb(); > + > + /* 1. start_exclusive saw cpu->running == true and pending_cpus >= 1. > + * After taking the lock we'll see cpu->has_waiter == true and run---not > + * for long because start_exclusive kicked us. cpu_exec_end will > + * decrement pending_cpus and signal the waiter. > + * > + * 2. start_exclusive saw cpu->running == false but pending_cpus >= 1. > + * This includes the case when an exclusive item is running now. > + * Then we'll see cpu->has_waiter == false and wait for the item to > + * complete. > + * > + * 3. pending_cpus == 0. Then start_exclusive is definitely going to > + * see cpu->running == true, and it will kick the CPU. > + */ > + if (pending_cpus) { I'd consider doing if (unlikely(pending_cpus)) { since the exclusive is a slow path and will be more so in the near future. > + qemu_mutex_lock(&qemu_cpu_list_mutex); > + if (!cpu->has_waiter) { > + /* Not counted in pending_cpus, let the exclusive item > + * run. Since we have the lock, set cpu->running to true > + * while holding it instead of retrying. > + */ > + cpu->running = false; > + exclusive_idle(); > + /* Now pending_cpus is zero. */ > + cpu->running = true; > + } else { > + /* Counted in pending_cpus, go ahead. */ > + } > + qemu_mutex_unlock(&qemu_cpu_list_mutex); > + } > } > > /* Mark cpu as not executing, and release pending exclusive ops. */ > void cpu_exec_end(CPUState *cpu) > { > - qemu_mutex_lock(&qemu_cpu_list_mutex); > cpu->running = false; > - if (pending_cpus > 1) { > - pending_cpus--; > - if (pending_cpus == 1) { > - qemu_cond_signal(&exclusive_cond); > + > + /* Write cpu->running before reading pending_cpus. */ > + smp_mb(); > + > + /* 1. start_exclusive saw cpu->running == true. Then it will increment > + * pending_cpus and wait for exclusive_cond. After taking the lock > + * we'll see cpu->has_waiter == true. > + * > + * 2. start_exclusive saw cpu->running == false but here pending_cpus >= 1. > + * This includes the case when an exclusive item started after setting > + * cpu->running to false and before we read pending_cpus. Then we'll see > + * cpu->has_waiter == false and not touch pending_cpus. The next call to > + * cpu_exec_start will run exclusive_idle if still necessary, thus waiting > + * for the item to complete. > + * > + * 3. pending_cpus == 0. Then start_exclusive is definitely going to > + * see cpu->running == false, and it can ignore this CPU until the > + * next cpu_exec_start. > + */ > + if (pending_cpus) { ditto > + qemu_mutex_lock(&qemu_cpu_list_mutex); > + if (cpu->has_waiter) { > + cpu->has_waiter = false; > + if (--pending_cpus == 1) { > + qemu_cond_signal(&exclusive_cond); > + } (snip) Another suggestion is to consistently access pending_cpus atomically, since now we're accessing it with and without the CPU list mutex held. Thanks, Emilio