From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B6DF1BCA06 for ; Tue, 6 Aug 2024 08:27:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722932844; cv=none; b=dnvkzYLpCKHMR0ftj4aA0an6o24IZrLINwRRDlPyX3yBc52tVRWBxCv5t5rihyB7mczMiCoNzMYPlQLeC54e/6C1q/eK/dE4qsBZqEi7cIOu2qIwfzT6c4J5ymF3azAFsvzlGXgP/7ToaYEqvtjOcJ1T4Q4W/eLEg3rnS55y6OI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722932844; c=relaxed/simple; bh=eIesG8t60+K8FeaonUdws/9nqAlhzL9g5yfZvQx+N1g=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=G6UGAwJnRCdhBDUGa41soY6ZXIn7QLmRqljvnp4QetsymC1RoGKT9rEwJ5qTTyXNvnldxIDjux7cvnxAMbZpJvSi+SAHN5+yHZBy7Qt4KmiRDlZ4OB7v96QL1qM6eb1Zu3vdrRFdWQNc+2Ve+dJ7XBcTRSG8TS35k3ZKCogOJuo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=oNkFNfgL; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="oNkFNfgL" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=gGf1RRMLsvUqCDo/9XbUQDO+27CXmxKKBoYyC5G8/rg=; b=oNkFNfgLGqDL52M/3JAZQ/hjHi yo7Xzbi9gXLhBsdoGUKv8tOP/WfubI70Cg8H6VF+Rotgz+cjHeFZg2Jcibhbfq/QpAJUjgUOjovol y2Bk98ExWzGvW+C1BVixGcqgO72ONyWGatv6Cvm/Ykj5uGz5MlbmhdMHhCJ2IQeFvqu5lkc3FD+u9 nxQK7dPIro4nvLoqdSCYh9EvdvnXhB9Tw3NopugJj8sI0BYjSdGbxJ0mM3Vak9Gjs26UINuffKKiD OfJOFqBAxE4VKxfk0/yYT19qtf36uv4/fQ9BNNyAZb8+yYB3VG9vO/o9dDFW5a+Iiho87BdOWLCoc YHDYavyw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sbFXJ-00000005Rcc-06kN; Tue, 06 Aug 2024 08:27:17 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 3FD4B30049D; Tue, 6 Aug 2024 10:27:16 +0200 (CEST) Date: Tue, 6 Aug 2024 10:27:16 +0200 From: Peter Zijlstra To: Tejun Heo Cc: Linus Torvalds , linux-kernel@vger.kernel.org, David Vernet , Ingo Molnar , Alexei Starovoitov , Thomas Gleixner Subject: Re: [GIT PULL] sched_ext: Initial pull request for v6.11 Message-ID: <20240806082716.GP37996@noisy.programming.kicks-ass.net> References: <20240723163358.GM26750@noisy.programming.kicks-ass.net> <20240802122034.GZ12673@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, Aug 02, 2024 at 08:47:51AM -1000, Tejun Heo wrote: > > > +static bool consume_remote_task(struct rq *rq, struct scx_dispatch_q *dsq, > > > + struct task_struct *p, struct rq *task_rq) > > > +{ > > > + bool moved = false; > > > + > > > + lockdep_assert_held(&dsq->lock); /* released on return */ > > > + > > > + /* > > > + * @dsq is locked and @p is on a remote rq. @p is currently protected by > > > + * @dsq->lock. We want to pull @p to @rq but may deadlock if we grab > > > + * @task_rq while holding @dsq and @rq locks. As dequeue can't drop the > > > + * rq lock or fail, do a little dancing from our side. See > > > + * move_task_to_local_dsq(). > > > + */ > > > + WARN_ON_ONCE(p->scx.holding_cpu >= 0); > > > + task_unlink_from_dsq(p, dsq); > > > + dsq_mod_nr(dsq, -1); > > > + p->scx.holding_cpu = raw_smp_processor_id(); > > > + raw_spin_unlock(&dsq->lock); > > > + > > > + double_lock_balance(rq, task_rq); > > > + > > > + moved = move_task_to_local_dsq(rq, p, 0); > > > + > > > + double_unlock_balance(rq, task_rq); > > > + > > > + return moved; > > > +} > > > > I've gotta ask, why are you using the double_lock_balance() pattern > > instead of the one in move_queued_task() that does: > > > > lock src > > dequeue src, task > > set_task_cpu task, dst > > unlock src > > > > lock dst > > enqueue dst, task > > unlock dst > > When !CONFIG_PREEMPTION, double_lock_balance() seems cheaper than unlocking > and locking unconditionally. Because SCX schedulers can do a lot more hot > migrations, I thought it'd be better to go that way. I haven't actually > measured anything tho, so I could be wrong. So I think the theory is something like this. If you take a spinlock, you wait-time W is N times the hold-time H, where the hold-time is avg/max (depending on your analysis goals) time you hold the lock for, and N is the contention level or number of waiters etc. Now, when you go nest locks, your hold-time increases with the wait-time of the nested lock. In this case, since it's the 'same' lock, your hold-time gets a recursive wait-time term, that is: H'=H+N*H. This blows up your wait-time, which makes contention worse. Because what was W=N*H then becomes W=N*(N*H). Anyway, at the time we saw great benefits from moving away from the double-lock thing, it might be worth looking into when/if you see significant lock contention; because obviously if the locks are not contended it all doesn't matter.