From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C492315A864 for ; Fri, 27 Mar 2026 16:00:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774627239; cv=none; b=R+lciwGX0nCZpf+MsmPjkvXAVW2d9+4tKGgNHGXY6Uk2vg8mtNvyHVE47B5bPyi9by0pp6/Gh8X2Qi0Davfwj3fS6ZFL+zSb1he3iab2WSIaXQD4+j4HGgY4Jw/gk9HzaYzEkWuSJ6ATsGkejhnlS2gB9O5FKKcRsdQ7gB4Y9Rk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774627239; c=relaxed/simple; bh=CtYyu5uhdoVzlQH4r77L18U+/uPXIvmJ8bHBCjWH4BI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hS2FJvKjFA4J4b3mZFRP1DaLbUJ7eEwO4IZVpYIOYA7kl3um+PXcoaBP64pA8KDNsnDDHBRz/V44rvUOEePlZcfUvbSjBFgKDJ/xQUag87XeCfha4kSLG3BXqcuD5NHgJGYZK0WlgRx7OdzcwriU0zCrxK643wzUIkU+2ZCR2ow= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=LNMkFPnr; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="LNMkFPnr" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=TjuoHbhxQNiXqn2ZGNrBPpdKBcKrJh2s2Qd9CT62nhs=; b=LNMkFPnrtodKGzepOOQQdKWCCe bPAUw1tvsuFwqUVCyG+jDANJnnqTDljgTnn0J+gFyLjAdGHSK+I37Ki8FGPHkAdKrh7HdhSwHNYk2 NyBnRbb8au/kPCNVsIPcsqZCpYE1zPK4o7hkeJBjOyKJZiBr5lUKb+wPKSVt6sOu7JIDOwmHatcRQ TKQSxrqB6n3qSCWwqvpRT4Tqobsz7wi+OKXO44kcjZsbfGoD7V0cpboBziT2vqtOz0I1cUoUg8Bon iXrsfwLFQMV0AwCnv0WCMpBGrQ2w3/8OSR1nitKYARBpivIEYVRs129RXLzLaW8bj72zE/n1I/frf d6g/lqaA==; Received: from 2001-1c00-8d85-4b00-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:4b00:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1w69be-000000023T0-3GAp; Fri, 27 Mar 2026 16:00:19 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 0241E300E56; Fri, 27 Mar 2026 17:00:18 +0100 (CET) Date: Fri, 27 Mar 2026 17:00:17 +0100 From: Peter Zijlstra To: K Prateek Nayak Cc: John Stultz , LKML , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Subject: Re: [PATCH v26 00/10] Simple Donor Migration for Proxy Execution Message-ID: <20260327160017.GK3738010@noisy.programming.kicks-ass.net> References: <20260324191337.1841376-1-jstultz@google.com> <36e96f87-a682-436e-aefc-13e2e5810019@amd.com> <20260327114844.GQ2872@noisy.programming.kicks-ass.net> <33e60181-1809-44e1-bc4c-8ac7f79d49d6@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <33e60181-1809-44e1-bc4c-8ac7f79d49d6@amd.com> On Fri, Mar 27, 2026 at 07:03:19PM +0530, K Prateek Nayak wrote: > Hello Peter, > > On 3/27/2026 5:18 PM, Peter Zijlstra wrote: > > I tried to have a quick look, but I find it *very* hard to make sense of > > the differences. > > Couple of concerns I had with the current approach is: > > 1. Why can't we simply do block_task() + wake_up_process() for return > migration? So the way things are set up now, we have the blocked task 'on_rq', so ttwu() will take ttwu_runnable() path, and we wake the task on the 'wrong' CPU. At this point '->state == TASK_RUNNABLE' and schedule() will pick it and ... we hit '->blocked_on == PROXY_WAKING', which leads to proxy_force_return(), which does deactivate_task()+activate_task() as per a normal migration, and then all is well. Right? You're asking why proxy_force_return() doesn't use block_task()+ttwu()? That seems really wrong at that point -- after all: '->state == TASK_RUNNABLE'. Or; are you asking why we don't block_task() at the point where we set '->blocked_on = PROXY_WAKING'? And then let ttwu() sort things out? I suspect the latter is really hard to do vs lock ordering, but I've not thought it through. One thing you *can* do it frob ttwu_runnable() to 'refuse' to wake the task, and then it goes into the normal path and will do the migration. I've done things like that before. Does that fix all the return-migration cases? > 2. Why does proxy_needs_return() (this comes later in John's tree but I > moved it up ahead) need the proxy_task_runnable_but_waking() override > of the ttwu_state_mach() machinery? > (https://github.com/johnstultz-work/linux-dev/commit/28ad4d3fa847b90713ca18a623d1ee7f73b648d9) Since it comes later, I've not seen it and not given it thought ;-) (I mean, I've probably seen it at some point, but being the gold-fish that I am, I have no recollection, so I might as well not have seen it). A brief look now makes me confused. The comment fails to describe how that situation could ever come to pass. > 3. How can proxy_deactivate() see a TASK_RUNNING for blocked donor? I was looking at that.. I'm not sure. I mean, having the clause doesn't hurt, but yeah, dunno. > Speaking of that commit, I would like you or Juri to confirm if it is > okay to set a throttled deadline task as rq->donor for a while until it > hits resched. I think that should be okay.