From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB390220F38 for ; Fri, 7 Nov 2025 15:19:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762528779; cv=none; b=D05jTTRZlkr3F9P7U6147aWlsf+E6/xko9SENPQVTMF9aFQ1VCSaazXjhRmJGyKXax+jQrvhILVaeV+qXZoIzjaMGjLmsmy1xZFGYR3WLI/sRJOA9IrxjBoIAPWr2A1M6GllL8ZZ2wwZWXQmRwisSwCaqzO2D+my99kkagBpfxc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762528779; c=relaxed/simple; bh=edr/lkaOuTCTyvPNCU3sC0NiFYTWChs4jos0Oe2v9c4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=g/Dr1mp/CM2UbkN2nIOGFEaSQwnU77Y85vpZRsdF61jSgxjNFAkJ0qya3kR++++M9jS7XFKY/f/h/fKqJTZDT04OqfAp5hdKTXH+x38BqOo/yabe3btM7bzfMbm/mArZE2SG61YMGgCy5VipOD5Wi2wX/bo0I8jYBzxE5Bpwy38= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EE9YBVw/; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=KrHbe4Fr; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EE9YBVw/"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="KrHbe4Fr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1762528776; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pEqaoboZeQr2HPVcictvI9mXo/22ZUbUWYmT7tbCing=; b=EE9YBVw/VIG5g3gRE3sNAPiWXOqSQJyvZ6sybW2Lv/31ZrhaCFBa/RHPMBhOTQ8Gly9ho0 dW56tKCp31CRfHLUS4KqZDeM4HIABB3X025Wg8N5Mq4UO5hwf5PRmpDGCbCkXkTuype0UV 85q01GXQnO/zO5FiNppfSlOJCXg8ORw= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-628-BAIpeXSVM5SPhuQOzDJfQg-1; Fri, 07 Nov 2025 10:19:35 -0500 X-MC-Unique: BAIpeXSVM5SPhuQOzDJfQg-1 X-Mimecast-MFC-AGG-ID: BAIpeXSVM5SPhuQOzDJfQg_1762528774 Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-42b2ad2a58cso91716f8f.0 for ; Fri, 07 Nov 2025 07:19:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1762528774; x=1763133574; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pEqaoboZeQr2HPVcictvI9mXo/22ZUbUWYmT7tbCing=; b=KrHbe4FrgjiKsGeKqaHhFRYov6yrx4GDb1wGk4FYEzHMbsZPe28vWWIQTL1uioagmn 0SwPPzzv8MhfJ+YOX3h6L+FDFFa9vrcm8u3zShVPQfX2NYQG/T4th4mmm23Z7KmaIoO0 0ePdHBHohKg47zhKFox6GdcS1yx/UDIdzUkIZsoFZ1E63GErnQKHh1JPtN2nVlUlvrGQ dOiRAqKLLGyo4OKSR80VKvNEXjS8GF7CRl4AdV7wzbOt5qzcvH6Peu+CjysSMoJUPlIo jP5xDn+3Rz10DKU1l+62R3HkOCQXhQLpaV8iquQfplTK1FD3H5tsF/SydPXxWVlJnwIF axBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762528774; x=1763133574; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pEqaoboZeQr2HPVcictvI9mXo/22ZUbUWYmT7tbCing=; b=nOTX4U4XdSU6jqKmnrbZF6Krg3NhOot8nJtB0wN33hc8bmRNkuhh80XuRsJFR9Nfvl ERdr+m2KdnFLR4ObSUkZECqfuBd28AJTpKxZl4hM8SiBiMnsUdtm07kiuTWu+thxrqJA 2p7DtOpTpiYYzVxVB5fYW96izwbDEPDeYAE2LBWAE/8MX2bShBQ0iRxJbUdKHVmFxGmC mFc6ys5fwTpRdLSal44TaJ0SyiD2kV6Ua3hI41k5MZT2EA5g3hyS+RG4bhmgaYPRZNxC cdebx0vuwNgTMx8r8ES9QeH7H27UFy12Hn3r8zsdFzPIgXBcJ5G64m8Rut1O283+wtOP XvUg== X-Gm-Message-State: AOJu0YzG95KN9+pP5M7I1+RXxgKESm1wciOEqVbK0XZXDxynTcUGJuxt VGHv2VG8nugv88SCSQ4YG5tjqkIEg3Bb5FVrPdbIXlwr5KFM6A96ywT+cOuMJYWlVaSLw5c6zfA Ygs87cydgtiuLUeJ2Ts04uZpAxngiqIF9cGLFLAFTKXj/N68HHR17kRiw7Q7na+sQNw== X-Gm-Gg: ASbGncudgsASmJrZlvmu5uXnoUAxJJKMKWuPChrTQFoke6p0flCqSZrORMcvwxTGEog Ksy5OD6KdvdCHqVoB5mt+qiWwnGEvD8s+wAkcqtdhSdNXA/g4pzMBLikSyaoEUyPLpmeXtZpQXH 4gIasU89ZyBvNpBb1YZLQp0BDZx5YrkVbCC03EKROx7WOPSbWNnzE3xRN434Vd+GXxzl5cRpQ/W Cen2HqDyFrOhCxjMt50oNvYwn87TyNeo9ysteEMJexihr09vN8eKJ190C7IfjlO/a0xAusTKatI KXS+d+erDSbGnAXY78mSQ1Ks6TFOM2dtNOn/LKFd6s813pY0NC0jGwFrah3KGihrCRjKr0meVDs daxUB6PBj3thufCWbC6OQDeeo3nrATg== X-Received: by 2002:a5d:64c9:0:b0:429:8d0f:e92 with SMTP id ffacd0b85a97d-42adc688670mr2202028f8f.6.1762528774026; Fri, 07 Nov 2025 07:19:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IG2XYjOhyXbjlUGSbmh5xTfWnmEhZrW2YyB6c/K7DF/9tRrAzjabK51nA78MDvWieIkdf+nfA== X-Received: by 2002:a5d:64c9:0:b0:429:8d0f:e92 with SMTP id ffacd0b85a97d-42adc688670mr2201986f8f.6.1762528773423; Fri, 07 Nov 2025 07:19:33 -0800 (PST) Received: from jlelli-thinkpadt14gen4.remote.csb ([151.29.129.40]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42ac675ca47sm5678060f8f.24.2025.11.07.07.19.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Nov 2025 07:19:32 -0800 (PST) Date: Fri, 7 Nov 2025 16:19:30 +0100 From: Juri Lelli To: John Stultz Cc: LKML , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Subject: Re: [PATCH v23 6/9] sched: Handle blocked-waiter migration (and return migration) Message-ID: References: <20251030001857.681432-1-jstultz@google.com> <20251030001857.681432-7-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20251030001857.681432-7-jstultz@google.com> Hi, On 30/10/25 00:18, John Stultz wrote: > Add logic to handle migrating a blocked waiter to a remote > cpu where the lock owner is runnable. > > Additionally, as the blocked task may not be able to run > on the remote cpu, add logic to handle return migration once > the waiting task is given the mutex. > > Because tasks may get migrated to where they cannot run, also > modify the scheduling classes to avoid sched class migrations on > mutex blocked tasks, leaving find_proxy_task() and related logic > to do the migrations and return migrations. > > This was split out from the larger proxy patch, and > significantly reworked. > > Credits for the original patch go to: > Peter Zijlstra (Intel) > Juri Lelli > Valentin Schneider > Connor O'Brien > > Signed-off-by: John Stultz > --- ... > +#ifdef CONFIG_SCHED_PROXY_EXEC > +static inline void proxy_set_task_cpu(struct task_struct *p, int cpu) > +{ > + unsigned int wake_cpu; > + > + /* > + * Since we are enqueuing a blocked task on a cpu it may > + * not be able to run on, preserve wake_cpu when we > + * __set_task_cpu so we can return the task to where it > + * was previously runnable. > + */ > + wake_cpu = p->wake_cpu; > + __set_task_cpu(p, cpu); > + p->wake_cpu = wake_cpu; > +} > +#endif /* CONFIG_SCHED_PROXY_EXEC */ > + ... > +static void proxy_migrate_task(struct rq *rq, struct rq_flags *rf, > + struct task_struct *p, int target_cpu) > { > - if (!__proxy_deactivate(rq, donor)) { > - /* > - * XXX: For now, if deactivation failed, set donor > - * as unblocked, as we aren't doing proxy-migrations > - * yet (more logic will be needed then). > - */ > - clear_task_blocked_on(donor, NULL); > + struct rq *target_rq = cpu_rq(target_cpu); > + > + lockdep_assert_rq_held(rq); > + > + /* > + * Since we're going to drop @rq, we have to put(@rq->donor) first, > + * otherwise we have a reference that no longer belongs to us. > + * > + * Additionally, as we put_prev_task(prev) earlier, its possible that > + * prev will migrate away as soon as we drop the rq lock, however we > + * still have it marked as rq->curr, as we've not yet switched tasks. > + * > + * So call proxy_resched_idle() to let go of the references before > + * we release the lock. > + */ > + proxy_resched_idle(rq); > + > + WARN_ON(p == rq->curr); > + > + deactivate_task(rq, p, 0); > + proxy_set_task_cpu(p, target_cpu); We use proxy_set_task_cpu() here. BTW, can you comment/expand on why an ad-hoc set_task_cpu() is needed for proxy? > + > + /* > + * We have to zap callbacks before unlocking the rq > + * as another CPU may jump in and call sched_balance_rq > + * which can trip the warning in rq_pin_lock() if we > + * leave callbacks set. > + */ > + zap_balance_callbacks(rq); > + rq_unpin_lock(rq, rf); > + raw_spin_rq_unlock(rq); > + raw_spin_rq_lock(target_rq); > + > + activate_task(target_rq, p, 0); > + wakeup_preempt(target_rq, p, 0); > + > + raw_spin_rq_unlock(target_rq); > + raw_spin_rq_lock(rq); > + rq_repin_lock(rq, rf); > +} > + > +static void proxy_force_return(struct rq *rq, struct rq_flags *rf, > + struct task_struct *p) > +{ > + struct rq *this_rq, *target_rq; > + struct rq_flags this_rf; > + int cpu, wake_flag = 0; > + > + lockdep_assert_rq_held(rq); > + WARN_ON(p == rq->curr); > + > + get_task_struct(p); > + > + /* > + * We have to zap callbacks before unlocking the rq > + * as another CPU may jump in and call sched_balance_rq > + * which can trip the warning in rq_pin_lock() if we > + * leave callbacks set. > + */ > + zap_balance_callbacks(rq); > + rq_unpin_lock(rq, rf); > + raw_spin_rq_unlock(rq); > + > + /* > + * We drop the rq lock, and re-grab task_rq_lock to get > + * the pi_lock (needed for select_task_rq) as well. > + */ > + this_rq = task_rq_lock(p, &this_rf); > + update_rq_clock(this_rq); > + > + /* > + * Since we let go of the rq lock, the task may have been > + * woken or migrated to another rq before we got the > + * task_rq_lock. So re-check we're on the same RQ. If > + * not, the task has already been migrated and that CPU > + * will handle any futher migrations. > + */ > + if (this_rq != rq) > + goto err_out; > + > + /* Similarly, if we've been dequeued, someone else will wake us */ > + if (!task_on_rq_queued(p)) > + goto err_out; > + > + /* > + * Since we should only be calling here from __schedule() > + * -> find_proxy_task(), no one else should have > + * assigned current out from under us. But check and warn > + * if we see this, then bail. > + */ > + if (task_current(this_rq, p) || task_on_cpu(this_rq, p)) { > + WARN_ONCE(1, "%s rq: %i current/on_cpu task %s %d on_cpu: %i\n", > + __func__, cpu_of(this_rq), > + p->comm, p->pid, p->on_cpu); > + goto err_out; > } > - return NULL; > + > + proxy_resched_idle(this_rq); > + deactivate_task(this_rq, p, 0); > + cpu = select_task_rq(p, p->wake_cpu, &wake_flag); > + set_task_cpu(p, cpu); But, then use the 'standard' set_task_cpu() for the return migration. Is that intended? > + target_rq = cpu_rq(cpu); > + clear_task_blocked_on(p, NULL); > + task_rq_unlock(this_rq, p, &this_rf); > + > + /* Drop this_rq and grab target_rq for activation */ > + raw_spin_rq_lock(target_rq); > + activate_task(target_rq, p, 0); > + wakeup_preempt(target_rq, p, 0); > + put_task_struct(p); > + raw_spin_rq_unlock(target_rq); > + > + /* Finally, re-grab the origianl rq lock and return to pick-again */ > + raw_spin_rq_lock(rq); > + rq_repin_lock(rq, rf); > + return; > + > +err_out: > + put_task_struct(p); > + task_rq_unlock(this_rq, p, &this_rf); > + raw_spin_rq_lock(rq); > + rq_repin_lock(rq, rf); > + return; Thanks, Juri