From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D51D257855 for ; Wed, 18 Mar 2026 06:35:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773815731; cv=none; b=oZ4+IULyHqTglwB30KJ7PswDzEG4sQgLAZ54eTwsUHuLM2xyxl+5kB5kW0XJsDH84xvv7P849zCepJB/cSBp9qUpXXCA38IpSuYqviKWpgYWHU3j+YYIw/AE2YaQT8f8k8iH1hrutywDuUWwaTMc5jLy6XBVw6xLZ4lRYfhX9S0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773815731; c=relaxed/simple; bh=YiQJPzX/e8aNPIl5G9DkX6HRvHtyYygZpdjKcVAMKc4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GmfxLTLjnuLA7OcQnvH8mgX6xHa9+F7j1SR0yKCLj5Fkgk/8Zo4qILKWrsH9YyChnQfU5kUwy98KO7qPvSTe99N5Ax8athAjbvBRKyNa7IfeEjEHwbMZ7Ip8OzwgYaI9E62G4SmkZRVdr7nmpQhqvHmE9hAPmeB/FYHHxouB+O0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=f5AmeXNA; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=EUyIJUGP; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="f5AmeXNA"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="EUyIJUGP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773815729; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E5Wk2xWDdhmeTWgQ2Q3QusO2DC9Wwl/3Kg5TqVg9Bz8=; b=f5AmeXNAGYldkr09RkJkmsyPp7Fg78oFVvZLv5BqOzsscrLNz4MOl4KBivrrfmmZ/aB3RV IOXaXZS70iOLhlw8V8HU0VwgWnvSOZ3YGyKFttVdKaJpFzc/3aeSwVzsVMHuhdqTLM1a2o xb1895yp4wMwaf6LTmwDCDB0z+/H274= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-81-z3kyygyQN7qpYy0kr6gW8g-1; Wed, 18 Mar 2026 02:35:26 -0400 X-MC-Unique: z3kyygyQN7qpYy0kr6gW8g-1 X-Mimecast-MFC-AGG-ID: z3kyygyQN7qpYy0kr6gW8g_1773815725 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-4853b474594so5050745e9.1 for ; Tue, 17 Mar 2026 23:35:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773815725; x=1774420525; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=E5Wk2xWDdhmeTWgQ2Q3QusO2DC9Wwl/3Kg5TqVg9Bz8=; b=EUyIJUGPtlDqt2Vrq8lAO8IlvLcQsSyyl8773if1CoPmfqV9HK1HZ5KuQxiyYwUg/T oDZicH1yVpyoP53uaO/o0WE45Wstt+j7bWRPutdGcZf7W8mObLJnKNvMHelK1EX78O1l 7DbLC5ViD2kPv++eTQBB6VOs4iAkqID1H9a5DCivOe9afUTgcjAckLqwz56B9vNrbJa7 d9UAYpOEG3wHqxAf+Wc9x/O1ma2BYTsBnRs4Zj/ivt4ZtUmU3cj66LpktVsSsDPxzhWJ 7gUSTKpetL1+OkyvtfMQxaLy1UPVsm2KMhruYy4dFZxz+paMxQ33kM68P2gDnfflfg9y xIQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773815725; x=1774420525; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=E5Wk2xWDdhmeTWgQ2Q3QusO2DC9Wwl/3Kg5TqVg9Bz8=; b=RwZfj1Ti4v8SwVO+jSmSAEvCrqNmMMjNcn//7a1bR/1Ck1N0mz30vrn0dW9r379giR R3dw8ho7DzZ4cvGqWxAsHRur8tpcfxHF6UZCwyCStvkpOKbLSN9+hfCzi1JPL+CjQo9I 5xQOLZ2c0dCrEnfO/cmbVLBtIfMS+Of+l1fgBgbPo3vSZcA+kQJ1dJRslSZTqrSCTFvG Uq1GOdYrkUbK8gb/ex5Eqbm4hxLtm5NPaiZQz8yCUVfGEC4c3z+Lk/5NBxDXwbcW0qBc xbTiJgIQZt/5f4TT/wcnKZNqhfCftwR4fZrzmhQ+qYufDbRpYpJeHANZsk3i8ofl07ze 66JQ== X-Gm-Message-State: AOJu0Ywm+2hknZC3zqwEPaQ7CSdoj+cF4fbg/j6vIur+kd4DmKYAocjm lbTPQbL3imDWtpPnICG6+qXkPtEtDGKUo7mV5YZIWj0V53doMCvw+oxiz37khBCH03OkqtyyPOL j/bOflpNHXbGV9wr6wZ0/9GL9t67764juzLO+vb8ETK8xkGLHJ1fzTbeSOmF4bBL0LQ== X-Gm-Gg: ATEYQzzfZG1BvBU31ik0oHvdaDPnCFFqDP0D/8491W5DjdV4FAtiaoPnIPvmJRSJAFK 10mbKDwb0qB47tW0VAIpeYPOrRj+OL3bxAmv0gAHGLQXUz7QfvyRvzbfSMtpETsId/kSW0ffM2j LPstq9IK7P+l42T+WUrD2fCm7tkIlelw0nppcPn/EV1CW7UTbfVceehuj0OdNpdHf/kb5iXrD0d b5NpUJ0EXl0H8bV7t1KqBZXyyk3WdwSMAChbeqTm1NTaNbqhhLGhUe1VdDMTg+nuNNWNgHRKxAB mQ1IUxreZAebDzJZYX8XOlJiQbPJfUlZWjlGLkVl0gu81/WLHlorkEYrTIKBa4Ku6WyBCwsjDkH gDxnzx+VNdFFg4gT9S2XEg805RFGpfEQRUHI5zm4031N7grb/lnc= X-Received: by 2002:a05:600c:c4a4:b0:485:3c11:de84 with SMTP id 5b1f17b1804b1-486f40c9015mr29972355e9.14.1773815724925; Tue, 17 Mar 2026 23:35:24 -0700 (PDT) X-Received: by 2002:a05:600c:c4a4:b0:485:3c11:de84 with SMTP id 5b1f17b1804b1-486f40c9015mr29972095e9.14.1773815724486; Tue, 17 Mar 2026 23:35:24 -0700 (PDT) Received: from jlelli-thinkpadt14gen4.remote.csb ([151.29.82.96]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-486f75f774fsm998035e9.32.2026.03.17.23.35.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Mar 2026 23:35:23 -0700 (PDT) Date: Wed, 18 Mar 2026 07:35:21 +0100 From: Juri Lelli To: John Stultz Cc: LKML , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Subject: Re: [PATCH v25 9/9] sched: Handle blocked-waiter migration (and return migration) Message-ID: References: <20260313023022.2902479-1-jstultz@google.com> <20260313023022.2902479-10-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20260313023022.2902479-10-jstultz@google.com> Hello, I couldn't convince myself the below is not potentially racy ... On 13/03/26 02:30, John Stultz wrote: ... > +static void proxy_migrate_task(struct rq *rq, struct rq_flags *rf, > + struct task_struct *p, int target_cpu) > { > - if (!__proxy_deactivate(rq, donor)) { > - /* > - * XXX: For now, if deactivation failed, set donor > - * as unblocked, as we aren't doing proxy-migrations > - * yet (more logic will be needed then). > - */ > - clear_task_blocked_on(donor, NULL); > + struct rq *target_rq = cpu_rq(target_cpu); > + > + lockdep_assert_rq_held(rq); > + > + /* > + * Since we're going to drop @rq, we have to put(@rq->donor) first, > + * otherwise we have a reference that no longer belongs to us. > + * > + * Additionally, as we put_prev_task(prev) earlier, its possible that > + * prev will migrate away as soon as we drop the rq lock, however we > + * still have it marked as rq->curr, as we've not yet switched tasks. > + * > + * So call proxy_resched_idle() to let go of the references before > + * we release the lock. > + */ > + proxy_resched_idle(rq); > + > + WARN_ON(p == rq->curr); > + > + deactivate_task(rq, p, DEQUEUE_NOCLOCK); > + proxy_set_task_cpu(p, target_cpu); > + > + /* > + * We have to zap callbacks before unlocking the rq > + * as another CPU may jump in and call sched_balance_rq > + * which can trip the warning in rq_pin_lock() if we > + * leave callbacks set. > + */ > + zap_balance_callbacks(rq); > + rq_unpin_lock(rq, rf); > + raw_spin_rq_unlock(rq); > + > + attach_one_task(target_rq, p); We release rq lock between deactivate and attach (and we don't hold neither wait_lock nor blocked_lock as they are out of scope at this point). Can't something like the following happen? - Task A: blocked on mutex M, queued on CPU 0 - Task B: owns mutex M, running on CPU 1 CPU 0 (migrating A→CPU 1) CPU 1 (B finishes critical section) ------------------------- ------------------------------------ find_proxy_task(donor=A): owner = B, owner_cpu = 1 action = MIGRATE // guard releases wait_lock proxy_migrate_task(A, cpu=1): deactivate_task(rq0, A) → A->on_rq = 0 proxy_set_task_cpu(A, 1) → A->cpu = 1 raw_spin_rq_unlock(rq0) → RQ0 LOCK RELEASED // Task B running mutex_unlock(M): lock(&M->wait_lock) // ← Can grab it A->blocked_on = PROXY_WAKING unlock(&M->wait_lock) wake_up_q(): try_to_wake_up(A): sees A->on_rq == 0 cpu = select_task_rq(A) → returns CPU 2 set_task_cpu(A, 2) ttwu_queue(A, 2) → A enqueued on CPU 2 → A->on_rq = 1, A->cpu = 2 attach_one_task(rq1, A): attach_task(rq1, A): WARN_ON_ONCE(task_rq(A) != rq1) → Fires! task_rq(A) = rq2 activate_task(rq1, A) → Double-enqueue! A->on_rq already = 1 What am I missing? :) Thanks, Juri