From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D90A131751 for ; Tue, 2 Apr 2024 15:30:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712071821; cv=none; b=s7npgrbpGf2BpdHNmNq+kzSpoYZm4FTE2FWrYSHNQwb0nS4b4WKSKFvBTPtI+C4BdoCoiW++WtqdEpXv0Z37Nqjd6cW88qRDuMQTraksdgtf/b+nxnVvjUaWxJqoiRNRg5d2OnaK9OpAZJ9a+SDEqhBxR+bMKA+cxmLCA2sb4CQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712071821; c=relaxed/simple; bh=23qNJ4ftegx5H447MVacyBoGDIcrxV1t1VQ4lsSZJ/A=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=NrLqk0QA7tKGov7+ujbeYaJxnGKzm5i5LJSIRHNYGQvrYXN51iaV8TGWigQZJG2BDnXjqsTb1HzP5QmHqHuEmrxYwYcjC9kKOLPDkV7WZKquep6OQ+ITeH/g4zCzaGcRBdZAPjaxY7MJgPNYA5KAiPQ80D5YbH/Ca/rqizTIV+Y= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=OqxyHHGf; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OqxyHHGf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1712071817; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9CES7zO5vAsVp440eN4HWeKNk0OJ2Kki5eVVcnl1iPU=; b=OqxyHHGfaULbFldnn+tYR4OjCWIO3qzVRKzyEiJNe690/uFQYpSYZ2yiypL9VD8t3T75x6 n11gTC+TFtEEE1fLTnSSyOo8v7I5rDwU3iGU2Fa8/K++nEf0sUmo8rPj4HmQBDr2hls39K mzQ0ZAcb3IJIJ+cS0IfokHsqKAQVUG4= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-611-mCLhMP9nP1GCXUty7wZCZQ-1; Tue, 02 Apr 2024 11:30:14 -0400 X-MC-Unique: mCLhMP9nP1GCXUty7wZCZQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 575BD1E441D0; Tue, 2 Apr 2024 15:30:13 +0000 (UTC) Received: from [10.22.33.108] (unknown [10.22.33.108]) by smtp.corp.redhat.com (Postfix) with ESMTP id E66122022EA7; Tue, 2 Apr 2024 15:30:11 +0000 (UTC) Message-ID: <548efd52-e45f-41fa-a477-bc5112d7b00c@redhat.com> Date: Tue, 2 Apr 2024 11:30:11 -0400 Precedence: bulk X-Mailing-List: cgroups@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] cgroup/cpuset: Make cpuset hotplug processing synchronous Content-Language: en-US To: =?UTF-8?Q?Michal_Koutn=C3=BD?= Cc: Tejun Heo , Zefan Li , Johannes Weiner , Thomas Gleixner , Peter Zijlstra , "Rafael J. Wysocki" , Len Brown , Pavel Machek , Shuah Khan , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-pm@vger.kernel.org, linux-kselftest@vger.kernel.org, Frederic Weisbecker , "Paul E. McKenney" , Ingo Molnar , Valentin Schneider , Anna-Maria Behnsen , Alex Shi , Vincent Guittot , Barry Song References: <20240401145858.2656598-1-longman@redhat.com> <20240401145858.2656598-2-longman@redhat.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 On 4/2/24 10:13, Michal Koutný wrote: > Hello Waiman. > > (I have no opinion on the overall locking reworks, only the bits about > v1 migrations caught my attention.) > > On Mon, Apr 01, 2024 at 10:58:57AM -0400, Waiman Long wrote: > ... >> @@ -4383,12 +4377,20 @@ hotplug_update_tasks_legacy(struct cpuset *cs, >> /* >> * Move tasks to the nearest ancestor with execution resources, >> * This is full cgroup operation which will also call back into >> - * cpuset. Should be done outside any lock. >> + * cpuset. Execute it asynchronously using workqueue. > ...to avoid deadlock on cpus_read_lock > > Is this the reason? > Also, what happens with the tasks in the window till the migration > happens? > Is it handled gracefully that their cpu is gone? Yes, there is a potential that a cpus_read_lock() may be called leading to deadlock. So unless we reverse the current cgroup_mutex --> cpu_hotplug_lock ordering, it is not safe to call cgroup_transfer_tasks() directly. > > >> - if (is_empty) { >> - mutex_unlock(&cpuset_mutex); >> - remove_tasks_in_empty_cpuset(cs); >> - mutex_lock(&cpuset_mutex); >> + if (is_empty && css_tryget_online(&cs->css)) { >> + struct cpuset_remove_tasks_struct *s; >> + >> + s = kzalloc(sizeof(*s), GFP_KERNEL); > Is there a benefit of having a work for each cpuset? > Instead of traversing whole top_cpuset once in the async work. We could do that too. It's just that we have the repeat the iteration process once the workfn is invoked, but that has the advantage of not needing to do memory allocation. I am OK with either way. Let's see what other folks think about that. Cheers, Longman