From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9B5E42048 for ; Wed, 18 Dec 2024 07:56:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.42 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734508588; cv=none; b=FP9qINoSO5kWGztatgJHwMlwlyBn7C/cipTRrMM+Vyhkk4thvA2uiheqyxQba2TCOVhFLX2NRZ6H0jOuvHI6uItlRJaQeAblRU4WP8PoT65+DYANdzT7F9GJHnS0xh9jojuDtPgJD/cMnTR+/B+21PVUPYYxdYCr0wl+oalC5aM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734508588; c=relaxed/simple; bh=cjl0Cdu1nU64p1FBadrj6cML7s4f100BVHDG+dqkhvQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HDFxDTVx5POq6eJVHhNPUHVvhDusJta/jQqTOu66E2BlZTaz199DbsoJU+N8rkwu5gIdkZtVHDafD1/2K3emqCfw3UFbvofpjilESSp6ESWVYmT5d/ZGRkta2S8jKexfqmHoKBftkLwFL3asBpc5sSlVTQz85wyOptaWkJk3cS8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=Z6OyPX/X; arc=none smtp.client-ip=209.85.218.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="Z6OyPX/X" Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-aa69077b93fso919269966b.0 for ; Tue, 17 Dec 2024 23:56:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1734508583; x=1735113383; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=YQN0wm71xAkj0p1Q/NdDLgt+KxZdFC6f+ofVO1dsHQo=; b=Z6OyPX/X228/2LtFOFEz3yj3mbOTnRdYG+yclJq9uAPSz56wcpprB6YupxGu70nSXW AkI/IeLYEN4bHfoacQhPnPHEG6xuTkhio1zyX3jR0KHzKgWD7CyTe07SI8umsfdNDTyQ 3GRmMaiF4ka5bI/jyOhaXEwB2C5n4HbSpiL3wSjVNYqYjFVTsr283WI4g9XhGE0dui0r /t6YYNvLRm2iHjykXQyDaQGOyzo0fQZm9yiEHO2RzlAmdhHIadv4vUZ3oVKdssrFJcZo QK3NhqiYmpo7b4KpeRfrK4sHWBi2WVIUU42G9EjXE8hOLvmuOHSR/ZbRo4IP20QV/o/m fypQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734508583; x=1735113383; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=YQN0wm71xAkj0p1Q/NdDLgt+KxZdFC6f+ofVO1dsHQo=; b=q2Dlk0dowee8cYdqPUgITldQ2i00iTM0a0XbZIuGPXCSITPoLMOobguovcm/4xIMBF RYNGxBFrkc93Q33oWjK5g0pckzU+74HkOaFrh5fIOvks33rQNA2dvmYRDNu2Ufj/HxF5 sMLsG1yGX8wvfaDUE+m3dPAtDhuxwZm/T0aHc+OmCWH3oO56rzNNTpm8Xv/kFAd6oivp aKkiiFwdO9UvcARBxfroBeQakJTi9IJ77N1/1zkKJMGxJ6nqSjFvn+g51JoUZeH0A/Rp 55Q0IHZIJ2M9kwE8Vw1b5Qvx6SiNRx1CAx2w1QzEHs9wSK2ujCO/+27ZtfMV6LmeS5+O UBww== X-Forwarded-Encrypted: i=1; AJvYcCVddd9gpRmeFYnZxkyi/RcYLs1tqlz9FIe71Ot8fuleyOAlgG69Jw7qd2ELhlhInqk+ECe9qjI558new3M=@vger.kernel.org X-Gm-Message-State: AOJu0YwOxwfWCIm+BSQgP3M5smSuyXsJ81Fp9qhYESDRfkVrWwnt1kfk /Pa4iSQyBuRVxxpjj9L0KnioYUBj/Rs46Tygt3aCo1mH5o0yzSX77yxCvWCo/ps= X-Gm-Gg: ASbGncuzg/2iOqoCNkd7jZPQEEDLcVbkklzfEg/ak/Y62Hx+MK+CK2KrzKpHXMzlW7x l5De71Ri3/9O/LPKdv9Fe9KNcU3YLzRDMBfMtGWTmaQxGvsBjCRWj77agKGzsw0FOMK0XTqYzfu bbPoDyd6y8L2LOnKEddiEjX0egb/UHXvm1L524lTy1O+HK27OWmCB1h9xQoOttcUmHN6bRXWLKB w9+NefxW7yFjKs6PfPISUJGPDTpnv2T2Voj9g56vHqf34xVSnkXejcGygBr+6CBLjY= X-Google-Smtp-Source: AGHT+IG9nTkGC+v0t17oJVqtodNfTCv4+bFZC7mXaRTkfBgwdIfAwM2qjG+K5/ZQ0hmsRv1dwYdgbA== X-Received: by 2002:a17:906:3145:b0:aa6:b4b3:5923 with SMTP id a640c23a62f3a-aabf47baa14mr132241166b.33.1734508583140; Tue, 17 Dec 2024 23:56:23 -0800 (PST) Received: from localhost (109-81-89-64.rct.o2.cz. [109.81.89.64]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d652ae1127sm5304774a12.42.2024.12.17.23.56.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Dec 2024 23:56:22 -0800 (PST) Date: Wed, 18 Dec 2024 08:56:21 +0100 From: Michal Hocko To: Chen Ridong Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, yosryahmed@google.com, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, davidf@vimeo.com, vbabka@suse.cz, handai.szj@taobao.com, rientjes@google.com, kamezawa.hiroyu@jp.fujitsu.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, chenridong@huawei.com, wangweiyang2@huawei.com Subject: Re: [PATCH v1] memcg: fix soft lockup in the OOM process Message-ID: References: <20241217121828.3219752-1-chenridong@huaweicloud.com> <872c5042-01d6-4ff3-94bc-8df94e1e941c@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <872c5042-01d6-4ff3-94bc-8df94e1e941c@huaweicloud.com> On Wed 18-12-24 15:44:34, Chen Ridong wrote: > > > On 2024/12/17 20:54, Michal Hocko wrote: > > On Tue 17-12-24 12:18:28, Chen Ridong wrote: > > [...] > >> diff --git a/mm/oom_kill.c b/mm/oom_kill.c > >> index 1c485beb0b93..14260381cccc 100644 > >> --- a/mm/oom_kill.c > >> +++ b/mm/oom_kill.c > >> @@ -390,6 +390,7 @@ static int dump_task(struct task_struct *p, void *arg) > >> if (!is_memcg_oom(oc) && !oom_cpuset_eligible(p, oc)) > >> return 0; > >> > >> + cond_resched(); > >> task = find_lock_task_mm(p); > >> if (!task) { > >> /* > > > > This is called from RCU read lock for the global OOM killer path and I > > do not think you can schedule there. I do not remember specifics of task > > traversal for crgoup path but I guess that you might need to silence the > > soft lockup detector instead or come up with a different iteration > > scheme. > > Thank you, Michal. > > I made a mistake. I added cond_resched in the mem_cgroup_scan_tasks > function below the fn, but after reconsideration, it may cause > unnecessary scheduling for other callers of mem_cgroup_scan_tasks. > Therefore, I moved it into the dump_task function. However, I missed the > RCU lock from the global OOM. > > I think we can use touch_nmi_watchdog in place of cond_resched, which > can silence the soft lockup detector. Do you think that is acceptable? It is certainly a way to go. Not the best one at that though. Maybe we need different solution for the global and for the memcg OOMs. During the global OOM we rarely care about latency as the whole system is likely to struggle. Memcg ooms are much more likely. Having that many tasks in a memcg certainly requires a further partitioning so if configured properly the OOM latency shouldn't be visible much. But I am wondering whether the cgroup task iteration could use cond_resched while the global one would touch_nmi_watchdog for every N iterations. I might be missing something but I do not see any locking required outside of css_task_iter_*. -- Michal Hocko SUSE Labs