From mboxrd@z Thu Jan 1 00:00:00 1970 From: Roman Gushchin Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Date: Sun, 5 Feb 2023 11:49:13 -0800 Message-ID: References: <9e61ab53e1419a144f774b95230b789244895424.camel@redhat.com> <0122005439ffb7895efda7a1a67992cbe41392fe.camel@redhat.com> <28e08669302ad1e7a41bdf8b9988de6a352b5fe1.camel@redhat.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1675626564; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zCXR2fdD5KUizgsD+r8xIXos+I6YOtrXfgJwLA6SDKs=; b=EiKIcs3Klx+YgeINRkbrjeT48IQMnlUunMLev5pz3djUhjV6YTD/1jYT489gdALY33FZCx dSbnndEJe4EaJCa9nw0etCNxIxWraIHACeTmUQS0DGR5VCNsZsBBcLx7vhAaIdbA07JJPu 9kXpNSVt5VA+kI0R8AAL2XXGsKB+Ues= Content-Disposition: inline In-Reply-To: <28e08669302ad1e7a41bdf8b9988de6a352b5fe1.camel@redhat.com> List-ID: Content-Type: text/plain; charset="iso-8859-1" To: Leonardo =?iso-8859-1?Q?Br=E1s?= Cc: Michal Hocko , Marcelo Tosatti , Johannes Weiner , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Hi Leonardo! > Yes, but we are exchanging an "always schedule_work_on()", which is a kin= d of > contention, for a "sometimes we hit spinlock contention". >=20 > For the spinlock proposal, on the local cpu side, the *worst case* conten= tion > is: > 1 - wait the spin_unlock() for a complete , > 2 - wait a cache hit for local per-cpu cacheline=A0 >=20 > What is current implemented (schedule_work_on() approach), for the local > cpu=A0side there is *always* this contention: > 1 - wait for a context switch, > 2 - wait a cache hit from it's local per-cpu cacheline, > 3 - wait a complete ,=A0 > 4 - then for a new context switch to the current thread. I think both Michal and me are thinking of a more generic case in which the= cpu is not exclusively consumed by 1 special process, so that the draining work= can be executed during an idle time. In this case the work is basically free. And the introduction of a spin_lock() on the hot path is what we're are con= cerned about. I agree, that on some hardware platforms it won't be that expensive,= but in general not having any spinlocks is so much better. >=20 > So moving from schedule_work_on() to spinlocks will save 2 context switch= es per > cpu every time drain_all_stock() is called. >=20 > On the remote cpu side, my tests point that doing the remote draining is = faster > than scheduling a local draining, so it's also a gain. >=20 > Also, IIUC the possible contention in the spinlock approach happens only = on > page-faulting and syscalls, versus the schedule_work_on() approach that c= an > interrupt user workload at any time.=A0 >=20 > In fact, not interrupting the user workload in isolated cpus is just a bo= nus of > using spinlocks. I believe it significantly depends on the preemption model: you're right re= garding fully preemptive kernels, but with voluntary/none preemption it's exactly o= pposite: the draining work will be executed at some point later (probably with 0 cos= t), while the remote access from another cpu will potentially cause delays on t= he spin lock as well as a need to refill the stock. Overall I'd expect a noticeable performance regression from an introduction= of spin locks and remote draining. Maybe not on all platforms, but at least on= some. That's my main concern. And I don't think the problem we're aiming to solve= here justifies this potential regression. Thanks!