From mboxrd@z Thu Jan 1 00:00:00 1970 From: Glauber Costa Subject: Re: [PATCH v5 2/2] decrement static keys on real destroy time Date: Thu, 17 May 2012 07:09:29 +0400 Message-ID: <4FB46BE9.6080503@parallels.com> References: <1336767077-25351-1-git-send-email-glommer@parallels.com> <1336767077-25351-3-git-send-email-glommer@parallels.com> <20120516141342.911931e7.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20120516141342.911931e7.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" To: Andrew Morton Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org, netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Tejun Heo , Li Zefan , Johannes Weiner , Michal Hocko On 05/17/2012 01:13 AM, Andrew Morton wrote: > On Fri, 11 May 2012 17:11:17 -0300 > Glauber Costa wrote: > >> We call the destroy function when a cgroup starts to be removed, >> such as by a rmdir event. >> >> However, because of our reference counters, some objects are still >> inflight. Right now, we are decrementing the static_keys at destroy() >> time, meaning that if we get rid of the last static_key reference, >> some objects will still have charges, but the code to properly >> uncharge them won't be run. >> >> This becomes a problem specially if it is ever enabled again, because >> now new charges will be added to the staled charges making keeping >> it pretty much impossible. >> >> We just need to be careful with the static branch activation: >> since there is no particular preferred order of their activation, >> we need to make sure that we only start using it after all >> call sites are active. This is achieved by having a per-memcg >> flag that is only updated after static_key_slow_inc() returns. >> At this time, we are sure all sites are active. >> >> This is made per-memcg, not global, for a reason: >> it also has the effect of making socket accounting more >> consistent. The first memcg to be limited will trigger static_key() >> activation, therefore, accounting. But all the others will then be >> accounted no matter what. After this patch, only limited memcgs >> will have its sockets accounted. > > So I'm scratching my head over what the actual bug is, and how > important it is. AFAICT it will cause charging stats to exhibit some > inaccuracy when memcg's are being torn down? > > I don't know how serious this in in the real world and so can't decide > which kernel version(s) we should fix. > > When fixing bugs, please always fully describe the bug's end-user > impact, so that I and others can make these sorts of decisions. Hi Andrew. I believe that was described in patch 0/2 ? In any case, this is something we need fixed, but it is not -stable material or anything. The bug leads to misaccounting when we quickly enable and disable limit in a loop. We have a synthetic script to demonstrate that.