From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Down Subject: Re: [RFC PATCH 0/8] memcg: Enable fine-grained per process memory control Date: Mon, 7 Sep 2020 12:47:45 +0100 Message-ID: <20200907114745.GA1076657@chrisdown.name> References: <20200817140831.30260-1-longman@redhat.com> <20200818091453.GL2674@hirez.programming.kicks-ass.net> <20200818092617.GN28270@dhcp22.suse.cz> <20200818095910.GM2674@hirez.programming.kicks-ass.net> <20200818100516.GO28270@dhcp22.suse.cz> <20200818101844.GO2674@hirez.programming.kicks-ass.net> <20200818134900.GA829964@cmpxchg.org> <20200821193716.GU3982@worktop.programming.kicks-ass.net> <20200824165850.GA932571@cmpxchg.org> Mime-Version: 1.0 Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chrisdown.name; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=395Y0lUxtoxk+N6Xwo36+dBOIhVjSE/2+oZvPClU7KM=; b=cuAJnS9cKwag10JrR5XWKLTSDpsVij2RIxygdpxgcJAAdzqC0pGMP0broEdFrjuV2u YK1YpzjfS7JWBVBZPRmTUcfnAukLTXmsLJ2XSzMOmdbEBvGaIW/6SMoesJ2yxCpA/WG2 m8QnLvvHNNu3zgtJUbsMfzQakaNE4sxXw5yG0= Content-Disposition: inline In-Reply-To: <20200824165850.GA932571@cmpxchg.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii"; format="flowed" Content-Transfer-Encoding: 7bit To: Johannes Weiner Cc: Peter Zijlstra , Michal Hocko , Waiman Long , Andrew Morton , Vladimir Davydov , Jonathan Corbet , Alexey Dobriyan , Ingo Molnar , Juri Lelli , Vincent Guittot , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Johannes Weiner writes: >That all being said, the semantics of the new 'high' limit in cgroup2 >have allowed us to move reclaim/limit enforcement out of the >allocation context and into the userspace return path. > >See the call to mem_cgroup_handle_over_high() from >tracehook_notify_resume(), and the comments in try_charge() around >set_notify_resume(). > >This already solves the free->alloc ordering problem by allowing the >allocation to exceed the limit temporarily until at least all locks >are dropped, we know we can sleep etc., before performing enforcement. > >That means we may not need the timed sleeps anymore for that purpose, >and could bring back directed waits for freeing-events again. > >What do you think? Any hazards around indefinite sleeps in that resume >path? It's called before __rseq_handle_notify_resume and the >arch-specific resume callback (which appears to be a no-op currently). > >Chris, Michal, what are your thoughts? It would certainly be simpler >conceptually on the memcg side. I'm not against that, although I personally don't feel very strongly about it either way, since the current behaviour clearly works in practice.