From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753545AbdA3Xkd (ORCPT ); Mon, 30 Jan 2017 18:40:33 -0500 Received: from LGEAMRELO12.lge.com ([156.147.23.52]:59575 "EHLO lgeamrelo12.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752270AbdA3Xkb (ORCPT ); Mon, 30 Jan 2017 18:40:31 -0500 X-Original-SENDERIP: 156.147.1.127 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 165.244.249.25 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 10.177.223.161 X-Original-MAILFROM: minchan@kernel.org Date: Tue, 31 Jan 2017 08:40:28 +0900 From: Minchan Kim To: vinayak menon CC: Vinayak Menon , Andrew Morton , Johannes Weiner , , , , Rik van Riel , , , Shiraz Hashim , "linux-mm@kvack.org" , Subject: Re: [PATCH] mm: vmscan: do not pass reclaimed slab to vmpressure Message-ID: <20170130234028.GA7942@bbox> References: <1485344318-6418-1-git-send-email-vinmenon@codeaurora.org> <20170125232713.GB20811@bbox> <20170126141836.GA3584@bbox> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-MIMETrack: Itemize by SMTP Server on LGEKRMHUB05/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2017/01/31 08:40:27, Serialize by Router on LGEKRMHUB05/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2017/01/31 08:40:27, Serialize complete at 2017/01/31 08:40:27 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Vinayak, Sorry for late response. It was Lunar New Year holidays. On Fri, Jan 27, 2017 at 01:43:23PM +0530, vinayak menon wrote: > > > > Thanks for the explain. However, such case can happen with THP page > > as well as slab. In case of THP page, nr_scanned is 1 but nr_reclaimed > > could be 512 so I think vmpressure should have a logic to prevent undeflow > > regardless of slab shrinking. > > > I see. Going to send a vmpressure fix. But, wouldn't the THP case > result in incorrect > vmpressure reporting even if we fix the vmpressure underflow problem ? If a THP page is reclaimed, it reports lower pressure due to bigger reclaim ratio(ie, reclaimed/scanned) compared to normal pages but it's not a problem, is it? Because VM reclaimed more memory than expected so memory pressure isn't severe now. > > >> > >> > > >> >> unsigned arithmetic results in the pressure value to be > >> >> huge, thus resulting in a critical event being sent to > >> >> root cgroup. Fix this by not passing the reclaimed slab > >> >> count to vmpressure, with the assumption that vmpressure > >> >> should show the actual pressure on LRU which is now > >> >> diluted by adding reclaimed slab without a corresponding > >> >> scanned value. > >> > > >> > I can't guess justfication of your assumption from the description. > >> > Why do we consider only LRU pages for vmpressure? Could you elaborate > >> > a bit? > >> > > >> When we encountered the false events from vmpressure, thought the problem > >> could be that slab scanned is not included in sc->nr_scanned, like it is done > >> for reclaimed. But later thought vmpressure works only on the scanned and > >> reclaimed from LRU. I can explain what I understand, let me know if this is > >> incorrect. > >> vmpressure is an index which tells the pressure on LRU, and thus an > >> indicator of thrashing. In shrink_node when we come out of the inner do-while > >> loop after shrinking the lruvec, the scanned and reclaimed corresponds to the > >> pressure felt on the LRUs which in turn indicates the pressure on VM. The > >> moment we add the slab reclaimed pages to the reclaimed, we dilute the > >> actual pressure felt on LRUs. When slab scanned/reclaimed is not included > >> in the vmpressure, the values will indicate the actual pressure and if there > >> were a lot of slab reclaimed pages it will result in lesser pressure > >> on LRUs in the next run which will again be indicated by vmpressure. i.e. the > > > > I think there is no intention to exclude slab by design of vmpressure. > > Beause slab is memory consumption so freeing of slab pages really helps > > the memory pressure. Also, there might be slab-intensive workload rather > > than LRU. It would be great if vmpressure works well with that case. > > But the problem with involving slab for vmpressure is it's not fair with > > LRU pages. LRU pages are 1:1 cost model for scan:free but slab shriking > > depends the each slab's object population. It means it's impossible to > > get stable cost model with current slab shrinkg model, unfortunately. > > So I don't obejct this patch although I want to see slab shrink model's > > change which is heavy-handed work. > > > Looking at the code, the slab reclaimed pages started getting passed to > vmpressure after the commit ("mm: vmscan: invoke slab shrinkers from > shrink_zone()"). > But as you said, this may be helpful for slab intensive workloads. But in its > current form I think it results in incorrect vmpressure reporting because of not > accounting the slab scanned pages. Resending the patch with a modified > commit msg > since the underflow issue is fixed separately. Thanks Minchan. Make sense. Thanks, Vinayak!