From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2993468AbcBSTmg (ORCPT ); Fri, 19 Feb 2016 14:42:36 -0500 Received: from gum.cmpxchg.org ([85.214.110.215]:59064 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2993006AbcBSTme (ORCPT ); Fri, 19 Feb 2016 14:42:34 -0500 Date: Fri, 19 Feb 2016 14:41:28 -0500 From: Johannes Weiner To: Rik van Riel Cc: Andrew Morton , Mel Gorman , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: scale kswapd watermarks in proportion to memory Message-ID: <20160219194128.GA17342@cmpxchg.org> References: <1455813719-2395-1-git-send-email-hannes@cmpxchg.org> <1455826543.15821.64.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1455826543.15821.64.camel@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 18, 2016 at 03:15:43PM -0500, Rik van Riel wrote: > On Thu, 2016-02-18 at 11:41 -0500, Johannes Weiner wrote: > > In machines with 140G of memory and enterprise flash storage, we have > > seen read and write bursts routinely exceed the kswapd watermarks and > > cause thundering herds in direct reclaim. Unfortunately, the only way > > to tune kswapd aggressiveness is through adjusting min_free_kbytes - > > the system's emergency reserves - which is entirely unrelated to the > > system's latency requirements. In order to get kswapd to maintain a > > 250M buffer of free memory, the emergency reserves need to be set to > > 1G. That is a lot of memory wasted for no good reason. > > > > On the other hand, it's reasonable to assume that allocation bursts > > and overall allocation concurrency scale with memory capacity, so it > > makes sense to make kswapd aggressiveness a function of that as well. > > > > Change the kswapd watermark scale factor from the currently fixed 25% > > of the tunable emergency reserve to a tunable 0.001% of memory. > > > > On a 140G machine, this raises the default watermark steps - the > > distance between min and low, and low and high - from 16M to 143M. > > This is an excellent idea for a large system, > but your patch reduces the gap between watermarks > on small systems. > > On an 8GB zone, your patch halves the gap between > the watermarks, and on smaller systems it would be > even worse. You're right, I'll address that in v2. > Would it make sense to keep using the old calculation > on small systems, when the result of the old calculation > exceeds that of the new calculation? > > Using the max of the two calculations could prevent > the issue you are trying to prevent on large systems, > from happening on smaller systems. Yes, I think enforcing a reasonable minimum this way makes sense. Thanks Rik.