From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750816Ab0G0EAH (ORCPT ); Tue, 27 Jul 2010 00:00:07 -0400 Received: from mga03.intel.com ([143.182.124.21]:20398 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750704Ab0G0EAF (ORCPT ); Tue, 27 Jul 2010 00:00:05 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.55,265,1278313200"; d="scan'208";a="304673211" Date: Tue, 27 Jul 2010 11:59:41 +0800 From: Wu Fengguang To: Jan Kara Cc: Andrew Morton , Christoph Hellwig , Peter Zijlstra , Richard Kennedy , Dave Chinner , "linux-fsdevel@vger.kernel.org" , Linux Memory Management List , LKML Subject: Re: [PATCH 2/6] writeback: reduce calls to global_page_state in balance_dirty_pages() Message-ID: <20100727035941.GA15007@localhost> References: <20100711020656.340075560@intel.com> <20100711021748.735126772@intel.com> <20100726151946.GH3280@quack.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100726151946.GH3280@quack.suse.cz> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > This patch slightly changes behavior by replacing clip_bdi_dirty_limit() > > with the explicit check (nr_reclaimable + nr_writeback >= dirty_thresh) > > to avoid exceeding the dirty limit. Since the bdi dirty limit is mostly > > accurate we don't need to do routinely clip. A simple dirty limit check > > would be enough. > > > > The check is necessary because, in principle we should throttle > > everything calling balance_dirty_pages() when we're over the total > > limit, as said by Peter. > > > > We now set and clear dirty_exceeded not only based on bdi dirty limits, > > but also on the global dirty limits. This is a bit counterintuitive, but > > the global limits are the ultimate goal and shall be always imposed. > Thinking about this again - what you did is rather big change for systems > with more active BDIs. For example if I have two disks sda and sdb and > write for some time to sda, then dirty limit for sdb gets scaled down. > So when we start writing to sbd we'll heavily throttle the threads until > the dirty limit for sdb ramps up regardless of how far are we to reach the > global limit... The global threshold check is added in place of clip_bdi_dirty_limit() for safety and not intended as a behavior change. If ever leading to big behavior change and regression, that it would be indicating some too permissive per-bdi threshold calculation. Did you see the global dirty threshold get exceeded when writing to 2+ devices? Occasional small exceeding should be OK though. I tried the following debug patch and see no warnings when doing two concurrent cp over local disk and NFS. Index: linux-next/mm/page-writeback.c =================================================================== --- linux-next.orig/mm/page-writeback.c 2010-07-27 11:26:18.063817669 +0800 +++ linux-next/mm/page-writeback.c 2010-07-27 11:26:53.335855847 +0800 @@ -513,6 +513,11 @@ if (!dirty_exceeded) break; + if (nr_reclaimable + nr_writeback >= dirty_thresh) + printk ("XXX: dirty exceeded: %lu + %lu = %lu ++ %lu\n", + nr_reclaimable, nr_writeback, dirty_thresh, + nr_reclaimable + nr_writeback - dirty_thresh); + /* * Throttle it only when the background writeback cannot * catch-up. This avoids (excessively) small writeouts Thanks, Fengguang