From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756975Ab1FGXvc (ORCPT ); Tue, 7 Jun 2011 19:51:32 -0400 Received: from mga02.intel.com ([134.134.136.20]:45363 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751920Ab1FGXvb (ORCPT ); Tue, 7 Jun 2011 19:51:31 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.65,335,1304319600"; d="scan'208";a="10867248" Date: Wed, 8 Jun 2011 07:51:28 +0800 From: Wu Fengguang To: Andrew Morton Cc: Jan Kara , Dave Chinner , Christoph Hellwig , "linux-fsdevel@vger.kernel.org" , LKML Subject: Re: [PATCH 02/15] writeback: update dirtied_when for synced inode to prevent livelock Message-ID: <20110607235127.GB19547@localhost> References: <20110607213236.634026193@intel.com> <20110607213853.635444678@intel.com> <20110607160245.9270aa27.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110607160245.9270aa27.akpm@linux-foundation.org> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 08, 2011 at 07:02:45AM +0800, Andrew Morton wrote: > On Wed, 08 Jun 2011 05:32:38 +0800 > Wu Fengguang wrote: > > > Explicitly update .dirtied_when on synced inodes, so that they are no > > longer considered for writeback in the next round. > > It sounds like this somewhat answers my questions for [1/15]. > > But I'm not seeing a description of exactly what caused the livelock. The exact livelock condition is, during sync(1): (1) no new inodes are dirtied (2) an inode being actively dirtied On (2), the inode will be tagged and synced with .nr_to_write=LONG_MAX. When finished, it will be redirty_tail()ed because it's still dirty and (.nr_to_write > 0). redirty_tail() won't update its ->dirtied_when on condition (1). The sync work will then revisit it on the next queue_io() and find it eligible again because its old ->dirtied_when predates the sync work start time. I'll add the above to the changelog. > > We'll do more aggressive "keep writeback as long as we wrote something" > > logic in wb_writeback(). The "use LONG_MAX .nr_to_write" trick in commit > > b9543dac5bbc ("writeback: avoid livelocking WB_SYNC_ALL writeback") will > > no longer be enough to stop sync livelock. > > > > It can prevent both of the following livelock schemes: > > > > - while true; do echo data >> f; done > > - while true; do touch f; done > > You're kidding. This livelocks sync(1)? When did we break this? There are no reported real cases for "touch f" style livelock. It's merely a possibility in theory and the more concurrent meta data dirties, the more likelihood it will happen. > Why is this? Because the inode keeps on getting rotated to head-of-list? Yes, when the inode is always redirty_tail()ed without updating its ->dirtied_when. Thanks, Fengguang