linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Kara <jack@suse.cz>, Wu Fengguang <fengguang.wu@intel.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	Johannes Weiner <hannes@cmpxchg.org>,
	Christoph Hellwig <hch@lst.de>,
	Jan Engelhardt <jengelh@medozas.de>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/5] writeback: stop background/kupdate works from livelocking other works
Date: Thu, 11 Nov 2010 17:44:24 +0100	[thread overview]
Message-ID: <20101111164424.GC3934@quack.suse.cz> (raw)
In-Reply-To: <20101110153729.81ae6b19.akpm@linux-foundation.org>

On Wed 10-11-10 15:37:29, Andrew Morton wrote:
> On Wed, 10 Nov 2010 00:56:32 +0100
> Jan Kara <jack@suse.cz> wrote:
> > On Tue 09-11-10 15:00:06, Andrew Morton wrote:
> > > On Tue, 9 Nov 2010 23:28:27 +0100
> > > Jan Kara <jack@suse.cz> wrote:
> > > >   New description which should address above questions:
> > > > Background writeback is easily livelockable in a loop in wb_writeback() by
> > > > a process continuously re-dirtying pages (or continuously appending to a
> > > > file). This is in fact intended as the target of background writeback is to
> > > > write dirty pages it can find as long as we are over
> > > > dirty_background_threshold.
> > > 
> > > Well.  The objective of the kupdate function is utterly different.
> > > 
> > > > But the above behavior gets inconvenient at times because no other work
> > > > queued in the flusher thread's queue gets processed. In particular,
> > > > since e.g. sync(1) relies on flusher thread to do all the IO for it,
> > > 
> > > That's fixable by doing the work synchronously within sync_inodes_sb(),
> > > rather than twiddling thumbs wasting a thread resource while waiting
> > > for kernel threads to do it.  As an added bonus, this even makes cpu
> > > time accounting more accurate ;)
> > > 
> > > Please remind me why we decided to hand the sync_inodes_sb() work off
> > > to other threads?
> >   Because when sync(1) does IO on it's own, it competes for the device with
> > the flusher thread running in parallel thus resulting in more seeks.
> 
> Skeptical.  Has that effect been demonstrated?  Has it been shown to be
> a significant problem?  A worse problem than livelocking the machine? ;)
  I don't think Jens has tested this when merging per-bdi writeback
patches. Just the argument makes sense to me (I've seen some loads where
application doing IO from balance_dirty_pages() together with pdflush doing
writeback did seriously harm performance and this looks like a similar
scenario). But I'd be also interested whether the theory and practice
matches here so I'll try to do some test to compare the two approaches.

> If this _is_ a problem then it's also a problem for fsync/msync.  But
> see below.
> 
> > > > sync(1) can hang forever waiting for flusher thread to do the work.
> > > > 
> > > > Generally, when a flusher thread has some work queued, someone submitted
> > > > the work to achieve a goal more specific than what background writeback
> > > > does. Moreover by working on the specific work, we also reduce amount of
> > > > dirty pages which is exactly the target of background writeout. So it makes
> > > > sense to give specific work a priority over a generic page cleaning.
> > > > 
> > > > Thus we interrupt background writeback if there is some other work to do. We
> > > > return to the background writeback after completing all the queued work.
> > > > 
> > ...
> > > > > So...  what prevents higher priority works (eg, sync(1)) from
> > > > > livelocking or seriously retarding background or kudate writeout?
> > > >   If other work than background or kupdate writeout livelocks, it's a bug
> > > > which should be fixed (either by setting sensible nr_to_write or by tagging
> > > > like we do it for WB_SYNC_ALL writeback). Of course, higher priority work
> > > > can be running when background or kupdate writeout would need to run as
> > > > well. But the idea here is that the purpose of background/kupdate types of
> > > > writeout is to get rid of dirty data and any type of writeout does this so
> > > > working on it we also work on background/kupdate writeout only possibly
> > > > less efficiently.
> > > 
> > > The kupdate function is a data-integrity/quality-of-service sort of
> > > thing.
> > > 
> > > And what I'm asking is whether this change enables scenarios in which
> > > these threads can be kept so busy that the kupdate function gets
> > > interrupted so frequently that we can have dirty memory not being
> > > written back for arbitrarily long periods of time?
> >   So let me compare:
> > What kupdate writeback does:
> >   queue inodes older than dirty_expire_centisecs
> >   while some inode in the queue
> >     write MAX_WRITEBACK_PAGES from each inode queued
> >     break if nr_to_write <= 0
> > 
> > What any other WB_SYNC_NONE writeback (let me call it "normal WB_SYNC_NONE
> > writeback") does:
> >   queue all dirty inodes 
> >   while some inode in the queue
> >     write MAX_WRITEBACK_PAGES from each inode queued
> >     break if nr_to_write <= 0
> > 
> > 
> > There only one kind of WB_SYNC_ALL writeback - the one which writes
> > everything.
> 
> fsync() uses WB_SYNC_ALL and it doesn't write "everything".
  Yes, but nobody submits fsync() work to a flusher thread (currently,
there's no way to tell flusher thread to operate only on a single inode).
But as you write below *if* somebody started doing it and fsync() would be
called often enough, we would have a problem.

> > So after WB_SYNC_ALL writeback (provided all livelocks are fixed ;)
> > obviously no old data should be unwritten in memory. Normal WB_SYNC_NONE
> > writeback differs from a kupdate one *only* in the fact that we queue all
> > inodes instead of only the old ones.
> 
> whoa.
> 
> That's only true for sync(), or sync_filesystem().  It isn't true for,
> say, fsync() and msync().  Now when someone comes along and changes
> fsync/msync to use flusher threads ("resulting in less seeks") then we
> could get into a situation where fsync-serving flusher threads never
> serve kupdate?
> 
> > We start writing old inodes first and
> 
> OT, but: your faith in those time-ordered inode lists is touching ;)
> Put a debug function in there which checks that the lists _are_
> time-ordered, and call that function from every site in the kernel
> which modifies the lists.   I bet there are still gremlins.
  ;-)

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-11-11 16:44 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-08 23:09 [PATCH 0/5] writeback livelock fixes Wu Fengguang
2010-11-08 23:09 ` [PATCH 1/5] writeback: integrated background writeback work Wu Fengguang
2010-11-08 23:09 ` [PATCH 2/5] writeback: trace wakeup event for background writeback Wu Fengguang
2010-11-08 23:09 ` [PATCH 3/5] writeback: stop background/kupdate works from livelocking other works Wu Fengguang
2010-11-09 21:13   ` Andrew Morton
2010-11-09 22:28     ` Jan Kara
2010-11-09 23:00       ` Andrew Morton
2010-11-09 23:56         ` Jan Kara
2010-11-10 23:37           ` Andrew Morton
2010-11-11  0:40             ` Wu Fengguang
2010-11-11 13:32               ` Christoph Hellwig
2010-11-11 16:44             ` Jan Kara [this message]
2010-11-08 23:09 ` [PATCH 4/5] writeback: avoid livelocking WB_SYNC_ALL writeback Wu Fengguang
2010-11-09 22:43   ` Andrew Morton
2010-11-09 23:18     ` Jan Kara
2010-11-10  2:26       ` Wu Fengguang
2010-11-08 23:09 ` [PATCH 5/5] writeback: check skipped pages on WB_SYNC_ALL Wu Fengguang
2010-11-09 22:47   ` Andrew Morton
2010-11-09 23:16     ` Wu Fengguang
2010-11-08 23:23 ` [PATCH 0/5] writeback livelock fixes Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2010-11-10  2:35 [PATCH 0/5] writeback livelock fixes v2 Wu Fengguang
2010-11-10  2:35 ` [PATCH 3/5] writeback: stop background/kupdate works from livelocking other works Wu Fengguang
2010-11-10  3:55   ` Wu Fengguang
2010-11-10 16:26     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101111164424.GC3934@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=jengelh@medozas.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).