linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andres Freund <andres@2ndquadrant.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	lsf@lists.linux-foundation.org,
	Wu Fengguang <fengguang.wu@intel.com>,
	rhaas@anarazel.de
Subject: Re: [Lsf] Postgresql performance problems with IO latency, especially during fsync()
Date: Wed, 26 Mar 2014 22:55:18 +0100	[thread overview]
Message-ID: <20140326215518.GH9066@alap3.anarazel.de> (raw)
In-Reply-To: <CALCETrUc1YvNc3EKb4ex579rCqBfF=84_h5bvbq49o62k2KpmA@mail.gmail.com>

On 2014-03-26 14:41:31 -0700, Andy Lutomirski wrote:
> On Wed, Mar 26, 2014 at 12:11 PM, Andres Freund <andres@anarazel.de> wrote:
> > Hi,
> >
> > At LSF/MM there was a slot about postgres' problems with the kernel. Our
> > top#1 concern is frequent slow read()s that happen while another process
> > calls fsync(), even though we'd be perfectly fine if that fsync() took
> > ages.
> > The "conclusion" of that part was that it'd be very useful to have a
> > demonstration of the problem without needing a full blown postgres
> > setup. I've quickly hacked something together, that seems to show the
> > problem nicely.
> >
> > For a bit of context: lwn.net/SubscriberLink/591723/940134eb57fcc0b8/
> > and the "IO Scheduling" bit in
> > http://archives.postgresql.org/message-id/20140310101537.GC10663%40suse.de
> >
> 
> For your amusement: running this program in KVM on a 2GB disk image
> failed, but it caused the *host* to go out to lunch for several
> seconds while failing.  In fact, it seems to have caused the host to
> fall over so badly that the guest decided that the disk controller was
> timing out.  The host is btrfs, and I think that btrfs is *really* bad
> at this kind of workload.

Also, unless you changed the parameters, it's a) using a 48GB disk file,
and writes really rather fast ;)

> Even using ext4 is no good.  I think that dm-crypt is dying under the
> load.  So I won't test your program for real :/

Try to reduce data_size to RAM * 2, NUM_RANDOM_READERS to something
smaller. If it still doesn't work consider increasing the two nsleep()s...

I didn't have a good idea how to scale those to the current machine in a
halfway automatic fashion.

> > Possible solutions:
> > * Add a fadvise(UNDIRTY), that doesn't stall on a full IO queue like
> >   sync_file_range() does.
> > * Make IO triggered by writeback regard IO priorities and add it to
> >   schedulers other than CFQ
> > * Add a tunable that allows limiting the amount of dirty memory before
> >   writeback on a per process basis.
> > * ...?
> 
> I thought the problem wasn't so much that priorities weren't respected
> but that the fsync call fills up the queue, so everything starts
> contending for the right to enqueue a new request.

I think it's both actually. If I understand correctly there's not even a
correct association to the originator anymore during a fsync triggered
flush?

> Since fsync blocks until all of its IO finishes anyway, what if it
> could just limit itself to a much smaller number of outstanding
> requests?

Yea, that could already help. If you remove the fsync()s, the problem
will periodically appear anyway, because writeback is triggered with
vengeance. That'd need to be fixed in a similar way.

> I'm not sure I understand the request queue stuff, but here's an idea.
>  The block core contains this little bit of code:

I haven't read enough of the code yet, to comment intelligently ;)

Greetings,

Andres Freund

-- 
 Andres Freund	                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-03-26 21:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-26 19:11 Postgresql performance problems with IO latency, especially during fsync() Andres Freund
2014-03-26 21:41 ` [Lsf] " Andy Lutomirski
2014-03-26 21:55   ` Andres Freund [this message]
2014-03-26 22:26     ` Andy Lutomirski
2014-03-26 22:35       ` David Lang
2014-03-26 23:11         ` Andy Lutomirski
2014-03-26 23:28           ` Andy Lutomirski
2014-03-27 15:50     ` Jan Kara
2014-03-27 18:10       ` Fernando Luis Vazquez Cao
2014-03-27 15:52 ` Jan Kara
2014-04-09  9:20 ` Dave Chinner
2014-04-12 13:24   ` Andres Freund
2014-04-28 23:47   ` [Lsf] " Dave Chinner
2014-04-28 23:57     ` Andres Freund
2014-05-23  6:42       ` Dave Chinner
2014-06-04 20:06         ` Andres Freund

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140326215518.GH9066@alap3.anarazel.de \
    --to=andres@2ndquadrant.com \
    --cc=fengguang.wu@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf@lists.linux-foundation.org \
    --cc=luto@amacapital.net \
    --cc=rhaas@anarazel.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).