linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Bron Gondwana <brong@fastmail.fm>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Poor performance unlinking hard-linked files (repost)
Date: Fri, 19 Nov 2010 09:10:08 -0500	[thread overview]
Message-ID: <1290175586-sup-2461@think> (raw)
In-Reply-To: <20101118214631.GC2401@brong.net>

Excerpts from Bron Gondwana's message of 2010-11-18 16:46:31 -0500:
> On Thu, Nov 18, 2010 at 10:30:47AM -0500, Chris Mason wrote:
> > Excerpts from Bron Gondwana's message of 2010-11-16 23:11:48 -0500:
> > > > > a) program creates piles of small temporary files, hard
> > > > >    links them out to different directories, unlinks the
> > > > >    originals.
> > > > > 
> > > > > b) filesystem size: ~ 300Gb (backed by hardware RAID5)
> > > > > 
> > > > > c) as the filesystem grows (currently about 30% full) 
> > > > >    the unlink performance becomes horrible.  Watching
> > > > >    iostat, there's a lot of reading going on as well.
> > > > 
> > > > It sounds like the unlink speed is limited by the reading, and the reads
> > > > are coming from one of two places.  We're either reading to cache cold
> > > > block groups or we're reading to find the directory entries.
> > > 
> > > All the unlinks for a single process will be happening in the same
> > > directory (though the hard linked copies will be all over)
> > > 
> > > > Could you sysrq-w while the performance is bad?  That would narrow it
> > > > down.
> > > 
> > > Here's one:
> > > 
> > > http://pastebin.com/Tg7agv42
> > 
> > Ok, we're mixing unlinks and fsyncs.  If it fsyncing directories too?
> 
> Nup.  I'm pretty sure it doesn't, just files.  Yes - there will certainly
> be fsyncs going on as well - Cyrus is very careful to fsync everything it
> cares about at the file level, but all it does with directories is mkdir
> them if they don't exist.

Could you double check this one please?  fsyncing the directory is a ton
more expensive, I just want to make sure it isn't part of the workload.

Otherwise it looks like we're seeking to read in the inode and unlink
it.  One possibility is that we're not giving the elevator enough clues
about the IO being synchronous.

Are you using cfq or deadline?  I bet we can improve the latencies using
READ_SYNC.

-chris


> 
> This just a single "sync_server" process on an experimental server.  A 
> real server under full load is going to have multiple processes doing
> fsyncs and unlinks.
> 
> A significant portion of unlinks are of files that have another link on
> the filesystem.  Every mailbox "move" is implemented as a copy (hardlink)
> plus expunge (delayed unlink).  The "delay" works by marking the message
> to be deleted in the cyrus.index metadata file, and then deleting later
> (tunable: 7 to 14 days in our case depending when the next weekend is)
> 
> Bron.

  reply	other threads:[~2010-11-19 14:10 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-13  3:25 Poor performance unlinking hard-linked files Bron Gondwana
2010-11-16 12:54 ` Poor performance unlinking hard-linked files (repost) Bron Gondwana
2010-11-16 13:38   ` Chris Mason
2010-11-17  4:11     ` Bron Gondwana
2010-11-17  9:56       ` Bron Gondwana
2010-11-18 15:30       ` Chris Mason
2010-11-18 21:46         ` Bron Gondwana
2010-11-19 14:10           ` Chris Mason [this message]
2010-11-19 21:58             ` Bron Gondwana
2010-11-30  9:35               ` Bron Gondwana
2010-11-30 12:49                 ` Chris Mason
2010-11-30 23:24                   ` Bron Gondwana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1290175586-sup-2461@think \
    --to=chris.mason@oracle.com \
    --cc=brong@fastmail.fm \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).