linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bron Gondwana <brong@fastmail.fm>
To: Chris Mason <chris.mason@oracle.com>
Cc: Bron Gondwana <brong@fastmail.fm>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Poor performance unlinking hard-linked files (repost)
Date: Fri, 19 Nov 2010 08:46:31 +1100	[thread overview]
Message-ID: <20101118214631.GC2401@brong.net> (raw)
In-Reply-To: <1290094104-sup-8656@think>

On Thu, Nov 18, 2010 at 10:30:47AM -0500, Chris Mason wrote:
> Excerpts from Bron Gondwana's message of 2010-11-16 23:11:48 -0500:
> > > > a) program creates piles of small temporary files, hard
> > > >    links them out to different directories, unlinks the
> > > >    originals.
> > > > 
> > > > b) filesystem size: ~ 300Gb (backed by hardware RAID5)
> > > > 
> > > > c) as the filesystem grows (currently about 30% full) 
> > > >    the unlink performance becomes horrible.  Watching
> > > >    iostat, there's a lot of reading going on as well.
> > > 
> > > It sounds like the unlink speed is limited by the reading, and the reads
> > > are coming from one of two places.  We're either reading to cache cold
> > > block groups or we're reading to find the directory entries.
> > 
> > All the unlinks for a single process will be happening in the same
> > directory (though the hard linked copies will be all over)
> > 
> > > Could you sysrq-w while the performance is bad?  That would narrow it
> > > down.
> > 
> > Here's one:
> > 
> > http://pastebin.com/Tg7agv42
> 
> Ok, we're mixing unlinks and fsyncs.  If it fsyncing directories too?

Nup.  I'm pretty sure it doesn't, just files.  Yes - there will certainly
be fsyncs going on as well - Cyrus is very careful to fsync everything it
cares about at the file level, but all it does with directories is mkdir
them if they don't exist.

This just a single "sync_server" process on an experimental server.  A 
real server under full load is going to have multiple processes doing
fsyncs and unlinks.

A significant portion of unlinks are of files that have another link on
the filesystem.  Every mailbox "move" is implemented as a copy (hardlink)
plus expunge (delayed unlink).  The "delay" works by marking the message
to be deleted in the cyrus.index metadata file, and then deleting later
(tunable: 7 to 14 days in our case depending when the next weekend is)

Bron.

  reply	other threads:[~2010-11-18 21:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-13  3:25 Poor performance unlinking hard-linked files Bron Gondwana
2010-11-16 12:54 ` Poor performance unlinking hard-linked files (repost) Bron Gondwana
2010-11-16 13:38   ` Chris Mason
2010-11-17  4:11     ` Bron Gondwana
2010-11-17  9:56       ` Bron Gondwana
2010-11-18 15:30       ` Chris Mason
2010-11-18 21:46         ` Bron Gondwana [this message]
2010-11-19 14:10           ` Chris Mason
2010-11-19 21:58             ` Bron Gondwana
2010-11-30  9:35               ` Bron Gondwana
2010-11-30 12:49                 ` Chris Mason
2010-11-30 23:24                   ` Bron Gondwana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101118214631.GC2401@brong.net \
    --to=brong@fastmail.fm \
    --cc=chris.mason@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).