From: "Ted Ts'o" <tytso@mit.edu>
To: Charles Samuels <charles@cariden.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Queuing of disk writes
Date: Tue, 5 Apr 2011 15:37:47 -0400 [thread overview]
Message-ID: <20110405193747.GG2832@thunk.org> (raw)
In-Reply-To: <201104041050.12731.charles@cariden.com>
On Mon, Apr 04, 2011 at 10:50:12AM -0700, Charles Samuels wrote:
>
> > Who or what is calling fsync()? Is it being called by your
> > application because you want to initiate writeout? Or is it being
> > called by some completely unrelated process?
>
> It's being called by my own process. When fsync finishes, I update
> another file with some offset counters, fsync that, and with some
> luck, my writes are transactional.
OK, how often are you calling fsync()? Is this something where you
are trying to get transactional guarantees by calling fsync() between
each transaction? And if so, how big are you transactions? If you
are trying to call fsync() 10+ times/second, then your only hope
really is going to be a battery-backed RAID controller card, as David
Lang has already suggested.
> What would be good use of sync_file_range? It looks pretty useful,
> but I don't know how to make good use of it. For example,
> SYNC_FILE_RANGE_WRITE, wouldn't linux start this pretty much
> immediately?
No, not necessarily. Generally Linux will pause for a bit to
hopefully allow writes to coalesce.
The reason why I suggested sync_file_range() is because you mentioned
that you tried waiting until there was a large amount of data in the
page cache, and then you called fsync() and that was taking forever.
I assumed from that you didn't necessarily had ACID or transactional
requirements.
The advantage of using sync_file_range() is that instead of forcing a
blocking write for *all* of the data pages, you can only do it on part
of the your data pages. This would allow the writing from interfering
with subsequent reads that was taking place to your database.
All of this goes by the boards if you need data integrity guarantees,
of course; in that case you need to call fsync() after each atomic
transaction update...
- Ted
prev parent reply other threads:[~2011-04-05 19:37 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-01 19:59 Queuing of disk writes Charles Samuels
2011-04-01 20:10 ` Alan Cox
2011-04-01 20:34 ` Charles Samuels
2011-04-01 20:39 ` Alan Cox
2011-04-04 2:02 ` Ted Ts'o
2011-04-04 17:50 ` Charles Samuels
2011-04-04 17:54 ` david
2011-04-05 19:37 ` Ted Ts'o [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110405193747.GG2832@thunk.org \
--to=tytso@mit.edu \
--cc=charles@cariden.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox