linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ted Ts'o <tytso@mit.edu>
To: Ken Sumrall <ksumrall@google.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] Support systems without posix_memalign() and memalign()
Date: Wed, 2 May 2012 16:21:19 -0400	[thread overview]
Message-ID: <20120502202119.GB18002@thunk.org> (raw)
In-Reply-To: <CAJr8W+i3vjmVw5Bxh5pDveuXvkC_q6YkayuA8SfF=kjSLqEFAg@mail.gmail.com>

On Wed, May 02, 2012 at 01:01:35PM -0700, Ken Sumrall wrote:
> I thought this patch enabled O_DIRECT functionality on the Mac:
> 
>   http://sourceforge.net/tracker/index.php?func=detail&aid=3140289&group_id=2406&atid=102406

Yeah, I was looking at this.  It turns out that F_NOCACHE has the
effect of O_DIRECT, but it does not require memory alignment.  There
are some hints that you get really bad performance if the read/write
size is not a multiple of 4k, and/or is not well-aligned ---
presumably in that case Mac OS uses a bounce buffer.

In practice, Linux works the same way --- we'll fall back to using the
page cache if the alignment requirements are not met.  However, there
have historically been bugs that cause this not to work correctly (and
in fact caused data corruption):

	https://bugzilla.redhat.com/show_bug.cgi?id=471613
	http://kerneltrap.org/mailarchive/linux-fsdevel/2008/11/14/4099714

... and the formal documentation of O_DIRECT in the open(2) man page
in Linux makes very little guarantees about what the requirements are
for O_DIRECT to work correctly:

       The  O_DIRECT  flag may impose alignment restrictions on the length and
       address of userspace buffers and the file offset  of  I/Os.   In  Linux
       alignment restrictions vary by file system and kernel version and might
       be absent entirely. 

So there are two possibilities; one is that we change
ext2fs_get_memalign() so it gets aligned memory on a best-efforts
basis, but does not guarantee that the memory will be aligned.

The other is we keep ext2fs_get_memalign() as an interface which
returns an error if the request memory alignment can not be honored,
and then add code so that we only try using ext2fs_get_memalign() if
we are trying to use O_DIRECT (and make an exception for F_NOCACHE on
Mac systems that don't have memalign or posix_memalign).

The first is tempting, but it just seems a little dirty to me.  On the
other hand, we've never published a formal API specification for
ext2fs_get_memalign(), and it's relatively unlikely there are users of
the API outside of e2fsprogs.

						- Ted

      reply	other threads:[~2012-05-02 20:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-02 18:30 [PATCH] Support systems without posix_memalign() and memalign() Theodore Ts'o
2012-05-02 18:52 ` Andreas Dilger
2012-05-02 19:40   ` Ted Ts'o
2012-05-02 20:01     ` Ken Sumrall
2012-05-02 20:21       ` Ted Ts'o [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120502202119.GB18002@thunk.org \
    --to=tytso@mit.edu \
    --cc=adilger@dilger.ca \
    --cc=ksumrall@google.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).