All of lore.kernel.org
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@mit.edu>
To: jim owens <jowens@hp.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: RFC: Clarifying Direct I/O Semantics
Date: Fri, 21 Aug 2009 20:07:45 -0400	[thread overview]
Message-ID: <20090822000745.GP9529@mit.edu> (raw)
In-Reply-To: <4A8F1FA5.5080501@hp.com>

On Fri, Aug 21, 2009 at 06:28:53PM -0400, jim owens wrote:
>> The Linux man page does not state what happens if the alignment
>> restrictions are not met; does the kernel start running rogue or
>> nethack; does it send a signal such as SIGSEGV or SIGABORT, and kill the
>> running process; or does it fall back to buffered I/O? Today, the answer
>> is the latter; but it's not specified anywhere.
>
> retval = -EINVAL; is what __blockdev_direct_IO does in that case
> and what I was making btrfs directIO do.  but fall back is OK too
> if we really want. what existing code fixes up the EINVAL?

You're right; I thought it did the fallback in all cases, but it only
does it when writing into holes.  Oops.  I should have tested this
before saying it.

I'll fix up the wiki page.

>> This is relatively well understood by most implementors and users of
>> O_DIRECT as part of the "oral lore", so simply updating the Linux man
>> page should not be controversial.
>>
>
> The following section includes "sparse" AKA "allocating" writes but
> just says "extending".  Either sparse-filling write needs covered
> separately or we should say "allocating" instead of "extending.

Yup, good point.

> Possibly it should just be stated that directIO write data integrity
> is based on the setting of posix O_SYNC and O_DSYNC.  Then it is their
> choice to run slow-and-safe or fast.  O_SYNC requires metadata on disk.

The question in my mind is whether we should guarantee that the data
block is written synchronously for allocating writes when the file
metadata is not written synchronously; what's the point?  After all,
the application can't distinguish between the data block not making it
out to disk, versus the metadata that will allow the data block to be
accessed after a crash, why should one by synchronous but not the
other?

						- Ted

  reply	other threads:[~2009-08-22  0:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-21 21:54 RFC: Clarifying Direct I/O Semantics Theodore Ts'o
2009-08-21 22:28 ` jim owens
2009-08-22  0:07   ` Theodore Tso [this message]
2009-08-22 13:25     ` Lawrence Greenfield
2009-08-22 20:40       ` Theodore Tso
2009-08-21 23:04 ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090822000745.GP9529@mit.edu \
    --to=tytso@mit.edu \
    --cc=jowens@hp.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.