All of lore.kernel.org
 help / color / mirror / Atom feed
From: gmate.amit@gmail.com (Kumar amit mehta)
To: kernelnewbies@lists.kernelnewbies.org
Subject: Direct IO and Page cache
Date: Fri, 26 Jul 2013 00:02:31 -0400	[thread overview]
Message-ID: <20130726040231.GD8707@gmail.com> (raw)
In-Reply-To: <CAK-9PRBfj_VgYTcs=Pi7VqfinFx5x3w+=tn8hJ5cCxeJwGVQPg@mail.gmail.com>

On Fri, Jul 26, 2013 at 05:14:21PM +0800, Chinmay V S wrote:
> > We have direct I/O(O_DIRECT), for example raw devices(/dev/rawctl) that
> > map to the block devices and we also have page cache. Now If I've
> > understood this correctly, direct I/O will bypass this page cache, which
> > is fine, I'll not get into the performance debate, but what about data
> > consistency. Kernel cannot and __should'nt__ try to control how the
> > applications are being written. So one bad day somebody comes up with
> > an application which does both these two types of IO(one that goes
> > through page cache and the other that doesn't) and in that application,
> > one instance is writing directly to the backend device and the other
> > instance, who is not aware of this write, goes ahead and writes to the
> > page cache, and that write would be written later to the backend device.
> > So wouldn't we end up corrupting the on disk data.
> 
> Yes. And that is the responsibility of the application. While the
> existence of O_DIRECT may not be common sense, anyone who knows about
> it *must* know that it bypasses the kernel page-cache and hence *must*
> know the consequences of doing cached and direct I/O on the same file
> simultaneously.
> 
> > I can think of multiple other scenarios which could corrupt the on-disk
> > data, if there isn't any safeguarding policies employed by the kernel.
> > But I'm very much sure that kernel is aware of such nasty attempts, and
> > I'd like to know how does kernel takes care of this.
> 
> O_DIRECT is an explicit flag not enabled by default.
> 
> It is the app's responsibility to ensure that it does NOT misuse the
> feature. Essentially specifying the O_DIRECT flag is the app's way of
> saying - "Hey kernel, i know what i am doing. Please step aside and
> let me talk to the hardware directly. Please do NOT interfere."
> 
> The kernel happily obliges.
> 
> Later, the app should NOT go crying back to kernel (and blaming it),
> if the app manages to screw-up the direct "relationship" with the
> hardware.

So leaving the hardware at the mercy of the application doesn't sound
like a good practice. This __may__ compromise kernel stability too. Also
think of this:

In app1:
fdx = open("blah" , O_RW|O_DIRECT);
write(fdx,buf,sizeof(buf));
	
In app2(unaware of app1):
fdy = open("blah", O_RW);
write(fdy,buf, sizeof(buf));

I think this isn't highly unlikely to do, and if you agree with me then
we may end up with same could-be/would-be data-corruption. Now who should
be blamed here, app1, app2 or the kernel? Or it will be handled
differently here?

!!amit

  reply	other threads:[~2013-07-26  4:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-26  3:15 Direct IO and Page cache Kumar amit mehta
2013-07-26  9:14 ` Chinmay V S
2013-07-26  4:02   ` Kumar amit mehta [this message]
2013-07-26 10:21     ` Chinmay V S
2013-07-26 10:31       ` Chinmay V S
2013-07-26  6:04         ` Kumar amit mehta
2013-07-26 14:59     ` Valdis.Kletnieks at vt.edu
2013-07-26 15:52       ` Kumar amit mehta
2013-07-26 14:56 ` Valdis.Kletnieks at vt.edu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130726040231.GD8707@gmail.com \
    --to=gmate.amit@gmail.com \
    --cc=kernelnewbies@lists.kernelnewbies.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.