From: Chris Mason <chris.mason@oracle.com>
To: jim owens <owens6336@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Josef Bacik <josef@redhat.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH V2] Btrfs: Full direct I/O and AIO read implementation.
Date: Tue, 16 Feb 2010 10:49:10 -0500 [thread overview]
Message-ID: <20100216154910.GE3497@think> (raw)
In-Reply-To: <4B79CD76.8050304@gmail.com>
On Mon, Feb 15, 2010 at 05:40:54PM -0500, jim owens wrote:
> Christoph Hellwig wrote:
> >On Mon, Feb 15, 2010 at 05:26:34PM -0500, jim owens wrote:
> >>My understanding is the current 4k drives normally operate in
> >>512 byte read/write access mode unless you set them to run
> >>as 4k only.
> >>
> >>In 512 byte mode, they buffer internally on writes. It is probably
> >>just as safe as any other drive on a power hit, as in anything may
> >>be trash.
> >>
> >>btrfs read of 512 byte boundaries is safe because we only write
> >>in 4k boundaries (hopefully we can detect and align on the drive).
> >
> >There are drives that still have 512 byte logical, but 4k physical
> >blocks, this includes all the consumer (SATA) drives. You can also
> >have drives with 4k physical and logical block size, this includes
> >many S/390 DASD devices, and also samples of enterprise SAS drives.
> >
> >The logical block size is the addressing limit for the OS, so your
> >above scenario is correct for the 512 bye logical / 4k physical
> >devices, but not the 4k logical / 4k physical devices. Nevermind
> >other corner cases like 2k block size CD-ROM which could in theory
> >be used in a read-only btrfs filesystem (very unlikely, but..).
> >
> >So no, you really can't go under the bdev_logical_block_size()
> >advertized by the device, and that may as well be over 512 bytes.
>
> I agree fully with all of that. What I did not say is the
> current btrfs direct IO code does not go below the drive
> logical block size. If the drive says 4k and the user tries
> to read any other multiple, the code returns an error.
>
> The confusion is that detection occurs only when I go to
> build the bio because it is there that I know the drive
> and extract the drive block size to check alignment.
>
> We only know what drive is being used when we have the
> extent info because we can have multiple drives in btrfs.
>
> The early 512 check is the idiot check.
The 512 check just needs to be replaced with a number that we store
during device scan. We won't try to bend physics w/4k drives ;)
-chris
next prev parent reply other threads:[~2010-02-16 15:49 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-10 18:53 [PATCH V2] Btrfs: Full direct I/O and AIO read implementation jim owens
2010-02-12 19:28 ` Josef Bacik
2010-02-14 1:30 ` jim owens
2010-02-15 16:42 ` Chris Mason
2010-02-15 19:18 ` jim owens
2010-02-16 16:01 ` Chris Mason
2010-02-16 17:09 ` jim owens
2010-02-15 21:58 ` Christoph Hellwig
2010-02-15 22:26 ` jim owens
2010-02-15 22:32 ` Christoph Hellwig
2010-02-15 22:40 ` jim owens
2010-02-16 15:49 ` Chris Mason [this message]
2010-02-15 22:01 ` rk
2010-02-15 22:31 ` jim owens
2010-02-16 19:28 ` jim owens
2010-02-16 19:39 ` Josef Bacik
2010-03-03 18:54 ` jim owens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100216154910.GE3497@think \
--to=chris.mason@oracle.com \
--cc=hch@infradead.org \
--cc=josef@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=owens6336@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox