From: Michael Tokarev <mjt@tls.msk.ru>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: Alignment size?
Date: Fri, 13 Aug 2010 10:24:46 +0400 [thread overview]
Message-ID: <4C64E52E.2060806@msgid.tls.msk.ru> (raw)
In-Reply-To: <20100812234911.GC10429@dastard>
13.08.2010 03:49, Dave Chinner wrote:
> On Fri, Aug 13, 2010 at 02:10:39AM +0400, Michael Tokarev wrote:
>> Hello.
>>
>> I used XFS for a long time on many different
>> servers, and it works well. But now I encountered
>> an.. unexpected problem.
>>
>> The question is: on one of our servers, XFS requires
>> different alignment size for O_DIRECT operations than
>> on others. Usually it's 512 bytes, but on this server
>> it is 4096 - both min_io and alignment (this is from
>> XFS_IOC_DIOINFO ioctl).
>
> It'll be a filesystem set up with a 4k sector size, then. Check the
> output of xfs_info.
yes, xfs_info reports sectsz=4096, I noticed this yesterday.
>> I'm not sure what the reason for this is.
>> On this server, the underlying block device is raid5
>> (linux sw raid), but we had other machines with raid5
>> which didn't have that alignment requiriments.
>>
>> The problem with that is that Oracle db, which we use
>> with XFS alot, refuses to work on this machine, or,
>> rather, XFS refuses to process I/O in 512-byte chunks
>> from oracle (control files and redolog files).
>
> A clear case of application failure. I guess Oracle have some work
> to do to support 4k sector drives where they won't be able to do 512
> byte direct IOs at all....
Sure thing, that's oracle10, and at least at that time
there was no way to determine the size of I/O in a generic
way. Now there is, and I hope in oracle12 there will be
support for various different sectors.
But this is not the point..
.
>> Is there a way to remedy this somehow, without
>> reformatting whole 600+ gb?
>
> Not really. If it is 4k sector size, then there is some extremely
> dangerous voodoo that you could do to realign and resize the AG
> headers, followed by a full xfs_repair run to fix up all the block
> accounting. This is not something I'd recommend anyone ever does,
> and for only 600GB of data it would probably take more time to work
> out how to do it correctly (using disposable filesystem images) than
> it would to dump, mkfs and restore...
Ugh. I see. Well, I was afraid of that, but I'm already
sorta-prepared for that, after "sleeping with this idea"... ;)
It'll take ages for sure, but there's no other choice for
now.
So the question that remains is: why?
It's an old machine (PIV era), with old scsi disks (74Gb
non-hotswap), -- the same disks as used on numerous other
machines out there, where there's no such issue.
Plain old linux software raid array, also as used on many
other systems.
At that time, all stuff were in 512 bytes for sure.
The array and filesystem were re-created last year (we
added another drive to it), but I don't think at that
time there were a kernel version that supported >512
sector sizes either (it was 2.6.27 I think).
So why xfs decided the block size is 4K??
And a related question, -- is there a way to create
xfs fs with the right sector size? The filesystem
were ok in years, not only on this machine, and I'm
quite afraid to replace it with something else (e.g.
ext4) in a hurry without good prior testing.
By the way, how one can check the "sector size" of a
block device nowadays? I think I saw something about
sysfs, but I see nothing of that sort in 2.6.32 kernel
(which is used on this and other systems).
Thanks!
/mjt
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2010-08-13 6:24 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-12 22:10 Alignment size? Michael Tokarev
2010-08-12 23:49 ` Dave Chinner
2010-08-13 6:24 ` Michael Tokarev [this message]
2010-08-13 10:27 ` Stan Hoeppner
2010-08-13 11:00 ` Michael Tokarev
2010-08-13 11:36 ` Roger Willcocks
2010-08-13 11:39 ` Dave Chinner
2010-08-13 15:15 ` Christoph Hellwig
2010-08-17 0:18 ` Michael Tokarev
2010-08-17 0:30 ` Michael Tokarev
2010-08-17 0:31 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C64E52E.2060806@msgid.tls.msk.ru \
--to=mjt@tls.msk.ru \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox