From: Linda Knippers <linda.knippers@hp.com>
To: Boaz Harrosh <boaz@plexistor.com>, Dave Chinner <david@fromorbit.com>
Cc: Jeff Moyer <jmoyer@redhat.com>,
"matthew r. wilcox" <matthew.r.wilcox@intel.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: regression introduced by "block: Add support for DAX reads/writes to block devices"
Date: Mon, 10 Aug 2015 12:32:08 -0400 [thread overview]
Message-ID: <55C8D208.1070903@hp.com> (raw)
In-Reply-To: <55C714D0.8070003@plexistor.com>
On 8/9/2015 4:52 AM, Boaz Harrosh wrote:
> On 08/06/2015 11:34 PM, Dave Chinner wrote:
>> On Thu, Aug 06, 2015 at 10:52:47AM +0300, Boaz Harrosh wrote:
>>> On 08/06/2015 06:24 AM, Dave Chinner wrote:
>>>> On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote:
>>>>> On 08/05/2015 06:01 PM, Dave Chinner wrote:
>>>>>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote:
>>> <>
>>>>>>>
>>>>>>> I sat down with Linda to look into it, and the problem is that mkfs.xfs
>>>>>>> sets the blocksize of the device to 512 (via BLKBSZSET), and then reads
>>>>>>> from the last sector of the device. This results in dax_io trying to do
>>>>>>> a page-sized I/O at 512 bytes from the end of the device.
>>>>>>
>>>
>>> This part I do not understand. how is mkfs.xfs reading the sector?
>>> Is it through open(/dev/pmem0,...) ? O_DIRECT?
>>
>> mkfs.xfs uses O_DIRECT. Only if open(O_DIRECT) fails or mkfs.xfs is
>> told that it is working on an image file does it fall back to
>> buffered IO. All of the XFS userspace tools work this way to prevent
>> page cache pollution issues with read-once or write-once data during
>> operation.
>>
>
> Thanks, yes makes sense. This is a bug at the DAX implementation of
> bdev. Since as you know with DAX there is no difference between
> O_DIRECT and buffered, we must support any aligned IO. I bet it
> should be something with bdev not giving 4K buffer-heads to dax.c.
>
> Or ... It might just be the infamous bug where the actual partition
> they used was not 4k aligned on its start sector. So the last sector IO
> after partition translation came out wrong. This bug then should be
> fixed by: https://lists.01.org/pipermail/linux-nvdimm/2015-July/001555.html
> by:Vishal Verma
>
> Vishal I think we should add CC: stable@vger.kernel.org to your patch
> because of these fdisk bugs.
That patch does cause 'mkfs -t xfs' to work.
Before:
$ sudo mkfs -t xfs -f /dev/pmem3
meta-data=/dev/pmem3 isize=256 agcount=4, agsize=524288 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=2097152, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mkfs.xfs: read failed: Numerical result out of range
After:
$ sudo mkfs -t xfs -f /dev/pmem3
meta-data=/dev/pmem3 isize=256 agcount=4, agsize=524288 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=0 finobt=0
data = bsize=4096 blocks=2097152, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
$ cat /sys/block/pmem3/queue/logical_block_size
512
$ cat /sys/block/pmem3/queue/physical_block_size
4096
$ cat /sys/block/pmem3/queue/hw_sector_size
512
$ cat /sys/block/pmem3/queue/minimum_io_size
4096
Previously physical_block_size was 512 and minimum_io_size was 0.
What about logical_block_size and hw_sector_size still being 512?
So do we want to change pmem rather than changing DAX?
-- ljk
>
>> Cheers,
>> Dave.
>
> Thanks
> Boaz
>
next prev parent reply other threads:[~2015-08-10 16:32 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-05 20:19 regression introduced by "block: Add support for DAX reads/writes to block devices" Jeff Moyer
2015-08-05 22:01 ` Dave Chinner
2015-08-06 1:42 ` Linda Knippers
2015-08-06 3:24 ` Dave Chinner
2015-08-06 7:52 ` Boaz Harrosh
2015-08-06 20:34 ` Dave Chinner
2015-08-09 8:52 ` Boaz Harrosh
2015-08-10 16:32 ` Linda Knippers [this message]
2015-08-10 21:27 ` Dave Chinner
2015-08-10 23:04 ` Linda Knippers
2015-08-06 14:21 ` Wilcox, Matthew R
2015-08-06 15:33 ` Jeff Moyer
2015-08-06 15:51 ` Wilcox, Matthew R
2015-08-06 21:30 ` Jeff Moyer
2015-08-07 18:11 ` Wilcox, Matthew R
2015-08-07 20:41 ` Jeff Moyer
2015-08-10 7:42 ` Boaz Harrosh
2015-08-12 21:11 ` Jeff Moyer
2015-08-13 5:32 ` Boaz Harrosh
2015-08-13 14:00 ` Jeff Moyer
2015-08-13 16:42 ` Linda Knippers
2015-08-13 17:14 ` Jeff Moyer
2015-08-13 17:52 ` Linda Knippers
2015-08-13 18:19 ` Jeff Moyer
2015-08-13 19:32 ` Wilcox, Matthew R
2015-08-14 16:28 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55C8D208.1070903@hp.com \
--to=linda.knippers@hp.com \
--cc=boaz@plexistor.com \
--cc=david@fromorbit.com \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=matthew.r.wilcox@intel.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.