All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linda Knippers <linda.knippers@hp.com>
To: Boaz Harrosh <boaz@plexistor.com>, Dave Chinner <david@fromorbit.com>
Cc: Jeff Moyer <jmoyer@redhat.com>,
	"matthew r. wilcox" <matthew.r.wilcox@intel.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: regression introduced by "block: Add support for DAX reads/writes to block devices"
Date: Mon, 10 Aug 2015 12:32:08 -0400	[thread overview]
Message-ID: <55C8D208.1070903@hp.com> (raw)
In-Reply-To: <55C714D0.8070003@plexistor.com>

On 8/9/2015 4:52 AM, Boaz Harrosh wrote:
> On 08/06/2015 11:34 PM, Dave Chinner wrote:
>> On Thu, Aug 06, 2015 at 10:52:47AM +0300, Boaz Harrosh wrote:
>>> On 08/06/2015 06:24 AM, Dave Chinner wrote:
>>>> On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote:
>>>>> On 08/05/2015 06:01 PM, Dave Chinner wrote:
>>>>>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote:
>>> <>
>>>>>>>
>>>>>>> I sat down with Linda to look into it, and the problem is that mkfs.xfs
>>>>>>> sets the blocksize of the device to 512 (via BLKBSZSET), and then reads
>>>>>>> from the last sector of the device.  This results in dax_io trying to do
>>>>>>> a page-sized I/O at 512 bytes from the end of the device.
>>>>>>
>>>
>>> This part I do not understand. how is mkfs.xfs reading the sector?
>>> Is it through open(/dev/pmem0,...) ? O_DIRECT?
>>
>> mkfs.xfs uses O_DIRECT. Only if open(O_DIRECT) fails or mkfs.xfs is
>> told that it is working on an image file does it fall back to
>> buffered IO. All of the XFS userspace tools work this way to prevent
>> page cache pollution issues with read-once or write-once data during
>> operation.
>>
> 
> Thanks, yes makes sense. This is a bug at the DAX implementation of
> bdev. Since as you know with DAX there is no difference between
> O_DIRECT and buffered, we must support any aligned IO. I bet it
> should be something with bdev not giving 4K buffer-heads to dax.c.
> 
> Or ... It might just be the infamous bug where the actual partition
> they used was not 4k aligned on its start sector. So the last sector IO
> after partition translation came out wrong. This bug then should be
> fixed by: https://lists.01.org/pipermail/linux-nvdimm/2015-July/001555.html
> by:Vishal Verma
> 
> Vishal I think we should add CC: stable@vger.kernel.org to your patch
> because of these fdisk bugs.

That patch does cause 'mkfs -t xfs' to work.

Before:
$ sudo mkfs -t xfs -f /dev/pmem3
meta-data=/dev/pmem3             isize=256    agcount=4, agsize=524288 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=2097152, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
mkfs.xfs: read failed: Numerical result out of range

After:

$ sudo mkfs -t xfs -f /dev/pmem3
meta-data=/dev/pmem3             isize=256    agcount=4, agsize=524288 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=2097152, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

$ cat /sys/block/pmem3/queue/logical_block_size
512
$ cat /sys/block/pmem3/queue/physical_block_size
4096
$ cat /sys/block/pmem3/queue/hw_sector_size
512
$ cat /sys/block/pmem3/queue/minimum_io_size
4096

Previously physical_block_size was 512 and minimum_io_size was 0.
What about logical_block_size and hw_sector_size still being 512?

So do we want to change pmem rather than changing DAX?

-- ljk
> 
>> Cheers,
>> Dave.
> 
> Thanks
> Boaz
> 

  reply	other threads:[~2015-08-10 16:32 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-05 20:19 regression introduced by "block: Add support for DAX reads/writes to block devices" Jeff Moyer
2015-08-05 22:01 ` Dave Chinner
2015-08-06  1:42   ` Linda Knippers
2015-08-06  3:24     ` Dave Chinner
2015-08-06  7:52       ` Boaz Harrosh
2015-08-06 20:34         ` Dave Chinner
2015-08-09  8:52           ` Boaz Harrosh
2015-08-10 16:32             ` Linda Knippers [this message]
2015-08-10 21:27               ` Dave Chinner
2015-08-10 23:04                 ` Linda Knippers
2015-08-06 14:21 ` Wilcox, Matthew R
2015-08-06 15:33   ` Jeff Moyer
2015-08-06 15:51     ` Wilcox, Matthew R
2015-08-06 21:30   ` Jeff Moyer
2015-08-07 18:11     ` Wilcox, Matthew R
2015-08-07 20:41       ` Jeff Moyer
2015-08-10  7:42         ` Boaz Harrosh
2015-08-12 21:11           ` Jeff Moyer
2015-08-13  5:32             ` Boaz Harrosh
2015-08-13 14:00               ` Jeff Moyer
2015-08-13 16:42                 ` Linda Knippers
2015-08-13 17:14                   ` Jeff Moyer
2015-08-13 17:52                     ` Linda Knippers
2015-08-13 18:19                       ` Jeff Moyer
2015-08-13 19:32                         ` Wilcox, Matthew R
2015-08-14 16:28                           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55C8D208.1070903@hp.com \
    --to=linda.knippers@hp.com \
    --cc=boaz@plexistor.com \
    --cc=david@fromorbit.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.