* [PATCH 0/2] fix regression in direct I/O to pmem devices
@ 2015-08-14 20:15 Jeff Moyer
2015-08-14 20:15 ` [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev Jeff Moyer
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Jeff Moyer @ 2015-08-14 20:15 UTC (permalink / raw)
To: matthew.r.wilcox; +Cc: linux-fsdevel, linux-kernel, linux-nvdimm
Linda Knippers noticed that commit bbab37ddc20b (block: Add support
for DAX reads/writes to block devices) caused a regression in mkfs.xfs.
Further investigation also uncovered issues related to misaligned
partitions. This patch set addresses the two issues.
[PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
[PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions
fs/block_dev.c | 7 +++++++
fs/dax.c | 3 ++-
2 files changed, 9 insertions(+), 1 deletion(-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
2015-08-14 20:15 [PATCH 0/2] fix regression in direct I/O to pmem devices Jeff Moyer
@ 2015-08-14 20:15 ` Jeff Moyer
2015-08-14 20:53 ` Linda Knippers
2015-08-14 20:15 ` [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions Jeff Moyer
2015-08-14 20:22 ` [PATCH 0/2] fix regression in direct I/O to pmem devices Dan Williams
2 siblings, 1 reply; 11+ messages in thread
From: Jeff Moyer @ 2015-08-14 20:15 UTC (permalink / raw)
To: matthew.r.wilcox; +Cc: linux-fsdevel, linux-kernel, linux-nvdimm
commit bbab37ddc20b (block: Add support for DAX reads/writes to
block devices) caused a regression in mkfs.xfs. That utility
sets the block size of the device to the logical block size
using the BLKBSZSET ioctl, and then issues a single sector read
from the last sector of the device. This results in the dax_io
code trying to do a page-sized read from 512 bytes from the end
of the device. The result is -ERANGE being returned to userspace.
The fix is to align the block to the page size before calling
get_block.
Thanks to willy for simplifying my original patch.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
---
fs/dax.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/dax.c b/fs/dax.c
index a7f77e1..ef35a20 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -116,7 +116,8 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
unsigned len;
if (pos == max) {
unsigned blkbits = inode->i_blkbits;
- sector_t block = pos >> blkbits;
+ long page = pos >> PAGE_SHIFT;
+ sector_t block = page << (PAGE_SHIFT - blkbits);
unsigned first = pos - (block << blkbits);
long size;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions
2015-08-14 20:15 [PATCH 0/2] fix regression in direct I/O to pmem devices Jeff Moyer
2015-08-14 20:15 ` [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev Jeff Moyer
@ 2015-08-14 20:15 ` Jeff Moyer
2015-08-14 20:46 ` Andreas Dilger
2015-08-14 20:22 ` [PATCH 0/2] fix regression in direct I/O to pmem devices Dan Williams
2 siblings, 1 reply; 11+ messages in thread
From: Jeff Moyer @ 2015-08-14 20:15 UTC (permalink / raw)
To: matthew.r.wilcox; +Cc: linux-fsdevel, linux-kernel, linux-nvdimm
The dax code doesn't currently support misaligned partitions,
so disable O_DIRECT via dax until such time as that support
materializes.
Suggested-by: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
---
fs/block_dev.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 1982437..1170f8c 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1241,6 +1241,13 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, int for_part)
goto out_clear;
}
bd_set_size(bdev, (loff_t)bdev->bd_part->nr_sects << 9);
+ /*
+ * If the partition is not aligned on a page
+ * boundary, we can't do dax I/O to it.
+ */
+ if ((bdev->bd_part->start_sect % (PAGE_SIZE / 512)) ||
+ (bdev->bd_part->nr_sects % (PAGE_SIZE / 512)))
+ bdev->bd_inode->i_flags &= ~S_DAX;
}
} else {
if (bdev->bd_contains == bdev) {
--
1.8.3.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/2] fix regression in direct I/O to pmem devices
2015-08-14 20:15 [PATCH 0/2] fix regression in direct I/O to pmem devices Jeff Moyer
2015-08-14 20:15 ` [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev Jeff Moyer
2015-08-14 20:15 ` [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions Jeff Moyer
@ 2015-08-14 20:22 ` Dan Williams
2015-08-14 20:23 ` Dan Williams
2 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2015-08-14 20:22 UTC (permalink / raw)
To: Jeff Moyer
Cc: Matthew R Wilcox, linux-fsdevel, linux-nvdimm,
linux-kernel@vger.kernel.org
On Fri, Aug 14, 2015 at 1:15 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Linda Knippers noticed that commit bbab37ddc20b (block: Add support
> for DAX reads/writes to block devices) caused a regression in mkfs.xfs.
> Further investigation also uncovered issues related to misaligned
> partitions. This patch set addresses the two issues.
>
> [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
> [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions
Should these be "Cc: <stable@vger.kernel.org>" given the regression
fix is applicable back to 4.0?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/2] fix regression in direct I/O to pmem devices
2015-08-14 20:22 ` [PATCH 0/2] fix regression in direct I/O to pmem devices Dan Williams
@ 2015-08-14 20:23 ` Dan Williams
2015-08-14 20:27 ` Jeff Moyer
0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2015-08-14 20:23 UTC (permalink / raw)
To: Jeff Moyer
Cc: Matthew R Wilcox, linux-fsdevel, linux-nvdimm,
linux-kernel@vger.kernel.org
On Fri, Aug 14, 2015 at 1:22 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Fri, Aug 14, 2015 at 1:15 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>> Linda Knippers noticed that commit bbab37ddc20b (block: Add support
>> for DAX reads/writes to block devices) caused a regression in mkfs.xfs.
>> Further investigation also uncovered issues related to misaligned
>> partitions. This patch set addresses the two issues.
>>
>> [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
>> [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions
>
> Should these be "Cc: <stable@vger.kernel.org>" given the regression
> fix is applicable back to 4.0?
Sorry, regression *since* 4.0.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/2] fix regression in direct I/O to pmem devices
2015-08-14 20:23 ` Dan Williams
@ 2015-08-14 20:27 ` Jeff Moyer
0 siblings, 0 replies; 11+ messages in thread
From: Jeff Moyer @ 2015-08-14 20:27 UTC (permalink / raw)
To: Dan Williams
Cc: Matthew R Wilcox, linux-fsdevel, linux-nvdimm,
linux-kernel@vger.kernel.org
Dan Williams <dan.j.williams@intel.com> writes:
> On Fri, Aug 14, 2015 at 1:22 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Fri, Aug 14, 2015 at 1:15 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>>> Linda Knippers noticed that commit bbab37ddc20b (block: Add support
>>> for DAX reads/writes to block devices) caused a regression in mkfs.xfs.
>>> Further investigation also uncovered issues related to misaligned
>>> partitions. This patch set addresses the two issues.
>>>
>>> [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
>>> [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions
>>
>> Should these be "Cc: <stable@vger.kernel.org>" given the regression
>> fix is applicable back to 4.0?
>
> Sorry, regression *since* 4.0.
$ git describe --contains bbab37ddc20b
v4.2-rc1~2^2~4
Looks like this was not sprung on the public yet.
-Jeff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions
2015-08-14 20:15 ` [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions Jeff Moyer
@ 2015-08-14 20:46 ` Andreas Dilger
2015-08-14 20:55 ` Jeff Moyer
0 siblings, 1 reply; 11+ messages in thread
From: Andreas Dilger @ 2015-08-14 20:46 UTC (permalink / raw)
To: Jeff Moyer; +Cc: matthew.r.wilcox, linux-fsdevel, linux-kernel, linux-nvdimm
On Aug 14, 2015, at 2:15 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>
> The dax code doesn't currently support misaligned partitions,
> so disable O_DIRECT via dax until such time as that support
> materializes.
>
> Suggested-by: Boaz Harrosh <boaz@plexistor.com>
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
> ---
> fs/block_dev.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 1982437..1170f8c 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1241,6 +1241,13 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, int for_part)
> goto out_clear;
> }
> bd_set_size(bdev, (loff_t)bdev->bd_part->nr_sects << 9);
> + /*
> + * If the partition is not aligned on a page
> + * boundary, we can't do dax I/O to it.
> + */
> + if ((bdev->bd_part->start_sect % (PAGE_SIZE / 512)) ||
> + (bdev->bd_part->nr_sects % (PAGE_SIZE / 512)))
Maybe I'm missing something, but doesn't the second condition above
disable DAX for the case that the 1/2 patch is fixing (i.e. the last
sectors at the end of a non-PAGE_SIZE-multiple device)? It seems a
shame to disable DAX for the whole device because of the last sector.
> + bdev->bd_inode->i_flags &= ~S_DAX;
> }
> } else {
> if (bdev->bd_contains == bdev) {
> --
> 1.8.3.1
Cheers, Andreas
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
2015-08-14 20:15 ` [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev Jeff Moyer
@ 2015-08-14 20:53 ` Linda Knippers
2015-09-08 16:10 ` Linda Knippers
0 siblings, 1 reply; 11+ messages in thread
From: Linda Knippers @ 2015-08-14 20:53 UTC (permalink / raw)
To: Jeff Moyer, matthew.r.wilcox; +Cc: linux-fsdevel, linux-nvdimm, linux-kernel
On 8/14/2015 4:15 PM, Jeff Moyer wrote:
> commit bbab37ddc20b (block: Add support for DAX reads/writes to
> block devices) caused a regression in mkfs.xfs. That utility
> sets the block size of the device to the logical block size
> using the BLKBSZSET ioctl, and then issues a single sector read
> from the last sector of the device. This results in the dax_io
> code trying to do a page-sized read from 512 bytes from the end
> of the device. The result is -ERANGE being returned to userspace.
>
> The fix is to align the block to the page size before calling
> get_block.
>
> Thanks to willy for simplifying my original patch.
>
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Tested-by: Linda Knippers <linda.knippers@hp.com>
> ---
> fs/dax.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index a7f77e1..ef35a20 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -116,7 +116,8 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
> unsigned len;
> if (pos == max) {
> unsigned blkbits = inode->i_blkbits;
> - sector_t block = pos >> blkbits;
> + long page = pos >> PAGE_SHIFT;
> + sector_t block = page << (PAGE_SHIFT - blkbits);
> unsigned first = pos - (block << blkbits);
> long size;
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions
2015-08-14 20:46 ` Andreas Dilger
@ 2015-08-14 20:55 ` Jeff Moyer
0 siblings, 0 replies; 11+ messages in thread
From: Jeff Moyer @ 2015-08-14 20:55 UTC (permalink / raw)
To: Andreas Dilger
Cc: matthew.r.wilcox, linux-fsdevel, linux-kernel, linux-nvdimm
Andreas Dilger <adilger@dilger.ca> writes:
> On Aug 14, 2015, at 2:15 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>>
>> The dax code doesn't currently support misaligned partitions,
>> so disable O_DIRECT via dax until such time as that support
>> materializes.
>>
>> Suggested-by: Boaz Harrosh <boaz@plexistor.com>
>> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
>> ---
>> fs/block_dev.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/fs/block_dev.c b/fs/block_dev.c
>> index 1982437..1170f8c 100644
>> --- a/fs/block_dev.c
>> +++ b/fs/block_dev.c
>> @@ -1241,6 +1241,13 @@ static int __blkdev_get(struct block_device *bdev, fmode_t mode, int for_part)
>> goto out_clear;
>> }
>> bd_set_size(bdev, (loff_t)bdev->bd_part->nr_sects << 9);
>> + /*
>> + * If the partition is not aligned on a page
>> + * boundary, we can't do dax I/O to it.
>> + */
>> + if ((bdev->bd_part->start_sect % (PAGE_SIZE / 512)) ||
>> + (bdev->bd_part->nr_sects % (PAGE_SIZE / 512)))
>
> Maybe I'm missing something, but doesn't the second condition above
> disable DAX for the case that the 1/2 patch is fixing (i.e. the last
> sectors at the end of a non-PAGE_SIZE-multiple device)? It seems a
> shame to disable DAX for the whole device because of the last sector.
No. Patch 1/2 fixes a 512 byte read of the last sector of a properly
aligned partiton. The goal is to eventually fix things so we can enable
the dax path for misaligned partitions, but it's not going to happen in
time for 4.2. Also, keep in mind that this is just for opening the
block device itself with O_DIRECT.
Thanks for taking a look.
Cheers,
Jeff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
2015-08-14 20:53 ` Linda Knippers
@ 2015-09-08 16:10 ` Linda Knippers
2015-09-08 16:23 ` Dan Williams
0 siblings, 1 reply; 11+ messages in thread
From: Linda Knippers @ 2015-09-08 16:10 UTC (permalink / raw)
To: Linda Knippers, Jeff Moyer, matthew.r.wilcox
Cc: linux-fsdevel, linux-nvdimm, linux-kernel, Ross Zwisler
This patch and the 2/2 patch don't seem to have gone anywhere.
Willy? or Ross?
-- ljk
On 8/14/2015 4:53 PM, Linda Knippers wrote:
> On 8/14/2015 4:15 PM, Jeff Moyer wrote:
>> commit bbab37ddc20b (block: Add support for DAX reads/writes to
>> block devices) caused a regression in mkfs.xfs. That utility
>> sets the block size of the device to the logical block size
>> using the BLKBSZSET ioctl, and then issues a single sector read
>> from the last sector of the device. This results in the dax_io
>> code trying to do a page-sized read from 512 bytes from the end
>> of the device. The result is -ERANGE being returned to userspace.
>>
>> The fix is to align the block to the page size before calling
>> get_block.
>>
>> Thanks to willy for simplifying my original patch.
>>
>> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
>
> Tested-by: Linda Knippers <linda.knippers@hp.com>
>
>> ---
>> fs/dax.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/dax.c b/fs/dax.c
>> index a7f77e1..ef35a20 100644
>> --- a/fs/dax.c
>> +++ b/fs/dax.c
>> @@ -116,7 +116,8 @@ static ssize_t dax_io(struct inode *inode, struct iov_iter *iter,
>> unsigned len;
>> if (pos == max) {
>> unsigned blkbits = inode->i_blkbits;
>> - sector_t block = pos >> blkbits;
>> + long page = pos >> PAGE_SHIFT;
>> + sector_t block = page << (PAGE_SHIFT - blkbits);
>> unsigned first = pos - (block << blkbits);
>> long size;
>>
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev
2015-09-08 16:10 ` Linda Knippers
@ 2015-09-08 16:23 ` Dan Williams
0 siblings, 0 replies; 11+ messages in thread
From: Dan Williams @ 2015-09-08 16:23 UTC (permalink / raw)
To: Linda Knippers
Cc: Linda Knippers, Jeff Moyer, Matthew R Wilcox, linux-fsdevel,
linux-nvdimm, linux-kernel@vger.kernel.org
On Tue, Sep 8, 2015 at 9:10 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
> This patch and the 2/2 patch don't seem to have gone anywhere.
> Willy? or Ross?
>
Yes, these should have gone into 4.2, The nvdimm.git tree will pick
them up after 4.3-rc1 and tag them for -stable.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-09-08 16:23 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-14 20:15 [PATCH 0/2] fix regression in direct I/O to pmem devices Jeff Moyer
2015-08-14 20:15 ` [PATCH 1/2] dax: fix O_DIRECT I/O to the last block of a blockdev Jeff Moyer
2015-08-14 20:53 ` Linda Knippers
2015-09-08 16:10 ` Linda Knippers
2015-09-08 16:23 ` Dan Williams
2015-08-14 20:15 ` [PATCH 2/2] blockdev: don't set S_DAX for misaligned partitions Jeff Moyer
2015-08-14 20:46 ` Andreas Dilger
2015-08-14 20:55 ` Jeff Moyer
2015-08-14 20:22 ` [PATCH 0/2] fix regression in direct I/O to pmem devices Dan Williams
2015-08-14 20:23 ` Dan Williams
2015-08-14 20:27 ` Jeff Moyer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).