* [PATCH] xfs: return correct XFS_IOC_DIOINFO for DAX inode
@ 2019-04-02 15:44 Eric Sandeen
2019-04-02 17:56 ` Darrick J. Wong
0 siblings, 1 reply; 5+ messages in thread
From: Eric Sandeen @ 2019-04-02 15:44 UTC (permalink / raw)
To: linux-xfs; +Cc: Jeff Moyer
pmem is byte addressable, and indeed byte-aligned DIO works on
a DAX file. So, teach XFS_IOC_DIOINFO to return the correct
alignment information if IS_DAX(inode).
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 6ecdbb3..35eae7d 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1919,12 +1919,21 @@ xfs_file_ioctl(
}
case XFS_IOC_DIOINFO: {
struct dioattr da;
- xfs_buftarg_t *target =
- XFS_IS_REALTIME_INODE(ip) ?
- mp->m_rtdev_targp : mp->m_ddev_targp;
- da.d_mem = da.d_miniosz = target->bt_logical_sectorsize;
- da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
+ if (IS_DAX(inode)) {
+ /* pmem is byte addressable */
+ da.d_mem = 1;
+ da.d_miniosz = 1;
+ da.d_maxiosz = INT_MAX;
+ } else {
+ xfs_buftarg_t *target =
+ XFS_IS_REALTIME_INODE(ip) ?
+ mp->m_rtdev_targp : mp->m_ddev_targp;
+
+ da.d_mem = target->bt_logical_sectorsize;
+ da.d_miniosz = target->bt_logical_sectorsize;
+ da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
+ }
if (copy_to_user(arg, &da, sizeof(da)))
return -EFAULT;
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] xfs: return correct XFS_IOC_DIOINFO for DAX inode
2019-04-02 15:44 [PATCH] xfs: return correct XFS_IOC_DIOINFO for DAX inode Eric Sandeen
@ 2019-04-02 17:56 ` Darrick J. Wong
2019-04-02 18:08 ` Eric Sandeen
2019-04-02 21:31 ` Dave Chinner
0 siblings, 2 replies; 5+ messages in thread
From: Darrick J. Wong @ 2019-04-02 17:56 UTC (permalink / raw)
To: Eric Sandeen; +Cc: linux-xfs, Jeff Moyer
On Tue, Apr 02, 2019 at 10:44:38AM -0500, Eric Sandeen wrote:
> pmem is byte addressable, and indeed byte-aligned DIO works on
> a DAX file. So, teach XFS_IOC_DIOINFO to return the correct
> alignment information if IS_DAX(inode).
If it's a DAX filesystem, do we want to try to steer people towards
things like 2MB pages since (in theory) we can get away with fewer page
table mappings? And (seeing as that's mmap that cares, not directio)
would advertising preferential page mapping sizes be more appropriate
advertised in a different ioctl?
--D
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
>
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 6ecdbb3..35eae7d 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1919,12 +1919,21 @@ xfs_file_ioctl(
> }
> case XFS_IOC_DIOINFO: {
> struct dioattr da;
> - xfs_buftarg_t *target =
> - XFS_IS_REALTIME_INODE(ip) ?
> - mp->m_rtdev_targp : mp->m_ddev_targp;
>
> - da.d_mem = da.d_miniosz = target->bt_logical_sectorsize;
> - da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
> + if (IS_DAX(inode)) {
> + /* pmem is byte addressable */
> + da.d_mem = 1;
> + da.d_miniosz = 1;
> + da.d_maxiosz = INT_MAX;
> + } else {
> + xfs_buftarg_t *target =
> + XFS_IS_REALTIME_INODE(ip) ?
> + mp->m_rtdev_targp : mp->m_ddev_targp;
> +
> + da.d_mem = target->bt_logical_sectorsize;
> + da.d_miniosz = target->bt_logical_sectorsize;
> + da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
> + }
>
> if (copy_to_user(arg, &da, sizeof(da)))
> return -EFAULT;
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] xfs: return correct XFS_IOC_DIOINFO for DAX inode
2019-04-02 17:56 ` Darrick J. Wong
@ 2019-04-02 18:08 ` Eric Sandeen
2019-04-02 21:31 ` Dave Chinner
1 sibling, 0 replies; 5+ messages in thread
From: Eric Sandeen @ 2019-04-02 18:08 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, Jeff Moyer
On 4/2/19 12:56 PM, Darrick J. Wong wrote:
> On Tue, Apr 02, 2019 at 10:44:38AM -0500, Eric Sandeen wrote:
>> pmem is byte addressable, and indeed byte-aligned DIO works on
>> a DAX file. So, teach XFS_IOC_DIOINFO to return the correct
>> alignment information if IS_DAX(inode).
>
> If it's a DAX filesystem, do we want to try to steer people towards
> things like 2MB pages since (in theory) we can get away with fewer page
> table mappings? And (seeing as that's mmap that cares, not directio)
> would advertising preferential page mapping sizes be more appropriate
> advertised in a different ioctl?
The xfsctl(3) manpage documents XFS_IOC_DIOINFO as providing the
minimum/required alignments to avoid DIO failure. Says nothing about
optimal. So, if you'd like to advertise preferences, it seems like
this is not the interface to use...
-Eric
> --D
>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index 6ecdbb3..35eae7d 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -1919,12 +1919,21 @@ xfs_file_ioctl(
>> }
>> case XFS_IOC_DIOINFO: {
>> struct dioattr da;
>> - xfs_buftarg_t *target =
>> - XFS_IS_REALTIME_INODE(ip) ?
>> - mp->m_rtdev_targp : mp->m_ddev_targp;
>>
>> - da.d_mem = da.d_miniosz = target->bt_logical_sectorsize;
>> - da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
>> + if (IS_DAX(inode)) {
>> + /* pmem is byte addressable */
>> + da.d_mem = 1;
>> + da.d_miniosz = 1;
>> + da.d_maxiosz = INT_MAX;
>> + } else {
>> + xfs_buftarg_t *target =
>> + XFS_IS_REALTIME_INODE(ip) ?
>> + mp->m_rtdev_targp : mp->m_ddev_targp;
>> +
>> + da.d_mem = target->bt_logical_sectorsize;
>> + da.d_miniosz = target->bt_logical_sectorsize;
>> + da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
>> + }
>>
>> if (copy_to_user(arg, &da, sizeof(da)))
>> return -EFAULT;
>>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] xfs: return correct XFS_IOC_DIOINFO for DAX inode
2019-04-02 17:56 ` Darrick J. Wong
2019-04-02 18:08 ` Eric Sandeen
@ 2019-04-02 21:31 ` Dave Chinner
2019-04-03 1:08 ` Eric Sandeen
1 sibling, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2019-04-02 21:31 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Eric Sandeen, linux-xfs, Jeff Moyer
On Tue, Apr 02, 2019 at 10:56:32AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 02, 2019 at 10:44:38AM -0500, Eric Sandeen wrote:
> > pmem is byte addressable, and indeed byte-aligned DIO works on
> > a DAX file. So, teach XFS_IOC_DIOINFO to return the correct
> > alignment information if IS_DAX(inode).
>
> If it's a DAX filesystem, do we want to try to steer people towards
> things like 2MB pages since (in theory) we can get away with fewer page
> table mappings? And (seeing as that's mmap that cares, not directio)
> would advertising preferential page mapping sizes be more appropriate
> advertised in a different ioctl?
>
> --D
>
> > Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> > ---
> >
> > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > index 6ecdbb3..35eae7d 100644
> > --- a/fs/xfs/xfs_ioctl.c
> > +++ b/fs/xfs/xfs_ioctl.c
> > @@ -1919,12 +1919,21 @@ xfs_file_ioctl(
> > }
> > case XFS_IOC_DIOINFO: {
> > struct dioattr da;
> > - xfs_buftarg_t *target =
> > - XFS_IS_REALTIME_INODE(ip) ?
> > - mp->m_rtdev_targp : mp->m_ddev_targp;
> >
> > - da.d_mem = da.d_miniosz = target->bt_logical_sectorsize;
> > - da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
> > + if (IS_DAX(inode)) {
> > + /* pmem is byte addressable */
> > + da.d_mem = 1;
> > + da.d_miniosz = 1;
> > + da.d_maxiosz = INT_MAX;
I don't think we want to open that can of worms.
Have you run fsx on dax mixing mmap/dio with byte range granularity?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] xfs: return correct XFS_IOC_DIOINFO for DAX inode
2019-04-02 21:31 ` Dave Chinner
@ 2019-04-03 1:08 ` Eric Sandeen
0 siblings, 0 replies; 5+ messages in thread
From: Eric Sandeen @ 2019-04-03 1:08 UTC (permalink / raw)
To: Dave Chinner, Darrick J. Wong; +Cc: Eric Sandeen, linux-xfs, Jeff Moyer
On 4/2/19 4:31 PM, Dave Chinner wrote:
> On Tue, Apr 02, 2019 at 10:56:32AM -0700, Darrick J. Wong wrote:
>> On Tue, Apr 02, 2019 at 10:44:38AM -0500, Eric Sandeen wrote:
>>> pmem is byte addressable, and indeed byte-aligned DIO works on
>>> a DAX file. So, teach XFS_IOC_DIOINFO to return the correct
>>> alignment information if IS_DAX(inode).
>>
>> If it's a DAX filesystem, do we want to try to steer people towards
>> things like 2MB pages since (in theory) we can get away with fewer page
>> table mappings? And (seeing as that's mmap that cares, not directio)
>> would advertising preferential page mapping sizes be more appropriate
>> advertised in a different ioctl?
>>
>> --D
>>
>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>> ---
>>>
>>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>>> index 6ecdbb3..35eae7d 100644
>>> --- a/fs/xfs/xfs_ioctl.c
>>> +++ b/fs/xfs/xfs_ioctl.c
>>> @@ -1919,12 +1919,21 @@ xfs_file_ioctl(
>>> }
>>> case XFS_IOC_DIOINFO: {
>>> struct dioattr da;
>>> - xfs_buftarg_t *target =
>>> - XFS_IS_REALTIME_INODE(ip) ?
>>> - mp->m_rtdev_targp : mp->m_ddev_targp;
>>>
>>> - da.d_mem = da.d_miniosz = target->bt_logical_sectorsize;
>>> - da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1);
>>> + if (IS_DAX(inode)) {
>>> + /* pmem is byte addressable */
>>> + da.d_mem = 1;
>>> + da.d_miniosz = 1;
>>> + da.d_maxiosz = INT_MAX;
>
> I don't think we want to open that can of worms.
It's already open... byte-granularity dax+dio succeeds today.
Does it work? ;)
> Have you run fsx on dax mixing mmap/dio with byte range granularity?
Like:
# fsx -Z -r 1 -w 1 daxfile
?
yes (now that you asked) ;)
not to a bazillion ops, but I've not seen a failure yet.
This is on simulated pmem, on a 5.0 kernel.
-Eric
> Cheers,
>
> Dave.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-04-03 1:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-02 15:44 [PATCH] xfs: return correct XFS_IOC_DIOINFO for DAX inode Eric Sandeen
2019-04-02 17:56 ` Darrick J. Wong
2019-04-02 18:08 ` Eric Sandeen
2019-04-02 21:31 ` Dave Chinner
2019-04-03 1:08 ` Eric Sandeen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).