* [PATCH] zonefs: do not use append if device does not support it
@ 2023-06-26 16:47 Andreas Hindborg
2023-06-26 17:54 ` Johannes Thumshirn
2023-06-27 3:45 ` Christoph Hellwig
0 siblings, 2 replies; 10+ messages in thread
From: Andreas Hindborg @ 2023-06-26 16:47 UTC (permalink / raw)
To: Damien Le Moal
Cc: open list:ZONEFS FILESYSTEM, gost.dev, Andreas Hindborg,
Naohiro Aota, Johannes Thumshirn, open list,
Andreas Hindborg (Samsung)
From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
io vector is truncated to zero. It also causes a call to
`invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
not intentional. Thus, do not use append when device does not support it.
Signed-off-by: Andreas Hindborg (Samsung) <nmi@metaspace.dk>
---
fs/zonefs/file.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 132f01d3461f..c97fe2aa20b0 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -536,9 +536,11 @@ static ssize_t zonefs_write_checks(struct kiocb *iocb, struct iov_iter *from)
static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
{
struct inode *inode = file_inode(iocb->ki_filp);
+ struct block_device *bdev = inode->i_sb->s_bdev;
struct zonefs_inode_info *zi = ZONEFS_I(inode);
struct zonefs_zone *z = zonefs_inode_zone(inode);
struct super_block *sb = inode->i_sb;
+ unsigned int max_append = bdev_max_zone_append_sectors(bdev);
bool sync = is_sync_kiocb(iocb);
bool append = false;
ssize_t ret, count;
@@ -581,7 +583,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
append = sync;
}
- if (append) {
+ if (append && max_append) {
ret = zonefs_file_dio_append(iocb, from);
} else {
/*
base-commit: 45a3e24f65e90a047bef86f927ebdc4c710edaa1
--
2.41.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-26 16:47 [PATCH] zonefs: do not use append if device does not support it Andreas Hindborg
@ 2023-06-26 17:54 ` Johannes Thumshirn
2023-06-26 18:23 ` Andreas Hindborg (Samsung)
2023-06-27 3:45 ` Christoph Hellwig
1 sibling, 1 reply; 10+ messages in thread
From: Johannes Thumshirn @ 2023-06-26 17:54 UTC (permalink / raw)
To: Andreas Hindborg, Damien Le Moal
Cc: open list:ZONEFS FILESYSTEM, gost.dev@samsung.com,
Andreas Hindborg, Naohiro Aota, Johannes Thumshirn, open list
On 26.06.23 18:47, Andreas Hindborg wrote:
> From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
>
> Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
> device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
> io vector is truncated to zero. It also causes a call to
> `invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
> not intentional. Thus, do not use append when device does not support it.
>
I'm sorry but I think it has been stated often enough that for Linux Zone Append
is a mandatory feature for a Zoned Block Device. Therefore this path is essentially
dead code as max_zone_append_sectors will always be greater than zero.
So this is a clear NAK from my side.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-26 17:54 ` Johannes Thumshirn
@ 2023-06-26 18:23 ` Andreas Hindborg (Samsung)
2023-06-27 0:21 ` Damien Le Moal
0 siblings, 1 reply; 10+ messages in thread
From: Andreas Hindborg (Samsung) @ 2023-06-26 18:23 UTC (permalink / raw)
To: Johannes Thumshirn
Cc: Damien Le Moal, open list:ZONEFS FILESYSTEM, gost.dev@samsung.com,
Naohiro Aota, Johannes Thumshirn, open list
Johannes Thumshirn <Johannes.Thumshirn@wdc.com> writes:
> On 26.06.23 18:47, Andreas Hindborg wrote:
>> From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
>>
>> Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
>> device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
>> io vector is truncated to zero. It also causes a call to
>> `invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
>> not intentional. Thus, do not use append when device does not support it.
>>
>
> I'm sorry but I think it has been stated often enough that for Linux Zone Append
> is a mandatory feature for a Zoned Block Device. Therefore this path is essentially
> dead code as max_zone_append_sectors will always be greater than zero.
>
> So this is a clear NAK from my side.
OK, thanks for clarifying 👍 I came across this bugging out while
playing around with zone append for ublk. The code makes sense if the
stack expects append to always be present.
I didn't follow the discussion, could you reiterate why the policy is
that zoned devices _must_ support append?
Best regards,
Andreas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-26 18:23 ` Andreas Hindborg (Samsung)
@ 2023-06-27 0:21 ` Damien Le Moal
2023-06-27 5:45 ` Andreas Hindborg (Samsung)
0 siblings, 1 reply; 10+ messages in thread
From: Damien Le Moal @ 2023-06-27 0:21 UTC (permalink / raw)
To: Andreas Hindborg (Samsung), Johannes Thumshirn
Cc: open list:ZONEFS FILESYSTEM, gost.dev@samsung.com, Naohiro Aota,
Johannes Thumshirn, open list
On 6/27/23 03:23, Andreas Hindborg (Samsung) wrote:
>
> Johannes Thumshirn <Johannes.Thumshirn@wdc.com> writes:
>
>> On 26.06.23 18:47, Andreas Hindborg wrote:
>>> From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
>>>
>>> Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
>>> device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
>>> io vector is truncated to zero. It also causes a call to
>>> `invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
>>> not intentional. Thus, do not use append when device does not support it.
>>>
>>
>> I'm sorry but I think it has been stated often enough that for Linux Zone Append
>> is a mandatory feature for a Zoned Block Device. Therefore this path is essentially
>> dead code as max_zone_append_sectors will always be greater than zero.
>>
>> So this is a clear NAK from my side.
>
> OK, thanks for clarifying 👍 I came across this bugging out while
> playing around with zone append for ublk. The code makes sense if the
> stack expects append to always be present.
>
> I didn't follow the discussion, could you reiterate why the policy is
> that zoned devices _must_ support append?
To avoid support fragmentation and for performance. btrfs zoned block device
support requires zone append and using that command makes writes much faster as
we do not have to go through zone locking.
Note that for zonefs, I plan to add async zone append support as well, linked
with O_APPEND use to further improve write performance with ZNS drives.
>
> Best regards,
> Andreas
>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-26 16:47 [PATCH] zonefs: do not use append if device does not support it Andreas Hindborg
2023-06-26 17:54 ` Johannes Thumshirn
@ 2023-06-27 3:45 ` Christoph Hellwig
2023-06-27 4:45 ` Damien Le Moal
1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2023-06-27 3:45 UTC (permalink / raw)
To: Andreas Hindborg
Cc: Damien Le Moal, open list:ZONEFS FILESYSTEM, gost.dev,
Andreas Hindborg, Naohiro Aota, Johannes Thumshirn, open list
On Mon, Jun 26, 2023 at 06:47:52PM +0200, Andreas Hindborg wrote:
> From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
>
> Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
> device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
> io vector is truncated to zero. It also causes a call to
> `invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
> not intentional. Thus, do not use append when device does not support it.
How do you even manage to hit this code? Zone Append is a mandatory
feature and driver need to check it is available.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-27 3:45 ` Christoph Hellwig
@ 2023-06-27 4:45 ` Damien Le Moal
2023-06-27 4:48 ` Christoph Hellwig
2023-06-27 5:14 ` Andreas Hindborg (Samsung)
0 siblings, 2 replies; 10+ messages in thread
From: Damien Le Moal @ 2023-06-27 4:45 UTC (permalink / raw)
To: Christoph Hellwig, Andreas Hindborg
Cc: open list:ZONEFS FILESYSTEM, gost.dev, Andreas Hindborg,
Naohiro Aota, Johannes Thumshirn, open list
On 6/27/23 12:45, Christoph Hellwig wrote:
> On Mon, Jun 26, 2023 at 06:47:52PM +0200, Andreas Hindborg wrote:
>> From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
>>
>> Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
>> device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
>> io vector is truncated to zero. It also causes a call to
>> `invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
>> not intentional. Thus, do not use append when device does not support it.
>
> How do you even manage to hit this code? Zone Append is a mandatory
> feature and driver need to check it is available.
ublk driver probably is missing that check ? I have not looked at the code for
zone support.
But thinking of it, we probably would be better off having a generic check for
"q->limits.max_zone_append_sectors != 0" in blk_revalidate_disk_zones().
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-27 4:45 ` Damien Le Moal
@ 2023-06-27 4:48 ` Christoph Hellwig
2023-06-27 4:50 ` Damien Le Moal
2023-06-27 5:14 ` Andreas Hindborg (Samsung)
1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2023-06-27 4:48 UTC (permalink / raw)
To: Damien Le Moal
Cc: Christoph Hellwig, Andreas Hindborg, open list:ZONEFS FILESYSTEM,
gost.dev, Andreas Hindborg, Naohiro Aota, Johannes Thumshirn,
open list
On Tue, Jun 27, 2023 at 01:45:38PM +0900, Damien Le Moal wrote:
> But thinking of it, we probably would be better off having a generic check for
> "q->limits.max_zone_append_sectors != 0" in blk_revalidate_disk_zones().
Agreed.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-27 4:48 ` Christoph Hellwig
@ 2023-06-27 4:50 ` Damien Le Moal
0 siblings, 0 replies; 10+ messages in thread
From: Damien Le Moal @ 2023-06-27 4:50 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Andreas Hindborg, open list:ZONEFS FILESYSTEM, gost.dev,
Andreas Hindborg, Naohiro Aota, Johannes Thumshirn, open list
On 6/27/23 13:48, Christoph Hellwig wrote:
> On Tue, Jun 27, 2023 at 01:45:38PM +0900, Damien Le Moal wrote:
>> But thinking of it, we probably would be better off having a generic check for
>> "q->limits.max_zone_append_sectors != 0" in blk_revalidate_disk_zones().
>
> Agreed.
I'll send something.
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-27 4:45 ` Damien Le Moal
2023-06-27 4:48 ` Christoph Hellwig
@ 2023-06-27 5:14 ` Andreas Hindborg (Samsung)
1 sibling, 0 replies; 10+ messages in thread
From: Andreas Hindborg (Samsung) @ 2023-06-27 5:14 UTC (permalink / raw)
To: Damien Le Moal
Cc: Christoph Hellwig, open list:ZONEFS FILESYSTEM, gost.dev,
Naohiro Aota, Johannes Thumshirn, open list
Damien Le Moal <dlemoal@kernel.org> writes:
> On 6/27/23 12:45, Christoph Hellwig wrote:
>> On Mon, Jun 26, 2023 at 06:47:52PM +0200, Andreas Hindborg wrote:
>>> From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
>>>
>>> Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
>>> device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
>>> io vector is truncated to zero. It also causes a call to
>>> `invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
>>> not intentional. Thus, do not use append when device does not support it.
>>
>> How do you even manage to hit this code? Zone Append is a mandatory
>> feature and driver need to check it is available.
>
> ublk driver probably is missing that check ? I have not looked at the code for
> zone support.
>
> But thinking of it, we probably would be better off having a generic check for
> "q->limits.max_zone_append_sectors != 0" in blk_revalidate_disk_zones().
I was playing with ublk zone support. It seems I made it buggy by
allowing zone append size to go to zero.
Adding the check would be a nice help to people like me that will
implement whatever in their driver :)
Best regards
Andreas
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] zonefs: do not use append if device does not support it
2023-06-27 0:21 ` Damien Le Moal
@ 2023-06-27 5:45 ` Andreas Hindborg (Samsung)
0 siblings, 0 replies; 10+ messages in thread
From: Andreas Hindborg (Samsung) @ 2023-06-27 5:45 UTC (permalink / raw)
To: Damien Le Moal
Cc: Johannes Thumshirn, open list:ZONEFS FILESYSTEM,
gost.dev@samsung.com, Naohiro Aota, Johannes Thumshirn, open list
Damien Le Moal <dlemoal@kernel.org> writes:
> On 6/27/23 03:23, Andreas Hindborg (Samsung) wrote:
>>
>> Johannes Thumshirn <Johannes.Thumshirn@wdc.com> writes:
>>
>>> On 26.06.23 18:47, Andreas Hindborg wrote:
>>>> From: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
>>>>
>>>> Zonefs will try to use `zonefs_file_dio_append()` for direct sync writes even if
>>>> device `max_zone_append_sectors` is zero. This will cause the IO to fail as the
>>>> io vector is truncated to zero. It also causes a call to
>>>> `invalidate_inode_pages2_range()` with end set to UINT_MAX, which is probably
>>>> not intentional. Thus, do not use append when device does not support it.
>>>>
>>>
>>> I'm sorry but I think it has been stated often enough that for Linux Zone Append
>>> is a mandatory feature for a Zoned Block Device. Therefore this path is essentially
>>> dead code as max_zone_append_sectors will always be greater than zero.
>>>
>>> So this is a clear NAK from my side.
>>
>> OK, thanks for clarifying 👍 I came across this bugging out while
>> playing around with zone append for ublk. The code makes sense if the
>> stack expects append to always be present.
>>
>> I didn't follow the discussion, could you reiterate why the policy is
>> that zoned devices _must_ support append?
>
> To avoid support fragmentation and for performance. btrfs zoned block device
> support requires zone append and using that command makes writes much faster as
> we do not have to go through zone locking.
> Note that for zonefs, I plan to add async zone append support as well, linked
> with O_APPEND use to further improve write performance with ZNS drives.
>
Thanks for clarifying, Damien 👍
BR Andreas
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-06-27 5:46 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-26 16:47 [PATCH] zonefs: do not use append if device does not support it Andreas Hindborg
2023-06-26 17:54 ` Johannes Thumshirn
2023-06-26 18:23 ` Andreas Hindborg (Samsung)
2023-06-27 0:21 ` Damien Le Moal
2023-06-27 5:45 ` Andreas Hindborg (Samsung)
2023-06-27 3:45 ` Christoph Hellwig
2023-06-27 4:45 ` Damien Le Moal
2023-06-27 4:48 ` Christoph Hellwig
2023-06-27 4:50 ` Damien Le Moal
2023-06-27 5:14 ` Andreas Hindborg (Samsung)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox