Linux CIFS filesystem development
 help / color / mirror / Atom feed
* Can fallocate() ops be emulated better using SMB request compounding?
@ 2023-12-07 15:58 David Howells
  2023-12-07 17:40 ` Jeremy Allison
  0 siblings, 1 reply; 8+ messages in thread
From: David Howells @ 2023-12-07 15:58 UTC (permalink / raw)
  To: Steve French, Namjae Jeon, jra
  Cc: dhowells, ronniesahlberg, Tom Talpey, Stefan Metzmacher, jlayton,
	linux-cifs, samba-technical

Hi Steve, Namjae, Jeremy,

At the moment certain fallocate() operations aren't very well implemented in
the cifs filesystem on Linux, either because the protocol doesn't fully
support them or because the ops being used don't also set the EOF marker at
the same time and a separate RPC must be made to do that.

For instance:

 - FALLOC_FL_ZERO_RANGE does some zeroing and then sets the EOF as two
   distinctly separate operations.  The code prevents you from doing this op
   under some circumstances as it doesn't have an oplock and doesn't want to
   race with a third party (note that smb3_punch_hole() doesn't have this
   check).

 - FALLOC_FL_COLLAPSE_RANGE uses COPYCHUNK to move the file down and then sets
   the EOF as two separate operations as there is no protocol op for this.
   However, the copy will likely fail if the ranges overlap and it's
   non-atomic with respect to a third party.

 - FALLOC_FL_INSERT_RANGE has the same issues as FALLOC_FL_COLLAPSE_RANGE.

Question: Would it be possible to do all of these better by using compounding
with SMB2_FLAGS_RELATED_OPERATIONS?  In particular, if two components of a
compound are marked related, does the second get skipped if the first fails?
Further, are the two ops then essentially done atomically?

If this is the case, then for FALLOC_FL_ZERO_RANGE, just compounding the
SET_ZERO_DATA with the SET-EOF will reduce or eliminate the race window.

For FALLOC_FL_COLLAPSE/INSERT_RANGE, we could compound the COPYCHUNK and
SET-EOF.  As long as the SET-EOF won't happen if the COPYCHUNK fails, this
will reduce the race.

However, for COLLAPSE/INSERT, we can go further: recognise the { COPYCHUNK,
SET-EOF } compound on the server and see if the file positions, chunk length
EOF and future EOF are consistent with a collapse/insert request and, if so,
convert the pair of them to a single fallocate() call and try that; if that
fails, fall back to copy_file_range() and ftruncate().


As an alternative, at least for removing the 3rd-party races, is it possible
to make sure we have an appropriate oplock around the two components in each
case?  It would mean potentially more trips to the server, but would remove
the window, I think.

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Can fallocate() ops be emulated better using SMB request compounding?
  2023-12-07 15:58 Can fallocate() ops be emulated better using SMB request compounding? David Howells
@ 2023-12-07 17:40 ` Jeremy Allison
  2023-12-07 17:50   ` David Howells
  2023-12-07 18:12   ` Paulo Alcantara
  0 siblings, 2 replies; 8+ messages in thread
From: Jeremy Allison @ 2023-12-07 17:40 UTC (permalink / raw)
  To: David Howells
  Cc: Steve French, Namjae Jeon, ronniesahlberg, Tom Talpey,
	Stefan Metzmacher, jlayton, linux-cifs, samba-technical

On Thu, Dec 07, 2023 at 03:58:46PM +0000, David Howells wrote:
>Hi Steve, Namjae, Jeremy,
>
>At the moment certain fallocate() operations aren't very well implemented in
>the cifs filesystem on Linux, either because the protocol doesn't fully
>support them or because the ops being used don't also set the EOF marker at
>the same time and a separate RPC must be made to do that.
>
>For instance:
>
> - FALLOC_FL_ZERO_RANGE does some zeroing and then sets the EOF as two
>   distinctly separate operations.  The code prevents you from doing this op
>   under some circumstances as it doesn't have an oplock and doesn't want to
>   race with a third party (note that smb3_punch_hole() doesn't have this
>   check).
>
> - FALLOC_FL_COLLAPSE_RANGE uses COPYCHUNK to move the file down and then sets
>   the EOF as two separate operations as there is no protocol op for this.
>   However, the copy will likely fail if the ranges overlap and it's
>   non-atomic with respect to a third party.
>
> - FALLOC_FL_INSERT_RANGE has the same issues as FALLOC_FL_COLLAPSE_RANGE.
>
>Question: Would it be possible to do all of these better by using compounding
>with SMB2_FLAGS_RELATED_OPERATIONS?  In particular, if two components of a
>compound are marked related, does the second get skipped if the first fails?

Yes:

https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-smb2/46dd4182-62d3-4e30-9fe5-e2ec124edca1

"When the current operation requires a FileId and the previous operation
either contains or generates a FileId, if the previous operation fails
with an error, the server SHOULD<253> fail the current operation with
the same error code returned by the previous operation."

>Further, are the two ops then essentially done atomically?

No. They are processed (at least in Samba) as two separate
requests and can be raced by local or other remote access.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Can fallocate() ops be emulated better using SMB request compounding?
  2023-12-07 17:40 ` Jeremy Allison
@ 2023-12-07 17:50   ` David Howells
  2023-12-07 18:18     ` Jeff Layton
                       ` (2 more replies)
  2023-12-07 18:12   ` Paulo Alcantara
  1 sibling, 3 replies; 8+ messages in thread
From: David Howells @ 2023-12-07 17:50 UTC (permalink / raw)
  To: Jeremy Allison
  Cc: dhowells, Steve French, Namjae Jeon, ronniesahlberg, Tom Talpey,
	Stefan Metzmacher, jlayton, linux-cifs, samba-technical

Jeremy Allison <jra@samba.org> wrote:

> >Further, are the two ops then essentially done atomically?
> 
> No. They are processed (at least in Samba) as two separate
> requests and can be raced by local or other remote access.

So just compounding them would leave us in the same situation we are in now -
which would be fine.

What do you think about the idea of having the server see a specifically
arranged compounded pair and turn them into an op that can't otherwise be
represented in the protocol?

Or is it better to try and get the protocol extended?

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Can fallocate() ops be emulated better using SMB request compounding?
  2023-12-07 17:40 ` Jeremy Allison
  2023-12-07 17:50   ` David Howells
@ 2023-12-07 18:12   ` Paulo Alcantara
  1 sibling, 0 replies; 8+ messages in thread
From: Paulo Alcantara @ 2023-12-07 18:12 UTC (permalink / raw)
  To: Jeremy Allison, David Howells
  Cc: Steve French, Namjae Jeon, ronniesahlberg, Tom Talpey,
	Stefan Metzmacher, jlayton, linux-cifs, samba-technical

Jeremy Allison <jra@samba.org> writes:

> On Thu, Dec 07, 2023 at 03:58:46PM +0000, David Howells wrote:
>>Hi Steve, Namjae, Jeremy,
>>
>>At the moment certain fallocate() operations aren't very well implemented in
>>the cifs filesystem on Linux, either because the protocol doesn't fully
>>support them or because the ops being used don't also set the EOF marker at
>>the same time and a separate RPC must be made to do that.
>>
>>For instance:
>>
>> - FALLOC_FL_ZERO_RANGE does some zeroing and then sets the EOF as two
>>   distinctly separate operations.  The code prevents you from doing this op
>>   under some circumstances as it doesn't have an oplock and doesn't want to
>>   race with a third party (note that smb3_punch_hole() doesn't have this
>>   check).
>>
>> - FALLOC_FL_COLLAPSE_RANGE uses COPYCHUNK to move the file down and then sets
>>   the EOF as two separate operations as there is no protocol op for this.
>>   However, the copy will likely fail if the ranges overlap and it's
>>   non-atomic with respect to a third party.
>>
>> - FALLOC_FL_INSERT_RANGE has the same issues as FALLOC_FL_COLLAPSE_RANGE.
>>
>>Question: Would it be possible to do all of these better by using compounding
>>with SMB2_FLAGS_RELATED_OPERATIONS?  In particular, if two components of a
>>compound are marked related, does the second get skipped if the first fails?
>
> Yes:
>
> https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-smb2/46dd4182-62d3-4e30-9fe5-e2ec124edca1
>
> "When the current operation requires a FileId and the previous operation
> either contains or generates a FileId, if the previous operation fails
> with an error, the server SHOULD<253> fail the current operation with
> the same error code returned by the previous operation."

David, you could extend smb2_compound_op() like I did for compound
create+{get,set}_reparse+getinfo+close in [1][2][3].

[1] https://lore.kernel.org/r/20231126025510.28147-2-pc@manguebit.com
[2] https://lore.kernel.org/r/20231126025510.28147-3-pc@manguebit.com
[3] https://lore.kernel.org/r/20231126025510.28147-8-pc@manguebit.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Can fallocate() ops be emulated better using SMB request compounding?
  2023-12-07 17:50   ` David Howells
@ 2023-12-07 18:18     ` Jeff Layton
  2023-12-07 18:32     ` Jeremy Allison
  2023-12-08 14:03     ` Steve French
  2 siblings, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2023-12-07 18:18 UTC (permalink / raw)
  To: David Howells, Jeremy Allison
  Cc: Steve French, Namjae Jeon, ronniesahlberg, Tom Talpey,
	Stefan Metzmacher, linux-cifs, samba-technical

On Thu, 2023-12-07 at 17:50 +0000, David Howells wrote:
> Jeremy Allison <jra@samba.org> wrote:
> 
> > > Further, are the two ops then essentially done atomically?
> > 
> > No. They are processed (at least in Samba) as two separate
> > requests and can be raced by local or other remote access.
> 
> So just compounding them would leave us in the same situation we are in now -
> which would be fine.
> 
> What do you think about the idea of having the server see a specifically
> arranged compounded pair and turn them into an op that can't otherwise be
> represented in the protocol?
> 
> Or is it better to try and get the protocol extended?
> 

I think you have to extend the protocol. You might be able to fix one
server to treat the two operations atomically, but there are tons of SMB
servers in the field. The client can't really depend that that will
always do the right thing.

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Can fallocate() ops be emulated better using SMB request compounding?
  2023-12-07 17:50   ` David Howells
  2023-12-07 18:18     ` Jeff Layton
@ 2023-12-07 18:32     ` Jeremy Allison
  2023-12-07 20:25       ` Tom Talpey
  2023-12-08 14:03     ` Steve French
  2 siblings, 1 reply; 8+ messages in thread
From: Jeremy Allison @ 2023-12-07 18:32 UTC (permalink / raw)
  To: David Howells
  Cc: Steve French, Namjae Jeon, ronniesahlberg, Tom Talpey,
	Stefan Metzmacher, jlayton, linux-cifs, samba-technical

On Thu, Dec 07, 2023 at 05:50:50PM +0000, David Howells wrote:
>Jeremy Allison <jra@samba.org> wrote:
>
>> >Further, are the two ops then essentially done atomically?
>>
>> No. They are processed (at least in Samba) as two separate
>> requests and can be raced by local or other remote access.
>
>So just compounding them would leave us in the same situation we are in now -
>which would be fine.
>
>What do you think about the idea of having the server see a specifically
>arranged compounded pair and turn them into an op that can't otherwise be
>represented in the protocol?

Complex, ugly code. How long does the server wait
for the second operation before proceeding with
the first ?

>Or is it better to try and get the protocol extended?

If this is a Linux -> Linux op, we have a protocol
space (the SMB3+POSIX) we can extend without having
to go via Microsoft. But this would need to be very carefully designed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Can fallocate() ops be emulated better using SMB request compounding?
  2023-12-07 18:32     ` Jeremy Allison
@ 2023-12-07 20:25       ` Tom Talpey
  0 siblings, 0 replies; 8+ messages in thread
From: Tom Talpey @ 2023-12-07 20:25 UTC (permalink / raw)
  To: Jeremy Allison, David Howells
  Cc: Steve French, Namjae Jeon, ronniesahlberg, Stefan Metzmacher,
	jlayton, linux-cifs, samba-technical

On 12/7/2023 1:32 PM, Jeremy Allison wrote:
> On Thu, Dec 07, 2023 at 05:50:50PM +0000, David Howells wrote:
>> Jeremy Allison <jra@samba.org> wrote:
>>
>>> >Further, are the two ops then essentially done atomically?
>>>
>>> No. They are processed (at least in Samba) as two separate
>>> requests and can be raced by local or other remote access.
>>
>> So just compounding them would leave us in the same situation we are 
>> in now -
>> which would be fine.
>>
>> What do you think about the idea of having the server see a specifically
>> arranged compounded pair and turn them into an op that can't otherwise be
>> represented in the protocol?
> 
> Complex, ugly code. How long does the server wait
> for the second operation before proceeding with
> the first ?

So existing SMB operations do this, but somewhat arbitrarily. An
operation can "go async" in the middle of a compound, by returning
STATUS_PENDING and subsequently completing with a second response.
Some operations pretty much always do this, for example directory
change notifications.

>> Or is it better to try and get the protocol extended?
> 
> If this is a Linux -> Linux op, we have a protocol
> space (the SMB3+POSIX) we can extend without having
> to go via Microsoft. But this would need to be very carefully designed.

True, but as you say, risky as heck. Don't forget the rather large
third-party SMB3 support from other vendors who may or may not be
on board.

I would STRONGLY discourage considering extending compounds in an
attempt to provide atomicity. But, perhaps some super-hairy ioctl-type
request, coupled with persistent session support? Heavy lift.

Tom.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Can fallocate() ops be emulated better using SMB request compounding?
  2023-12-07 17:50   ` David Howells
  2023-12-07 18:18     ` Jeff Layton
  2023-12-07 18:32     ` Jeremy Allison
@ 2023-12-08 14:03     ` Steve French
  2 siblings, 0 replies; 8+ messages in thread
From: Steve French @ 2023-12-08 14:03 UTC (permalink / raw)
  To: David Howells
  Cc: Jeremy Allison, Namjae Jeon, ronniesahlberg, Tom Talpey,
	Stefan Metzmacher, jlayton, linux-cifs, samba-technical

> What do you think about the idea of having the server see a specifically
> arranged compounded pair and turn them into an op that can't otherwise be
> represented in the protocol?

That makes sense for some cases (open, queryinfo e.g.) and has been
done in the past for SMB3 servers.

On Thu, Dec 7, 2023 at 11:50 AM David Howells <dhowells@redhat.com> wrote:
>
> Jeremy Allison <jra@samba.org> wrote:
>
> > >Further, are the two ops then essentially done atomically?
> >
> > No. They are processed (at least in Samba) as two separate
> > requests and can be raced by local or other remote access.
>
> So just compounding them would leave us in the same situation we are in now -
> which would be fine.
>
> What do you think about the idea of having the server see a specifically
> arranged compounded pair and turn them into an op that can't otherwise be
> represented in the protocol?
>
> Or is it better to try and get the protocol extended?
>
> David
>


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-12-08 14:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-07 15:58 Can fallocate() ops be emulated better using SMB request compounding? David Howells
2023-12-07 17:40 ` Jeremy Allison
2023-12-07 17:50   ` David Howells
2023-12-07 18:18     ` Jeff Layton
2023-12-07 18:32     ` Jeremy Allison
2023-12-07 20:25       ` Tom Talpey
2023-12-08 14:03     ` Steve French
2023-12-07 18:12   ` Paulo Alcantara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox