* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression
[not found] <rnx34bfst5gyomkwooq2pvkxsjw5mrx5vxszhz7m4hy54yuma5@huwvwzgvrrru>
@ 2024-03-15 12:22 ` Jan Kara
2024-03-18 10:59 ` David Disseldorp
0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2024-03-15 12:22 UTC (permalink / raw)
To: Enzo Matsumiya; +Cc: lsf-pc, linux-fsdevel, linux-cifs
Hello Enzo,
it is good to also CC appropriate public mailing lists so that other
attendees can discuss about your proposal. Added some I found relevant.
Honza
On Thu 14-03-24 15:14:49, Enzo Matsumiya wrote:
> Hello,
>
> Having implemented data compression for SMB2 messages in cifs.ko, I'd
> like to attend LSF/MM to discuss:
>
> - implementation decisions, both in the protocol level and in the
> compression algorithms; e.g. performance improvements, what could,
> if possible/wanted, turn into a lib/ module, etc
>
> - compression algorithms in general; talk about algorithms to determine
> if/how compressible a blob of data is
> * several such algorithms already exist and are used by on-disk
> compression tools, but for over-the-wire compression maybe the
> fastest one with good (not great nor best) predictability
> could work?
>
> - overlapping modules/areas that have the need/desire to compress
> transmitting data and their status quo in the topic; difficulties
> where I could help and/or achievements that I could learn from
>
>
> Cheers,
>
> Enzo
> _______________________________________________
> Lsf-pc mailing list
> Lsf-pc@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/lsf-pc
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression
2024-03-15 12:22 ` [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression Jan Kara
@ 2024-03-18 10:59 ` David Disseldorp
2024-03-22 21:23 ` Enzo Matsumiya
0 siblings, 1 reply; 4+ messages in thread
From: David Disseldorp @ 2024-03-18 10:59 UTC (permalink / raw)
To: Enzo Matsumiya; +Cc: Jan Kara, lsf-pc, linux-fsdevel, linux-cifs
Hi Enzo,
...
> On Thu 14-03-24 15:14:49, Enzo Matsumiya wrote:
> > Hello,
> >
> > Having implemented data compression for SMB2 messages in cifs.ko, I'd
> > like to attend LSF/MM to discuss:
> >
> > - implementation decisions, both in the protocol level and in the
> > compression algorithms; e.g. performance improvements, what could,
> > if possible/wanted, turn into a lib/ module, etc
> >
> > - compression algorithms in general; talk about algorithms to determine
> > if/how compressible a blob of data is
> > * several such algorithms already exist and are used by on-disk
> > compression tools, but for over-the-wire compression maybe the
> > fastest one with good (not great nor best) predictability
> > could work?
Ideally there could be some overlap between on-disk and over-the-wire
compression algorithm support. That could allow optimally aligned /
sized IOs to avoid unnecessary compression / decompression cycles on an
SMB server / client if the underlying filesystem supports encoded I/O
via e.g. BTRFS_IOC_ENCODED_READ/WRITE.
IIUC, we currently have:
SMB: LZ77, LZ77+Huffman (DEFLATE?), LZNT1, LZ4
Btrfs: zlib/DEFLATE, LZO, Zstd
Bcachefs: zlib/DEFLATE, LZ4, Zstd. Currently no encoded I/O support.
Cheers, David
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression
2024-03-18 10:59 ` David Disseldorp
@ 2024-03-22 21:23 ` Enzo Matsumiya
2024-03-25 10:40 ` David Disseldorp
0 siblings, 1 reply; 4+ messages in thread
From: Enzo Matsumiya @ 2024-03-22 21:23 UTC (permalink / raw)
To: David Disseldorp; +Cc: Jan Kara, lsf-pc, linux-fsdevel, linux-cifs
Hi Dave,
On 03/18, David Disseldorp wrote:
>Hi Enzo,
>
>...
>> On Thu 14-03-24 15:14:49, Enzo Matsumiya wrote:
>> > Hello,
>> >
>> > Having implemented data compression for SMB2 messages in cifs.ko, I'd
>> > like to attend LSF/MM to discuss:
>> >
>> > - implementation decisions, both in the protocol level and in the
>> > compression algorithms; e.g. performance improvements, what could,
>> > if possible/wanted, turn into a lib/ module, etc
>> >
>> > - compression algorithms in general; talk about algorithms to determine
>> > if/how compressible a blob of data is
>> > * several such algorithms already exist and are used by on-disk
>> > compression tools, but for over-the-wire compression maybe the
>> > fastest one with good (not great nor best) predictability
>> > could work?
>
>Ideally there could be some overlap between on-disk and over-the-wire
>compression algorithm support. That could allow optimally aligned /
>sized IOs to avoid unnecessary compression / decompression cycles on an
>SMB server / client if the underlying filesystem supports encoded I/O
>via e.g. BTRFS_IOC_ENCODED_READ/WRITE.
That's exactly the kind of discussion I'd be interested in when I
mentioned 'modules/subsystems with such overlapping
requirements/desire', and not only from the feature/integration
perspective, but the performance part is something I really wanted to
get right (good) from the beginning.
Which brought me to the 'how to detect uncompressible data' subject;
practical test at hand: when writing this 289MiB ISO file to an SMB
share with compression enabled, only 7 out of 69 WRITE requests
(~10%) are compressed.
(this is not the problem since SMB2 compression is supposed to be
done on a best-effort basis)
So, best effort... for 90% of this particular ISO file, cifs.ko "compressed"
those requests, reached an output with size >= to input size, discarded it
all, and sent the original uncompressed request instead => lots of CPU
cycles wasted. Would be nice to not try to compress such data right of
the bat, or at least with minimal parsing, instead.
>IIUC, we currently have:
>SMB: LZ77, LZ77+Huffman (DEFLATE?), LZNT1, LZ4
>Btrfs: zlib/DEFLATE, LZO, Zstd
>Bcachefs: zlib/DEFLATE, LZ4, Zstd. Currently no encoded I/O support.
The algorithms required by SMB2 looks generic from an initial POV,
but due to some minor, but very important, implementation details,
I couldn't make a Windows Server decompress a DEFLATE'd buffer,
for example. So I'm not really sure how such integration with other
subsystems would play out.
LZ4 might change this, but I haven't implemented it yet (btw thanks for
pointing me to its support in newest MS-SMB2 :)).
Cheers,
Enzo
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression
2024-03-22 21:23 ` Enzo Matsumiya
@ 2024-03-25 10:40 ` David Disseldorp
0 siblings, 0 replies; 4+ messages in thread
From: David Disseldorp @ 2024-03-25 10:40 UTC (permalink / raw)
To: Enzo Matsumiya; +Cc: Jan Kara, lsf-pc, linux-fsdevel, linux-cifs
Hi Enzo,
On Fri, 22 Mar 2024 18:23:54 -0300, Enzo Matsumiya wrote:
> Which brought me to the 'how to detect uncompressible data' subject;
> practical test at hand: when writing this 289MiB ISO file to an SMB
> share with compression enabled, only 7 out of 69 WRITE requests
> (~10%) are compressed.
>
> (this is not the problem since SMB2 compression is supposed to be
> done on a best-effort basis)
>
> So, best effort... for 90% of this particular ISO file, cifs.ko "compressed"
> those requests, reached an output with size >= to input size, discarded it
> all, and sent the original uncompressed request instead => lots of CPU
> cycles wasted. Would be nice to not try to compress such data right of
> the bat, or at least with minimal parsing, instead.
Sounds like storing some compressible vs non-compressible write metrics
alongside a compression-capable SMB2 FILEID would allow for a simple
attempt-compression-on-next-write prediction mechanism. However, you'd
be forced to re-learn compressibility with each reconnect or store it.
FILE_ATTRIBUTE_COMPRESSED might also be available as a (user-provided)
hint.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-03-25 10:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <rnx34bfst5gyomkwooq2pvkxsjw5mrx5vxszhz7m4hy54yuma5@huwvwzgvrrru>
2024-03-15 12:22 ` [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression Jan Kara
2024-03-18 10:59 ` David Disseldorp
2024-03-22 21:23 ` Enzo Matsumiya
2024-03-25 10:40 ` David Disseldorp
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).