* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression [not found] <rnx34bfst5gyomkwooq2pvkxsjw5mrx5vxszhz7m4hy54yuma5@huwvwzgvrrru> @ 2024-03-15 12:22 ` Jan Kara 2024-03-18 10:59 ` David Disseldorp 0 siblings, 1 reply; 4+ messages in thread From: Jan Kara @ 2024-03-15 12:22 UTC (permalink / raw) To: Enzo Matsumiya; +Cc: lsf-pc, linux-fsdevel, linux-cifs Hello Enzo, it is good to also CC appropriate public mailing lists so that other attendees can discuss about your proposal. Added some I found relevant. Honza On Thu 14-03-24 15:14:49, Enzo Matsumiya wrote: > Hello, > > Having implemented data compression for SMB2 messages in cifs.ko, I'd > like to attend LSF/MM to discuss: > > - implementation decisions, both in the protocol level and in the > compression algorithms; e.g. performance improvements, what could, > if possible/wanted, turn into a lib/ module, etc > > - compression algorithms in general; talk about algorithms to determine > if/how compressible a blob of data is > * several such algorithms already exist and are used by on-disk > compression tools, but for over-the-wire compression maybe the > fastest one with good (not great nor best) predictability > could work? > > - overlapping modules/areas that have the need/desire to compress > transmitting data and their status quo in the topic; difficulties > where I could help and/or achievements that I could learn from > > > Cheers, > > Enzo > _______________________________________________ > Lsf-pc mailing list > Lsf-pc@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/lsf-pc > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression 2024-03-15 12:22 ` [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression Jan Kara @ 2024-03-18 10:59 ` David Disseldorp 2024-03-22 21:23 ` Enzo Matsumiya 0 siblings, 1 reply; 4+ messages in thread From: David Disseldorp @ 2024-03-18 10:59 UTC (permalink / raw) To: Enzo Matsumiya; +Cc: Jan Kara, lsf-pc, linux-fsdevel, linux-cifs Hi Enzo, ... > On Thu 14-03-24 15:14:49, Enzo Matsumiya wrote: > > Hello, > > > > Having implemented data compression for SMB2 messages in cifs.ko, I'd > > like to attend LSF/MM to discuss: > > > > - implementation decisions, both in the protocol level and in the > > compression algorithms; e.g. performance improvements, what could, > > if possible/wanted, turn into a lib/ module, etc > > > > - compression algorithms in general; talk about algorithms to determine > > if/how compressible a blob of data is > > * several such algorithms already exist and are used by on-disk > > compression tools, but for over-the-wire compression maybe the > > fastest one with good (not great nor best) predictability > > could work? Ideally there could be some overlap between on-disk and over-the-wire compression algorithm support. That could allow optimally aligned / sized IOs to avoid unnecessary compression / decompression cycles on an SMB server / client if the underlying filesystem supports encoded I/O via e.g. BTRFS_IOC_ENCODED_READ/WRITE. IIUC, we currently have: SMB: LZ77, LZ77+Huffman (DEFLATE?), LZNT1, LZ4 Btrfs: zlib/DEFLATE, LZO, Zstd Bcachefs: zlib/DEFLATE, LZ4, Zstd. Currently no encoded I/O support. Cheers, David ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression 2024-03-18 10:59 ` David Disseldorp @ 2024-03-22 21:23 ` Enzo Matsumiya 2024-03-25 10:40 ` David Disseldorp 0 siblings, 1 reply; 4+ messages in thread From: Enzo Matsumiya @ 2024-03-22 21:23 UTC (permalink / raw) To: David Disseldorp; +Cc: Jan Kara, lsf-pc, linux-fsdevel, linux-cifs Hi Dave, On 03/18, David Disseldorp wrote: >Hi Enzo, > >... >> On Thu 14-03-24 15:14:49, Enzo Matsumiya wrote: >> > Hello, >> > >> > Having implemented data compression for SMB2 messages in cifs.ko, I'd >> > like to attend LSF/MM to discuss: >> > >> > - implementation decisions, both in the protocol level and in the >> > compression algorithms; e.g. performance improvements, what could, >> > if possible/wanted, turn into a lib/ module, etc >> > >> > - compression algorithms in general; talk about algorithms to determine >> > if/how compressible a blob of data is >> > * several such algorithms already exist and are used by on-disk >> > compression tools, but for over-the-wire compression maybe the >> > fastest one with good (not great nor best) predictability >> > could work? > >Ideally there could be some overlap between on-disk and over-the-wire >compression algorithm support. That could allow optimally aligned / >sized IOs to avoid unnecessary compression / decompression cycles on an >SMB server / client if the underlying filesystem supports encoded I/O >via e.g. BTRFS_IOC_ENCODED_READ/WRITE. That's exactly the kind of discussion I'd be interested in when I mentioned 'modules/subsystems with such overlapping requirements/desire', and not only from the feature/integration perspective, but the performance part is something I really wanted to get right (good) from the beginning. Which brought me to the 'how to detect uncompressible data' subject; practical test at hand: when writing this 289MiB ISO file to an SMB share with compression enabled, only 7 out of 69 WRITE requests (~10%) are compressed. (this is not the problem since SMB2 compression is supposed to be done on a best-effort basis) So, best effort... for 90% of this particular ISO file, cifs.ko "compressed" those requests, reached an output with size >= to input size, discarded it all, and sent the original uncompressed request instead => lots of CPU cycles wasted. Would be nice to not try to compress such data right of the bat, or at least with minimal parsing, instead. >IIUC, we currently have: >SMB: LZ77, LZ77+Huffman (DEFLATE?), LZNT1, LZ4 >Btrfs: zlib/DEFLATE, LZO, Zstd >Bcachefs: zlib/DEFLATE, LZ4, Zstd. Currently no encoded I/O support. The algorithms required by SMB2 looks generic from an initial POV, but due to some minor, but very important, implementation details, I couldn't make a Windows Server decompress a DEFLATE'd buffer, for example. So I'm not really sure how such integration with other subsystems would play out. LZ4 might change this, but I haven't implemented it yet (btw thanks for pointing me to its support in newest MS-SMB2 :)). Cheers, Enzo ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression 2024-03-22 21:23 ` Enzo Matsumiya @ 2024-03-25 10:40 ` David Disseldorp 0 siblings, 0 replies; 4+ messages in thread From: David Disseldorp @ 2024-03-25 10:40 UTC (permalink / raw) To: Enzo Matsumiya; +Cc: Jan Kara, lsf-pc, linux-fsdevel, linux-cifs Hi Enzo, On Fri, 22 Mar 2024 18:23:54 -0300, Enzo Matsumiya wrote: > Which brought me to the 'how to detect uncompressible data' subject; > practical test at hand: when writing this 289MiB ISO file to an SMB > share with compression enabled, only 7 out of 69 WRITE requests > (~10%) are compressed. > > (this is not the problem since SMB2 compression is supposed to be > done on a best-effort basis) > > So, best effort... for 90% of this particular ISO file, cifs.ko "compressed" > those requests, reached an output with size >= to input size, discarded it > all, and sent the original uncompressed request instead => lots of CPU > cycles wasted. Would be nice to not try to compress such data right of > the bat, or at least with minimal parsing, instead. Sounds like storing some compressible vs non-compressible write metrics alongside a compression-capable SMB2 FILEID would allow for a simple attempt-compression-on-next-write prediction mechanism. However, you'd be forced to re-learn compressibility with each reconnect or store it. FILE_ATTRIBUTE_COMPRESSED might also be available as a (user-provided) hint. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-03-25 10:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <rnx34bfst5gyomkwooq2pvkxsjw5mrx5vxszhz7m4hy54yuma5@huwvwzgvrrru>
2024-03-15 12:22 ` [Lsf-pc] [LSF/MM ATTEND] Over-the-wire data compression Jan Kara
2024-03-18 10:59 ` David Disseldorp
2024-03-22 21:23 ` Enzo Matsumiya
2024-03-25 10:40 ` David Disseldorp
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).