public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Willem Pinckaers <willem@lekkertech.net>,
	"Don A. Bailey" <donb@securitymouse.com>,
	Willy Tarreau <w@1wt.eu>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: [ 44/48] Documentation: lzo: document part of the encoding
Date: Sun, 16 Nov 2014 22:54:12 +0100	[thread overview]
Message-ID: <20141116215330.475702542@1wt.eu> (raw)
In-Reply-To: <28c765bc23bd4bae1611534e510f49f8@local>

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Willy Tarreau <w@1wt.eu>

Add a complete description of the LZO format as processed by the
decompressor. I have not found a public specification of this format
hence this analysis, which will be used to better understand the code.

Cc: Willem Pinckaers <willem@lekkertech.net>
Cc: "Don A. Bailey" <donb@securitymouse.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d98a0526434d27e261f622cf9d2e0028b5ff1a00)
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 Documentation/lzo.txt | 164 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 164 insertions(+)
 create mode 100644 Documentation/lzo.txt

diff --git a/Documentation/lzo.txt b/Documentation/lzo.txt
new file mode 100644
index 0000000..ea45dd3
--- /dev/null
+++ b/Documentation/lzo.txt
@@ -0,0 +1,164 @@
+
+LZO stream format as understood by Linux's LZO decompressor
+===========================================================
+
+Introduction
+
+  This is not a specification. No specification seems to be publicly available
+  for the LZO stream format. This document describes what input format the LZO
+  decompressor as implemented in the Linux kernel understands. The file subject
+  of this analysis is lib/lzo/lzo1x_decompress_safe.c. No analysis was made on
+  the compressor nor on any other implementations though it seems likely that
+  the format matches the standard one. The purpose of this document is to
+  better understand what the code does in order to propose more efficient fixes
+  for future bug reports.
+
+Description
+
+  The stream is composed of a series of instructions, operands, and data. The
+  instructions consist in a few bits representing an opcode, and bits forming
+  the operands for the instruction, whose size and position depend on the
+  opcode and on the number of literals copied by previous instruction. The
+  operands are used to indicate :
+
+    - a distance when copying data from the dictionary (past output buffer)
+    - a length (number of bytes to copy from dictionary)
+    - the number of literals to copy, which is retained in variable "state"
+      as a piece of information for next instructions.
+
+  Optionally depending on the opcode and operands, extra data may follow. These
+  extra data can be a complement for the operand (eg: a length or a distance
+  encoded on larger values), or a literal to be copied to the output buffer.
+
+  The first byte of the block follows a different encoding from other bytes, it
+  seems to be optimized for literal use only, since there is no dictionary yet
+  prior to that byte.
+
+  Lengths are always encoded on a variable size starting with a small number
+  of bits in the operand. If the number of bits isn't enough to represent the
+  length, up to 255 may be added in increments by consuming more bytes with a
+  rate of at most 255 per extra byte (thus the compression ratio cannot exceed
+  around 255:1). The variable length encoding using #bits is always the same :
+
+       length = byte & ((1 << #bits) - 1)
+       if (!length) {
+               length = ((1 << #bits) - 1)
+               length += 255*(number of zero bytes)
+               length += first-non-zero-byte
+       }
+       length += constant (generally 2 or 3)
+
+  For references to the dictionary, distances are relative to the output
+  pointer. Distances are encoded using very few bits belonging to certain
+  ranges, resulting in multiple copy instructions using different encodings.
+  Certain encodings involve one extra byte, others involve two extra bytes
+  forming a little-endian 16-bit quantity (marked LE16 below).
+
+  After any instruction except the large literal copy, 0, 1, 2 or 3 literals
+  are copied before starting the next instruction. The number of literals that
+  were copied may change the meaning and behaviour of the next instruction. In
+  practice, only one instruction needs to know whether 0, less than 4, or more
+  literals were copied. This is the information stored in the <state> variable
+  in this implementation. This number of immediate literals to be copied is
+  generally encoded in the last two bits of the instruction but may also be
+  taken from the last two bits of an extra operand (eg: distance).
+
+  End of stream is declared when a block copy of distance 0 is seen. Only one
+  instruction may encode this distance (0001HLLL), it takes one LE16 operand
+  for the distance, thus requiring 3 bytes.
+
+  IMPORTANT NOTE : in the code some length checks are missing because certain
+  instructions are called under the assumption that a certain number of bytes
+  follow because it has already been garanteed before parsing the instructions.
+  They just have to "refill" this credit if they consume extra bytes. This is
+  an implementation design choice independant on the algorithm or encoding.
+
+Byte sequences
+
+  First byte encoding :
+
+      0..17   : follow regular instruction encoding, see below. It is worth
+                noting that codes 16 and 17 will represent a block copy from
+                the dictionary which is empty, and that they will always be
+                invalid at this place.
+
+      18..21  : copy 0..3 literals
+                state = (byte - 17) = 0..3  [ copy <state> literals ]
+                skip byte
+
+      22..255 : copy literal string
+                length = (byte - 17) = 4..238
+                state = 4 [ don't copy extra literals ]
+                skip byte
+
+  Instruction encoding :
+
+      0 0 0 0 X X X X  (0..15)
+        Depends on the number of literals copied by the last instruction.
+        If last instruction did not copy any literal (state == 0), this
+        encoding will be a copy of 4 or more literal, and must be interpreted
+        like this :
+
+           0 0 0 0 L L L L  (0..15)  : copy long literal string
+           length = 3 + (L ?: 15 + (zero_bytes * 255) + non_zero_byte)
+           state = 4  (no extra literals are copied)
+
+        If last instruction used to copy between 1 to 3 literals (encoded in
+        the instruction's opcode or distance), the instruction is a copy of a
+        2-byte block from the dictionary within a 1kB distance. It is worth
+        noting that this instruction provides little savings since it uses 2
+        bytes to encode a copy of 2 other bytes but it encodes the number of
+        following literals for free. It must be interpreted like this :
+
+           0 0 0 0 D D S S  (0..15)  : copy 2 bytes from <= 1kB distance
+           length = 2
+           state = S (copy S literals after this block)
+         Always followed by exactly one byte : H H H H H H H H
+           distance = (H << 2) + D + 1
+
+        If last instruction used to copy 4 or more literals (as detected by
+        state == 4), the instruction becomes a copy of a 3-byte block from the
+        dictionary from a 2..3kB distance, and must be interpreted like this :
+
+           0 0 0 0 D D S S  (0..15)  : copy 3 bytes from 2..3 kB distance
+           length = 3
+           state = S (copy S literals after this block)
+         Always followed by exactly one byte : H H H H H H H H
+           distance = (H << 2) + D + 2049
+
+      0 0 0 1 H L L L  (16..31)
+           Copy of a block within 16..48kB distance (preferably less than 10B)
+           length = 2 + (L ?: 7 + (zero_bytes * 255) + non_zero_byte)
+        Always followed by exactly one LE16 :  D D D D D D D D : D D D D D D S S
+           distance = 16384 + (H << 14) + D
+           state = S (copy S literals after this block)
+           End of stream is reached if distance == 16384
+
+      0 0 1 L L L L L  (32..63)
+           Copy of small block within 16kB distance (preferably less than 34B)
+           length = 2 + (L ?: 31 + (zero_bytes * 255) + non_zero_byte)
+        Always followed by exactly one LE16 :  D D D D D D D D : D D D D D D S S
+           distance = D + 1
+           state = S (copy S literals after this block)
+
+      0 1 L D D D S S  (64..127)
+           Copy 3-4 bytes from block within 2kB distance
+           state = S (copy S literals after this block)
+           length = 3 + L
+         Always followed by exactly one byte : H H H H H H H H
+           distance = (H << 3) + D + 1
+
+      1 L L D D D S S  (128..255)
+           Copy 5-8 bytes from block within 2kB distance
+           state = S (copy S literals after this block)
+           length = 5 + L
+         Always followed by exactly one byte : H H H H H H H H
+           distance = (H << 3) + D + 1
+
+Authors
+
+  This document was written by Willy Tarreau <w@1wt.eu> on 2014/07/19 during an
+  analysis of the decompression code available in Linux 3.16-rc5. The code is
+  tricky, it is possible that this document contains mistakes or that a few
+  corner cases were overlooked. In any case, please report any doubt, fix, or
+  proposed updates to the author(s) so that the document can be updated.
-- 
1.7.12.2.21.g234cd45.dirty




  parent reply	other threads:[~2014-11-16 22:00 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <28c765bc23bd4bae1611534e510f49f8@local>
2014-11-16 21:53 ` [ 00/48] 2.6.32.64-longterm review Willy Tarreau
2014-11-16 21:53 ` [ 01/48] x86_32, entry: Do syscall exit work on badsys Willy Tarreau
2014-11-16 21:53 ` [ 02/48] x86_32, entry: Store badsys error code in %eax Willy Tarreau
2014-11-16 21:53 ` [ 03/48] x86_32, entry: Clean up sysenter_badsys declaration Willy Tarreau
2014-11-16 21:53 ` [ 04/48] MIPS: Cleanup flags in syscall flags handlers Willy Tarreau
2014-11-16 21:53 ` [ 05/48] MIPS: asm: thread_info: Add _TIF_SECCOMP flag Willy Tarreau
2014-11-16 21:53 ` [ 06/48] fix autofs/afs/etc. magic mountpoint breakage Willy Tarreau
2014-11-16 21:53 ` [ 07/48] ALSA: control: Make sure that id->index does not Willy Tarreau
2014-11-16 21:53 ` [ 08/48] ALSA: control: Handle numid overflow Willy Tarreau
2014-11-16 21:53 ` [ 09/48] sctp: Fix sk_ack_backlog wrap-around problem Willy Tarreau
2014-11-16 21:53 ` [ 10/48] mm: try_to_unmap_cluster() should lock_page() before Willy Tarreau
2014-11-16 21:53 ` [ 11/48] filter: prevent nla extensions to peek beyond the end Willy Tarreau
2014-11-16 21:53 ` [ 12/48] ALSA: control: Protect user controls against Willy Tarreau
2014-11-16 21:53 ` [ 13/48] ptrace,x86: force IRET path after a ptrace_stop() Willy Tarreau
2014-11-16 21:53 ` [ 14/48] sym53c8xx_2: Set DID_REQUEUE return code when aborting Willy Tarreau
2014-11-16 21:53 ` [ 15/48] tcp: fix tcp_match_skb_to_sack() for unaligned SACK at Willy Tarreau
2014-11-16 21:53 ` [ 16/48] igmp: fix the problem when mc leave group Willy Tarreau
2014-11-16 21:53 ` [ 17/48] appletalk: Fix socket referencing in skb Willy Tarreau
2014-11-16 21:53 ` [ 18/48] net: sctp: fix information leaks in ulpevent layer Willy Tarreau
2014-11-16 21:53 ` [ 19/48] sunvnet: clean up objects created in vnet_new() on Willy Tarreau
2014-11-16 21:53 ` [ 20/48] ipv4: fix buffer overflow in ip_options_compile() Willy Tarreau
2014-11-16 21:53 ` [ 21/48] net: sctp: inherit auth_capable on INIT collisions Willy Tarreau
2014-11-16 21:53 ` [ 22/48] net: sendmsg: fix NULL pointer dereference Willy Tarreau
2014-12-01 11:45   ` Luis Henriques
2014-12-01 12:30     ` Willy Tarreau
2014-11-16 21:53 ` [ 23/48] tcp: Fix integer-overflows in TCP veno Willy Tarreau
2014-11-16 21:53 ` [ 24/48] tcp: Fix integer-overflow in TCP vegas Willy Tarreau
2014-11-16 21:53 ` [ 25/48] macvlan: Initialize vlan_features to turn on offload Willy Tarreau
2014-11-16 21:53 ` [ 26/48] net: Correctly set segment mac_len in skb_segment() Willy Tarreau
2014-11-16 21:53 ` [ 27/48] iovec: make sure the caller actually wants anything in Willy Tarreau
2014-11-16 21:53 ` [ 28/48] sctp: fix possible seqlock seadlock in Willy Tarreau
2014-11-16 21:53 ` [ 29/48] Revert "nfsd: correctly handle return value from Willy Tarreau
2014-11-16 21:53 ` [ 30/48] dm crypt: fix access beyond the end of allocated space Willy Tarreau
2014-11-16 21:53 ` [ 31/48] gianfar: disable vlan tag insertion by default Willy Tarreau
2014-11-16 21:54 ` [ 32/48] USB: kobil_sct: fix non-atomic allocation in write Willy Tarreau
2014-11-16 21:54 ` [ 33/48] fix misuses of f_count() in ppp and netlink Willy Tarreau
2014-11-16 21:54 ` [ 34/48] net: sctp: fix skb_over_panic when receiving malformed Willy Tarreau
2014-11-16 21:54 ` [ 35/48] tty: Fix high cpu load if tty is unreleaseable Willy Tarreau
2014-11-16 21:54 ` [ 36/48] netfilter: nf_log: account for size of NLMSG_DONE Willy Tarreau
2014-11-16 21:54 ` [ 37/48] netfilter: nfnetlink_log: fix maximum packet length Willy Tarreau
2014-11-16 21:54 ` [ 38/48] ring-buffer: Always reset iterator to reader page Willy Tarreau
2014-11-16 21:54 ` [ 39/48] md/raid6: avoid data corruption during recovery of Willy Tarreau
2014-11-16 21:54 ` [ 40/48] net: pppoe: use correct channel MTU when using Willy Tarreau
2014-11-16 21:54 ` [ 41/48] ARM: 7668/1: fix memset-related crashes caused by Willy Tarreau
2014-11-16 21:54 ` [ 42/48] ARM: 7670/1: fix the memset fix Willy Tarreau
2014-11-16 21:54 ` [ 43/48] lib/lzo: Update LZO compression to current upstream Willy Tarreau
2014-11-16 21:54 ` Willy Tarreau [this message]
2014-11-16 21:54 ` [ 45/48] lzo: check for length overrun in variable length Willy Tarreau
2014-11-16 21:54 ` [ 46/48] USB: add new zte 3g-dongles pid to option.c Willy Tarreau
2014-11-16 21:54 ` [ 47/48] futex: Unlock hb->lock in futex_wait_requeue_pi() Willy Tarreau
2014-11-16 21:54 ` [ 48/48] isofs: Fix unbounded recursion when processing Willy Tarreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141116215330.475702542@1wt.eu \
    --to=w@1wt.eu \
    --cc=donb@securitymouse.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=willem@lekkertech.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox