All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: "Oleksii Kurochko" <oleksii.kurochko@gmail.com>,
	"Jürgen Groß" <jgross@suse.com>,
	"Jan Beulich" <jbeulich@suse.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Anthony PERARD" <anthony.perard@vates.tech>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH] tools/xg: increase LZMA_BLOCK_SIZE for uncompressing the kernel
Date: Mon, 16 Dec 2024 12:35:01 +0100	[thread overview]
Message-ID: <Z2AQZnRw6CwxQ4mo@mail-itl> (raw)
In-Reply-To: <Z1_mWbWnwNOG64ji@macbook.local>

[-- Attachment #1: Type: text/plain, Size: 4536 bytes --]

On Mon, Dec 16, 2024 at 09:35:37AM +0100, Roger Pau Monné wrote:
> Adding Oleksii, as this IMO wants to be a blocker for 4.20.
> 
> On Wed, Oct 09, 2024 at 03:03:28PM +0200, Marek Marczykowski-Górecki wrote:
> > On Wed, Oct 09, 2024 at 01:38:32PM +0200, Jürgen Groß wrote:
> > > On 09.10.24 13:15, Jan Beulich wrote:
> > > > On 09.10.2024 13:08, Andrew Cooper wrote:
> > > > > On 09/10/2024 11:26 am, Juergen Gross wrote:
> > > > > > On 09.10.24 12:19, Jan Beulich wrote:
> > > > > > > On 09.10.2024 11:52, Marek Marczykowski-Górecki wrote:
> > > > > > > > On Wed, Oct 09, 2024 at 09:19:57AM +0200, Jan Beulich wrote:
> > > > > > > > > On 08.10.2024 23:32, Marek Marczykowski-Górecki wrote:
> > > > > > > > > > --- a/tools/libs/guest/xg_dom_bzimageloader.c
> > > > > > > > > > +++ b/tools/libs/guest/xg_dom_bzimageloader.c
> > > > > > > > > > @@ -272,8 +272,7 @@ static int _xc_try_lzma_decode(
> > > > > > > > > >        return retval;
> > > > > > > > > >    }
> > > > > > > > > >    -/* 128 Mb is the minimum size (half-way) documented to work for
> > > > > > > > > > all inputs. */
> > > > > > > > > > -#define LZMA_BLOCK_SIZE (128*1024*1024)
> > > > > > > > > > +#define LZMA_BLOCK_SIZE (256*1024*1024)
> > > > > > > > > 
> > > > > > > > > That's as arbitrary as before, now just not even with a comment at
> > > > > > > > > least
> > > > > > > > > hinting at it being arbitrary. Quoting from one of the LZMA API
> > > > > > > > > headers:
> > > > > > > > > 
> > > > > > > > >       * Decoder already supports dictionaries up to 4 GiB - 1 B (i.e.
> > > > > > > > >       * UINT32_MAX), so increasing the maximum dictionary size of the
> > > > > > > > >       * encoder won't cause problems for old decoders.
> > > > > > > > > 
> > > > > > > > > IOW - what if the Linux folks decided to increase the dictionary size
> > > > > > > > > further? I therefore wonder whether we don't need to make this more
> > > > > > > > > dynamic, perhaps by peeking into the header to obtain the dictionary
> > > > > > > > > size used. The one thing I'm not sure about is whether there can't be
> > > > > > > > > multiple such headers throughout the file, and hence (in principle)
> > > > > > > > > differing dictionary sizes.
> > > > > > > > 
> > > > > > > > What is the purpose of this block size limit? From the error
> > > > > > > > message, it
> > > > > > > > seems to be avoiding excessive memory usage during decompression (which
> > > > > > > > could be DoS via OOM). If that's the case, then taking the limit from
> > > > > > > > the kernel binary itself will miss this point (especially in case of
> > > > > > > > pygrub or similar, but there may be other cases of not-fully-trusted
> > > > > > > > kernel binaries).
> > > > > > > 
> > > > > > > Indeed. The question then simply is: Where do we want to draw the line
> > > > > > > between what we permit and what we reject?
> > > > > > 
> > > > > > IMHO the most natural solution would be to use guest memory for this
> > > > > > purpose.
> > > > > > OTOH this probably would require a significant rework of libxenguest.
> > > > > 
> > > > > That was XSA-25.  There are toolstack-provided limits on kernel&initrd
> > > > > sizes.
> > > > 
> > > > Which probably can't be directly applied to dictionary size used during
> > > > (de)compression.
> > > 
> > > My point still stands: using GUEST memory for all the decompression work
> > > would avoid all these problems. If the guest memory isn't sufficient, a
> > > decompression by e.g. grub wouldn't work either.
> > 
> > Doing that would probably require mapping guest memory to dom0 for this
> > purpose. And probably quite severe changes to the decompressing code
> > (liblzma?) to actually use that memory instead of standard heap. I don't
> > think it's a feasible short term fix.
> > Theoretically this could be made configurable (if nothing else, then via
> > an env variable or even build-time setting...), but honestly it feels
> > like an overkill.
> 
> As a compromise that could likely be done in time for the release,
> would it be feasible to fetch the dictionary size from the header and
> cap it at certain boundary using max(<header val>, <boundary>)?

Isn't the current constant more or less that already? It's named
LZMA_BLOCK_SIZE, but the lzma_stream_decoder() argument it's used for is
"memlimit", described as "Memory usage limit as bytes".

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2024-12-16 11:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-08 21:32 [PATCH] tools/xg: increase LZMA_BLOCK_SIZE for uncompressing the kernel Marek Marczykowski-Górecki
2024-10-09  7:19 ` Jan Beulich
2024-10-09  9:52   ` Marek Marczykowski-Górecki
2024-10-09 10:19     ` Jan Beulich
2024-10-09 10:26       ` Juergen Gross
2024-10-09 11:08         ` Andrew Cooper
2024-10-09 11:15           ` Jan Beulich
2024-10-09 11:38             ` Jürgen Groß
2024-10-09 13:03               ` Marek Marczykowski-Górecki
2024-12-16  8:35                 ` Roger Pau Monné
2024-12-16 11:35                   ` Marek Marczykowski-Górecki [this message]
2024-12-18 17:58 ` Roger Pau Monné
2024-12-18 18:23   ` Andrew Cooper
2024-12-19 14:58   ` Anthony PERARD

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z2AQZnRw6CwxQ4mo@mail-itl \
    --to=marmarek@invisiblethingslab.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=anthony.perard@vates.tech \
    --cc=jbeulich@suse.com \
    --cc=jgross@suse.com \
    --cc=oleksii.kurochko@gmail.com \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.