public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Kees Cook <keescook@chromium.org>
Cc: Yinghai Lu <yinghai@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Baoquan He <bhe@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bp@alien8.de>, Vivek Goyal <vgoyal@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	lasse.collin@tukaani.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Young <dyoung@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer
Date: Fri, 29 Apr 2016 09:18:05 +0200	[thread overview]
Message-ID: <20160429071805.GC28320@gmail.com> (raw)
In-Reply-To: <1461888548-32439-3-git-send-email-keescook@chromium.org>


* Kees Cook <keescook@chromium.org> wrote:

> From: Yinghai Lu <yinghai@kernel.org>
> 
> This change makes later calculations about where the kernel is located
> easier to reason about. To better understand this change, we must first
> clarify what VO and ZO are. They were introduced in commits by hpa:
> 
> 77d1a49 x86, boot: make symbols from the main vmlinux available
> 37ba7ab x86, boot: make kernel_alignment adjustable; new bzImage fields
> 
> Specifically:
> 
> VO:
> - uncompressed kernel image
> - size: VO__end - VO__text ("VO_INIT_SIZE" define)
> 
> ZO:
> - bootable compressed kernel image (boot/compressed/vmlinux)
> - head text + compressed kernel (VO and relocs table) + decompressor code
> - size: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)
> 
> The INIT_SIZE definition is used to find the larger of the two image sizes:
> 
>  #define ZO_INIT_SIZE    (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
>  #define VO_INIT_SIZE    (VO__end - VO__text)
>  #if ZO_INIT_SIZE > VO_INIT_SIZE
>  #define INIT_SIZE ZO_INIT_SIZE
>  #else
>  #define INIT_SIZE VO_INIT_SIZE
>  #endif
> 
> The current code uses extract_offset to decide where to position the
> copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
> currently includes the extract_offset.)

Yeah, so I rewrote the above to:

=================>
This change makes later calculations about where the kernel is located
easier to reason about. To better understand this change, we must first
clarify what 'VO' and 'ZO' are. These values were introduced in commits
by hpa:

  77d1a4999502 ("x86, boot: make symbols from the main vmlinux available")
  37ba7ab5e33c ("x86, boot: make kernel_alignment adjustable; new bzImage fields")

Specifically:

All names prefixed with 'VO_':

 - relate to the uncompressed kernel image

 - the size of the VO image is: VO__end-VO__text ("VO_INIT_SIZE" define)

All names prefixed with 'ZO_':

 - relate to the bootable compressed kernel image (boot/compressed/vmlinux),
   which is composed of the following memory areas:
     - head text
     - compressed kernel (VO image and relocs table)
     - decompressor code

 - the size of the ZO image is: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)

The 'INIT_SIZE' value is used to find the larger of the two image sizes:

 #define ZO_INIT_SIZE    (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
 #define VO_INIT_SIZE    (VO__end - VO__text)

 #if ZO_INIT_SIZE > VO_INIT_SIZE
 # define INIT_SIZE ZO_INIT_SIZE
 #else
 # define INIT_SIZE VO_INIT_SIZE
 #endif

The current code uses extract_offset to decide where to position the
copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
currently includes the extract_offset.)
<=================

Assuming the edits I made are correct, this is the point where the changelog lost 
me. It does not explain why ZO_z_extract_offset exists. Why isn't the ZO copied to 
offset 0?

I had to go into arch/x86/boot/compressed/mkpiggy.c, where ZO_z_extract_offset is 
generated, to find the answer: it's needed because we are trying to minimize the 
amount of RAM used for the whole act of creating an uncompressed, executable, 
properly relocation-linked kernel image in system memory. We do this so that 
kernels can be booted on even very small systems.

To achieve the goal of minimal memory consumption we have implemented an in-place 
decompression strategy: instead of cleanly separating the VO and ZO images and 
also allocating some memory for the decompression code's runtime needs, we instead 
create this elaborate layout of memory buffers where the output (decompressed) 
stream, as it progresses, overlaps with and destroys the input (compressed) 
stream. This can only be done safely if the ZO image is placed to the end of the 
VO range, plus a certain amount of safety distance to make sure that when the last 
bytes of the VO range are decompressed, the compressed stream pointer is safely 
beyond the end of the VO range. Correct?

This is a very essential central concept to the whole code, but nowhere is it 
described clearly!

But more importantly, especially in view of address space randomization, we should 
realize that the days of 8 MB i386-DX systems are gone, and we should get rid of 
all this crazy obfuscation that is hindering development in this area. I also 
suspect that the actual temporary allocation size reduction savings from this 
trick are relatively small, compared to the resulting total memory size.

So my suggestion: let's just cleanly separate all the data areas and not try to do 
any clever overlapping: the benefit will be minimal, and any system that has main 
RAM less than twice of the VO+ZO image sizes is fundamentally unbootable and 
unusable anyway.

I.e. have a really clean size calculation of:

	ZO + VO + decompressor-stacks-size + decompressor-data-size

and decompress accordingly without tricks, without overlaps, without any chance 
for corruption - and, most importantly, without this metric ton of obfuscation 
that very few people have managed to fight their way through in the last couple of 
years, and which hinders essential features ...

Agreed?

Thanks,

	Ingo

  reply	other threads:[~2016-04-29  7:18 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29  0:09 [PATCH 0/6] x86/boot: Improve compressed kernel handling Kees Cook
2016-04-29  0:09 ` [PATCH 1/6] x86/KASLR: Handle kernel relocation above 2G Kees Cook
2016-04-29  8:02   ` [tip:x86/boot] x86/KASLR: Handle kernel relocations above 2G correctly tip-bot for Baoquan He
2016-04-29  0:09 ` [PATCH 2/6] x86/boot: Move compressed kernel to end of decompression buffer Kees Cook
2016-04-29  7:18   ` Ingo Molnar [this message]
2016-04-29  7:48     ` Kees Cook
2016-04-29  8:07       ` Ingo Molnar
2016-04-29  9:51         ` Ingo Molnar
2016-04-29  9:51   ` [tip:x86/boot] x86/boot: Move compressed kernel to the end of the " tip-bot for Yinghai Lu
2016-08-16  4:01   ` [PATCH 2/6] x86/boot: Move compressed kernel to end of " Matt Mullins
2016-08-16 19:19     ` Yinghai Lu
2016-08-17  2:25       ` Matt Mullins
2016-10-03 21:50         ` Simon Glass
2016-11-30 16:52           ` Andy Shevchenko
2016-04-29  0:09 ` [PATCH 3/6] x86/boot: Calculate decompression size during boot not build Kees Cook
2016-04-29  9:52   ` [tip:x86/boot] " tip-bot for Yinghai Lu
2016-04-29  0:09 ` [PATCH 4/6] x86/boot: Fix "run_size" calculation Kees Cook
2016-04-29  9:52   ` [tip:x86/boot] " tip-bot for Yinghai Lu
2016-04-29  0:09 ` [PATCH 5/6] x86/KASLR: Clean up unused code from old "run_size" Kees Cook
2016-04-29  9:52   ` [tip:x86/boot] x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size' tip-bot for Yinghai Lu
2016-04-29  0:09 ` [PATCH 6/6] x86/boot: Correctly bounds-check relocations Kees Cook
2016-04-29  9:53   ` [tip:x86/boot] " tip-bot for Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160429071805.GC28320@gmail.com \
    --to=mingo@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=dyoung@redhat.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=lasse.collin@tukaani.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox