From: Vivek Goyal <vgoyal@redhat.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
ebiederm@xmission.com, hpa@zytor.com, mjg59@srcf.ucam.org,
greg@kroah.com, jkosina@suse.cz, dyoung@redhat.com,
chaowang@redhat.com, bhe@redhat.com, akpm@linux-foundation.org
Subject: Re: [PATCH 10/13] kexec: Load and Relocate purgatory at kernel load time
Date: Wed, 11 Jun 2014 15:24:48 -0400 [thread overview]
Message-ID: <20140611192448.GH10723@redhat.com> (raw)
In-Reply-To: <20140610163128.GA6652@nazgul.tnic>
On Tue, Jun 10, 2014 at 06:31:28PM +0200, Borislav Petkov wrote:
[..]
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 213308a..0f24b61 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -1556,6 +1556,8 @@ source kernel/Kconfig.hz
> > config KEXEC
> > bool "kexec system call"
> > select BUILD_BIN2C
> > + select CRYPTO
> > + select CRYPTO_SHA256
>
> Ok, but why automatically enable crypto? There's still the old kexec
> method where we don't check any signatures.
Hi Boris,
Thanks for reviewing the patches.
This new syscall requires sha256 even if signature checking does not
happen. Purgatory verifies checksum of segments.
I had to select CRYPTO also otherwise CONFIG_CRYPTO=m broke the build.
>
> Which begs the more important question - shouldn't this new in-kernel
> loading method support also kexec'ing of kernels without any signature
> verifications at all?
I think yes it should allow kexecing kernels without any signature also.
In fact in long term, we should deprecate the old syscall and maintain
this new one.
Now, when does signature checking kick in? I think we can define a new
config option say KEXEC_ENFORCE_KERNEL_SIG_VERIFICATION. This option
will make sure kernel signature are verified.
If KEXEC_ENFORCE_KERNEL_SIG_VERIFICATION=n, even then signature
verification should be enforced if secureboot is enabled on the platform.
As signature verification is not part of this series, I was planning to
post these changes as part of next series.
>
> I mean, the main use case is secure boot and all but why not leave it
> configurable for people to decide?
I will make it configurable in next series. This series does not do
any signature verification yet. Above CRYPTO and CRYPTO_SHA256 I had
to select to make sure checksum verfication logic in purgatory works
fine.
[..]
> > +/* Apply purgatory relocations */
> > +int arch_kexec_apply_relocations_add(Elf64_Shdr *sechdrs,
>
> apply_..._add? "arch_kexec_apply_relocations" seems fine to me.
Hmmm.., not sure why I did this. I will change it (until and unless find
a good reason for doing this).
[..]
> > + case R_X86_64_PC32:
> > + value -= (u64)address;
> > + *(u32 *)location = value;
> > + break;
> > + default:
> > + pr_err("kexec: Unknown rela relocation: %llu\n",
>
> Yep, the "kexec: " string should come from pr_fmt as in the other mail.
Sure. Will use pr_fmt to prefix "kexec" string.
>
> > + ELF64_R_TYPE(rel[i].r_info));
> > + return -ENOEXEC;
> > + }
> > + }
> > + return 0;
> > +
> > +overflow:
> > + pr_err("kexec: overflow in relocation type %d value 0x%lx\n",
>
> Ditto.
Will do.
[..]
> > struct kimage {
> > kimage_entry_t head;
> > kimage_entry_t *entry;
> > @@ -143,6 +165,9 @@ struct kimage {
> >
> > /* Image loader handling the kernel can store a pointer here */
> > void *image_loader_data;
> > +
> > + /* Information for loading purgatory */
> > + struct purgatory_info purgatory_info;
>
> Having the member name with the same name as the struct is kinda
> confusing. Also, you could shorten it, which, in turn, would give
> shorter code lines at the sites it is accessed. I.e.,
>
> struct purgatory_info pinfo;
>
> should be just fine IMHO.
Hmm... I have seen at other places using same name as structure. But I am
not particular about it will change. Anyway, on most of the places
I use a pointer to access it.
struct purgaotry_info *pi = &image->purgatory_info;
[..]
> > +/* Apply relocations for rela section */
> > +int __attribute__ ((weak))
> > +arch_kexec_apply_relocations_add(Elf_Shdr *sechdrs, unsigned int nr_sections,
> > + unsigned int relsec)
> > +{
> > + pr_err(KERN_ERR "kexec: REL relocation unsupported\n");
>
> pr_err *and* KERN_ERR. Double error level? :-)
Yep. Will fix it. :-)
>
> > + return -ENOEXEC;
> > +}
> > +
> > /*
> > * Free up tempory buffers allocated which are not needed after image has
> > * been loaded.
> > @@ -355,6 +376,12 @@ static void kimage_file_post_load_cleanup(struct kimage *image)
> > vfree(image->cmdline_buf);
> > image->cmdline_buf = NULL;
> >
> > + vfree(image->purgatory_info.purgatory_buf);
> > + image->purgatory_info.purgatory_buf = NULL;
>
> Here's what I mean - that's definitely too long. Maybe
Sure will change it.
>
> vree(image->pinfo.pbuf);
> image->pinfo.pbuf = NULL;
>
> (yep, we shortened purgatory_buf too). Now this looks like proper kernel
> code to me :-)
I would like to retain purgaotry_buf. To shorten it I could do this.
struct purgatory_info *pi = &image->purgatory_info;
vfree(pi->purgatory_buf);
pi->purgatory_buf = NULL;
I like the clarity in variable names.
>
> > +
> > + vfree(image->purgatory_info.sechdrs);
> > + image->purgatory_info.sechdrs = NULL;
> > +
> > /* See if architcture has anything to cleanup post load */
> > arch_kimage_file_post_load_cleanup(image);
> > }
> > @@ -1370,6 +1397,10 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > if (ret)
> > goto out;
> >
> > + ret = kexec_calculate_store_digests(image);
>
> This function name could be shortened too: kexec_calc_digests() or so.
> The actual storing could be a separate kexec_store_digests. I.e., a
> function does one thing only.
I would like to keep it one function. Reason being that apart from
digest, we also store the list of regions which has been checkummed. And
you will notice that we skip the purgatory region during checksum
calculation.
So I will have to return quite some information from calc() function. Size
of digest, actual digest buffer which will need to be freed by caller,
and list of sha regions which will need to be freed by caller. Keeping
it call in one function makes it little simpler actually.
[..]
> > +/* Calculate and store the digest of segments */
> > +static int kexec_calculate_store_digests(struct kimage *image)
> > +{
> > + struct crypto_shash *tfm;
> > + struct shash_desc *desc;
> > + int ret = 0, i, j, zero_buf_sz = 256, sha_region_sz;
>
> 256 - a magic constant.
Just wanted a small zero buffer. Is there any global zero buffer available
in kernel. If not, I could use a PAGE_SIZE zero buffer instead.
>
> > + size_t desc_size, nullsz;
> > + char *digest = NULL;
> > + void *zero_buf;
> > + struct kexec_sha_region *sha_regions;
> > +
> > + tfm = crypto_alloc_shash("sha256", 0, 0);
> > + if (IS_ERR(tfm)) {
> > + ret = PTR_ERR(tfm);
> > + goto out;
>
> The "out" label kfrees digest but we haven't allocated it yet...
kfree() can handle NULL and digest is initialized to null.
>
> > + }
> > +
> > + desc_size = crypto_shash_descsize(tfm) + sizeof(*desc);
> > + desc = kzalloc(desc_size, GFP_KERNEL);
> > + if (!desc) {
> > + ret = -ENOMEM;
> > + goto out_free_tfm;
> > + }
> > +
> > + zero_buf = kzalloc(zero_buf_sz, GFP_KERNEL);
> > + if (!zero_buf) {
> > + ret = -ENOMEM;
> > + goto out_free_desc;
> > + }
> > +
> > + sha_region_sz = KEXEC_SEGMENT_MAX * sizeof(struct kexec_sha_region);
> > + sha_regions = vzalloc(sha_region_sz);
> > + if (!sha_regions)
> > + goto out_free_zero_buf;
> > +
> > + desc->tfm = tfm;
> > + desc->flags = 0;
> > +
> > + ret = crypto_shash_init(desc);
> > + if (ret < 0)
> > + goto out_free_sha_regions;
> > +
> > + digest = kzalloc(SHA256_DIGEST_SIZE, GFP_KERNEL);
> > + if (!digest) {
> > + ret = -ENOMEM;
> > + goto out_free_sha_regions;
> > + }
>
> ... but this digest is a simple allocation. Could it be you moved it
> down but forgot to readjust the labels?
I see what you are saying. I will fix the allocation ordering and lable
ordering.
>
> > +
> > + /* Traverse through all segments */
>
> Yep, we can see that. Why does it need an explicit comment?
Will remove.
>
> > + for (j = i = 0; i < image->nr_segments; i++) {
> > + struct kexec_segment *ksegment;
> > + ksegment = &image->segment[i];
> > +
> > + /*
> > + * Skip purgatory as it will be modified once we put digest
> > + * info in purgatory
> > + */
>
> Now this comment is perfect right there! :-) It needs a full-stop though.
Will do.
>
> > + if (ksegment->kbuf == image->purgatory_info.purgatory_buf)
> > + continue;
> > +
> > + ret = crypto_shash_update(desc, ksegment->kbuf,
> > + ksegment->bufsz);
>
> Arg alignment.
Will do.
>
> > + if (ret)
> > + break;
> > +
> > + nullsz = ksegment->memsz - ksegment->bufsz;
> > + while (nullsz) {
> > + unsigned long bytes = nullsz;
> > + if (bytes > zero_buf_sz)
> > + bytes = zero_buf_sz;
> > + ret = crypto_shash_update(desc, zero_buf, bytes);
> > + if (ret)
> > + break;
> > + nullsz -= bytes;
> > + }
>
> Now this trailing buffer "drainage" could very well use a comment on
> what's going on.
Sure, will put a comment.
>
> > +
> > + if (ret)
> > + break;
> > +
> > + sha_regions[j].start = ksegment->mem;
> > + sha_regions[j].len = ksegment->memsz;
> > + j++;
> > + }
> > +
> > + if (!ret) {
> > + ret = crypto_shash_final(desc, digest);
> > + if (ret)
> > + goto out_free_sha_regions;
> > + ret = kexec_purgatory_get_set_symbol(image, "sha_regions",
> > + sha_regions, sha_region_sz, 0);
> > + if (ret)
> > + goto out_free_sha_regions;
> > +
> > + ret = kexec_purgatory_get_set_symbol(image, "sha256_digest",
> > + digest, SHA256_DIGEST_SIZE, 0);
> > + if (ret)
> > + goto out_free_sha_regions;
>
> Yeah, this block could be a separate kexec_store_digests() function.
As explained above, splitting it out in a separate function requires
carrying atleast two pointers and their sizes. And these pointers will need to
be freed by store() functions. I guess keeping it right here is simpler.
>
> > + }
> > +
> > +out_free_sha_regions:
> > + vfree(sha_regions);
> > +out_free_zero_buf:
> > + kfree(zero_buf);
> > +out_free_desc:
> > + kfree(desc);
> > +out_free_tfm:
> > + kfree(tfm);
> > +out:
> > + kfree(digest);
> > + return ret;
> > +}
> > +
> > +/* Actually load and relcoate purgatory. Lot of code taken from kexec-tools */
>
> s/relcoate/relocate/
Will fix.
>
> > +static int elf_rel_load_relocate(struct kimage *image, unsigned long min,
> > + unsigned long max, int top_down)
>
> Another function which is too big and does at least two things and could
> probably be nicely split into two.
>
> > +{
> > + struct purgatory_info *pi = &image->purgatory_info;
> > + unsigned long align, buf_align, bss_align, buf_sz, bss_sz, bss_pad;
> > + unsigned long memsz, entry, load_addr, data_addr, bss_addr, off;
> > + unsigned char *buf_addr, *src;
> > + int i, ret = 0, entry_sidx = -1;
> > + Elf_Shdr *sechdrs = NULL, *sechdrs_c;
> > + void *purgatory_buf = NULL;
> > +
> > + /*
> > + * sechdrs_c points to section headers in purgatory and are read
> > + * only. No modifications allowed.
> > + */
>
> Then do
>
> const Elf_Shdr * const sechdrs_c = (void *)pi->ehdr + pi->ehdr->e_shoff;
>
> to enforce it?
Ok, will try to use it.
>
> > + sechdrs_c = (void *)pi->ehdr + pi->ehdr->e_shoff;
> > +
> > + /*
> > + * We can not modify sechdrs_c[] and its fields. It is read only.
> > + * Copy it over to a local copy where one can store some temporary
> > + * data and free it at the end. We need to modify ->sh_addr and
>
> What is freeing it when we store it into pi->sechdrs and return? Or
> doesn't it need to be freed?
kimage_file_post_load_cleanup() takes care of freeing it up. Till then
we need to keep this information around.
>
> > + * ->sh_offset fields to keep track permanent and temporary locations
> > + * of sections.
> > + */
> > + sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr));
> > + if (!sechdrs)
> > + return -ENOMEM;
> > +
> > + memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr));
> > +
> > + /*
> > + * We seem to have multiple copies of sections. First copy is which
> > + * is embedded in kernel in read only section. Some of these sections
> > + * will be copied to a temporary buffer and relocated. And these
> > + * sections will finally be copied to their final detination at
>
> "destination"
Will fix.
[..]
> > + /* Add buffer to segment list */
> > + ret = kexec_add_buffer(image, purgatory_buf, buf_sz, memsz,
> > + buf_align, min, max, top_down,
> > + &pi->purgatory_load_addr);
> > + if (ret)
> > + goto out;
> > +
> > + /* Load SHF_ALLOC sections */
>
> Here could start a new function.
I can try but I think there is too much of context information which
will need to be passed through function parameters.
[..]
> > + /* update entry based on entry section position */
> > + if (entry_sidx >= 0)
> > + entry += sechdrs[entry_sidx].sh_addr;
> > +
> > + /* Set the entry point of purgatory */
> > + image->start = entry;
> > +
> > + /* Apply relocations */
>
> >From here-on could start a new function.
You seem to be asking to make three functions, parse(), load() and
relocate(). Aagain, I will have a closer look again and see if it
easily doable. My feeling is that they are very tightly coupled and
will need many function parameters.
[..]
> > + for (i = 0; i < ehdr->e_shnum; i++) {
> > + if (sechdrs[i].sh_type != SHT_SYMTAB)
> > + continue;
> > +
> > + if (sechdrs[i].sh_link > ehdr->e_shnum)
> > + /* Invalid stratab section number */
>
> "strtab"
Will Fix.
[..]
> > +int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
> > + void *buf, unsigned int size, bool get_value)
> > +{
> > + Elf_Sym *sym;
> > + Elf_Shdr *sechdrs;
> > + struct purgatory_info *pi = &image->purgatory_info;
> > + char *sym_buf;
> > +
> > + sym = kexec_purgatory_find_symbol(pi, name);
> > + if (!sym)
> > + return -EINVAL;
> > +
> > + if (sym->st_size != size) {
> > + pr_debug("Symbol: %s size is not right\n", name);
>
> Should probably be pr_err because it is an error, right? And then
>
> pr_err("Symbol %s size mismatch: %d vs %d\n", name, sym->st_size, size);
It is an error, that's why I return -EINVAL. We are always not verbose
for all the errors. I kind of felt that it does not have to be of type
KERN_ERR. But I don't feel strongly about it. Will change it to pr_err().
Also will include additional information about expected size and actual
size.
>
> > + return -EINVAL;
> > + }
> > +
> > + sechdrs = pi->sechdrs;
> > +
> > + if (sechdrs[sym->st_shndx].sh_type == SHT_NOBITS) {
> > + pr_debug("Symbol: %s is in a bss section. Cannot get/set\n",
>
> ... Cannot %s\n", (get_value ? "get" : "set"), name);
Will do.
Thanks
Vivek
next prev parent reply other threads:[~2014-06-11 19:25 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-03 13:06 [RFC PATCH 00/13][V3] kexec: A new system call to allow in kernel loading Vivek Goyal
2014-06-03 13:06 ` [PATCH 01/13] bin2c: Move bin2c in scripts/basic Vivek Goyal
2014-06-03 16:01 ` Borislav Petkov
2014-06-03 17:13 ` Vivek Goyal
2014-06-03 13:06 ` [PATCH 02/13] kernel: Build bin2c based on config option CONFIG_BUILD_BIN2C Vivek Goyal
2014-06-04 9:13 ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 03/13] kexec: Move segment verification code in a separate function Vivek Goyal
2014-06-04 9:32 ` Borislav Petkov
2014-06-04 18:47 ` Vivek Goyal
2014-06-04 20:30 ` Borislav Petkov
2014-06-05 14:05 ` Vivek Goyal
2014-06-05 14:07 ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 04/13] resource: Provide new functions to walk through resources Vivek Goyal
2014-06-04 10:24 ` Borislav Petkov
2014-06-05 13:58 ` Vivek Goyal
2014-06-03 13:06 ` [PATCH 05/13] kexec: Make kexec_segment user buffer pointer a union Vivek Goyal
2014-06-04 10:34 ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 06/13] kexec: New syscall kexec_file_load() declaration Vivek Goyal
2014-06-04 15:18 ` Borislav Petkov
2014-06-05 9:56 ` WANG Chao
2014-06-05 15:16 ` Vivek Goyal
2014-06-05 15:22 ` Vivek Goyal
2014-06-06 6:34 ` WANG Chao
2014-06-03 13:06 ` [PATCH 07/13] kexec: Implementation of new syscall kexec_file_load Vivek Goyal
2014-06-05 11:15 ` Borislav Petkov
2014-06-05 20:17 ` Vivek Goyal
2014-06-06 2:11 ` Borislav Petkov
2014-06-06 18:02 ` Vivek Goyal
2014-06-11 14:13 ` Borislav Petkov
2014-06-11 17:04 ` Vivek Goyal
2014-06-06 6:56 ` WANG Chao
2014-06-06 18:19 ` Vivek Goyal
2014-06-09 2:11 ` Dave Young
2014-06-09 5:35 ` WANG Chao
2014-06-09 15:41 ` Vivek Goyal
2014-06-13 7:50 ` Borislav Petkov
2014-06-13 8:00 ` WANG Chao
2014-06-13 8:10 ` Borislav Petkov
2014-06-13 8:24 ` WANG Chao
2014-06-13 8:30 ` Borislav Petkov
2014-06-13 12:49 ` Vivek Goyal
2014-06-13 12:46 ` Vivek Goyal
2014-06-13 15:36 ` Borislav Petkov
2014-06-16 17:38 ` Vivek Goyal
2014-06-16 20:05 ` Borislav Petkov
2014-06-16 20:53 ` Vivek Goyal
2014-06-16 21:09 ` Borislav Petkov
2014-06-16 21:25 ` H. Peter Anvin
2014-06-16 21:43 ` Vivek Goyal
2014-06-16 22:10 ` Borislav Petkov
2014-06-16 22:49 ` H. Peter Anvin
2014-06-09 15:30 ` Vivek Goyal
2014-06-03 13:06 ` [PATCH 08/13] purgatory/sha256: Provide implementation of sha256 in purgaotory context Vivek Goyal
2014-06-03 13:06 ` [PATCH 09/13] purgatory: Core purgatory functionality Vivek Goyal
2014-06-05 20:05 ` Borislav Petkov
2014-06-06 19:51 ` Vivek Goyal
2014-06-13 10:17 ` Borislav Petkov
2014-06-16 17:25 ` Vivek Goyal
2014-06-16 20:10 ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 10/13] kexec: Load and Relocate purgatory at kernel load time Vivek Goyal
2014-06-10 16:31 ` Borislav Petkov
2014-06-11 19:24 ` Vivek Goyal [this message]
2014-06-13 16:14 ` Borislav Petkov
2014-06-03 13:07 ` [PATCH 11/13] kexec-bzImage: Support for loading bzImage using 64bit entry Vivek Goyal
2014-06-15 16:35 ` Borislav Petkov
2014-06-15 16:56 ` H. Peter Anvin
2014-06-16 20:06 ` Vivek Goyal
2014-06-16 20:57 ` Borislav Petkov
2014-06-16 21:15 ` Vivek Goyal
2014-06-16 21:27 ` Borislav Petkov
2014-06-16 21:45 ` Vivek Goyal
2014-06-24 17:31 ` Vivek Goyal
2014-06-24 18:23 ` Borislav Petkov
2014-06-03 13:07 ` [PATCH 12/13] kexec: Support for Kexec on panic using new system call Vivek Goyal
2014-06-17 21:43 ` Borislav Petkov
2014-06-18 14:20 ` Vivek Goyal
2014-06-03 13:07 ` [PATCH 13/13] kexec: Support kexec/kdump on EFI systems Vivek Goyal
2014-06-18 15:43 ` Borislav Petkov
2014-06-18 16:06 ` Borislav Petkov
2014-06-18 17:39 ` Vivek Goyal
2014-06-03 13:12 ` [RFC PATCH 00/13][V3] kexec: A new system call to allow in kernel loading Vivek Goyal
2014-06-04 9:22 ` WANG Chao
2014-06-04 17:50 ` Vivek Goyal
2014-06-04 19:39 ` Michael Kerrisk
2014-06-05 14:04 ` Vivek Goyal
2014-06-06 5:45 ` Michael Kerrisk (man-pages)
2014-06-06 18:04 ` Vivek Goyal
2014-06-05 8:31 ` Dave Young
2014-06-05 15:01 ` Vivek Goyal
2014-06-06 7:37 ` Dave Young
2014-06-06 20:04 ` Vivek Goyal
2014-06-09 1:57 ` Dave Young
2014-06-06 20:37 ` H. Peter Anvin
2014-06-06 20:58 ` Matt Fleming
2014-06-06 21:00 ` H. Peter Anvin
2014-06-06 21:02 ` Matt Fleming
2014-06-12 5:42 ` Dave Young
2014-06-12 12:36 ` Vivek Goyal
2014-06-17 14:24 ` Vivek Goyal
2014-06-18 1:45 ` Dave Young
2014-06-18 1:52 ` Dave Young
2014-06-18 12:40 ` Vivek Goyal
2014-06-16 21:13 ` Borislav Petkov
2014-06-17 13:24 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140611192448.GH10723@redhat.com \
--to=vgoyal@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=bp@alien8.de \
--cc=chaowang@redhat.com \
--cc=dyoung@redhat.com \
--cc=ebiederm@xmission.com \
--cc=greg@kroah.com \
--cc=hpa@zytor.com \
--cc=jkosina@suse.cz \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mjg59@srcf.ucam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).