linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	ebiederm@xmission.com, hpa@zytor.com, mjg59@srcf.ucam.org,
	greg@kroah.com, jkosina@suse.cz, dyoung@redhat.com,
	chaowang@redhat.com, bhe@redhat.com, akpm@linux-foundation.org
Subject: Re: [PATCH 12/13] kexec: Support for Kexec on panic using new system call
Date: Wed, 18 Jun 2014 10:20:58 -0400	[thread overview]
Message-ID: <20140618142058.GA16552@redhat.com> (raw)
In-Reply-To: <20140617214310.GC20603@pd.tnic>

On Tue, Jun 17, 2014 at 11:43:10PM +0200, Borislav Petkov wrote:

[..]
> > diff --git a/arch/x86/include/asm/crash.h b/arch/x86/include/asm/crash.h
> > new file mode 100644
> > index 0000000..2dd2eb8
> > --- /dev/null
> > +++ b/arch/x86/include/asm/crash.h
> > @@ -0,0 +1,9 @@
> > +#ifndef _ASM_X86_CRASH_H
> > +#define _ASM_X86_CRASH_H
> > +
> > +int load_crashdump_segments(struct kimage *image);
> 
> I guess crash_load_segments(..) as you're prefixing the other exported
> functions with "crash_".

Ok, I can make that change.

[..]
> > +/* Alignment required for elf header segment */
> > +#define ELF_CORE_HEADER_ALIGN   4096
> > +
> > +/* This primarily reprsents number of split ranges due to exclusion */
> 
> "represents"

Will do.

> 
> > +#define CRASH_MAX_RANGES	16
> > +
> > +struct crash_mem_range {
> > +	u64 start, end;
> > +};
> > +
> > +struct crash_mem {
> > +	unsigned int nr_ranges;
> > +	struct crash_mem_range ranges[CRASH_MAX_RANGES];
> > +};
> > +
> > +/* Misc data about ram ranges needed to prepare elf headers */
> > +struct crash_elf_data {
> > +	struct kimage *image;
> > +	/*
> > +	 * Total number of ram ranges we have after various ajustments for
> 
> "adjustments"

Will do.

[..]
> > @@ -39,6 +82,7 @@ int in_crash_kexec;
> >   */
> >  crash_vmclear_fn __rcu *crash_vmclear_loaded_vmcss = NULL;
> >  EXPORT_SYMBOL_GPL(crash_vmclear_loaded_vmcss);
> > +unsigned long crash_zero_bytes;
> 
> Ah, that's the empty_zero_page...

Ok, will look into moving to empty_zero_page.

[..]
> > +static int fill_up_crash_elf_data(struct crash_elf_data *ced,
> > +					struct kimage *image)
> > +{
> > +	unsigned int nr_ranges = 0;
> > +
> > +	ced->image = image;
> > +
> > +	walk_system_ram_range(0, -1, &nr_ranges,
> > +				get_nr_ram_ranges_callback);
> > +
> > +	ced->max_nr_ranges = nr_ranges;
> > +
> > +	/*
> > +	 * We don't create ELF headers for GART aperture as an attempt
> > +	 * to dump this memory in second kernel leads to hang/crash.
> > +	 * If gart aperture is present, one needs to exclude that region
> > +	 * and that could lead to need of extra phdr.
> > +	 */
> > +	walk_ram_res("GART", IORESOURCE_MEM, 0, -1,
> > +				ced, get_gart_ranges_callback);
> > +
> > +	/*
> > +	 * If we have gart region, excluding that could potentially split
> > +	 * a memory range, resulting in extra header. Account for  that.
> > +	 */
> > +	if (ced->gart_end)
> > +		ced->max_nr_ranges++;
> > +
> > +	/* Exclusion of crash region could split memory ranges */
> > +	ced->max_nr_ranges++;
> > +
> > +	/* If crashk_low_res is there, another range split possible */
> 
> You mean "is not 0"?

Yes. Will make comment more clear.

> 
> > +	if (crashk_low_res.end != 0)
> > +		ced->max_nr_ranges++;
> > +
> > +	return 0;
> 
> Returns unconditional 0 - make function void then.

Will do.

[..]
> > +		if (mstart > start && mend < end) {
> > +			/* Split original range */
> > +			mem->ranges[i].end = mstart - 1;
> > +			temp_range.start = mend + 1;
> > +			temp_range.end = end;
> > +		} else if (mstart != start)
> > +			mem->ranges[i].end = mstart - 1;
> > +		else
> > +			mem->ranges[i].start = mend + 1;
> > +		break;
> > +	}
> > +
> > +	/* If a split happend, add the split in array */
> 
> "happened" ... "split to array"

Ok. Will fix.


> 
> > +	if (!temp_range.end)
> > +		return 0;
> > +
> > +	/* Split happened */
> > +	if (i == CRASH_MAX_RANGES - 1) {
> > +		pr_err("Too many crash ranges after split\n");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	/* Location where new range should go */
> > +	j = i + 1;
> > +	if (j < mem->nr_ranges) {
> > +		/* Move over all ranges one place */
> 
> 			...  all ranges one slot towards the end */
> 

Will change.

[..]
> > +static int prepare_elf64_headers(struct crash_elf_data *ced,
> > +		void **addr, unsigned long *sz)
> > +{
> > +	Elf64_Ehdr *ehdr;
> > +	Elf64_Phdr *phdr;
> > +	unsigned long nr_cpus = num_possible_cpus(), nr_phdr, elf_sz;
> > +	unsigned char *buf, *bufp;
> > +	unsigned int cpu;
> > +	unsigned long long notes_addr;
> > +	int ret;
> > +
> > +	/* extra phdr for vmcoreinfo elf note */
> > +	nr_phdr = nr_cpus + 1;
> > +	nr_phdr += ced->max_nr_ranges;
> > +
> > +	/*
> > +	 * kexec-tools creates an extra PT_LOAD phdr for kernel text mapping
> > +	 * area on x86_64 (ffffffff80000000 - ffffffffa0000000).
> > +	 * I think this is required by tools like gdb. So same physical
> > +	 * memory will be mapped in two elf headers. One will contain kernel
> > +	 * text virtual addresses and other will have __va(physical) addresses.
> > +	 */
> > +
> > +	nr_phdr++;
> > +	elf_sz = sizeof(Elf64_Ehdr) + nr_phdr * sizeof(Elf64_Phdr);
> > +	elf_sz = ALIGN(elf_sz, ELF_CORE_HEADER_ALIGN);
> > +
> > +	buf = vzalloc(elf_sz);
> 
> Since you get zeroed memory, you can save yourself all assignments to 0
> below and thus slim this already terse function.

Will do.

[..]
> > +static int add_e820_entry(struct boot_params *params, struct e820entry *entry)
> > +{
> > +	unsigned int nr_e820_entries;
> > +
> > +	nr_e820_entries = params->e820_entries;
> > +	if (nr_e820_entries >= E820MAX)
> > +		return 1;
> 
> You're not testing for the error condition in any call site. Are we sure
> we will never hit E820MAX?

Actually there can be. Right now I am just handling the case of passing
as many e820 enties as can fit in bootparams and ignoring rest. Ideally
momory ranges more than E820MAX should be passed through setup data and I
have not handled that case yet.

Very few systems should run into that kind of scenario. I was thinking
that once these patches are in, I can look into enabling passing of
more than E820MAX entries using setup data.

I will put a TODO comment.

> 
> > +
> > +	memcpy(&params->e820_map[nr_e820_entries], entry,
> > +			sizeof(struct e820entry));
> > +	params->e820_entries++;
> > +	return 0;
> > +}
> > +
> > +static int memmap_entry_callback(u64 start, u64 end, void *arg)
> > +{
> > +	struct crash_memmap_data *cmd = arg;
> > +	struct boot_params *params = cmd->params;
> > +	struct e820entry ei;
> > +
> > +	ei.addr = start;
> > +	ei.size = end - start + 1;
> > +	ei.type = cmd->type;
> > +	add_e820_entry(params, &ei);
> > +
> > +	return 0;
> > +}
> > +
> > +static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
> > +		unsigned long long mstart, unsigned long long mend)
> 
> Arg alignment... multiple occurrences in this patch.

Will fix.

> 
> > +{
> > +	unsigned long start, end;
> > +	int ret = 0;
> > +
> > +	memset(cmem->ranges, 0, sizeof(cmem->ranges));
> > +
> > +	cmem->ranges[0].start = mstart;
> > +	cmem->ranges[0].end = mend;
> > +	cmem->nr_ranges = 1;
> > +
> > +	/* Exclude Backup region */
> > +	start = image->arch.backup_load_addr;
> > +	end = start + image->arch.backup_src_sz - 1;
> > +	ret = exclude_mem_range(cmem, start, end);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Exclude elf header region */
> > +	start = image->arch.elf_load_addr;
> > +	end = start + image->arch.elf_headers_sz - 1;
> > +	ret = exclude_mem_range(cmem, start, end);
> > +	return ret;
> 
> 	return exclude_mem_range(cmem, start, end);

Will change.

> 
> > +}
> > +
> > +/* Prepare memory map for crash dump kernel */
> > +int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
> > +{
> > +	int i, ret = 0;
> > +	unsigned long flags;
> > +	struct e820entry ei;
> > +	struct crash_memmap_data cmd;
> > +	struct crash_mem *cmem;
> > +
> > +	cmem = vzalloc(sizeof(struct crash_mem));
> > +	if (!cmem)
> > +		return -ENOMEM;
> 
> You're getting zeroed memory already but you're zeroing it out again
> above in memmap_exclude_ranges().

Will remove extra zeoring.

[..]
> > +	/* Exclude some ranges from crashk_res and add rest to memmap */
> > +	ret = memmap_exclude_ranges(image, cmem, crashk_res.start,
> > +						crashk_res.end);
> > +	if (ret)
> > +		goto out;
> > +
> > +	for (i = 0; i < cmem->nr_ranges; i++) {
> > +		ei.addr = cmem->ranges[i].start;
> > +		ei.size = cmem->ranges[i].end - ei.addr + 1;
> > +		ei.type = E820_RAM;
> > +
> > +		/* If entry is less than a page, skip it */
> > +		if (ei.size < PAGE_SIZE)
> > +			continue;
> 
> You can do the size assignment and check first so that you don't have to
> do the rest if it is a less than a page.
> 

Ok, will do.

> > +		add_e820_entry(params, &ei);
> > +	}
> > +
> > +out:
> > +	vfree(cmem);
> > +	return ret;
> 
> This retval is not checked at the callsite in
> kexec_setup_boot_parameters().

Will check return code at call site.

[..]
> >  /*
> >   * Defines lowest physical address for various segments. Not sure where
> > @@ -130,11 +133,28 @@ void *bzImage64_load(struct kimage *image, char *kernel,
> >  		return ERR_PTR(-EINVAL);
> >  	}
> >  
> > +	/*
> > +	 * In case of crash dump, we will append elfcorehdr=<addr> to
> > +	 * command line. Make sure it does not overflow
> > +	 */
> > +	if (cmdline_len + MAX_ELFCOREHDR_STR_LEN > header->cmdline_size) {
> > +		ret = -EINVAL;
> 
> No need to assign anything to ret if you return ERR_PTR below.

Yep. Will remove it.

> 
> > +		pr_debug("Kernel command line too long\n");
> 
> This error message needs to differ from the one above - say something
> about "error appending elfcorehdr=...", for example.

Ok, Will fix it.

[..]
> > +	/* Setup copying of backup region */
> > +	if (image->type == KEXEC_TYPE_CRASH) {
> > +		ret = kexec_purgatory_get_set_symbol(image, "backup_dest",
> > +				&image->arch.backup_load_addr,
> > +				sizeof(image->arch.backup_load_addr), 0);
> > +		if (ret)
> > +			return ret;
> > +
> > +		ret = kexec_purgatory_get_set_symbol(image, "backup_src",
> > +				&image->arch.backup_src_start,
> > +				sizeof(image->arch.backup_src_start), 0);
> > +		if (ret)
> > +			return ret;
> > +
> > +		ret = kexec_purgatory_get_set_symbol(image, "backup_sz",
> > +				&image->arch.backup_src_sz,
> > +				sizeof(image->arch.backup_src_sz), 0);
> 
> Arg alignment is funny.

Will change.

Thanks
Vivek

  reply	other threads:[~2014-06-18 14:21 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-03 13:06 [RFC PATCH 00/13][V3] kexec: A new system call to allow in kernel loading Vivek Goyal
2014-06-03 13:06 ` [PATCH 01/13] bin2c: Move bin2c in scripts/basic Vivek Goyal
2014-06-03 16:01   ` Borislav Petkov
2014-06-03 17:13     ` Vivek Goyal
2014-06-03 13:06 ` [PATCH 02/13] kernel: Build bin2c based on config option CONFIG_BUILD_BIN2C Vivek Goyal
2014-06-04  9:13   ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 03/13] kexec: Move segment verification code in a separate function Vivek Goyal
2014-06-04  9:32   ` Borislav Petkov
2014-06-04 18:47     ` Vivek Goyal
2014-06-04 20:30       ` Borislav Petkov
2014-06-05 14:05         ` Vivek Goyal
2014-06-05 14:07           ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 04/13] resource: Provide new functions to walk through resources Vivek Goyal
2014-06-04 10:24   ` Borislav Petkov
2014-06-05 13:58     ` Vivek Goyal
2014-06-03 13:06 ` [PATCH 05/13] kexec: Make kexec_segment user buffer pointer a union Vivek Goyal
2014-06-04 10:34   ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 06/13] kexec: New syscall kexec_file_load() declaration Vivek Goyal
2014-06-04 15:18   ` Borislav Petkov
2014-06-05  9:56   ` WANG Chao
2014-06-05 15:16     ` Vivek Goyal
2014-06-05 15:22       ` Vivek Goyal
2014-06-06  6:34         ` WANG Chao
2014-06-03 13:06 ` [PATCH 07/13] kexec: Implementation of new syscall kexec_file_load Vivek Goyal
2014-06-05 11:15   ` Borislav Petkov
2014-06-05 20:17     ` Vivek Goyal
2014-06-06  2:11       ` Borislav Petkov
2014-06-06 18:02         ` Vivek Goyal
2014-06-11 14:13           ` Borislav Petkov
2014-06-11 17:04             ` Vivek Goyal
2014-06-06  6:56   ` WANG Chao
2014-06-06 18:19     ` Vivek Goyal
2014-06-09  2:11       ` Dave Young
2014-06-09  5:35         ` WANG Chao
2014-06-09 15:41           ` Vivek Goyal
2014-06-13  7:50             ` Borislav Petkov
2014-06-13  8:00               ` WANG Chao
2014-06-13  8:10                 ` Borislav Petkov
2014-06-13  8:24                   ` WANG Chao
2014-06-13  8:30                     ` Borislav Petkov
2014-06-13 12:49                 ` Vivek Goyal
2014-06-13 12:46               ` Vivek Goyal
2014-06-13 15:36                 ` Borislav Petkov
2014-06-16 17:38                   ` Vivek Goyal
2014-06-16 20:05                     ` Borislav Petkov
2014-06-16 20:53                       ` Vivek Goyal
2014-06-16 21:09                         ` Borislav Petkov
2014-06-16 21:25                           ` H. Peter Anvin
2014-06-16 21:43                             ` Vivek Goyal
2014-06-16 22:10                               ` Borislav Petkov
2014-06-16 22:49                               ` H. Peter Anvin
2014-06-09 15:30         ` Vivek Goyal
2014-06-03 13:06 ` [PATCH 08/13] purgatory/sha256: Provide implementation of sha256 in purgaotory context Vivek Goyal
2014-06-03 13:06 ` [PATCH 09/13] purgatory: Core purgatory functionality Vivek Goyal
2014-06-05 20:05   ` Borislav Petkov
2014-06-06 19:51     ` Vivek Goyal
2014-06-13 10:17       ` Borislav Petkov
2014-06-16 17:25         ` Vivek Goyal
2014-06-16 20:10           ` Borislav Petkov
2014-06-03 13:06 ` [PATCH 10/13] kexec: Load and Relocate purgatory at kernel load time Vivek Goyal
2014-06-10 16:31   ` Borislav Petkov
2014-06-11 19:24     ` Vivek Goyal
2014-06-13 16:14       ` Borislav Petkov
2014-06-03 13:07 ` [PATCH 11/13] kexec-bzImage: Support for loading bzImage using 64bit entry Vivek Goyal
2014-06-15 16:35   ` Borislav Petkov
2014-06-15 16:56     ` H. Peter Anvin
2014-06-16 20:06     ` Vivek Goyal
2014-06-16 20:57       ` Borislav Petkov
2014-06-16 21:15         ` Vivek Goyal
2014-06-16 21:27           ` Borislav Petkov
2014-06-16 21:45             ` Vivek Goyal
2014-06-24 17:31     ` Vivek Goyal
2014-06-24 18:23       ` Borislav Petkov
2014-06-03 13:07 ` [PATCH 12/13] kexec: Support for Kexec on panic using new system call Vivek Goyal
2014-06-17 21:43   ` Borislav Petkov
2014-06-18 14:20     ` Vivek Goyal [this message]
2014-06-03 13:07 ` [PATCH 13/13] kexec: Support kexec/kdump on EFI systems Vivek Goyal
2014-06-18 15:43   ` Borislav Petkov
2014-06-18 16:06   ` Borislav Petkov
2014-06-18 17:39     ` Vivek Goyal
2014-06-03 13:12 ` [RFC PATCH 00/13][V3] kexec: A new system call to allow in kernel loading Vivek Goyal
2014-06-04  9:22   ` WANG Chao
2014-06-04 17:50     ` Vivek Goyal
2014-06-04 19:39       ` Michael Kerrisk
2014-06-05 14:04         ` Vivek Goyal
2014-06-06  5:45           ` Michael Kerrisk (man-pages)
2014-06-06 18:04             ` Vivek Goyal
2014-06-05  8:31   ` Dave Young
2014-06-05 15:01     ` Vivek Goyal
2014-06-06  7:37       ` Dave Young
2014-06-06 20:04         ` Vivek Goyal
2014-06-09  1:57           ` Dave Young
2014-06-06 20:37         ` H. Peter Anvin
2014-06-06 20:58           ` Matt Fleming
2014-06-06 21:00             ` H. Peter Anvin
2014-06-06 21:02               ` Matt Fleming
2014-06-12  5:42 ` Dave Young
2014-06-12 12:36   ` Vivek Goyal
2014-06-17 14:24   ` Vivek Goyal
2014-06-18  1:45     ` Dave Young
2014-06-18  1:52       ` Dave Young
2014-06-18 12:40         ` Vivek Goyal
2014-06-16 21:13 ` Borislav Petkov
2014-06-17 13:24   ` Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140618142058.GA16552@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=chaowang@redhat.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=greg@kroah.com \
    --cc=hpa@zytor.com \
    --cc=jkosina@suse.cz \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjg59@srcf.ucam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).