All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@muc.de>
To: virtualization@lists.osdl.org
Cc: Rusty Russell <rusty@rustcorp.com.au>,
	lkml - Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@osdl.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Paul Mackerras <paulus@samba.org>
Subject: Re: [PATCH 6/10] lguest code: the little linux hypervisor.
Date: Fri, 9 Feb 2007 11:09:19 +0100	[thread overview]
Message-ID: <200702091109.20061.ak@muc.de> (raw)
In-Reply-To: <1171012827.2718.42.camel@localhost.localdomain>

On Friday 09 February 2007 10:20, Rusty Russell wrote:
> This is the core of lguest: both the guest code (always compiled in to
> the image so it can boot under lguest), and the host code (lg.ko).
> 
> There is only one config prompt at the moment: lguest is currently
> designed to run exactly the same guest and host kernels so we can
> frob the ABI freely.
> 
> Unfortunately, we don't have the build infrastructure for "private"
> asm-offsets.h files, so there's a not-so-neat include in
> arch/i386/kernel/asm-offsets.c.

Ask the kbuild people to fix that? 

It indeed looks ugly.

I bet Xen et.al. could make good use of that too.

> +# This links the hypervisor in the right place and turns it into a C array.
> +$(obj)/hypervisor-raw: $(obj)/hypervisor.o
> +	@$(LD) -static -Tdata=`printf %#x $$(($(HYPE_ADDR)))` -Ttext=`printf %#x $$(($(HYPE_ADDR)+$(HYPE_DATA_SIZE)))` -o $@ $< && $(OBJCOPY) -O binary $@
> +$(obj)/hypervisor-blob.c: $(obj)/hypervisor-raw
> +	@od -tx1 -An -v $< | sed -e 's/^ /0x/' -e 's/$$/,/' -e 's/ /,0x/g' > $@

an .S file with .incbin is more efficient and simpler
(note it has to be an separate .S file, otherwise icecream/distcc break) 

It won't allow to show off any sed skills, but I guess we can live with that ;-)


> +static struct vm_struct *hypervisor_vma;
> +static int cpu_had_pge;
> +static struct {
> +	unsigned long offset;
> +	unsigned short segment;
> +} lguest_entry;
> +struct page *hype_pages; /* Contiguous pages. */

Statics? looks funky.  Why only a single hypervisor_vma?

> +struct lguest lguests[MAX_LGUEST_GUESTS];
> +DECLARE_MUTEX(lguest_lock);
> +
> +/* IDT entries are at start of hypervisor. */
> +const unsigned long *__lguest_default_idt_entries(void)
> +{
> +	return (void *)HYPE_ADDR;
> +}
> +
> +/* Next is switch_to_guest */
> +static void *__lguest_switch_to_guest(void)
> +{
> +	return (void *)HYPE_ADDR + HYPE_DATA_SIZE;
> +}
> +
> +/* Then we use everything else to hold guest state. */
> +struct lguest_state *__lguest_states(void)
> +{
> +	return (void *)HYPE_ADDR + sizeof(hypervisor_blob);

This cries for asm_offsets.h too, doesn't it? 

> +}
> +
> +static __init int map_hypervisor(void)
> +{
> +	unsigned int i;
> +	int err;
> +	struct page *pages[HYPERVISOR_PAGES], **pagep = pages;
> +
> +	hype_pages = alloc_pages(GFP_KERNEL|__GFP_ZERO,
> +				 get_order(HYPERVISOR_SIZE));

Wasteful because of the rounding. Probably wants reintroduction
of alloc_pages_exact()


> +
> +static __exit void unmap_hypervisor(void)
> +{
> +	vunmap(hypervisor_vma->addr);
> +	__free_pages(hype_pages, get_order(HYPERVISOR_SIZE));

Shouldn't you clean up the GDTs too? 

> +}
> +
> +/* IN/OUT insns: enough to get us past boot-time probing. */
> +static int emulate_insn(struct lguest *lg)
> +{
> +	u8 insn;
> +	unsigned int insnlen = 0, in = 0, shift = 0;
> +	unsigned long physaddr = guest_pa(lg, lg->state->regs.eip);
> +
> +	/* This only works for addresses in linear mapping... */
> +	if (lg->state->regs.eip < lg->page_offset)
> +		return 0;

Shouldn't there be a printk here?

> +/* Saves exporting idt_table from kernel */
> +static struct desc_struct *get_idt_table(void)
> +{
> +	struct Xgt_desc_struct idt;
> +
> +	asm("sidt %0":"=m" (idt));

Nasty, but ok.

> +	return (void *)idt.address;
> +}
> +
> +extern asmlinkage void math_state_restore(void);

No externs in .c files

> +
> +/* Trap page resets this when it reloads gs. */
> +static int new_gfp_eip(struct lguest *lg, struct lguest_regs *regs)
> +{
> +	u32 eip;
> +	get_user(eip, &lg->lguest_data->gs_gpf_eip);
> +	if (eip == regs->eip)
> +		return 0;
> +	put_user(regs->eip, &lg->lguest_data->gs_gpf_eip);

No fault checking? 

lhread/write use probably also needs to be double checked that a malicious
guest can't put the kernel into a loop.

> +	return 1;
> +}
> +
> +static void set_ts(unsigned int guest_ts)
> +{
> +	u32 cr0;
> +	if (guest_ts) {
> +		asm("movl %%cr0,%0":"=r" (cr0));
> +		if (!(cr0 & 8))
> +			asm("movl %0,%%cr0": :"r" (cr0|8));
> +	}

We have macros and defines for this in standard headers.
\
> +	while (!lg->dead) {
> +		unsigned int cr2 = 0; /* Damn gcc */
> +
> +		/* Hypercalls first: we might have been out to userspace */
> +		if (do_async_hcalls(lg))
> +			goto pending_dma;
> +
> +		if (regs->trapnum == LGUEST_TRAP_ENTRY) {
> +			/* Only do hypercall once. */
> +			regs->trapnum = 255;
> +			if (hypercall(lg, regs))
> +				goto pending_dma;
> +		}
> +
> +		if (signal_pending(current))
> +			return -EINTR

Probably needs freezer checking here somewhere.

> ; 
> +		maybe_do_interrupt(lg);
> +
> +		if (lg->dead)
> +			break;
> +
> +		if (lg->halted) {
> +			set_current_state(TASK_INTERRUPTIBLE);
> +			schedule_timeout(1);

1?  And what is that good for anyways?

> +				/* FIXME: If it's reloading %gs in a loop? */

Yes what then? Have you tried it?

In general i miss printks when things go wrong. Do you expect
all users to have a gdbstub ready? @)

> +pending_dma:
> +	put_user(lg->pending_dma, (unsigned long *)user);
> +	put_user(lg->pending_addr, (unsigned long *)user+1);

error checking? How do you avoid loops?


> +	if (cpu_has_pge) { /* We have a broader idea of "global". */
> +		cpu_had_pge = 1;
> +		on_each_cpu(adjust_pge, 0, 0, 1);

cpu hotplug? 

> +		clear_bit(X86_FEATURE_PGE, boot_cpu_data.x86_capability);
> +	}
> +	return 0;
> +}
> 
> +	case LHCALL_CRASH: {
> +		char msg[128];
> +		lhread(lg, msg, regs->edx, sizeof(msg));
> +		msg[sizeof(msg)-1] = '\0';

Might be safer to vet for isprint here

> +#define log(...)					\
> +	do {						\
> +		mm_segment_t oldfs = get_fs();		\
> +		char buf[100];				\

At least older gccs will accumulate the bufs in a function, eventually possibly blowing
the stack. Better use a function.


> +	/* If they're halted, we re-enable interrupts. */
> +	if (lg->halted) {
> +		/* Re-enable interrupts. */
> +		put_user(512, &lg->lguest_data->irq_enabled);

interesting magic number

> +	/* Ignore NMI, doublefault, hypercall, spurious interrupt. */
> +	if (i == 2 || i == 8 || i == 15 || i == LGUEST_TRAP_ENTRY)
> +		return;
> +	/* FIXME: We should handle debug and int3 */
> +	else if (i == 1 || i == 3)
> +		return;
> +	/* We intercept page fault, general protection fault and fpu missing */
> +	else if (i == 13)
> +		copy_trap(lg, &lg->gpf_trap, &d);
> +	else if (i == 14)
> +		copy_trap(lg, &lg->page_trap, &d);
> +	else if (i == 7)
> +		copy_trap(lg, &lg->fpu_trap, &d);
> +	/* Other traps go straight to guest. */
> +	else if (i < FIRST_EXTERNAL_VECTOR || i == SYSCALL_VECTOR)
> +		setup_idt(lg, i, &d);
> +	/* A virtual interrupt */
> +	else if (i < FIRST_EXTERNAL_VECTOR + LGUEST_IRQS)
> +		copy_trap(lg, &lg->interrupt[i-FIRST_EXTERNAL_VECTOR], &d);\

switch is not cool enough anymore?

>
> +	down(&lguest_lock);

i suspect mutexes are the new way to do this

> +	down_read(&current->mm->mmap_sem);
> +	if (get_futex_key((u32 __user *)addr, &key) != 0) {
> +		kill_guest(lg, "bad dma address %#lx", addr);
> +		goto unlock;

Risky? Use probe_kernel_address et.al.?

> +#if 0
> +/* FIXME: Use asm-offsets here... */

Remove?

> +extern int mce_disabled;

tststs

> +
> +/* FIXME: Update iff tsc rate changes. */

It does.


> +static fastcall void lguest_cpuid(unsigned int *eax, unsigned int *ebx,
> +				 unsigned int *ecx, unsigned int *edx)
> +{
> +	int is_feature = (*eax == 1);
> +
> +	asm volatile ("cpuid"
> +		      : "=a" (*eax),
> +			"=b" (*ebx),
> +			"=c" (*ecx),
> +			"=d" (*edx)
> +		      : "0" (*eax), "2" (*ecx));

What's wrong with the standard cpuid*() macros?

> +	extern struct Xgt_desc_struct cpu_gdt_descr;
> +	extern struct i386_pda boot_pda;

No externs in .c

> +
> +	paravirt_ops.name = "lguest";

Can you just statically initialize this and then copy over? 

> +	asm volatile ("mov %0, %%gs" : : "r" (__KERNEL_PDA) : "memory");

This will be %fs soon.


... haven't read everything else. the IO driver earlier was also not very closely looked at.

-Andi

  parent reply	other threads:[~2007-02-09 10:09 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-09  9:11 [PATCH 0/10] lguest Rusty Russell
2007-02-09  9:11 ` Rusty Russell
2007-02-09  9:14 ` [PATCH 1/10] lguest: Don't rely on last-linked fallthru when no paravirt handler Rusty Russell
2007-02-09  9:15   ` [PATCH 2/10] lguest: Export symbols for lguest as a module Rusty Russell
2007-02-09  9:32     ` Andi Kleen
2007-02-09 12:06       ` Rusty Russell
2007-02-09 13:58         ` Andi Kleen
2007-02-10 11:39           ` Rusty Russell
2007-02-09  9:17   ` [PATCH 3/10] lguest: Expose get_futex_key, get_key_refs and drop_key_refs Rusty Russell
2007-02-09  9:18   ` [PATCH 4/10] lguest: Initialize esp0 properly all the time Rusty Russell
2007-02-09  9:19     ` [PATCH 5/10] Make hvc_console.c compile on non-PowerPC Rusty Russell
2007-02-09  9:19       ` Rusty Russell
2007-02-09  9:20       ` [PATCH 6/10] lguest code: the little linux hypervisor Rusty Russell
2007-02-09  9:22         ` [PATCH 7/10] lguest: Simple lguest network driver Rusty Russell
2007-02-09  9:23           ` [PATCH 8/10] lguest: console driver Rusty Russell
2007-02-09  9:24             ` [PATCH 9/10] lguest: block driver Rusty Russell
2007-02-09  9:25               ` [PATCH 10/10] lguest: documentatation including example launcher Rusty Russell
2007-02-09  9:35         ` [PATCH 6/10] lguest code: the little linux hypervisor Andrew Morton
2007-02-09 11:00           ` Rusty Russell
2007-02-09 11:13             ` Zachary Amsden
2007-02-09 11:50               ` Andi Kleen
2007-02-09 11:54                 ` Zachary Amsden
2007-02-09 11:57                   ` Andi Kleen
2007-02-09 12:08                     ` Zachary Amsden
2007-02-09 22:29                 ` David Miller
2007-02-09 10:09         ` Andi Kleen [this message]
2007-02-09 12:39           ` Rusty Russell
2007-02-09 13:57             ` Andi Kleen
2007-02-09 15:01               ` Rusty Russell
2007-02-09 14:17             ` Sam Ravnborg
2007-02-09 15:23               ` Rusty Russell
2007-02-12 13:34                 ` [q] kbuild for private asm-offsets (Re: [PATCH 6/10] lguest code: the little linux hypervisor.) Oleg Verych
2007-02-12 17:24                   ` Andi Kleen
2007-02-12 21:41                   ` Sam Ravnborg
2007-02-12 23:41                   ` Rusty Russell
2007-02-13  3:10                     ` Oleg Verych
2007-02-13  3:10                       ` Oleg Verych
2007-02-16 15:55                       ` [pp] kbuild: lguest with private asm-offsets (and some bloat) Oleg Verych
2007-02-16 15:59                         ` [pp] kbuild: asm-offsets generalized Oleg Verych
2007-02-16 18:56                           ` Sam Ravnborg
2007-02-16 21:56                             ` Oleg Verych
2007-02-17  4:43                               ` Rusty Russell
2007-02-17  5:33                                 ` Oleg Verych
2007-04-01 20:42                           ` Sam Ravnborg
2007-04-01 21:08                             ` Oleg Verych
2007-04-01 21:03                               ` Sam Ravnborg
2007-02-09 10:55       ` [PATCH 6a/10] lguest: Config and headers Rusty Russell
2007-02-09 10:56         ` [PATCH 6b/10] lguest: the host code (lg.ko) Rusty Russell
2007-02-09 10:57           ` [PATCH 6c/10] lguest: the guest code Rusty Russell
2007-02-09 10:58             ` [PATCH 6d/10] lguest: the Makefiles Rusty Russell
2007-02-09 17:06             ` [PATCH 6c/10] lguest: the guest code Len Brown
2007-02-09 17:14               ` James Morris
2007-02-09 17:49                 ` Len Brown
2007-02-09 23:48                   ` [PATCH 11/10] lguest: use disable_acpi() Rusty Russell
2007-02-09  9:31   ` [PATCH 1/10] lguest: Don't rely on last-linked fallthru when no paravirt handler Andi Kleen
2007-02-09 11:52     ` Rusty Russell
2007-02-09 20:49       ` Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200702091109.20061.ak@muc.de \
    --to=ak@muc.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulus@samba.org \
    --cc=rusty@rustcorp.com.au \
    --cc=sfr@canb.auug.org.au \
    --cc=virtualization@lists.osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.