From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Gregory Haskins" Subject: Re: [PATCH] Support for in-kernel mmio handlers Date: Thu, 05 Apr 2007 10:58:39 -0400 Message-ID: <4614C844.BA47.005A.0@novell.com> References: <4613C73F.BA47.005A.0@novell.com> <4614A03C.2050707@qumranet.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=__PartBA9DE48F.0__=" Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org To: "Avi Kivity" Return-path: In-Reply-To: <4614A03C.2050707-atKUWr5tajBWk0Htik3J/w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org List-Id: kvm.vger.kernel.org --=__PartBA9DE48F.0__= Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi Avi, I have addressed your comments and re-attached the fixed up patch. Most of the things you suggested I implemented, but a few I didnt so I will comment inline... >>> On Thu, Apr 5, 2007 at 3:07 AM, in message <4614A03C.2050707-atKUWr5tajBWk0Htik3J/w@public.gmane.org>, Avi Kivity wrote: > Gregory Haskins wrote: >> The MMIO registration code has been broken out as a new patch from the > in- kernel APIC work with the following changes per Avi's request: >> >> 1) Supports dynamic registration >> 2) Uses gpa_t addresses >> 3) Explicit per- cpu mappings >> >> In addition, I have added the concept of distinct VCPU and VM level > registrations (where VCPU devices will eclipse competing VM registrations (if > any). This will be key down the road where LAPICs should use VCPU > registration, but IOAPICs should use VM level. >> >> Signed- off- by: Gregory Haskins >> >> --- >> drivers/kvm/kvm.h | 50 +++++++++++++++++++++++++++++++++++++++++++++ >> drivers/kvm/kvm_main.c | 53 +++++++++++++++++++++++++++++++++++++++-------- >> 2 files changed, 94 insertions(+), 9 deletions(- ) >> >> diff -- git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h >> index fceeb84..3334730 100644 >> --- a/drivers/kvm/kvm.h >> +++ b/drivers/kvm/kvm.h >> @@ - 236,6 +236,54 @@ struct kvm_pio_request { >> int rep; >> }; >> >> +struct kvm_io_device { >> + unsigned long (*read)(struct kvm_io_device *this, >> + gpa_t addr, >> + unsigned long length); >> + void (*write)(struct kvm_io_device *this, >> + gpa_t addr, >> + unsigned long length, >> + unsigned long val); >> > > length could be just an int. Done > >> + int (*in_range)(struct kvm_io_device *this, gpa_t addr); >> > > Do you see any reason to have this as a callback and not a pair of gpas? I believe Dor replied earlier stating the reason of being able to support holes. Another reason that I can think of that I particularly like about this design (which I am not claiming as my own) is that the device can relocate (e.g. LAPIC base addr) without worrying about reprogramming the bus. > >> + >> + void *private; >> + struct list_head link; >> > > Having these in an array would be much more efficient. A fixed size > array of moderate size should suffice. Done. Maximum # devices is currently 6, because anything beyond that and I think we need to revisit the linear alg ;) > >> +}; >> + >> +/* It would be nice to use something smarter than a linear search, TBD... >> + Thankfully we dont expect many devices to register (famous last words > :), >> + so until then it will suffice. At least its abstracted so we can change >> + in one place. >> + */ >> > > /* > * kernel comments look > * like this > */ Done > >> +struct kvm_io_bus { >> + struct list_head list; >> +}; >> + >> +static inline void >> +kvm_io_bus_init(struct kvm_io_bus *bus) >> > > function declarations on one line please. Done (though I hate lines that runneth over 80 ;) > >> +{ >> + INIT_LIST_HEAD(&bus- >list); >> +} >> + >> +static inline struct kvm_io_device* >> > > C style pointers: > struct blah *somthing(); Done. > >> +kvm_io_bus_find_dev(struct kvm_io_bus *bus, gpa_t addr) >> +{ >> + struct kvm_io_device *pos = NULL; >> + >> + list_for_each_entry(pos, &bus- >list, link) { >> + if(pos- >in_range(pos, addr)) >> + return pos; >> + } >> > > space after if. avoid redundant {}. Have Documentaion/CodingStyle > tattooed somewhere easily accessible. Done > >> + >> + return NULL; >> +} >> + >> +static inline void >> +kvm_io_bus_register_dev(struct kvm_io_bus *bus, struct kvm_io_device *dev) >> +{ >> + list_add_tail(&dev- >link, &bus- >list); >> +} >> + >> struct kvm_vcpu { >> struct kvm *kvm; >> union { >> @@ - 294,6 +342,7 @@ struct kvm_vcpu { >> gpa_t mmio_phys_addr; >> struct kvm_pio_request pio; >> void *pio_data; >> + struct kvm_io_bus mmio_bus; >> >> int sigset_active; >> sigset_t sigset; >> @@ - 345,6 +394,7 @@ struct kvm { >> unsigned long rmap_overflow; >> struct list_head vm_list; >> struct file *filp; >> + struct kvm_io_bus mmio_bus; >> > > The per- vcpu I/O bus is special in that it has exactly one component, > and one which can change its address. I think we can special case it > and just check for apic addresses explicitly when searching the bus. I am loath to make special cases if they can be avoided. I think performance wise a kvm_io_bus with one device wont be much different than having a special case check against apicbase. And the advantage that this buys us is future platforms (e.g. IA64?) may have more than one per-cpu MMIO address. I also realize that future platforms may be divergent from the entire in-kernel code base altogether, but I think the general and flexible way is better if there are no compromising tradeoffs, even if its only for example/reference. In this case I dont think there are any tradeoffs, so I left it. If you insist, I will pull it ;) > >> }; >> >> struct kvm_stat { >> diff -- git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c >> index 4473174..da119c0 100644 >> --- a/drivers/kvm/kvm_main.c >> +++ b/drivers/kvm/kvm_main.c >> @@ - 294,6 +294,7 @@ static struct kvm *kvm_create_vm(void) >> >> spin_lock_init(&kvm- >lock); >> INIT_LIST_HEAD(&kvm- >active_mmu_pages); >> + kvm_io_bus_init(&kvm- >mmio_bus); >> for (i = 0; i < KVM_MAX_VCPUS; ++i) { >> struct kvm_vcpu *vcpu = &kvm- >vcpus[i]; >> >> @@ - 302,6 +303,7 @@ static struct kvm *kvm_create_vm(void) >> vcpu- >kvm = kvm; >> vcpu- >mmu.root_hpa = INVALID_PAGE; >> INIT_LIST_HEAD(&vcpu- >free_pages); >> + kvm_io_bus_init(&vcpu- >mmio_bus); >> spin_lock(&kvm_lock); >> list_add(&kvm- >vm_list, &vm_list); >> spin_unlock(&kvm_lock); >> @@ - 1015,12 +1017,30 @@ static int emulator_write_std(unsigned long addr, >> return X86EMUL_UNHANDLEABLE; >> } >> >> +static struct kvm_io_device* vcpu_find_mmio_dev(struct kvm_vcpu *vcpu, >> + gpa_t addr) >> +{ >> + struct kvm_io_device *mmio_dev; >> + >> + /* First check the local CPU addresses */ >> + mmio_dev = kvm_io_bus_find_dev(&vcpu- >mmio_bus, addr); >> + if(!mmio_dev) { >> + /* Then check the entire VM */ >> + mmio_dev = kvm_io_bus_find_dev(&vcpu- >kvm- >mmio_bus, addr); >> + } >> > > space, comment, braces I believe I fixed this, but I am a little confused about what you were pointing out. The space is obvious. I believe you were pointing out that the braces weren't needed because its technically a single-line, and that the comment is fine. If I needed to change the comment too, let me know. > >> + >> + return mmio_dev; >> +} >> + >> static int emulator_read_emulated(unsigned long addr, >> unsigned long *val, >> unsigned int bytes, >> struct x86_emulate_ctxt *ctxt) >> { >> struct kvm_vcpu *vcpu = ctxt- >vcpu; >> + gpa_t gpa; >> + int i; >> + struct kvm_io_device *mmio_dev; >> >> if (vcpu- >mmio_read_completed) { >> memcpy(val, vcpu- >mmio_data, bytes); >> @@ - 1029,18 +1049,24 @@ static int emulator_read_emulated(unsigned long addr, >> } else if (emulator_read_std(addr, val, bytes, ctxt) >> == X86EMUL_CONTINUE) >> return X86EMUL_CONTINUE; >> - else { >> - gpa_t gpa = vcpu- >mmu.gva_to_gpa(vcpu, addr); >> >> - if (gpa == UNMAPPED_GVA) >> - return X86EMUL_PROPAGATE_FAULT; >> - vcpu- >mmio_needed = 1; >> - vcpu- >mmio_phys_addr = gpa; >> - vcpu- >mmio_size = bytes; >> - vcpu- >mmio_is_write = 0; >> + gpa = vcpu- >mmu.gva_to_gpa(vcpu, addr); >> + if (gpa == UNMAPPED_GVA) >> + return vcpu_printf(vcpu, "not present\n"), X86EMUL_PROPAGATE_FAULT; >> > > > The vcpu_printf() snuck in somehow. Opps. Carry over from the apic branch merge. Fixed. > >> >> - return X86EMUL_UNHANDLEABLE; >> + /* Is thie MMIO handled locally? */ >> + mmio_dev = vcpu_find_mmio_dev(vcpu, gpa); >> + if(mmio_dev) { >> + *val = mmio_dev- >read(mmio_dev, gpa, bytes); >> + return X86EMUL_CONTINUE; >> } >> + >> + vcpu- >mmio_needed = 1; >> + vcpu- >mmio_phys_addr = gpa; >> + vcpu- >mmio_size = bytes; >> + vcpu- >mmio_is_write = 0; >> + >> + return X86EMUL_UNHANDLEABLE; >> } >> >> static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, >> @@ - 1070,6 +1096,8 @@ static int emulator_write_emulated(unsigned long addr, >> { >> struct kvm_vcpu *vcpu = ctxt- >vcpu; >> gpa_t gpa = vcpu- >mmu.gva_to_gpa(vcpu, addr); >> + int i; >> + struct kvm_io_device *mmio_dev; >> >> if (gpa == UNMAPPED_GVA) >> return X86EMUL_PROPAGATE_FAULT; >> @@ - 1077,6 +1105,13 @@ static int emulator_write_emulated(unsigned long addr, >> if (emulator_write_phys(vcpu, gpa, val, bytes)) >> return X86EMUL_CONTINUE; >> >> + /* Is thie MMIO handled locally? */ >> > > spelling Done > >> + mmio_dev = vcpu_find_mmio_dev(vcpu, gpa); >> + if(mmio_dev) { >> + mmio_dev- >write(mmio_dev, gpa, bytes, val); >> + return X86EMUL_CONTINUE; >> + } >> + >> vcpu- >mmio_needed = 1; >> vcpu- >mmio_phys_addr = gpa; >> vcpu- >mmio_size = bytes; >> >> >> > > Please fix and *test*. Boot at least 32- bit Windows with ACPI HAL and > 64- bit Linux, the more the better of course. I have confirmed that my 64 bit linux guest boots fine. I don't currently have any other guests. Careful review of the code leads me to believe this should be an inert change, so I wont go through the effort of finding an XP CD to install unless you insist ;) Regards, -Greg --=__PartBA9DE48F.0__= Content-Type: text/plain; name="in-kernel-mmio.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="in-kernel-mmio.patch" diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h index fceeb84..c1923df 100644 --- a/drivers/kvm/kvm.h +++ b/drivers/kvm/kvm.h @@ -236,6 +236,56 @@ struct kvm_pio_request { int rep; }; =20 +struct kvm_io_device { + unsigned long (*read)(struct kvm_io_device *this, + gpa_t addr, + int length); + void (*write)(struct kvm_io_device *this, + gpa_t addr, + int length, + unsigned long val); + int (*in_range)(struct kvm_io_device *this, gpa_t addr); + + void *private; +}; + +/* It would be nice to use something smarter than a linear search, TBD... + * Thankfully we dont expect many devices to register (famous last words = :), + * so until then it will suffice. At least its abstracted so we can = change + * in one place. + */ +struct kvm_io_bus { + int dev_count; +#define NR_IOBUS_DEVS 6 + struct kvm_io_device *devs[NR_IOBUS_DEVS]; +}; + +static inline void kvm_io_bus_init(struct kvm_io_bus *bus) +{ + memset(bus, 0, sizeof(*bus)); +} + +static inline struct kvm_io_device *kvm_io_bus_find_dev(struct kvm_io_bus = *bus, gpa_t addr) +{ + int i; + + for(i=3D0; idev_count; i++) { + struct kvm_io_device *pos =3D bus->devs[i]; + + if (pos->in_range(pos, addr)) + return pos; + } + + return NULL; +} + +static inline void kvm_io_bus_register_dev(struct kvm_io_bus *bus, struct = kvm_io_device *dev) +{ + BUG_ON(bus->dev_count >=3D (NR_IOBUS_DEVS-1)); + + bus->devs[bus->dev_count++] =3D dev; +} + struct kvm_vcpu { struct kvm *kvm; union { @@ -294,6 +344,7 @@ struct kvm_vcpu { gpa_t mmio_phys_addr; struct kvm_pio_request pio; void *pio_data; + struct kvm_io_bus mmio_bus; =20 int sigset_active; sigset_t sigset; @@ -345,6 +396,7 @@ struct kvm { unsigned long rmap_overflow; struct list_head vm_list; struct file *filp; + struct kvm_io_bus mmio_bus; }; =20 struct kvm_stat { diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c index 4473174..9ca0ad3 100644 --- a/drivers/kvm/kvm_main.c +++ b/drivers/kvm/kvm_main.c @@ -294,6 +294,7 @@ static struct kvm *kvm_create_vm(void) =20 spin_lock_init(&kvm->lock); INIT_LIST_HEAD(&kvm->active_mmu_pages); + kvm_io_bus_init(&kvm->mmio_bus); for (i =3D 0; i < KVM_MAX_VCPUS; ++i) { struct kvm_vcpu *vcpu =3D &kvm->vcpus[i]; =20 @@ -302,6 +303,7 @@ static struct kvm *kvm_create_vm(void) vcpu->kvm =3D kvm; vcpu->mmu.root_hpa =3D INVALID_PAGE; INIT_LIST_HEAD(&vcpu->free_pages); + kvm_io_bus_init(&vcpu->mmio_bus); spin_lock(&kvm_lock); list_add(&kvm->vm_list, &vm_list); spin_unlock(&kvm_lock); @@ -1015,12 +1017,28 @@ static int emulator_write_std(unsigned long addr, return X86EMUL_UNHANDLEABLE; } =20 +static struct kvm_io_device* vcpu_find_mmio_dev(struct kvm_vcpu *vcpu,=20 + gpa_t addr) +{ + struct kvm_io_device *mmio_dev; + + /* First check the local CPU addresses */ + mmio_dev =3D kvm_io_bus_find_dev(&vcpu->mmio_bus, addr); + if (!mmio_dev) + /* Then check the entire VM */ + mmio_dev =3D kvm_io_bus_find_dev(&vcpu->kvm->mmio_bus, = addr); +=09 + return mmio_dev; +} + static int emulator_read_emulated(unsigned long addr, unsigned long *val, unsigned int bytes, struct x86_emulate_ctxt *ctxt) { - struct kvm_vcpu *vcpu =3D ctxt->vcpu; + struct kvm_vcpu *vcpu =3D ctxt->vcpu; + struct kvm_io_device *mmio_dev; + gpa_t gpa; =20 if (vcpu->mmio_read_completed) { memcpy(val, vcpu->mmio_data, bytes); @@ -1029,18 +1047,24 @@ static int emulator_read_emulated(unsigned long = addr, } else if (emulator_read_std(addr, val, bytes, ctxt) =3D=3D X86EMUL_CONTINUE) return X86EMUL_CONTINUE; - else { - gpa_t gpa =3D vcpu->mmu.gva_to_gpa(vcpu, addr); =20 - if (gpa =3D=3D UNMAPPED_GVA) - return X86EMUL_PROPAGATE_FAULT; - vcpu->mmio_needed =3D 1; - vcpu->mmio_phys_addr =3D gpa; - vcpu->mmio_size =3D bytes; - vcpu->mmio_is_write =3D 0; + gpa =3D vcpu->mmu.gva_to_gpa(vcpu, addr); + if (gpa =3D=3D UNMAPPED_GVA) + return X86EMUL_PROPAGATE_FAULT; =20 - return X86EMUL_UNHANDLEABLE; + /* Is this MMIO handled locally? */ + mmio_dev =3D vcpu_find_mmio_dev(vcpu, gpa); + if (mmio_dev) { + *val =3D mmio_dev->read(mmio_dev, gpa, bytes); + return X86EMUL_CONTINUE; } + + vcpu->mmio_needed =3D 1; + vcpu->mmio_phys_addr =3D gpa; + vcpu->mmio_size =3D bytes; + vcpu->mmio_is_write =3D 0; +=09 + return X86EMUL_UNHANDLEABLE; } =20 static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa, @@ -1068,8 +1092,9 @@ static int emulator_write_emulated(unsigned long = addr, unsigned int bytes, struct x86_emulate_ctxt *ctxt) { - struct kvm_vcpu *vcpu =3D ctxt->vcpu; - gpa_t gpa =3D vcpu->mmu.gva_to_gpa(vcpu, addr); + struct kvm_vcpu *vcpu =3D ctxt->vcpu; + struct kvm_io_device *mmio_dev; + gpa_t gpa =3D vcpu->mmu.gva_to_gpa(vcpu, addr); =20 if (gpa =3D=3D UNMAPPED_GVA) return X86EMUL_PROPAGATE_FAULT; @@ -1077,6 +1102,13 @@ static int emulator_write_emulated(unsigned long = addr, if (emulator_write_phys(vcpu, gpa, val, bytes)) return X86EMUL_CONTINUE; =20 + /* Is this MMIO handled locally? */ + mmio_dev =3D vcpu_find_mmio_dev(vcpu, gpa); + if (mmio_dev) { + mmio_dev->write(mmio_dev, gpa, bytes, val); + return X86EMUL_CONTINUE; + } + vcpu->mmio_needed =3D 1; vcpu->mmio_phys_addr =3D gpa; vcpu->mmio_size =3D bytes; --=__PartBA9DE48F.0__= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV --=__PartBA9DE48F.0__= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ kvm-devel mailing list kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org https://lists.sourceforge.net/lists/listinfo/kvm-devel --=__PartBA9DE48F.0__=--