From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:42686) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TQxzg-0001kO-G1 for qemu-devel@nongnu.org; Wed, 24 Oct 2012 06:15:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TQxza-0004qA-Mp for qemu-devel@nongnu.org; Wed, 24 Oct 2012 06:15:24 -0400 Received: from mail-la0-f45.google.com ([209.85.215.45]:62645) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TQxza-0004q2-FG for qemu-devel@nongnu.org; Wed, 24 Oct 2012 06:15:18 -0400 Received: by mail-la0-f45.google.com with SMTP id m13so199182lah.4 for ; Wed, 24 Oct 2012 03:15:17 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <1348226255-4226-1-git-send-email-vasilis.liaskovitis@profitbricks.com> <1348226255-4226-6-git-send-email-vasilis.liaskovitis@profitbricks.com> <20121023122532.GE19977@stefanha-thinkpad.redhat.com> Date: Wed, 24 Oct 2012 12:15:17 +0200 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [Qemu-devel] [RFC PATCH v3 05/19] Implement dimm device abstraction List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: liu ping fan Cc: blauwirbel@gmail.com, kvm@vger.kernel.org, gleb@redhat.com, seabios@seabios.org, qemu-devel@nongnu.org, Vasilis Liaskovitis , kevin@koconnor.net, avi@redhat.com, anthony@codemonkey.ws, imammedo@redhat.com, Paolo Bonzini , kraxel@redhat.com On Wed, Oct 24, 2012 at 10:06 AM, liu ping fan wrote: > On Tue, Oct 23, 2012 at 8:25 PM, Stefan Hajnoczi wrote: >> On Fri, Sep 21, 2012 at 01:17:21PM +0200, Vasilis Liaskovitis wrote: >>> +static void dimm_populate(DimmDevice *s) >>> +{ >>> + DeviceState *dev= (DeviceState*)s; >>> + MemoryRegion *new = NULL; >>> + >>> + new = g_malloc(sizeof(MemoryRegion)); >>> + memory_region_init_ram(new, dev->id, s->size); >>> + vmstate_register_ram_global(new); >>> + memory_region_add_subregion(get_system_memory(), s->start, new); >>> + s->mr = new; >>> +} >>> + >>> +static void dimm_depopulate(DimmDevice *s) >>> +{ >>> + assert(s); >>> + vmstate_unregister_ram(s->mr, NULL); >>> + memory_region_del_subregion(get_system_memory(), s->mr); >>> + memory_region_destroy(s->mr); >>> + s->mr = NULL; >>> +} >> >> How is dimm hot unplug protected against callers who currently have RAM >> mapped (from cpu_physical_memory_map())? >> >> Emulated devices call cpu_physical_memory_map() directly or indirectly >> through DMA emulation code. The RAM pointer may be held for arbitrary >> lengths of time, across main loop iterations, etc. >> >> It's not clear to me that it is safe to unplug a DIMM that has network >> or disk I/O buffers, for example. We also need to be robust against >> malicious guests who abuse the hotplug lifecycle. QEMU should never be >> left with dangling pointers. >> > Not sure about the block layer. But I think those thread are already > out of big lock, so there should be a MemoryListener to catch the > RAM-unplug event, and if needed, bdrv_flush. Here is the detailed scenario: 1. Emulated device does cpu_physical_memory_map() and gets a pointer to guest RAM. 2. Return to vcpu or iothread, continue processing... 3. Hot unplug of RAM causes the guest RAM to disappear. 4. Pending I/O completes and overwrites memory from dangling guest RAM pointer. Any I/O device that does zero-copy I/O in QEMU faces this problem: * The block layer is affected. * The net layer is unaffected because it doesn't do zero-copy tx/rx across returns to the main loop (#2 above). * Not sure about other devices classes (e.g. USB). How should the MemoryListener callback work? For block I/O it may not be possible to cancel pending I/O asynchronously - if you try to cancel then your thread may block until the I/O completes. Synchronous cancel behavior is not workable since it can lead to poor latency or hangs in the guest. Stefan