From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35864) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dKU3t-0004gw-J0 for qemu-devel@nongnu.org; Mon, 12 Jun 2017 14:27:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dKU3n-0006I8-FS for qemu-devel@nongnu.org; Mon, 12 Jun 2017 14:27:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54170) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dKU3n-0006HT-6P for qemu-devel@nongnu.org; Mon, 12 Jun 2017 14:27:31 -0400 References: <1497097821-32754-1-git-send-email-mark.cave-ayland@ilande.co.uk> <1497097821-32754-2-git-send-email-mark.cave-ayland@ilande.co.uk> <20170612132034.1d15ab04@nial.brq.redhat.com> From: Laszlo Ersek Message-ID: <0d1ddd4f-d008-29df-cb6d-38a8da84afb5@redhat.com> Date: Mon, 12 Jun 2017 20:27:24 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 1/6] fw_cfg: move initialisation of FWCfgState into instance_init List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Cave-Ayland , Igor Mammedov Cc: ehabkost@redhat.com, mst@redhat.com, somlo@cmu.edu, qemu-devel@nongnu.org, rjones@redhat.com, pbonzini@redhat.com Hi Mark, On 06/12/17 13:45, Mark Cave-Ayland wrote: > On 12/06/17 12:20, Igor Mammedov wrote: > >> On Sat, 10 Jun 2017 13:30:16 +0100 >> Mark Cave-Ayland wrote: >> >>> Signed-off-by: Mark Cave-Ayland >>> --- >>> hw/nvram/fw_cfg.c | 14 ++++++++++---- >>> 1 file changed, 10 insertions(+), 4 deletions(-) >>> >>> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c >>> index 316fca9..144e0c6 100644 >>> --- a/hw/nvram/fw_cfg.c >>> +++ b/hw/nvram/fw_cfg.c >>> @@ -1017,6 +1017,15 @@ FWCfgState *fw_cfg_find(void) >>> return FW_CFG(object_resolve_path(FW_CFG_PATH, NULL)); >>> } >>> >>> +static void fw_cfg_init(Object *obj) >>> +{ >>> + FWCfgState *s = FW_CFG(obj); >>> + >>> + s->entries[0] = g_new0(FWCfgEntry, fw_cfg_max_entry(s)); >>> + s->entries[1] = g_new0(FWCfgEntry, fw_cfg_max_entry(s)); >>> + s->entry_order = g_new0(int, fw_cfg_max_entry(s)); >> it doesn't seem right, >> >> consider, >> object_new(fwcfg) >> 1: fw_cfg_init -> g_new0(FWCfgEntry, fw_cfg_max_entry(s) -> FW_CFG_FILE_SLOTS_DFLT); >> 2: set property x-file-slots >> 3: realize -> fw_cfg_file_slots_allocate() >> >>> @@ -1052,10 +1062,6 @@ static void fw_cfg_file_slots_allocate(FWCfgState *s, Error **errp) >>> file_slots_max); >>> return; >>> } >>> - >>> - s->entries[0] = g_new0(FWCfgEntry, fw_cfg_max_entry(s)); >>> - s->entries[1] = g_new0(FWCfgEntry, fw_cfg_max_entry(s)); >>> - s->entry_order = g_new0(int, fw_cfg_max_entry(s)); >> opps, s->entries doesn't account for new values of x-file-slots >> >> So question is why this patch is needed at all (it needs some reasoning in commit message)? >> >> So for now NACK and I'd suggest to drop this patch. > > Right I missed the x-file-slots property when I was looking at this, > mainly because all of the existing callers went through > fw_cfg_init_mem() and fw_cfg_init_io(). I do see now though that the > property is referenced in compat.h. > > The reason for moving the initialisation is that if you apply patch 2 on > its own then you get hit by the following assert on qemu-system-sparc64 > due to uninitialised data: > > Program received signal SIGSEGV, Segmentation fault. > 0x000000000062f81d in fw_cfg_add_bytes_read_callback (s=0x1e85410, > key=0, callback=0, callback_opaque=0x0, data=0x9f6c28, len=4, > read_only=true) at hw/nvram/fw_cfg.c:633 > 633 assert(s->entries[arch][key].data == NULL); /* avoid key > conflict */ > (gdb) bt > #0 0x000000000062f81d in fw_cfg_add_bytes_read_callback (s=0x1e85410, > key=0, callback=0, callback_opaque=0x0, data=0x9f6c28, len=4, > read_only=true) at hw/nvram/fw_cfg.c:633 > #1 0x000000000062fac8 in fw_cfg_add_bytes (s=0x1e85410, key=0, > data=0x9f6c28, len=4) at hw/nvram/fw_cfg.c:664 > #2 0x00000000006307a8 in fw_cfg_init1 (dev=0x1e85410) at > hw/nvram/fw_cfg.c:922 > #3 0x00000000006308fe in fw_cfg_init_io_dma (iobase=1296, dma_iobase=0, > dma_as=0x0) at hw/nvram/fw_cfg.c:948 > #4 0x00000000006309bc in fw_cfg_init_io (iobase=1296) at > hw/nvram/fw_cfg.c:968 > #5 0x00000000004f31e9 in sun4uv_init (address_space_mem=0x132df20, > machine=0x132c8c0, hwdef=0x8bf400) at > /home/build/src/qemu/git/qemu/hw/sparc64/sun4u.c:515 > #6 0x00000000004f3403 in sun4u_init (machine=0x132c8c0) at > /home/build/src/qemu/git/qemu/hw/sparc64/sun4u.c:565 > #7 0x00000000005c0d26 in machine_run_board_init (machine=0x132c8c0) at > hw/core/machine.c:760 > > > #8 0x0000000000532e56 in main (argc=2, argv=0x7fffffffeaa8, > envp=0x7fffffffeac0) at vl.c:4594 > > Based upon this what do you think the best solution would be? if you apply patch #2 without patch #1, then the above SIGSEGV will hit on all fw_cfg using targets / machine types, not just qemu-system-sparc64. The reason is that, after patch #2 (without patch #1 applied) you have fw_cfg_init1() ... fw_cfg_add_bytes_read_callback() s->entries[arch][key].data qdev_init_nofail() fw_cfg_file_slots_allocate() IOW, the s->entries array is now dynamically allocated (after the introduction of the x-file-slots property): FWCfgEntry *entries[2]; and it is allocated in fw_cfg_file_slots_allocate(), called from realize. But with patch #2 applied in isolation, you perform the first dereference (from fw_cfg_init1(), indirectly) before allocation, that's why it crashes. Ultimately we need the following order: (1) set properties (from device defaults, from device compat settings, and from the command line; and also directly from code, such as in fw_cfg_init_io_dma()) (2) call qdev_init_nofail() -- this will consume the properties from (1), including the x-file-slots property, for allocating "s->entries" in fw_cfg_file_slots_allocate() (3) call fw_cfg_add_*() -- now that the device has been realized and is usable. This means that - patch #1 should be dropped, - and in patch #2, you have to push the rest of fw_cfg_init1() -- everything that is after the original qdev_init_nofail() call -- to the end of the realize functions. Alternatively, in patch #2, you have to split fw_cfg_init1() into two parts, fw_cfg_init1() and fw_cfg_init2(). fw_cfg_init1() would keep everything that's *before* the original qdev_init_nofail() call, and fw_cfg_init2() would get everything *after*. Then, in fw_cfg_init_io_dma() and fw_cfg_init_mem_wide(), you'd do fw_cfg_init1(dev); qdev_init_nofail(dev); fw_cfg_init2(dev); In short, you cannot reorder steps (2) and (3) against each other -- you cannot add fw_cfg entries before you realize the device -- and the crash happens because that's exactly what patch #2 does at the moment. And, patch #1 is not the right fix (you can allocate s->entries only in realize, because the size depends on a property, which you can only access in realize). The right fix is to drop patch #1 and to split fw_cfg_init1() like described above. ... Hmmm, wait a second, while the above approach would fix patch #2, in fact I think patch #2 doesn't have the right *goal*. I'll comment under patch #2. Thanks Laszlo