From mboxrd@z Thu Jan 1 00:00:00 1970 From: Qing He Subject: Re: IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem) Date: Thu, 22 Oct 2009 13:10:48 +0800 Message-ID: <20091022051048.GA775@ub-qhe2> References: <706158FABBBA044BAD4FE898A02E4BC201C9BD8CED@pdsmsx503.ccr.corp.intel.com> <2B044E14371DA244B71F8BF2514563F503FC081B@cosmail03.lsi.com> <706158FABBBA044BAD4FE898A02E4BC201C9BD914F@pdsmsx503.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ikeVEW9yuYc//A+q" Return-path: Content-Disposition: inline In-Reply-To: <706158FABBBA044BAD4FE898A02E4BC201C9BD914F@pdsmsx503.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Zhang, Xiantao" Cc: "Cinco, Dante" , "xen-devel@lists.xensource.com" , keir.fraser@eu.citrix.com List-Id: xen-devel@lists.xenproject.org --ikeVEW9yuYc//A+q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, 2009-10-22 at 09:58 +0800, Zhang, Xiantao wrote: > > (XEN) traps.c:1626: guest_io_write::pci_conf_write data=0x40ba > > This should be written by dom0(likely to be Qemu). And if it does > exist, we may have to prohibit such unsafe writings about MSI in > Qemu. Yes, it is the case, the problem happens in Qemu, the algorithm looks like below: pt_pci_write_config(new_value) { dev_value = pci_read_block(); value = msi_write_handler(dev_value, new_value); pci_write_block(value); } msi_write_handler(dev_value, new_value) { HYPERVISOR_bind_pt_irq(); // updates MSI binding return dev_value; // it decides not to change it } The problem lies here, when bind_pt_irq is called, the real physical data/address is updated by the hypervisor. There were no problem exposed before because at that time hypervisor uses a universal vector , the data/address of msi remains unchanged. But this isn't the case when per-CPU vector is there, the pci_write_block is undesirable in QEmu now, it writes stale value back into the register and invalidate any modifications. Clearly, if QEmu decides to hand the management of these registers to the hypervisor, it shouldn't touch them again. Here is a patch to fix this by introducing a no_wb flag. Can you have a try? Thanks, Qing --ikeVEW9yuYc//A+q Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="qemu-msi-no-wb.patch" diff --git a/hw/pass-through.c b/hw/pass-through.c index 8d80755..b1a3b09 100644 --- a/hw/pass-through.c +++ b/hw/pass-through.c @@ -626,6 +626,7 @@ static struct pt_reg_info_tbl pt_emu_reg_msi_tbl[] = { .init_val = 0x00000000, .ro_mask = 0x00000003, .emu_mask = 0xFFFFFFFF, + .no_wb = 1, .init = pt_common_reg_init, .u.dw.read = pt_long_reg_read, .u.dw.write = pt_msgaddr32_reg_write, @@ -638,6 +639,7 @@ static struct pt_reg_info_tbl pt_emu_reg_msi_tbl[] = { .init_val = 0x00000000, .ro_mask = 0x00000000, .emu_mask = 0xFFFFFFFF, + .no_wb = 1, .init = pt_msgaddr64_reg_init, .u.dw.read = pt_long_reg_read, .u.dw.write = pt_msgaddr64_reg_write, @@ -650,6 +652,7 @@ static struct pt_reg_info_tbl pt_emu_reg_msi_tbl[] = { .init_val = 0x0000, .ro_mask = 0x0000, .emu_mask = 0xFFFF, + .no_wb = 1, .init = pt_msgdata_reg_init, .u.w.read = pt_word_reg_read, .u.w.write = pt_msgdata_reg_write, @@ -662,6 +665,7 @@ static struct pt_reg_info_tbl pt_emu_reg_msi_tbl[] = { .init_val = 0x0000, .ro_mask = 0x0000, .emu_mask = 0xFFFF, + .no_wb = 1, .init = pt_msgdata_reg_init, .u.w.read = pt_word_reg_read, .u.w.write = pt_msgdata_reg_write, @@ -1550,10 +1554,12 @@ static void pt_pci_write_config(PCIDevice *d, uint32_t address, uint32_t val, val >>= ((address & 3) << 3); out: - ret = pci_write_block(pci_dev, address, (uint8_t *)&val, len); + if (!reg->no_wb) { + ret = pci_write_block(pci_dev, address, (uint8_t *)&val, len); - if (!ret) - PT_LOG("Error: pci_write_block failed. return value[%d].\n", ret); + if (!ret) + PT_LOG("Error: pci_write_block failed. return value[%d].\n", ret); + } if (pm_state != NULL && pm_state->flags & PT_FLAG_TRANSITING) /* set QEMUTimer */ diff --git a/hw/pass-through.h b/hw/pass-through.h index 028a03e..3c79885 100644 --- a/hw/pass-through.h +++ b/hw/pass-through.h @@ -364,6 +364,8 @@ struct pt_reg_info_tbl { uint32_t ro_mask; /* reg emulate field mask (ON:emu, OFF:passthrough) */ uint32_t emu_mask; + /* no write back allowed */ + uint32_t no_wb; /* emul reg initialize method */ conf_reg_init init; union { --ikeVEW9yuYc//A+q Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --ikeVEW9yuYc//A+q--