From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Yu, Zhang" Subject: Re: [PATCH v4 1/2] Differentiate IO/mem resources tracked by ioreq server Date: Fri, 14 Aug 2015 17:30:27 +0800 Message-ID: <55CDB533.70602@linux.intel.com> References: <1439460355-10481-1-git-send-email-yu.c.zhang@linux.intel.com> <1439460355-10481-2-git-send-email-yu.c.zhang@linux.intel.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F58134F@AMSPEX01CL01.citrite.net> <55CC73D7.1000705@linux.intel.com> <55CD9B7D.2040208@linux.intel.com> <9AAE0902D5BC7E449B7C8E4E778ABCD02F582526@AMSPEX01CL01.citrite.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD02F582526@AMSPEX01CL01.citrite.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Paul Durrant , "xen-devel@lists.xen.org" , Ian Jackson , Stefano Stabellini , Ian Campbell , Wei Liu , "Keir (Xen.org)" , "jbeulich@suse.com" , Andrew Cooper Cc: Kevin Tian , "zhiyuan.lv@intel.com" List-Id: xen-devel@lists.xenproject.org On 8/14/2015 4:46 PM, Paul Durrant wrote: >> -----Original Message----- >> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com] >> Sent: 14 August 2015 08:41 >> To: Paul Durrant; xen-devel@lists.xen.org; Ian Jackson; Stefano Stabellini; Ian >> Campbell; Wei Liu; Keir (Xen.org); jbeulich@suse.com; Andrew Cooper >> Cc: Kevin Tian; zhiyuan.lv@intel.com >> Subject: Re: [Xen-devel] [PATCH v4 1/2] Differentiate IO/mem resources >> tracked by ioreq server >> >> Hi Paul, >> >> On 8/13/2015 6:39 PM, Yu, Zhang wrote: >>> >>> >>> On 8/13/2015 6:16 PM, Paul Durrant wrote: >>>>> -----Original Message----- >>>>> From: Yu Zhang [mailto:yu.c.zhang@linux.intel.com] >>>>> Sent: 13 August 2015 11:06 >>>>> To: xen-devel@lists.xen.org; Paul Durrant; Ian Jackson; Stefano >>>>> Stabellini; Ian >>>>> Campbell; Wei Liu; Keir (Xen.org); jbeulich@suse.com; Andrew Cooper >>>>> Cc: Kevin Tian; zhiyuan.lv@intel.com >>>>> Subject: [PATCH v4 1/2] Differentiate IO/mem resources tracked by >> ioreq >>>>> server >>>>> >>>>> Currently in ioreq server, guest write-protected ram pages are >>>>> tracked in the same rangeset with device mmio resources. Yet >>>>> unlike device mmio, which can be in big chunks, the guest write- >>>>> protected pages may be discrete ranges with 4K bytes each. >>>> >>>> Would the interfaces be better using xen_pfn_t rather than using >>>> uint64_t to pass addresses round where the bottom 12 bits will always >>>> be zero? >>>> >>>> Paul >>> >>> Thank you, Paul. >>> Well, I'm not quite sure if the bottom 12 bits of the address are zero. >>> I had thought these addresses are just guest physical ones causing >>> the ept violation, and not aligned down to its page address, like the >>> MMIO. So, if our handler code has already done that, using xen_pfn_t >>> could be better(my developing machine encountered some hardware >>> problem, will check the code tomorrow) :) >>> >>> Yu >> >> I just checked the code in hvm_select_ioreq_server(), and found value >> of address is not aligned down, and the lower 12 bits are not zero. >> So I wonder, is it necessary to do the shift to get a gfn, and what's >> the benefit? >> > > Well, you can only make whole pages mmio_dm_write and I was assuming an emulator that did so would be interested in writes anywhere in the page. Maybe that assumption is wrong? > > Paul Thank you, Paul. That makes sense. I'll try use the xen_pfn_t. :-) Yu > >> Thanks >> Yu >>>> >>>>> >>>>> This patch uses a seperate rangeset for the guest ram pages. >>>>> And a new ioreq type, IOREQ_TYPE_WP_MEM, is defined. >>>>> >>>>> Note: Previously, a new hypercall or subop was suggested to map >>>>> write-protected pages into ioreq server. However, it turned out >>>>> handler of this new hypercall would be almost the same with the >>>>> existing pair - HVMOP_[un]map_io_range_to_ioreq_server, and there's >>>>> already a type parameter in this hypercall. So no new hypercall >>>>> defined, only a new type is introduced. >>>>> >>>>> Signed-off-by: Yu Zhang >>>>> --- >>>>> tools/libxc/include/xenctrl.h | 31 ++++++++++++++++++++++ >>>>> tools/libxc/xc_domain.c | 55 >>>>> ++++++++++++++++++++++++++++++++++++++++ >>>>> xen/arch/x86/hvm/hvm.c | 26 ++++++++++++++++++- >>>>> xen/include/asm-x86/hvm/domain.h | 4 +-- >>>>> xen/include/public/hvm/hvm_op.h | 1 + >>>>> xen/include/public/hvm/ioreq.h | 1 + >>>>> 6 files changed, 115 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/tools/libxc/include/xenctrl.h >>>>> b/tools/libxc/include/xenctrl.h >>>>> index de3c0ad..53f196d 100644 >>>>> --- a/tools/libxc/include/xenctrl.h >>>>> +++ b/tools/libxc/include/xenctrl.h >>>>> @@ -2010,6 +2010,37 @@ int >>>>> xc_hvm_unmap_io_range_from_ioreq_server(xc_interface *xch, >>>>> int is_mmio, >>>>> uint64_t start, >>>>> uint64_t end); >>>>> +/** >>>>> + * This function registers a range of write-protected memory for >>>>> emulation. >>>>> + * >>>>> + * @parm xch a handle to an open hypervisor interface. >>>>> + * @parm domid the domain id to be serviced >>>>> + * @parm id the IOREQ Server id. >>>>> + * @parm start start of range >>>>> + * @parm end end of range (inclusive). >>>>> + * @return 0 on success, -1 on failure. >>>>> + */ >>>>> +int xc_hvm_map_mem_range_to_ioreq_server(xc_interface *xch, >>>>> + domid_t domid, >>>>> + ioservid_t id, >>>>> + uint64_t start, >>>>> + uint64_t end); >>>>> + >>>>> +/** >>>>> + * This function deregisters a range of write-protected memory for >>>>> emulation. >>>>> + * >>>>> + * @parm xch a handle to an open hypervisor interface. >>>>> + * @parm domid the domain id to be serviced >>>>> + * @parm id the IOREQ Server id. >>>>> + * @parm start start of range >>>>> + * @parm end end of range (inclusive). >>>>> + * @return 0 on success, -1 on failure. >>>>> + */ >>>>> +int xc_hvm_unmap_mem_range_from_ioreq_server(xc_interface >> *xch, >>>>> + domid_t domid, >>>>> + ioservid_t id, >>>>> + uint64_t start, >>>>> + uint64_t end); >>>>> >>>>> /** >>>>> * This function registers a PCI device for config space emulation. >>>>> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c >>>>> index 2ee26fb..42aeba9 100644 >>>>> --- a/tools/libxc/xc_domain.c >>>>> +++ b/tools/libxc/xc_domain.c >>>>> @@ -1552,6 +1552,61 @@ int >>>>> xc_hvm_unmap_io_range_from_ioreq_server(xc_interface *xch, >> domid_t >>>>> domid, >>>>> return rc; >>>>> } >>>>> >>>>> +int xc_hvm_map_mem_range_to_ioreq_server(xc_interface *xch, >>>>> domid_t domid, >>>>> + ioservid_t id, uint64_t >>>>> start, uint64_t end) >>>>> +{ >>>>> + DECLARE_HYPERCALL; >>>>> + DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg); >>>>> + int rc; >>>>> + >>>>> + arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg)); >>>>> + if ( arg == NULL ) >>>>> + return -1; >>>>> + >>>>> + hypercall.op = __HYPERVISOR_hvm_op; >>>>> + hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server; >>>>> + hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg); >>>>> + >>>>> + arg->domid = domid; >>>>> + arg->id = id; >>>>> + arg->type = HVMOP_IO_RANGE_WP_MEM; >>>>> + arg->start = start; >>>>> + arg->end = end; >>>>> + >>>>> + rc = do_xen_hypercall(xch, &hypercall); >>>>> + >>>>> + xc_hypercall_buffer_free(xch, arg); >>>>> + return rc; >>>>> +} >>>>> + >>>>> +int xc_hvm_unmap_mem_range_from_ioreq_server(xc_interface >> *xch, >>>>> domid_t domid, >>>>> + ioservid_t id, uint64_t >>>>> start, uint64_t end) >>>>> +{ >>>>> + DECLARE_HYPERCALL; >>>>> + DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg); >>>>> + int rc; >>>>> + >>>>> + arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg)); >>>>> + if ( arg == NULL ) >>>>> + return -1; >>>>> + >>>>> + hypercall.op = __HYPERVISOR_hvm_op; >>>>> + hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server; >>>>> + hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg); >>>>> + >>>>> + arg->domid = domid; >>>>> + arg->id = id; >>>>> + arg->type = HVMOP_IO_RANGE_WP_MEM; >>>>> + arg->start = start; >>>>> + arg->end = end; >>>>> + >>>>> + rc = do_xen_hypercall(xch, &hypercall); >>>>> + >>>>> + xc_hypercall_buffer_free(xch, arg); >>>>> + return rc; >>>>> + >>>>> +} >>>>> + >>>>> int xc_hvm_map_pcidev_to_ioreq_server(xc_interface *xch, domid_t >>>>> domid, >>>>> ioservid_t id, uint16_t segment, >>>>> uint8_t bus, uint8_t device, >>>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c >>>>> index c957610..0389c0a 100644 >>>>> --- a/xen/arch/x86/hvm/hvm.c >>>>> +++ b/xen/arch/x86/hvm/hvm.c >>>>> @@ -938,8 +938,9 @@ static int >> hvm_ioreq_server_alloc_rangesets(struct >>>>> hvm_ioreq_server *s, >>>>> >>>>> rc = asprintf(&name, "ioreq_server %d %s", s->id, >>>>> (i == HVMOP_IO_RANGE_PORT) ? "port" : >>>>> - (i == HVMOP_IO_RANGE_MEMORY) ? "memory" : >>>>> + (i == HVMOP_IO_RANGE_MEMORY) ? "mmio" : >>>>> (i == HVMOP_IO_RANGE_PCI) ? "pci" : >>>>> + (i == HVMOP_IO_RANGE_WP_MEM) ? "wp-ed memory" : >>>>> ""); >>>>> if ( rc ) >>>>> goto fail; >>>>> @@ -1258,6 +1259,7 @@ static int >>>>> hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id, >>>>> case HVMOP_IO_RANGE_PORT: >>>>> case HVMOP_IO_RANGE_MEMORY: >>>>> case HVMOP_IO_RANGE_PCI: >>>>> + case HVMOP_IO_RANGE_WP_MEM: >>>>> r = s->range[type]; >>>>> break; >>>>> >>>>> @@ -1309,6 +1311,7 @@ static int >>>>> hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t >> id, >>>>> case HVMOP_IO_RANGE_PORT: >>>>> case HVMOP_IO_RANGE_MEMORY: >>>>> case HVMOP_IO_RANGE_PCI: >>>>> + case HVMOP_IO_RANGE_WP_MEM: >>>>> r = s->range[type]; >>>>> break; >>>>> >>>>> @@ -2523,6 +2526,8 @@ struct hvm_ioreq_server >>>>> *hvm_select_ioreq_server(struct domain *d, >>>>> uint32_t cf8; >>>>> uint8_t type; >>>>> uint64_t addr; >>>>> + p2m_type_t p2mt; >>>>> + struct page_info *ram_page; >>>>> >>>>> if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) ) >>>>> return NULL; >>>>> @@ -2565,6 +2570,17 @@ struct hvm_ioreq_server >>>>> *hvm_select_ioreq_server(struct domain *d, >>>>> { >>>>> type = p->type; >>>>> addr = p->addr; >>>>> + >>>>> + if ( p->type == IOREQ_TYPE_COPY ) >>>>> + { >>>>> + ram_page = get_page_from_gfn(d, p->addr >> PAGE_SHIFT, >>>>> + &p2mt, P2M_UNSHARE); >>>>> + if ( p2mt == p2m_mmio_write_dm ) >>>>> + type = IOREQ_TYPE_WP_MEM; >>>>> + >>>>> + if ( ram_page ) >>>>> + put_page(ram_page); >>>>> + } >>>>> } >>>>> >>>>> list_for_each_entry ( s, >>>>> @@ -2582,6 +2598,7 @@ struct hvm_ioreq_server >>>>> *hvm_select_ioreq_server(struct domain *d, >>>>> BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT); >>>>> BUILD_BUG_ON(IOREQ_TYPE_COPY != >> HVMOP_IO_RANGE_MEMORY); >>>>> BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != >>>>> HVMOP_IO_RANGE_PCI); >>>>> + BUILD_BUG_ON(IOREQ_TYPE_WP_MEM != >>>>> HVMOP_IO_RANGE_WP_MEM); >>>>> r = s->range[type]; >>>>> >>>>> switch ( type ) >>>>> @@ -2609,6 +2626,13 @@ struct hvm_ioreq_server >>>>> *hvm_select_ioreq_server(struct domain *d, >>>>> } >>>>> >>>>> break; >>>>> + >>>>> + case IOREQ_TYPE_WP_MEM: >>>>> + end = addr + (p->size * p->count) - 1; >>>>> + if ( rangeset_contains_range(r, addr, end) ) >>>>> + return s; >>>>> + >>>>> + break; >>>>> } >>>>> } >>>>> >>>>> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm- >>>>> x86/hvm/domain.h >>>>> index 992d5d1..b2e4234 100644 >>>>> --- a/xen/include/asm-x86/hvm/domain.h >>>>> +++ b/xen/include/asm-x86/hvm/domain.h >>>>> @@ -48,8 +48,8 @@ struct hvm_ioreq_vcpu { >>>>> bool_t pending; >>>>> }; >>>>> >>>>> -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1) >>>>> -#define MAX_NR_IO_RANGES 256 >>>>> +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP_MEM + 1) >>>>> +#define MAX_NR_IO_RANGES 8192 >>>>> >>>>> struct hvm_ioreq_server { >>>>> struct list_head list_entry; >>>>> diff --git a/xen/include/public/hvm/hvm_op.h >>>>> b/xen/include/public/hvm/hvm_op.h >>>>> index 014546a..c02c91a 100644 >>>>> --- a/xen/include/public/hvm/hvm_op.h >>>>> +++ b/xen/include/public/hvm/hvm_op.h >>>>> @@ -331,6 +331,7 @@ struct xen_hvm_io_range { >>>>> # define HVMOP_IO_RANGE_PORT 0 /* I/O port range */ >>>>> # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */ >>>>> # define HVMOP_IO_RANGE_PCI 2 /* PCI segment/bus/dev/func >> range */ >>>>> +# define HVMOP_IO_RANGE_WP_MEM 3 /* Write-protected memory >>>>> range */ >>>>> uint64_aligned_t start, end; /* IN - inclusive start and end of >>>>> range */ >>>>> }; >>>>> typedef struct xen_hvm_io_range xen_hvm_io_range_t; >>>>> diff --git a/xen/include/public/hvm/ioreq.h >>>>> b/xen/include/public/hvm/ioreq.h >>>>> index 2e5809b..2b712ac 100644 >>>>> --- a/xen/include/public/hvm/ioreq.h >>>>> +++ b/xen/include/public/hvm/ioreq.h >>>>> @@ -35,6 +35,7 @@ >>>>> #define IOREQ_TYPE_PIO 0 /* pio */ >>>>> #define IOREQ_TYPE_COPY 1 /* mmio ops */ >>>>> #define IOREQ_TYPE_PCI_CONFIG 2 >>>>> +#define IOREQ_TYPE_WP_MEM 3 >>>>> #define IOREQ_TYPE_TIMEOFFSET 7 >>>>> #define IOREQ_TYPE_INVALIDATE 8 /* mapcache */ >>>>> >>>>> -- >>>>> 1.9.1 >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xen.org >>>> http://lists.xen.org/xen-devel >>>> >>>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel >>> >>> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > >