From: "Yu, Zhang" <yu.c.zhang@linux.intel.com>
To: Paul Durrant <Paul.Durrant@citrix.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
Ian Jackson <Ian.Jackson@citrix.com>,
Stefano Stabellini <Stefano.Stabellini@citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
Wei Liu <wei.liu2@citrix.com>, "Keir (Xen.org)" <keir@xen.org>,
"jbeulich@suse.com" <jbeulich@suse.com>,
Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
"zhiyuan.lv@intel.com" <zhiyuan.lv@intel.com>
Subject: Re: [PATCH v5 1/2] Differentiate IO/mem resources tracked by ioreq server
Date: Wed, 19 Aug 2015 17:09:36 +0800 [thread overview]
Message-ID: <55D447D0.7040307@linux.intel.com> (raw)
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD02F589FE0@AMSPEX01CL01.citrite.net>
On 8/18/2015 9:25 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 17 August 2015 22:34
>> To: Paul Durrant; xen-devel@lists.xen.org; Ian Jackson; Stefano Stabellini; Ian
>> Campbell; Wei Liu; Keir (Xen.org); jbeulich@suse.com; Andrew Cooper
>> Cc: Kevin Tian; zhiyuan.lv@intel.com
>> Subject: Re: [Xen-devel] [PATCH v5 1/2] Differentiate IO/mem resources
>> tracked by ioreq server
>>
>>
>>
>> On 8/14/2015 9:51 PM, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Yu Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>> Sent: 14 August 2015 13:03
>>>> To: xen-devel@lists.xen.org; Paul Durrant; Ian Jackson; Stefano Stabellini;
>> Ian
>>>> Campbell; Wei Liu; Keir (Xen.org); jbeulich@suse.com; Andrew Cooper
>>>> Cc: Kevin Tian; zhiyuan.lv@intel.com
>>>> Subject: [PATCH v5 1/2] Differentiate IO/mem resources tracked by ioreq
>>>> server
>>>>
>>>> Currently in ioreq server, guest write-protected ram pages are
>>>> tracked in the same rangeset with device mmio resources. Yet
>>>> unlike device mmio, which can be in big chunks, the guest write-
>>>> protected pages may be discrete ranges with 4K bytes each.
>>>>
>>>> This patch uses a seperate rangeset for the guest ram pages.
>>>> And a new ioreq type, IOREQ_TYPE_WP_MEM, is defined.
>>>>
>>>> Note: Previously, a new hypercall or subop was suggested to map
>>>> write-protected pages into ioreq server. However, it turned out
>>>> handler of this new hypercall would be almost the same with the
>>>> existing pair - HVMOP_[un]map_io_range_to_ioreq_server, and there's
>>>> already a type parameter in this hypercall. So no new hypercall
>>>> defined, only a new type is introduced.
>>>>
>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>> ---
>>>> tools/libxc/include/xenctrl.h | 31 ++++++++++++++++++++++
>>>> tools/libxc/xc_domain.c | 55
>>>> ++++++++++++++++++++++++++++++++++++++++
>>>> xen/arch/x86/hvm/hvm.c | 25 +++++++++++++++++-
>>>> xen/include/asm-x86/hvm/domain.h | 4 +--
>>>> xen/include/public/hvm/hvm_op.h | 1 +
>>>> xen/include/public/hvm/ioreq.h | 1 +
>>>> 6 files changed, 114 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>>>> index de3c0ad..3c276b7 100644
>>>> --- a/tools/libxc/include/xenctrl.h
>>>> +++ b/tools/libxc/include/xenctrl.h
>>>> @@ -2010,6 +2010,37 @@ int
>>>> xc_hvm_unmap_io_range_from_ioreq_server(xc_interface *xch,
>>>> int is_mmio,
>>>> uint64_t start,
>>>> uint64_t end);
>>>> +/**
>>>> + * This function registers a range of write-protected memory for
>> emulation.
>>>> + *
>>>> + * @parm xch a handle to an open hypervisor interface.
>>>> + * @parm domid the domain id to be serviced
>>>> + * @parm id the IOREQ Server id.
>>>> + * @parm start start of range
>>>> + * @parm end end of range (inclusive).
>>>> + * @return 0 on success, -1 on failure.
>>>> + */
>>>> +int xc_hvm_map_mem_range_to_ioreq_server(xc_interface *xch,
>>>> + domid_t domid,
>>>> + ioservid_t id,
>>>> + xen_pfn_t start,
>>>> + xen_pfn_t end);
>>>> +
>>>> +/**
>>>> + * This function deregisters a range of write-protected memory for
>>>> emulation.
>>>> + *
>>>> + * @parm xch a handle to an open hypervisor interface.
>>>> + * @parm domid the domain id to be serviced
>>>> + * @parm id the IOREQ Server id.
>>>> + * @parm start start of range
>>>> + * @parm end end of range (inclusive).
>>>> + * @return 0 on success, -1 on failure.
>>>> + */
>>>> +int xc_hvm_unmap_mem_range_from_ioreq_server(xc_interface *xch,
>>>> + domid_t domid,
>>>> + ioservid_t id,
>>>> + xen_pfn_t start,
>>>> + xen_pfn_t end);
>>>>
>>>> /**
>>>> * This function registers a PCI device for config space emulation.
>>>> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
>>>> index 2ee26fb..9db05b3 100644
>>>> --- a/tools/libxc/xc_domain.c
>>>> +++ b/tools/libxc/xc_domain.c
>>>> @@ -1552,6 +1552,61 @@ int
>>>> xc_hvm_unmap_io_range_from_ioreq_server(xc_interface *xch,
>> domid_t
>>>> domid,
>>>> return rc;
>>>> }
>>>>
>>>> +int xc_hvm_map_mem_range_to_ioreq_server(xc_interface *xch,
>>>> domid_t domid,
>>>> + ioservid_t id, xen_pfn_t start, xen_pfn_t end)
>>>> +{
>>>> + DECLARE_HYPERCALL;
>>>> + DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
>>>> + int rc;
>>>> +
>>>> + arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
>>>> + if ( arg == NULL )
>>>> + return -1;
>>>> +
>>>> + hypercall.op = __HYPERVISOR_hvm_op;
>>>> + hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
>>>> + hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
>>>> +
>>>> + arg->domid = domid;
>>>> + arg->id = id;
>>>> + arg->type = HVMOP_IO_RANGE_WP_MEM;
>>>> + arg->start = start;
>>>> + arg->end = end;
>>>> +
>>>> + rc = do_xen_hypercall(xch, &hypercall);
>>>> +
>>>> + xc_hypercall_buffer_free(xch, arg);
>>>> + return rc;
>>>> +}
>>>> +
>>>> +int xc_hvm_unmap_mem_range_from_ioreq_server(xc_interface *xch,
>>>> domid_t domid,
>>>> + ioservid_t id, xen_pfn_t start, xen_pfn_t end)
>>>> +{
>>>> + DECLARE_HYPERCALL;
>>>> + DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
>>>> + int rc;
>>>> +
>>>> + arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
>>>> + if ( arg == NULL )
>>>> + return -1;
>>>> +
>>>> + hypercall.op = __HYPERVISOR_hvm_op;
>>>> + hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
>>>> + hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
>>>> +
>>>> + arg->domid = domid;
>>>> + arg->id = id;
>>>> + arg->type = HVMOP_IO_RANGE_WP_MEM;
>>>> + arg->start = start;
>>>> + arg->end = end;
>>>> +
>>>> + rc = do_xen_hypercall(xch, &hypercall);
>>>> +
>>>> + xc_hypercall_buffer_free(xch, arg);
>>>> + return rc;
>>>> +
>>>> +}
>>>> +
>>>> int xc_hvm_map_pcidev_to_ioreq_server(xc_interface *xch, domid_t
>>>> domid,
>>>> ioservid_t id, uint16_t segment,
>>>> uint8_t bus, uint8_t device,
>>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>>> index c957610..5eb7864 100644
>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>> @@ -938,8 +938,9 @@ static int
>> hvm_ioreq_server_alloc_rangesets(struct
>>>> hvm_ioreq_server *s,
>>>>
>>>> rc = asprintf(&name, "ioreq_server %d %s", s->id,
>>>> (i == HVMOP_IO_RANGE_PORT) ? "port" :
>>>> - (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
>>>> + (i == HVMOP_IO_RANGE_MEMORY) ? "mmio" :
>>>> (i == HVMOP_IO_RANGE_PCI) ? "pci" :
>>>> + (i == HVMOP_IO_RANGE_WP_MEM) ? "wp-ed memory" :
>>>> "");
>>>> if ( rc )
>>>> goto fail;
>>>> @@ -1258,6 +1259,7 @@ static int
>>>> hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>>>> case HVMOP_IO_RANGE_PORT:
>>>> case HVMOP_IO_RANGE_MEMORY:
>>>> case HVMOP_IO_RANGE_PCI:
>>>> + case HVMOP_IO_RANGE_WP_MEM:
>>>> r = s->range[type];
>>>> break;
>>>>
>>>> @@ -1309,6 +1311,7 @@ static int
>>>> hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t
>> id,
>>>> case HVMOP_IO_RANGE_PORT:
>>>> case HVMOP_IO_RANGE_MEMORY:
>>>> case HVMOP_IO_RANGE_PCI:
>>>> + case HVMOP_IO_RANGE_WP_MEM:
>>>> r = s->range[type];
>>>> break;
>>>>
>>>> @@ -2523,6 +2526,8 @@ struct hvm_ioreq_server
>>>> *hvm_select_ioreq_server(struct domain *d,
>>>> uint32_t cf8;
>>>> uint8_t type;
>>>> uint64_t addr;
>>>> + p2m_type_t p2mt;
>>>> + struct page_info *ram_page;
>>>>
>>>> if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
>>>> return NULL;
>>>> @@ -2565,6 +2570,17 @@ struct hvm_ioreq_server
>>>> *hvm_select_ioreq_server(struct domain *d,
>>>> {
>>>> type = p->type;
>>>> addr = p->addr;
>>>> +
>>>> + if ( p->type == IOREQ_TYPE_COPY )
>>>> + {
>>>> + ram_page = get_page_from_gfn(d, p->addr >> PAGE_SHIFT,
>>>> + &p2mt, P2M_UNSHARE);
>>>> + if ( p2mt == p2m_mmio_write_dm )
>>>> + type = IOREQ_TYPE_WP_MEM;
>>>> +
>>>> + if ( ram_page )
>>>> + put_page(ram_page);
>>>> + }
>>>> }
>>>>
>>>> list_for_each_entry ( s,
>>>> @@ -2582,6 +2598,7 @@ struct hvm_ioreq_server
>>>> *hvm_select_ioreq_server(struct domain *d,
>>>> BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
>>>> BUILD_BUG_ON(IOREQ_TYPE_COPY !=
>> HVMOP_IO_RANGE_MEMORY);
>>>> BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG !=
>>>> HVMOP_IO_RANGE_PCI);
>>>> + BUILD_BUG_ON(IOREQ_TYPE_WP_MEM !=
>>>> HVMOP_IO_RANGE_WP_MEM);
>>>
>>> I wonder whether it's time to break this identity relationship. Is there really
>> a need for the emulator to know the difference between MMIO and
>> WP_MEM? If not then, IOREQ_TYPE_COPY would do for both but it does
>> mean you'd need to index below using HVMOP_IO_RANGE_XXX values
>> rather than the raw ioreq type itself.
>>>
>>> Paul
>>
>> Well, that sounds reasonable. The identity relationship is not so
>> necessary as long as hvm_select_ioreq_server() can decide which
>> rangeset to search.
>> Maybe we can also remove the IOREQ_TYPE_PCI_CONFIG? With a separate
>> patch as the first one in this patch series? :)
>
> No, you can't remove that type. That is actually passed to emulators. See upstream QEMU for usage.
>
> Paul
Thank you, Paul. You are right. I kept the IOREQ_TYPE_PCI_CONFIG and
added another patch to use the HVMOP_IO_RANGE_XXX to index the rangeset
in hvm_select_ioreq_server() in version 6 patch series.
Yu
>
>>
>> Thanks
>> Yu
>>>
>>>> r = s->range[type];
>>>>
>>>> switch ( type )
>>>> @@ -2609,6 +2626,12 @@ struct hvm_ioreq_server
>>>> *hvm_select_ioreq_server(struct domain *d,
>>>> }
>>>>
>>>> break;
>>>> +
>>>> + case IOREQ_TYPE_WP_MEM:
>>>> + if ( rangeset_contains_singleton(r, addr >> PAGE_SHIFT) )
>>>> + return s;
>>>> +
>>>> + break;
>>>> }
>>>> }
>>>>
>>>> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
>>>> x86/hvm/domain.h
>>>> index 992d5d1..b2e4234 100644
>>>> --- a/xen/include/asm-x86/hvm/domain.h
>>>> +++ b/xen/include/asm-x86/hvm/domain.h
>>>> @@ -48,8 +48,8 @@ struct hvm_ioreq_vcpu {
>>>> bool_t pending;
>>>> };
>>>>
>>>> -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
>>>> -#define MAX_NR_IO_RANGES 256
>>>> +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP_MEM + 1)
>>>> +#define MAX_NR_IO_RANGES 8192
>>>>
>>>> struct hvm_ioreq_server {
>>>> struct list_head list_entry;
>>>> diff --git a/xen/include/public/hvm/hvm_op.h
>>>> b/xen/include/public/hvm/hvm_op.h
>>>> index 014546a..c02c91a 100644
>>>> --- a/xen/include/public/hvm/hvm_op.h
>>>> +++ b/xen/include/public/hvm/hvm_op.h
>>>> @@ -331,6 +331,7 @@ struct xen_hvm_io_range {
>>>> # define HVMOP_IO_RANGE_PORT 0 /* I/O port range */
>>>> # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
>>>> # define HVMOP_IO_RANGE_PCI 2 /* PCI segment/bus/dev/func
>> range */
>>>> +# define HVMOP_IO_RANGE_WP_MEM 3 /* Write-protected memory
>>>> range */
>>>> uint64_aligned_t start, end; /* IN - inclusive start and end of range */
>>>> };
>>>> typedef struct xen_hvm_io_range xen_hvm_io_range_t;
>>>> diff --git a/xen/include/public/hvm/ioreq.h
>>>> b/xen/include/public/hvm/ioreq.h
>>>> index 2e5809b..2b712ac 100644
>>>> --- a/xen/include/public/hvm/ioreq.h
>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>> @@ -35,6 +35,7 @@
>>>> #define IOREQ_TYPE_PIO 0 /* pio */
>>>> #define IOREQ_TYPE_COPY 1 /* mmio ops */
>>>> #define IOREQ_TYPE_PCI_CONFIG 2
>>>> +#define IOREQ_TYPE_WP_MEM 3
>>>> #define IOREQ_TYPE_TIMEOFFSET 7
>>>> #define IOREQ_TYPE_INVALIDATE 8 /* mapcache */
>>>>
>>>> --
>>>> 1.9.1
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>>>
>>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>
>
next prev parent reply other threads:[~2015-08-19 9:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-14 12:02 [PATCH v5 0/2] Refactor ioreq server for better performance Yu Zhang
2015-08-14 12:02 ` [PATCH v5 1/2] Differentiate IO/mem resources tracked by ioreq server Yu Zhang
2015-08-14 13:51 ` Paul Durrant
2015-08-18 5:33 ` Yu, Zhang
2015-08-18 13:25 ` Paul Durrant
2015-08-19 9:09 ` Yu, Zhang [this message]
2015-08-14 12:02 ` [PATCH v5 2/2] Refactor rangeset structure for better performance Yu Zhang
2015-08-14 13:32 ` Paul Durrant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55D447D0.7040307@linux.intel.com \
--to=yu.c.zhang@linux.intel.com \
--cc=Andrew.Cooper3@citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@citrix.com \
--cc=Paul.Durrant@citrix.com \
--cc=Stefano.Stabellini@citrix.com \
--cc=jbeulich@suse.com \
--cc=keir@xen.org \
--cc=kevin.tian@intel.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).