xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Yu, Zhang" <yu.c.zhang@linux.intel.com>
To: Paul Durrant <Paul.Durrant@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	Stefano Stabellini <Stefano.Stabellini@citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Wei Liu <wei.liu2@citrix.com>, "Keir (Xen.org)" <keir@xen.org>,
	"jbeulich@suse.com" <jbeulich@suse.com>,
	Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
	"zhiyuan.lv@intel.com" <zhiyuan.lv@intel.com>
Subject: Re: [PATCH v4 1/2] Differentiate IO/mem resources tracked by ioreq server
Date: Fri, 14 Aug 2015 17:30:27 +0800	[thread overview]
Message-ID: <55CDB533.70602@linux.intel.com> (raw)
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD02F582526@AMSPEX01CL01.citrite.net>



On 8/14/2015 4:46 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Yu, Zhang [mailto:yu.c.zhang@linux.intel.com]
>> Sent: 14 August 2015 08:41
>> To: Paul Durrant; xen-devel@lists.xen.org; Ian Jackson; Stefano Stabellini; Ian
>> Campbell; Wei Liu; Keir (Xen.org); jbeulich@suse.com; Andrew Cooper
>> Cc: Kevin Tian; zhiyuan.lv@intel.com
>> Subject: Re: [Xen-devel] [PATCH v4 1/2] Differentiate IO/mem resources
>> tracked by ioreq server
>>
>> Hi Paul,
>>
>> On 8/13/2015 6:39 PM, Yu, Zhang wrote:
>>>
>>>
>>> On 8/13/2015 6:16 PM, Paul Durrant wrote:
>>>>> -----Original Message-----
>>>>> From: Yu Zhang [mailto:yu.c.zhang@linux.intel.com]
>>>>> Sent: 13 August 2015 11:06
>>>>> To: xen-devel@lists.xen.org; Paul Durrant; Ian Jackson; Stefano
>>>>> Stabellini; Ian
>>>>> Campbell; Wei Liu; Keir (Xen.org); jbeulich@suse.com; Andrew Cooper
>>>>> Cc: Kevin Tian; zhiyuan.lv@intel.com
>>>>> Subject: [PATCH v4 1/2] Differentiate IO/mem resources tracked by
>> ioreq
>>>>> server
>>>>>
>>>>> Currently in ioreq server, guest write-protected ram pages are
>>>>> tracked in the same rangeset with device mmio resources. Yet
>>>>> unlike device mmio, which can be in big chunks, the guest write-
>>>>> protected pages may be discrete ranges with 4K bytes each.
>>>>
>>>> Would the interfaces be better using xen_pfn_t rather than using
>>>> uint64_t to pass addresses round where the bottom 12 bits will always
>>>> be zero?
>>>>
>>>>     Paul
>>>
>>> Thank you, Paul.
>>> Well, I'm not quite sure if the bottom 12 bits of the address are zero.
>>> I had thought these addresses are just guest physical ones causing
>>> the ept violation, and not aligned down to its page address, like the
>>> MMIO. So, if our handler code has already done that, using xen_pfn_t
>>> could be better(my developing machine encountered some hardware
>>> problem, will check the code tomorrow) :)
>>>
>>> Yu
>>
>> I just checked the code in hvm_select_ioreq_server(), and found value
>> of address is not aligned down, and the lower 12 bits are not zero.
>> So I wonder, is it necessary to do the shift to get a gfn, and what's
>> the benefit?
>>
>
> Well, you can only make whole pages mmio_dm_write and I was assuming an emulator that did so would be interested in writes anywhere in the page. Maybe that assumption is wrong?
>
>    Paul

Thank you, Paul. That makes sense. I'll try use the xen_pfn_t. :-)

Yu
>
>> Thanks
>> Yu
>>>>
>>>>>
>>>>> This patch uses a seperate rangeset for the guest ram pages.
>>>>> And a new ioreq type, IOREQ_TYPE_WP_MEM, is defined.
>>>>>
>>>>> Note: Previously, a new hypercall or subop was suggested to map
>>>>> write-protected pages into ioreq server. However, it turned out
>>>>> handler of this new hypercall would be almost the same with the
>>>>> existing pair - HVMOP_[un]map_io_range_to_ioreq_server, and there's
>>>>> already a type parameter in this hypercall. So no new hypercall
>>>>> defined, only a new type is introduced.
>>>>>
>>>>> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
>>>>> ---
>>>>>    tools/libxc/include/xenctrl.h    | 31 ++++++++++++++++++++++
>>>>>    tools/libxc/xc_domain.c          | 55
>>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>>    xen/arch/x86/hvm/hvm.c           | 26 ++++++++++++++++++-
>>>>>    xen/include/asm-x86/hvm/domain.h |  4 +--
>>>>>    xen/include/public/hvm/hvm_op.h  |  1 +
>>>>>    xen/include/public/hvm/ioreq.h   |  1 +
>>>>>    6 files changed, 115 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/tools/libxc/include/xenctrl.h
>>>>> b/tools/libxc/include/xenctrl.h
>>>>> index de3c0ad..53f196d 100644
>>>>> --- a/tools/libxc/include/xenctrl.h
>>>>> +++ b/tools/libxc/include/xenctrl.h
>>>>> @@ -2010,6 +2010,37 @@ int
>>>>> xc_hvm_unmap_io_range_from_ioreq_server(xc_interface *xch,
>>>>>                                                int is_mmio,
>>>>>                                                uint64_t start,
>>>>>                                                uint64_t end);
>>>>> +/**
>>>>> + * This function registers a range of write-protected memory for
>>>>> emulation.
>>>>> + *
>>>>> + * @parm xch a handle to an open hypervisor interface.
>>>>> + * @parm domid the domain id to be serviced
>>>>> + * @parm id the IOREQ Server id.
>>>>> + * @parm start start of range
>>>>> + * @parm end end of range (inclusive).
>>>>> + * @return 0 on success, -1 on failure.
>>>>> + */
>>>>> +int xc_hvm_map_mem_range_to_ioreq_server(xc_interface *xch,
>>>>> +                                        domid_t domid,
>>>>> +                                        ioservid_t id,
>>>>> +                                        uint64_t start,
>>>>> +                                        uint64_t end);
>>>>> +
>>>>> +/**
>>>>> + * This function deregisters a range of write-protected memory for
>>>>> emulation.
>>>>> + *
>>>>> + * @parm xch a handle to an open hypervisor interface.
>>>>> + * @parm domid the domain id to be serviced
>>>>> + * @parm id the IOREQ Server id.
>>>>> + * @parm start start of range
>>>>> + * @parm end end of range (inclusive).
>>>>> + * @return 0 on success, -1 on failure.
>>>>> + */
>>>>> +int xc_hvm_unmap_mem_range_from_ioreq_server(xc_interface
>> *xch,
>>>>> +                                            domid_t domid,
>>>>> +                                            ioservid_t id,
>>>>> +                                            uint64_t start,
>>>>> +                                            uint64_t end);
>>>>>
>>>>>    /**
>>>>>     * This function registers a PCI device for config space emulation.
>>>>> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
>>>>> index 2ee26fb..42aeba9 100644
>>>>> --- a/tools/libxc/xc_domain.c
>>>>> +++ b/tools/libxc/xc_domain.c
>>>>> @@ -1552,6 +1552,61 @@ int
>>>>> xc_hvm_unmap_io_range_from_ioreq_server(xc_interface *xch,
>> domid_t
>>>>> domid,
>>>>>        return rc;
>>>>>    }
>>>>>
>>>>> +int xc_hvm_map_mem_range_to_ioreq_server(xc_interface *xch,
>>>>> domid_t domid,
>>>>> +                                        ioservid_t id, uint64_t
>>>>> start, uint64_t end)
>>>>> +{
>>>>> +    DECLARE_HYPERCALL;
>>>>> +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
>>>>> +    int rc;
>>>>> +
>>>>> +    arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
>>>>> +    if ( arg == NULL )
>>>>> +        return -1;
>>>>> +
>>>>> +    hypercall.op     = __HYPERVISOR_hvm_op;
>>>>> +    hypercall.arg[0] = HVMOP_map_io_range_to_ioreq_server;
>>>>> +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
>>>>> +
>>>>> +    arg->domid = domid;
>>>>> +    arg->id = id;
>>>>> +    arg->type = HVMOP_IO_RANGE_WP_MEM;
>>>>> +    arg->start = start;
>>>>> +    arg->end = end;
>>>>> +
>>>>> +    rc = do_xen_hypercall(xch, &hypercall);
>>>>> +
>>>>> +    xc_hypercall_buffer_free(xch, arg);
>>>>> +    return rc;
>>>>> +}
>>>>> +
>>>>> +int xc_hvm_unmap_mem_range_from_ioreq_server(xc_interface
>> *xch,
>>>>> domid_t domid,
>>>>> +                                            ioservid_t id, uint64_t
>>>>> start, uint64_t end)
>>>>> +{
>>>>> +    DECLARE_HYPERCALL;
>>>>> +    DECLARE_HYPERCALL_BUFFER(xen_hvm_io_range_t, arg);
>>>>> +    int rc;
>>>>> +
>>>>> +    arg = xc_hypercall_buffer_alloc(xch, arg, sizeof(*arg));
>>>>> +    if ( arg == NULL )
>>>>> +        return -1;
>>>>> +
>>>>> +    hypercall.op     = __HYPERVISOR_hvm_op;
>>>>> +    hypercall.arg[0] = HVMOP_unmap_io_range_from_ioreq_server;
>>>>> +    hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
>>>>> +
>>>>> +    arg->domid = domid;
>>>>> +    arg->id = id;
>>>>> +    arg->type = HVMOP_IO_RANGE_WP_MEM;
>>>>> +    arg->start = start;
>>>>> +    arg->end = end;
>>>>> +
>>>>> +    rc = do_xen_hypercall(xch, &hypercall);
>>>>> +
>>>>> +    xc_hypercall_buffer_free(xch, arg);
>>>>> +    return rc;
>>>>> +
>>>>> +}
>>>>> +
>>>>>    int xc_hvm_map_pcidev_to_ioreq_server(xc_interface *xch, domid_t
>>>>> domid,
>>>>>                                          ioservid_t id, uint16_t segment,
>>>>>                                          uint8_t bus, uint8_t device,
>>>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>>>> index c957610..0389c0a 100644
>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>> @@ -938,8 +938,9 @@ static int
>> hvm_ioreq_server_alloc_rangesets(struct
>>>>> hvm_ioreq_server *s,
>>>>>
>>>>>            rc = asprintf(&name, "ioreq_server %d %s", s->id,
>>>>>                          (i == HVMOP_IO_RANGE_PORT) ? "port" :
>>>>> -                      (i == HVMOP_IO_RANGE_MEMORY) ? "memory" :
>>>>> +                      (i == HVMOP_IO_RANGE_MEMORY) ? "mmio" :
>>>>>                          (i == HVMOP_IO_RANGE_PCI) ? "pci" :
>>>>> +                      (i == HVMOP_IO_RANGE_WP_MEM) ? "wp-ed memory" :
>>>>>                          "");
>>>>>            if ( rc )
>>>>>                goto fail;
>>>>> @@ -1258,6 +1259,7 @@ static int
>>>>> hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
>>>>>                case HVMOP_IO_RANGE_PORT:
>>>>>                case HVMOP_IO_RANGE_MEMORY:
>>>>>                case HVMOP_IO_RANGE_PCI:
>>>>> +            case HVMOP_IO_RANGE_WP_MEM:
>>>>>                    r = s->range[type];
>>>>>                    break;
>>>>>
>>>>> @@ -1309,6 +1311,7 @@ static int
>>>>> hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t
>> id,
>>>>>                case HVMOP_IO_RANGE_PORT:
>>>>>                case HVMOP_IO_RANGE_MEMORY:
>>>>>                case HVMOP_IO_RANGE_PCI:
>>>>> +            case HVMOP_IO_RANGE_WP_MEM:
>>>>>                    r = s->range[type];
>>>>>                    break;
>>>>>
>>>>> @@ -2523,6 +2526,8 @@ struct hvm_ioreq_server
>>>>> *hvm_select_ioreq_server(struct domain *d,
>>>>>        uint32_t cf8;
>>>>>        uint8_t type;
>>>>>        uint64_t addr;
>>>>> +    p2m_type_t p2mt;
>>>>> +    struct page_info *ram_page;
>>>>>
>>>>>        if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
>>>>>            return NULL;
>>>>> @@ -2565,6 +2570,17 @@ struct hvm_ioreq_server
>>>>> *hvm_select_ioreq_server(struct domain *d,
>>>>>        {
>>>>>            type = p->type;
>>>>>            addr = p->addr;
>>>>> +
>>>>> +        if ( p->type == IOREQ_TYPE_COPY )
>>>>> +        {
>>>>> +            ram_page = get_page_from_gfn(d, p->addr >> PAGE_SHIFT,
>>>>> +                                         &p2mt, P2M_UNSHARE);
>>>>> +            if ( p2mt == p2m_mmio_write_dm )
>>>>> +                type = IOREQ_TYPE_WP_MEM;
>>>>> +
>>>>> +            if ( ram_page )
>>>>> +                put_page(ram_page);
>>>>> +        }
>>>>>        }
>>>>>
>>>>>        list_for_each_entry ( s,
>>>>> @@ -2582,6 +2598,7 @@ struct hvm_ioreq_server
>>>>> *hvm_select_ioreq_server(struct domain *d,
>>>>>            BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
>>>>>            BUILD_BUG_ON(IOREQ_TYPE_COPY !=
>> HVMOP_IO_RANGE_MEMORY);
>>>>>            BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG !=
>>>>> HVMOP_IO_RANGE_PCI);
>>>>> +        BUILD_BUG_ON(IOREQ_TYPE_WP_MEM !=
>>>>> HVMOP_IO_RANGE_WP_MEM);
>>>>>            r = s->range[type];
>>>>>
>>>>>            switch ( type )
>>>>> @@ -2609,6 +2626,13 @@ struct hvm_ioreq_server
>>>>> *hvm_select_ioreq_server(struct domain *d,
>>>>>                }
>>>>>
>>>>>                break;
>>>>> +
>>>>> +        case IOREQ_TYPE_WP_MEM:
>>>>> +            end = addr + (p->size * p->count) - 1;
>>>>> +            if ( rangeset_contains_range(r, addr, end) )
>>>>> +                return s;
>>>>> +
>>>>> +            break;
>>>>>            }
>>>>>        }
>>>>>
>>>>> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
>>>>> x86/hvm/domain.h
>>>>> index 992d5d1..b2e4234 100644
>>>>> --- a/xen/include/asm-x86/hvm/domain.h
>>>>> +++ b/xen/include/asm-x86/hvm/domain.h
>>>>> @@ -48,8 +48,8 @@ struct hvm_ioreq_vcpu {
>>>>>        bool_t           pending;
>>>>>    };
>>>>>
>>>>> -#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_PCI + 1)
>>>>> -#define MAX_NR_IO_RANGES  256
>>>>> +#define NR_IO_RANGE_TYPES (HVMOP_IO_RANGE_WP_MEM + 1)
>>>>> +#define MAX_NR_IO_RANGES  8192
>>>>>
>>>>>    struct hvm_ioreq_server {
>>>>>        struct list_head       list_entry;
>>>>> diff --git a/xen/include/public/hvm/hvm_op.h
>>>>> b/xen/include/public/hvm/hvm_op.h
>>>>> index 014546a..c02c91a 100644
>>>>> --- a/xen/include/public/hvm/hvm_op.h
>>>>> +++ b/xen/include/public/hvm/hvm_op.h
>>>>> @@ -331,6 +331,7 @@ struct xen_hvm_io_range {
>>>>>    # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
>>>>>    # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
>>>>>    # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func
>> range */
>>>>> +# define HVMOP_IO_RANGE_WP_MEM 3 /* Write-protected memory
>>>>> range */
>>>>>        uint64_aligned_t start, end; /* IN - inclusive start and end of
>>>>> range */
>>>>>    };
>>>>>    typedef struct xen_hvm_io_range xen_hvm_io_range_t;
>>>>> diff --git a/xen/include/public/hvm/ioreq.h
>>>>> b/xen/include/public/hvm/ioreq.h
>>>>> index 2e5809b..2b712ac 100644
>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>> @@ -35,6 +35,7 @@
>>>>>    #define IOREQ_TYPE_PIO          0 /* pio */
>>>>>    #define IOREQ_TYPE_COPY         1 /* mmio ops */
>>>>>    #define IOREQ_TYPE_PCI_CONFIG   2
>>>>> +#define IOREQ_TYPE_WP_MEM       3
>>>>>    #define IOREQ_TYPE_TIMEOFFSET   7
>>>>>    #define IOREQ_TYPE_INVALIDATE   8 /* mapcache */
>>>>>
>>>>> --
>>>>> 1.9.1
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xen.org
>>>> http://lists.xen.org/xen-devel
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel
>>>
>>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>
>

  reply	other threads:[~2015-08-14  9:30 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-13 10:05 [PATCH v4 0/2] Refactor ioreq server for better performance Yu Zhang
2015-08-13 10:05 ` [PATCH v4 1/2] Differentiate IO/mem resources tracked by ioreq server Yu Zhang
2015-08-13 10:16   ` Paul Durrant
2015-08-13 10:39     ` Yu, Zhang
2015-08-14  7:40       ` Yu, Zhang
2015-08-14  8:46         ` Paul Durrant
2015-08-14  9:30           ` Yu, Zhang [this message]
2015-08-13 10:05 ` [PATCH v4 2/2] Refactor rangeset structure for better performance Yu Zhang
2015-08-13 10:33   ` Paul Durrant
2015-08-13 10:46     ` Yu, Zhang
2015-08-13 11:29       ` Paul Durrant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55CDB533.70602@linux.intel.com \
    --to=yu.c.zhang@linux.intel.com \
    --cc=Andrew.Cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=Stefano.Stabellini@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).