From: Yu Zhang <yu.c.zhang@linux.intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
George Dunlap <george.dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Tim Deegan <tim@xen.org>,
xen-devel@lists.xen.org, Paul Durrant <paul.durrant@citrix.com>,
zhiyuan.lv@intel.com, Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server.
Date: Sat, 11 Mar 2017 16:42:07 +0800 [thread overview]
Message-ID: <58C3B85F.6060205@linux.intel.com> (raw)
In-Reply-To: <58C2D462020000780014203E@prv-mh.provo.novell.com>
On 3/10/2017 11:29 PM, Jan Beulich wrote:
>>>> On 08.03.17 at 16:33, <yu.c.zhang@linux.intel.com> wrote:
>> changes in v7:
>> - Use new ioreq server interface - XEN_DMOP_map_mem_type_to_ioreq_server.
>> - According to comments from George: removed domain_pause/unpause() in
>> hvm_map_mem_type_to_ioreq_server(), because it's too expensive,
>> and we can avoid the:
>> a> deadlock between p2m lock and ioreq server lock by using these locks
>> in the same order - solved in patch 4;
> That is, until patch 4 there is deadlock potential? I think you want to
> re-order the patches if so. Or was it that the type can't really be used
> until the last patch of the series? (I'm sorry, it's been quite a while
> since the previous version.)
Oh. There's no deadlock potential in this version patch set. But in v6,
there was, and I used
domain_pause/unpause() to avoid this. Later on, I realized that if I use
different locks in the
same order, the deadlock potential can be avoid and we do not need
domain_pause/unpause
in this version.
>> @@ -365,6 +383,24 @@ static int dm_op(domid_t domid,
>> break;
>> }
>>
>> + case XEN_DMOP_map_mem_type_to_ioreq_server:
>> + {
>> + const struct xen_dm_op_map_mem_type_to_ioreq_server *data =
>> + &op.u.map_mem_type_to_ioreq_server;
>> +
>> + rc = -EINVAL;
>> + if ( data->pad )
>> + break;
>> +
>> + /* Only support for HAP enabled hvm. */
>> + if ( !hap_enabled(d) )
>> + break;
> Perhaps better to give an error other than -EINVAL in this case?
> If so, then the same error should likely also be used in your
> set_mem_type() addition.
How about -ENOTSUP?
>> --- a/xen/arch/x86/hvm/emulate.c
>> +++ b/xen/arch/x86/hvm/emulate.c
>> @@ -99,6 +99,7 @@ static int hvmemul_do_io(
>> uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
>> {
>> struct vcpu *curr = current;
>> + struct domain *currd = curr->domain;
>> struct hvm_vcpu_io *vio = &curr->arch.hvm_vcpu.hvm_io;
>> ioreq_t p = {
>> .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
>> @@ -140,7 +141,7 @@ static int hvmemul_do_io(
>> (p.dir != dir) ||
>> (p.df != df) ||
>> (p.data_is_ptr != data_is_addr) )
>> - domain_crash(curr->domain);
>> + domain_crash(currd);
> If you mean to do this transformation here, then please do so
> consistently for the entire function.
OK. Thanks.
>> @@ -177,8 +178,65 @@ static int hvmemul_do_io(
>> break;
>> case X86EMUL_UNHANDLEABLE:
>> {
>> - struct hvm_ioreq_server *s =
>> - hvm_select_ioreq_server(curr->domain, &p);
>> + /*
>> + * Xen isn't emulating the instruction internally, so see if
>> + * there's an ioreq server that can handle it. Rules:
>> + *
>> + * - PIO and "normal" mmio run through hvm_select_ioreq_server()
>> + * to choose the ioreq server by range. If no server is found,
>> + * the access is ignored.
>> + *
>> + * - p2m_ioreq_server accesses are handled by the current
>> + * ioreq_server for the domain, but there are some corner
>> + * cases:
> Who or what is "the current ioreq_server for the domain"?
It means "the current ioreq_server which maps the p2m_ioreq_server type
for this domain"...
I'd like to use a succinct phrase, but now seems not accurate enough.
Any preference?
>> + * - If the domain ioreq_server is NULL, assume this is a
>> + * race between the unbinding of ioreq server and guest fault
>> + * so re-try the instruction.
>> + *
>> + * - If the IOREQ_MEM_ACCESS_WRITE flag is not set, treat it
>> + * like a normal PIO or MMIO that doesn't have an ioreq
>> + * server (i.e., by ignoring it).
>> + */
>> + struct hvm_ioreq_server *s = NULL;
>> + p2m_type_t p2mt = p2m_invalid;
>> +
>> + if ( is_mmio )
>> + {
>> + unsigned long gmfn = paddr_to_pfn(addr);
>> +
>> + (void) get_gfn_query_unlocked(currd, gmfn, &p2mt);
> Stray cast.
OK. Will remove it.
>
>> + if ( p2mt == p2m_ioreq_server )
>> + {
>> + unsigned int flags;
>> +
>> + s = p2m_get_ioreq_server(currd, &flags);
>> +
>> + /*
>> + * If p2mt is ioreq_server but ioreq_server is NULL,
>> + * we probably lost a race with unbinding of ioreq
>> + * server, just retry the access.
>> + */
>> + if ( s == NULL )
>> + {
>> + rc = X86EMUL_RETRY;
>> + vio->io_req.state = STATE_IOREQ_NONE;
>> + break;
>> + }
>> +
>> + /*
>> + * If the IOREQ_MEM_ACCESS_WRITE flag is not set,
>> + * we should set s to NULL, and just ignore such
>> + * access.
>> + */
>> + if ( !(flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE) )
>> + s = NULL;
> What is this about? You only allow WRITE registrations, so this looks
> to be dead code. Yet if it is meant to guard against future enabling
> of READ, then this clearly should not be done for reads.
It's to guard against future emulation of READ. We can remove it for now.
>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -131,6 +131,13 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
>> entry->r = entry->w = entry->x = 1;
>> entry->a = entry->d = !!cpu_has_vmx_ept_ad;
>> break;
>> + case p2m_ioreq_server:
>> + entry->r = 1;
>> + entry->w = !(p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE);
> Along the lines of the previous comment - if you mean to have the
> code cope with READ, please also set ->r accordingly, or add a
> comment why this won't have the intended effect (yielding a not
> present EPTE).
How about we keep this code and do not support READ? I'll remove above
dead code in hvmemul_do_io().
>> @@ -92,8 +94,13 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t mfn,
>> default:
>> return flags | _PAGE_NX_BIT;
>> case p2m_grant_map_ro:
>> - case p2m_ioreq_server:
>> return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
>> + case p2m_ioreq_server:
>> + flags |= P2M_BASE_FLAGS | _PAGE_RW | _PAGE_NX_BIT;
>> + if ( p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
>> + return flags & ~_PAGE_RW;
>> + else
>> + return flags;
> Stray else. But even better would imo be
>
> if ( p2m->ioreq.flags & XEN_DMOP_IOREQ_MEM_ACCESS_WRITE )
> flags &= ~_PAGE_RW;
> return flags;
Oh. Thanks. :)
>> +struct hvm_ioreq_server *p2m_get_ioreq_server(struct domain *d,
>> + unsigned int *flags)
>> +{
>> + struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> + struct hvm_ioreq_server *s;
>> +
>> + spin_lock(&p2m->ioreq.lock);
>> +
>> + s = p2m->ioreq.server;
>> + *flags = p2m->ioreq.flags;
>> +
>> + spin_unlock(&p2m->ioreq.lock);
>> + return s;
>> +}
> I'm afraid this question was asked before, but since there's no
> comment here or anywhere, I can't recall if there was a reason why
> s potentially being stale by the time the caller looks at it is not a
> problem.
Well, it is possibe that s is stale. I did not take it as a problem
because the device model
will later discard such io request. And I believe current
hvm_select_ioreq_server() also
has the same issue - the returned s should be considered to be stale, if
the MMIO/PIO
address is removed from the ioreq server's rangeset.
Another thought is, if you think it is inappropriate for device model to
do the check,
we can use spin_lock_recursive on ioreq_server.lock to protect all the
ioreq server select
and release the lock after the ioreq server is sent out.
>> --- a/xen/include/public/hvm/dm_op.h
>> +++ b/xen/include/public/hvm/dm_op.h
>> @@ -318,6 +318,30 @@ struct xen_dm_op_inject_msi {
>> uint64_aligned_t addr;
>> };
>>
>> +/*
>> + * XEN_DMOP_map_mem_type_to_ioreq_server : map or unmap the IOREQ Server <id>
>> + * to specific memroy type <type>
>> + * for specific accesses <flags>
>> + *
>> + * For now, flags only accept the value of XEN_DMOP_IOREQ_MEM_ACCESS_WRITE,
>> + * which means only write operations are to be forwarded to an ioreq server.
>> + * Support for the emulation of read operations can be added when an ioreq
>> + * server has such requirement in future.
>> + */
>> +#define XEN_DMOP_map_mem_type_to_ioreq_server 15
>> +
>> +struct xen_dm_op_map_mem_type_to_ioreq_server {
>> + ioservid_t id; /* IN - ioreq server id */
>> + uint16_t type; /* IN - memory type */
>> + uint16_t pad;
> Perhaps there was padding needed when this was a hvmop, but
> now the padding does exactly the wrong thing.
Right, padding is useless in this new structure. I will remove it if
other field does not change(I proposed
a change for flag field in reply to Andrew's comments on patch 5). Thank
you, Jan.
Yu
>
> Jan
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-03-11 8:42 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-08 15:33 [PATCH v7 0/5] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2017-03-08 15:33 ` [PATCH v7 1/5] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
2017-03-08 15:33 ` [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2017-03-10 15:29 ` Jan Beulich
2017-03-11 8:42 ` Yu Zhang [this message]
2017-03-13 11:20 ` Jan Beulich
2017-03-14 7:28 ` Yu Zhang
2017-03-14 9:40 ` Paul Durrant
2017-03-14 9:52 ` Yu Zhang
2017-03-14 10:40 ` Paul Durrant
2017-03-14 12:03 ` Yu Zhang
2017-03-14 13:10 ` Jan Beulich
2017-03-14 13:28 ` Yu Zhang
2017-03-14 10:26 ` Jan Beulich
2017-03-08 15:33 ` [PATCH v7 3/5] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
2017-03-10 15:33 ` Jan Beulich
2017-03-11 8:42 ` Yu Zhang
2017-03-13 11:22 ` Jan Beulich
2017-03-14 7:28 ` Yu Zhang
2017-03-08 15:33 ` [PATCH v7 4/5] ix86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries Yu Zhang
2017-03-10 16:03 ` Jan Beulich
2017-03-11 8:42 ` Yu Zhang
2017-03-13 11:24 ` Jan Beulich
2017-03-14 7:42 ` Yu Zhang
2017-03-14 10:49 ` Jan Beulich
2017-03-14 12:18 ` Yu Zhang
2017-03-14 13:11 ` Jan Beulich
2017-03-14 13:29 ` Yu Zhang
2017-03-08 15:33 ` [PATCH v7 5/5] x86/ioreq server: Synchronously reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
2017-03-10 16:17 ` Jan Beulich
2017-03-11 8:42 ` Yu Zhang
2017-03-13 11:24 ` Jan Beulich
2017-03-10 16:59 ` Andrew Cooper
2017-03-11 8:42 ` Yu Zhang
2017-03-13 11:32 ` Jan Beulich
2017-03-14 7:42 ` Yu Zhang
2017-03-14 10:51 ` Jan Beulich
2017-03-14 12:22 ` Yu Zhang
2017-03-14 13:12 ` Jan Beulich
2017-03-14 13:29 ` Yu Zhang
-- strict thread matches above, loose matches on Subject: below --
2017-03-08 13:32 [PATCH v7 0/5] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2017-03-08 13:32 ` [PATCH v7 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58C3B85F.6060205@linux.intel.com \
--to=yu.c.zhang@linux.intel.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=george.dunlap@eu.citrix.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=paul.durrant@citrix.com \
--cc=tim@xen.org \
--cc=xen-devel@lists.xen.org \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).