All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Durrant <xadimgnik@gmail.com>
To: "'Julien Grall'" <julien@xen.org>, <xen-devel@lists.xenproject.org>
Cc: 'Stefano Stabellini' <sstabellini@kernel.org>,
	'Wei Liu' <wl@xen.org>,
	'Andrew Cooper' <andrew.cooper3@citrix.com>,
	'Ian Jackson' <ian.jackson@eu.citrix.com>,
	'George Dunlap' <george.dunlap@citrix.com>,
	'Jan Beulich' <jbeulich@suse.com>
Subject: RE: [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context
Date: Fri, 3 Apr 2020 16:55:43 +0100	[thread overview]
Message-ID: <002201d609d0$55a76690$00f633b0$@xen.org> (raw)
In-Reply-To: <5a26a89a-6422-b41d-daac-8f33a48ae23b@xen.org>

> -----Original Message-----
[snip]
> > +
> > +#include <xen/save.h>
> > +
> > +struct domain_context {
> > +    bool log;
> > +    struct domain_save_descriptor desc;
> > +    domain_copy_entry copy;
> 
> As your new framework is technically an extension of existing one, it
> would be good to explain why we diverge in the definitions.
> 

I don't follow. What is diverging? I explain in the commit comment that this is a parallel framework. Do I need to justify why it is not a carbon copy of the HVM one?

> > +    void *priv;
> > +};
> > +
> > +static struct {
> > +    const char *name;
> > +    bool per_vcpu;
> > +    domain_save_handler save;
> > +    domain_load_handler load;
> > +} handlers[DOMAIN_SAVE_CODE_MAX + 1];
> > +
> > +void __init domain_register_save_type(unsigned int tc, const char *name,
> > +                                      bool per_vcpu,
> > +                                      domain_save_handler save,
> > +                                      domain_load_handler load)
> > +{
> > +    BUG_ON(tc > ARRAY_SIZE(handlers));
> > +
> > +    ASSERT(!handlers[tc].save);
> > +    ASSERT(!handlers[tc].load);
> > +
> > +    handlers[tc].name = name;
> > +    handlers[tc].per_vcpu = per_vcpu;
> > +    handlers[tc].save = save;
> > +    handlers[tc].load = load;
> > +}
> > +
> > +int domain_save_entry(struct domain_context *c, unsigned int tc,
> > +                      const char *name, const struct vcpu *v, void *src,
> 
> I realize that 'src' is not const because how you define copy, however I
> would rather prefer if we rework the interface to avoid to keep the
> const in the definition. This may mean to have to define two callback
> rather than one.

That seems a bit ugly for the sake of a const but I guess I could create a union with a copy_in and copy_out. I'll see how that looks.

> 
> > +                      size_t src_len)
> 
> On 64-bit architecture, size_t would be 64-bit. But the record is only
> storing 32-bit. Would it make sense to add some sanity check in the code?
> 

True. Given this is new I think I'll just bump the length to 64-bits.

> > +{
> > +    int rc;
> > +
> > +    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
> > +         tc != DOMAIN_SAVE_CODE(END) )
> > +        gdprintk(XENLOG_INFO, "%pv save: %s (%lu)\n", v, name, src_len);
> > +
> > +    if ( !IS_ALIGNED(src_len, 8) )
> 
> Why not adding an implicit padding? This would avoid to chase error
> during saving because the len wasn't a multiple of 8.
> 

I don't think implicit padding is worth it. This error should only happen if a badly defined save record type is added to the code so perhaps I ought to add an ASSERT here as well as deal with the error.

> > +        return -EINVAL;
> > +
> > +    BUG_ON(tc != c->desc.typecode);
> > +    BUG_ON(v->vcpu_id != c->desc.instance);
> 
> Does the descriptor really need to be filled by domain_save()? Can't it
> be done here, so we can avoid the two BUG_ON()s?
> 

Yes it can but this serves as a sanity check that the save code is actually doing what it should. Hence why these are BUG_ON()s and not error exits.

> > +    c->desc.length = src_len;
> > +
> > +    rc = c->copy(c->priv, &c->desc, sizeof(c->desc));
> > +    if ( rc )
> > +        return rc;
> > +
> > +    return c->copy(c->priv, src, src_len);
> > +}
> > +
> > +int domain_save(struct domain *d, domain_copy_entry copy, void *priv,
> > +                unsigned long mask, bool dry_run)
> > +{
> > +    struct domain_context c = {
> > +        .copy = copy,
> > +        .priv = priv,
> > +        .log = !dry_run,
> > +    };
> > +    struct domain_save_header h = {
> > +        .magic = DOMAIN_SAVE_MAGIC,
> > +        .version = DOMAIN_SAVE_VERSION,
> > +    };
> > +    struct domain_save_header e;
> > +    unsigned int i;
> > +    int rc;
> > +
> > +    ASSERT(d != current->domain);
> > +
> > +    if ( d->is_dying )
> > +        return -EINVAL;
> > +
> > +    domain_pause(d);
> > +
> > +    c.desc.typecode = DOMAIN_SAVE_CODE(HEADER);
> > +
> > +    rc = DOMAIN_SAVE_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h));
> > +    if ( rc )
> > +        goto out;
> > +
> > +    for ( i = 0; i < ARRAY_SIZE(handlers); i++ )
> 
> AFAIU, with this solution, if there are dependency between records, the
> dependencies will have to a lower "index". I think we may have some
> dependency with guest transparent migration such as we need to restore
> the event ABI before restoring the event channels.
> 

Is that just a downstream implementation detail though? I would hope that there are no ordering dependencies between records.

> Should we rely on the index for the dependency?
>

If we do need ordering dependencies then I guess it would need to be explicit.
 
> In any case, I think we want to document it.
>

I can certainly document that save handlers are invoked in code order.

> > +    {
> > +        domain_save_handler save = handlers[i].save;
> > +
> > +        if ( (mask && !test_bit(i, &mask)) || !save )
> 
> The type of mask suggests it is not possible to have more than 32
> different types of record if we wanted to be arch agnostic. Even if we
> focus on 64-bit arch, this is only 64 records.
> 
> This is not very future proof, but might be ok if this is not exposed
> outside of the hypervisor (I haven't looked at the rest of the series
> yet). However, we at least need a BUILD_BUG_ON() in place. So please
> make sure  DOMAIN_SAVE_CODE_MAX is always less than 64.
> 
> Also what if there is a bit set in the mask and the record is not
> existing? Shouldn't we return an error?
> 

TBH I think 32 will be enough... I would not expect this context space to grow very much, if at all, once transparent migration is working. I think I'll just drop the mask for now; it's not actually clear to me we'll make use of it... just seemed like a nice-to-have.

> > +            continue;
> > +
> > +        memset(&c.desc, 0, sizeof(c.desc));
> > +        c.desc.typecode = i;
> > +
> > +        if ( handlers[i].per_vcpu )
> > +        {
> > +            struct vcpu *v;
> > +
> > +            for_each_vcpu ( d, v )
> > +            {
> > +                c.desc.instance = v->vcpu_id;
> > +
> > +                rc = save(v, &c, dry_run);
> > +                if ( rc )
> > +                    goto out;
> > +            }
> > +        }
> > +        else
> > +        {
> > +            rc = save(d->vcpu[0], &c, dry_run);
> > +            if ( rc )
> > +                goto out;
> > +        }
> > +    }
> > +
> > +    memset(&c.desc, 0, sizeof(c.desc));
> > +    c.desc.typecode = DOMAIN_SAVE_CODE(END);
> > +
> > +    rc = DOMAIN_SAVE_ENTRY(END, &c, d->vcpu[0], &e, 0);
> > +
> > + out:
> > +    domain_unpause(d);
> > +
> > +    return rc;
> > +}
> > +
> > +int domain_load_entry(struct domain_context *c, unsigned int tc,
> > +                      const char *name, const struct vcpu *v, void *dst,
> > +                      size_t dst_len, bool exact)
> > +{
> > +    int rc;
> > +
> > +    if ( c->log && tc != DOMAIN_SAVE_CODE(HEADER) &&
> > +         tc != DOMAIN_SAVE_CODE(END) )
> > +        gdprintk(XENLOG_INFO, "%pv load: %s (%lu)\n", v, name, dst_len);
> > +
> > +    BUG_ON(tc != c->desc.typecode);
> > +    BUG_ON(v->vcpu_id != c->desc.instance);
> 
> Is it really warrant to crash the host? What would happen if the values
> were wrong?
> 

It would mean the code is fairly seriously buggy in that the load handler is trying to load a record other than the type it registered for, or for a vcpu other than the one it was passed.

> > +
> > +    if ( (exact ?
> > +          (dst_len != c->desc.length) : (dst_len < c->desc.length)) ||
> 
> Using ternary in if is really confusing. How about:
> 
> dst_len < c->desc.length || (exact && dst_len != c->desc.length) ||
> 
> I understand that there would be two check for the exact case but I
> think it is better than a ternary.

I'm going to re-work this I think.

> 
> However what is the purpose of the param 'exact'? If you set it to false
> how do you know the size you read?

The point of the exact parameter is that whether the caller can only consume a record of the correct length, or whether it can handle an undersize one which gets zero-padded. (It's the same as the zeroextend option in HVM records).

> 
> > +         !IS_ALIGNED(c->desc.length, 8) )
> > +        return -EINVAL;
> > +
> > +    rc = c->copy(c->priv, dst, c->desc.length);
> > +    if ( rc )
> > +        return rc;
> > +
> > +    if ( !exact )
> > +    {
> > +        dst += c->desc.length;
> > +        memset(dst, 0, dst_len - c->desc.length);
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +int domain_load(struct domain *d,  domain_copy_entry copy, void *priv,
> > +                unsigned long mask)
> > +{
> > +    struct domain_context c = {
> > +        .copy = copy,
> > +        .priv = priv,
> > +        .log = true,
> > +    };
> > +    struct domain_save_header h, e;
> > +    int rc;
> > +
> > +    ASSERT(d != current->domain);
> > +
> > +    if ( d->is_dying )
> > +        return -EINVAL;
> 
> What does protect you against the doing dying right after this check?
> 

Nothing. It's just to avoid doing pointless work if we can.

> > +
> > +    rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
> > +    if ( rc )
> > +        return rc;
> > +
> > +    if ( c.desc.typecode != DOMAIN_SAVE_CODE(HEADER) ||
> > +         c.desc.instance != 0 )
> > +        return -EINVAL;
> > +
> > +    rc = DOMAIN_LOAD_ENTRY(HEADER, &c, d->vcpu[0], &h, sizeof(h), true);
> > +    if ( rc )
> > +        return rc;
> > +
> > +    if ( h.magic != DOMAIN_SAVE_MAGIC || h.version != DOMAIN_SAVE_VERSION )
> > +        return -EINVAL;
> > +
> > +    domain_pause(d);
> > +
> > +    for (;;)
> > +    {
> > +        unsigned int i;
> > +        domain_load_handler load;
> > +        struct vcpu *v;
> > +
> > +        rc = c.copy(c.priv, &c.desc, sizeof(c.desc));
> > +        if ( rc )
> > +            break;
> > +
> > +        if ( c.desc.typecode == DOMAIN_SAVE_CODE(END) ) {
> > +            rc = DOMAIN_LOAD_ENTRY(END, &c, d->vcpu[0], &e, 0, true);
> > +            break;
> > +        }
> > +
> > +        rc = -EINVAL;
> > +        if ( c.desc.typecode >= ARRAY_SIZE(handlers) ||
> > +             c.desc.instance >= d->max_vcpus )
> 
> Nothing in the documention suggests that c.desc.instance should be 0
> when the record is for the domain.
> 

Ok. I'll put a comment somewhere.

> > +            break;
> > +
> > +        i = c.desc.typecode;
> > +        load = handlers[i].load;
> > +        v = d->vcpu[c.desc.instance];
> > +
> > +        if ( mask && !test_bit(i, &mask) )
> > +        {
> > +            /* Sink the data */
> > +            rc = c.copy(c.priv, NULL, c.desc.length);
> > +            if ( rc )
> > +                break;
> > +
> > +            continue;
> > +        }
> > +
> > +        rc = load ? load(v, &c) : -EOPNOTSUPP;
> > +        if ( rc )
> > +            break;
> > +    }
> > +
> > +    domain_unpause(d);
> > +
> > +    return rc;
> > +}
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/include/public/save.h b/xen/include/public/save.h
> > new file mode 100644
> > index 0000000000..84981cd0f6
> > --- /dev/null
> > +++ b/xen/include/public/save.h
> > @@ -0,0 +1,74 @@
> > +/*
> > + * save.h
> > + *
> > + * Structure definitions for common PV/HVM domain state that is held by
> > + * Xen and must be saved along with the domain's memory.
> > + *
> > + * Copyright Amazon.com Inc. or its affiliates.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a copy
> > + * of this software and associated documentation files (the "Software"), to
> > + * deal in the Software without restriction, including without limitation the
> > + * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
> > + * sell copies of the Software, and to permit persons to whom the Software is
> > + * furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
> > + * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> > + * DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#ifndef __XEN_PUBLIC_SAVE_H__
> > +#define __XEN_PUBLIC_SAVE_H__
> > +
> > +#include "xen.h"
> > +
> > +/* Each entry is preceded by a descriptor */
> > +struct domain_save_descriptor {
> > +    uint16_t typecode;
> > +    uint16_t instance;
> > +    /*
> > +     * Entry length not including this descriptor. Entries must be padded
> > +     * to a multiple of 8 bytes to make sure descriptors remain correctly
> > +     * aligned.
> > +     */
> > +    uint32_t length;
> > +};
> > +
> > +/*
> > + * Each entry has a type associated with it. DECLARE_DOMAIN_SAVE_TYPE
> > + * binds these things together.
> > + */
> > +#define DECLARE_DOMAIN_SAVE_TYPE(_x, _code, _type) \
> > +    struct __DOMAIN_SAVE_TYPE_##_x { _type t; char c[_code]; };
> > +
> > +#define DOMAIN_SAVE_TYPE(_x) \
> > +    typeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->t)
> > +#define DOMAIN_SAVE_CODE(_x) \
> > +    (sizeof (((struct __DOMAIN_SAVE_TYPE_##_x *)(0))->c))
> > +#define DOMAIN_SAVE_MASK(_x) (1u << DOMAIN_SAVE_CODE(_x))
> > +
> > +/* Terminating entry */
> > +struct domain_save_end {};
> > +DECLARE_DOMAIN_SAVE_TYPE(END, 0, struct domain_save_end);
> > +
> > +#define DOMAIN_SAVE_MAGIC   0x53415645
> > +#define DOMAIN_SAVE_VERSION 0x00000001
> > +
> > +/* Initial entry */
> > +struct domain_save_header {
> > +    uint32_t magic;             /* Must be DOMAIN_SAVE_MAGIC */
> > +    uint32_t version;           /* Save format version */
> 
> Would it make sense to have the version of Xen in the stream?
> 

Why? How would it help?

  Paul



  parent reply	other threads:[~2020-04-03 15:56 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-27 18:50 [Xen-devel] [PATCH 0/5] domain context infrastructure Paul Durrant
2020-03-27 18:50 ` [Xen-devel] [PATCH 1/5] xen/common: introduce a new framework for save/restore of 'domain' context Paul Durrant
2020-04-01 12:00   ` Julien Grall
2020-04-01 12:07     ` Jan Beulich
2020-04-01 12:16       ` Julien Grall
2020-04-01 12:23         ` Jan Beulich
2020-04-03 15:55     ` Paul Durrant [this message]
2020-04-03 17:24       ` Julien Grall
2020-04-06  8:27         ` Paul Durrant
2020-04-06  9:08           ` Julien Grall
2020-04-06  9:18             ` Paul Durrant
2020-04-06  9:50               ` Julien Grall
2020-04-06 10:34                 ` Paul Durrant
2020-04-06 12:43                   ` Jan Beulich
2020-04-01 14:50   ` Jan Beulich
2020-04-02  9:58     ` Paul Durrant
2020-04-02 11:08       ` Jan Beulich
2020-04-02 14:00         ` Paul Durrant
2020-04-03  8:38           ` Jan Beulich
2020-03-27 18:50 ` [Xen-devel] [PATCH 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext Paul Durrant
2020-04-01 13:42   ` Julien Grall
2020-04-06  9:07     ` Paul Durrant
2020-04-06 12:46       ` Jan Beulich
2020-04-01 13:46   ` Julien Grall
2020-03-27 18:50 ` [Xen-devel] [PATCH 3/5] tools/misc: add xen-ctx to present domain context Paul Durrant
2020-03-30 10:54   ` Jan Beulich
2020-04-03 15:20     ` Paul Durrant
2020-04-03 15:29       ` Jan Beulich
2020-04-03 16:08         ` Paul Durrant
2020-03-27 18:50 ` [Xen-devel] [PATCH 4/5] common/domain: add a domain context record for shared_info Paul Durrant
2020-04-01 14:27   ` Julien Grall
2020-03-27 18:50 ` [Xen-devel] [PATCH 5/5] tools/libxc: make use of domain context SHARED_INFO record Paul Durrant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='002201d609d0$55a76690$00f633b0$@xen.org' \
    --to=xadimgnik@gmail.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=paul@xen.org \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.