xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Juergen Gross <juergen.gross@ts.fujitsu.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	xen-devel <xen-devel@lists.xen.org>,
	Daniel De Graaf <dgdegra@tycho.nsa.gov>
Subject: Re: [PATCH 1/4] xen: report how much memory a domain has on each NUMA node
Date: Wed, 5 Mar 2014 17:13:23 +0100	[thread overview]
Message-ID: <1394036003.16409.20.camel@Solace> (raw)
In-Reply-To: <53174AF40200007800121357@nat28.tlf.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 5012 bytes --]

On mer, 2014-03-05 at 15:04 +0000, Jan Beulich wrote:
> >>> On 05.03.14 at 15:36, Dario Faggioli <dario.faggioli@citrix.com> wrote:
> > --- a/xen/common/domctl.c
> > +++ b/xen/common/domctl.c
> > @@ -574,6 +574,51 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
> >      }
> >      break;
> >  
> > +    case XEN_DOMCTL_numainfo:
> > +    {
> > +        uint32_t node, max_node_index, last_online_node;
> > +        xen_domctl_numainfo_t *ni = &op->u.numainfo;
> > +        uint64_t *memkb_on_node;
> > +        struct page_info *page;
> > +
> > +        /*
> > +         * We report back info about the min number of nodes between how
> > +         * much of them the caller can handle and the number of them that
> > +         * are actually online.
> > +         */
> > +        last_online_node = last_node(node_online_map);
> > +        max_node_index = min_t(uint32_t, ni->max_node_index, last_online_node);
> > +        ni->max_node_index = max_node_index;
> > +
> > +        ret = -ENOMEM;
> > +        memkb_on_node = xzalloc_array(uint64_t, max_node_index);
> 
> max_node_index + 1
> 
Ouch. Will fix, thanks.

> > +        if ( !memkb_on_node )
> > +            break;
> > +
> > +        spin_lock(&d->page_alloc_lock);
> > +        page_list_for_each(page, &d->page_list)
> 
> No new non-preemptable potentially long running loops like this,
> please. See XSA-77.
> 
Not super familiar with XSAs, but do you perhaps 45 ("Several long
latency operations are not preemptible",
xenbits.xenproject.org/xsa/advisory-45.html)? 77 seems to be about
"Hypercalls exposed to privilege rings 1 and 2 of HVM guests"...

In any case, I see what you mean, and Juergen was also pointing out that
this is going to take a lot. I of course agree, but was, when
implementing, not sure about what the best solution was.

I initially thought of, as Juergen says, instrumenting page allocation
and adding the accounting there. However, I think that means doing
something very similar to having a MAX_NUMNODES big array for each
domain (or are there other ways?).

I think such information may turn out handy for other things in the
future but, for now, it'd be adding it for the sake of this hypercall
only... Is that acceptable? Any other ideas? Am I missing something?

> > +        {
> > +            node = phys_to_nid((paddr_t)page_to_mfn(page) << PAGE_SHIFT);
> 
> Please don't open-code pfn_to_paddr()/page_to_maddr().
> 
Ok. BTW, for this, I looked at what dump_numa() in xen/arch/x86/numa.c
does.

Should we fix that to --and I mean wrt both open coding, and also the
non-preemptability? Or it doesn't matter in this case, as it's just a
debug key handler?

> > +            /* For nodes that are offline, don't touch the counter */
> > +            if ( node <= max_node_index && node_online(node) )
> 
> How can a node a running domain has memory on be offline?
> Looks like you want an assertion here instead.
> 
Right. I did put this here to be on the safe side, and I agree that an
ASSERT() is much more correct and effective for that. Will do.

> > +                memkb_on_node[node]++;
> > +        }
> > +        spin_unlock(&d->page_alloc_lock);
> > +
> > +        for ( node = 0; node <= max_node_index; node++ )
> > +        {
> > +            memkb_on_node[node] <<= (PAGE_SHIFT-10);
> > +            if ( copy_to_guest_offset(ni->memkb_on_node, node,
> > +                                      &memkb_on_node[node], 1) )
> > +                break;
> 
> I'd suggest to do the copying in one go after the loop.
> 
Also ok.

> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -315,6 +315,26 @@ typedef struct xen_domctl_max_vcpus 
> > xen_domctl_max_vcpus_t;
> >  DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
> >  
> >  
> > +/* XEN_DOMCTL_numainfo */
> > +struct xen_domctl_numainfo {
> > +    /*
> > +     * IN: maximum addressable entry in the caller-provided arrays.
> > +     * OUT: minimum between the maximum addressable entry in the
> > +     *      caller-provided arrays and largest online node identifier
> > +     *      in the system.
> > +     */
> > +    uint32_t max_node_index;
> 
> With that IN/OUT specification (and the matching implementation
> above), how would the caller know it passed too small a value to
> fit all information?
> 
It doesn't. I designed it to be similar to XEN_SYSCTL_numainfo, which
retrieves _system_wide_ NUMA information. Should I use a new, OUT only,
parameter for the required number of elements (or index of the largest),
so that the caller can compare the twos and figure things out?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2014-03-05 16:13 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-05 14:36 [PATCH 0/4] report how much memory a domain has on each NUMA node Dario Faggioli
2014-03-05 14:36 ` [PATCH 1/4] xen: " Dario Faggioli
2014-03-05 14:50   ` Juergen Gross
2014-03-05 16:31     ` Dario Faggioli
2014-03-05 16:49       ` Jan Beulich
2014-03-05 17:14         ` Dario Faggioli
2014-03-05 15:04   ` Jan Beulich
2014-03-05 16:13     ` Dario Faggioli [this message]
2014-03-05 16:44       ` Jan Beulich
2014-03-05 14:36 ` [PATCH 2/4] libxc: " Dario Faggioli
2014-03-05 15:05   ` Andrew Cooper
2014-03-05 15:40     ` Dario Faggioli
2014-03-10 16:39   ` Ian Jackson
2014-03-10 17:07     ` Dario Faggioli
2014-03-10 17:09       ` Andrew Cooper
2014-03-10 17:20       ` Ian Jackson
2014-03-10 17:35         ` Dario Faggioli
2014-03-11 11:15           ` Ian Jackson
2014-03-11 17:37             ` Dario Faggioli
2014-03-11 18:16               ` Ian Jackson
2014-03-11 19:04                 ` Dario Faggioli
2014-03-13 11:54                   ` George Dunlap
2014-03-05 14:36 ` [PATCH 3/4] libxl: " Dario Faggioli
2014-03-10 16:40   ` Ian Jackson
2014-03-10 17:28     ` Dario Faggioli
2014-03-13 17:26       ` Ian Jackson
2014-03-05 14:36 ` [PATCH 4/4] xl: " Dario Faggioli
2014-03-10 16:42   ` Ian Jackson
2014-03-10 17:09     ` Dario Faggioli
2014-03-05 14:40 ` [PATCH 0/4] " Juergen Gross
2014-03-05 14:44   ` Dario Faggioli
2014-03-10 16:37 ` Ian Jackson
2014-03-10 17:12   ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1394036003.16409.20.camel@Solace \
    --to=dario.faggioli@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=juergen.gross@ts.fujitsu.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).