From: Chao Peng <chao.p.peng@linux.intel.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>,
JBeulich@suse.com, Wei Liu <wei.liu2@citrix.com>
Cc: Ian.Jackson@eu.citrix.com, xen-devel@lists.xen.org, keir@xen.org,
Ian.Campbell@citrix.com, stefano.stabellini@eu.citrix.com
Subject: Re: [PATCH 4/4] tools: add total/local memory bandwith monitoring
Date: Thu, 15 Jan 2015 16:46:23 +0800 [thread overview]
Message-ID: <20150115084623.GA7376@pengc-linux.bj.intel.com> (raw)
In-Reply-To: <54ABB8FE.3010801@citrix.com>
On Tue, Jan 06, 2015 at 10:29:18AM +0000, Andrew Cooper wrote:
> On 06/01/15 10:09, Chao Peng wrote:
> > On Mon, Jan 05, 2015 at 12:39:42PM +0000, Wei Liu wrote:
> >> On Tue, Dec 23, 2014 at 04:54:39PM +0800, Chao Peng wrote:
> >> [...]
> >>> +static int libxl__psr_cmt_get_mem_bandwidth(libxl__gc *gc, uint32_t domid,
> >>> + xc_psr_cmt_type type, uint32_t socketid, uint32_t *bandwidth)
> >>> +{
> >>> + uint64_t sample1, sample2;
> >>> + uint32_t upscaling_factor;
> >>> + int rc;
> >>> +
> >>> + rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid,
> >>> + type, socketid, &sample1);
> >>> + if (rc < 0)
> >>> + return ERROR_FAIL;
> >>> +
> >>> + usleep(10000);
> >>> +
> >>> + rc = libxl__psr_cmt_get_l3_monitoring_data(gc, domid,
> >>> + type, socketid, &sample2);
> >>> + if (rc < 0)
> >>> + return ERROR_FAIL;
> >>> +
> >>> + if (sample2 < sample1) {
> >>> + LOGE(ERROR, "event counter overflowed between two samplings");
> >>> + return ERROR_FAIL;
> >>> + }
> >>> +
> >> What's the likelihood of counter overflows? Can we handle this more
> >> gracefully? Say, retry (with maximum retry cap) when counter overflows?
> > The likelihood is very small here. Hardware guarantees the counter will
> > not overflow in one second even under maximum platform bandwidth conditions.
> > And we only sleep 0.01 second here.
> >
> > I'd like to adopt your suggestion to retry another time once that happens.
> > But only one retry and it should correct the overflow.
> >
> > Thanks,
> > Chao
>
> You have no possible way of guaranteeing that the actual elapsed time
> between the two samples is less than 1 second. On a very heavily loaded
> system, even regular task scheduling could cause an actual elapsed time
> of more than one second in that snippet of code.
>
On further thought, this could not be right if implemented this only in
tool stack, due to the fact that the duration between two samples can’t
be guaranteed. Even got sample2 > sample1 here, the data may still wrong
as the hardware counter may overflowed more than one times during this
period.
What the hardware guaranteed here is that at most 1 overflow can happen
(which can be corrected by software) when the duration between two samples
is less than 1 second. So only the data that got from two samples which
duration is actually less than 1 second is valid.
The duration must be checked to use the data, this means something must
be done in hypervisor.
My initial solution is: Add a new hypercall to get both the counter
value and the timestamp at that moment(The two operations should be
atomic).
(Looks like not good to add this to existed resource_op hypercall)
Any suggestions?
Thanks,
Chao
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2015-01-15 8:46 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-23 8:54 [PATCH 0/4] enable Memory Bandwidth Monitoring (MBM) for VMs Chao Peng
2014-12-23 8:54 ` [PATCH 1/4] x86: expose CMT L3 event mask to user space Chao Peng
2014-12-23 15:47 ` Andrew Cooper
2015-01-07 8:53 ` Jan Beulich
2014-12-23 8:54 ` [PATCH 2/4] tools: libxc: add routine to get CMT L3 event mask Chao Peng
2014-12-23 15:46 ` Andrew Cooper
2014-12-24 8:33 ` Chao Peng
2014-12-23 8:54 ` [PATCH 3/4] tools: libxl: code preparation for MBM Chao Peng
2015-01-05 12:25 ` Wei Liu
2015-01-06 9:46 ` Chao Peng
2015-01-06 9:51 ` Wei Liu
2015-01-06 10:12 ` Chao Peng
2014-12-23 8:54 ` [PATCH 4/4] tools: add total/local memory bandwith monitoring Chao Peng
2015-01-05 12:39 ` Wei Liu
2015-01-06 10:09 ` Chao Peng
2015-01-06 10:29 ` Andrew Cooper
2015-01-07 0:54 ` Chao Peng
2015-01-15 8:46 ` Chao Peng [this message]
2014-12-23 15:47 ` [PATCH 0/4] enable Memory Bandwidth Monitoring (MBM) for VMs Andrew Cooper
2014-12-24 8:35 ` Chao Peng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150115084623.GA7376@pengc-linux.bj.intel.com \
--to=chao.p.peng@linux.intel.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=keir@xen.org \
--cc=stefano.stabellini@eu.citrix.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.