From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Julian Stecklina <jsteckli@os.inf.tu-dresden.de>, jbeulich@suse.com
Cc: kvm@vger.kernel.org, xen-devel@lists.xen.org
Subject: Re: [Xen-devel] [PATCH] KVM, XEN: Fix potential race in pvclock code
Date: Tue, 28 Jan 2014 10:16:46 -0500 [thread overview]
Message-ID: <20140128151646.GA4308@phenom.dumpdata.com> (raw)
In-Reply-To: <52E651FD.2020608@os.inf.tu-dresden.de>
On Mon, Jan 27, 2014 at 01:33:01PM +0100, Julian Stecklina wrote:
> On 01/24/2014 07:08 PM, Konrad Rzeszutek Wilk wrote:
> > On Thu, Jan 16, 2014 at 03:13:44PM +0100, Julian Stecklina wrote:
> >> The paravirtualized clock used in KVM and Xen uses a version field to
> >> allow the guest to see when the shared data structure is inconsistent.
> >> The code reads the version field twice (before and after the data
> >> structure is copied) and checks whether they haven't
> >> changed and that the lowest bit is not set. As the second access is not
> >> synchronized, the compiler could generate code that accesses version
> >> more than two times and you end up with inconsistent data.
> >
> > Could you paste in the code that the 'bad' compiler generates
> > vs the compiler that generate 'good' code please?
>
> At least 4.8 and probably older compilers compile this as intended. The
> point is that the standard does not guarantee the indented behavior,
> i.e. the code is wrong.
Perhaps I misunderstood Jan's response but it sounded to me like
that the compiler was not adhering to the standard?
>
> I can refer to this lwn article:
> https://lwn.net/Articles/508991/
>
> The whole point of ACCESS_ONCE is to avoid time bombs like that. There
> are lots of place where ACCESS_ONCE is used in the kernel:
>
> http://lxr.free-electrons.com/ident?i=ACCESS_ONCE
>
> See for example the check_zero function here:
> http://lxr.free-electrons.com/source/arch/x86/kernel/kvm.c#L559
>
In other words, you don't have a sample of 'bad' compiler code.
> Julian
>
> >
> >>
> >> An example using pvclock_get_time_values:
> >>
> >> host starts updating data, sets src->version to 1
> >> guest reads src->version (1) and stores it into dst->version.
> >> guest copies inconsistent data
> >> guest reads src->version (1) and computes xor with dst->version.
> >> host finishes updating data and sets src->version to 2
> >> guest reads src->version (2) and checks whether lower bit is not set.
> >> while loop exits with inconsistent data!
> >>
> >> AFAICS the compiler is allowed to optimize the given code this way.
> >>
> >> Signed-off-by: Julian Stecklina <jsteckli@os.inf.tu-dresden.de>
> >> ---
> >> arch/x86/kernel/pvclock.c | 10 +++++++---
> >> 1 file changed, 7 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
> >> index 42eb330..f62b41c 100644
> >> --- a/arch/x86/kernel/pvclock.c
> >> +++ b/arch/x86/kernel/pvclock.c
> >> @@ -55,6 +55,8 @@ static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow)
> >> static unsigned pvclock_get_time_values(struct pvclock_shadow_time *dst,
> >> struct pvclock_vcpu_time_info *src)
> >> {
> >> + u32 nversion;
> >> +
> >> do {
> >> dst->version = src->version;
> >> rmb(); /* fetch version before data */
> >> @@ -64,7 +66,8 @@ static unsigned pvclock_get_time_values(struct pvclock_shadow_time *dst,
> >> dst->tsc_shift = src->tsc_shift;
> >> dst->flags = src->flags;
> >> rmb(); /* test version after fetching data */
> >> - } while ((src->version & 1) || (dst->version != src->version));
> >> + nversion = ACCESS_ONCE(src->version);
> >> + } while ((nversion & 1) || (dst->version != nversion));
> >>
> >> return dst->version;
> >> }
> >> @@ -135,7 +138,7 @@ void pvclock_read_wallclock(struct pvclock_wall_clock *wall_clock,
> >> struct pvclock_vcpu_time_info *vcpu_time,
> >> struct timespec *ts)
> >> {
> >> - u32 version;
> >> + u32 version, nversion;
> >> u64 delta;
> >> struct timespec now;
> >>
> >> @@ -146,7 +149,8 @@ void pvclock_read_wallclock(struct pvclock_wall_clock *wall_clock,
> >> now.tv_sec = wall_clock->sec;
> >> now.tv_nsec = wall_clock->nsec;
> >> rmb(); /* fetch time before checking version */
> >> - } while ((wall_clock->version & 1) || (version != wall_clock->version));
> >> + nversion = ACCESS_ONCE(wall_clock->version);
> >> + } while ((nversion & 1) || (version != nversion));
> >>
> >> delta = pvclock_clocksource_read(vcpu_time); /* time since system boot */
> >> delta += now.tv_sec * (u64)NSEC_PER_SEC + now.tv_nsec;
> >> --
> >> 1.8.4.2
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> http://lists.xen.org/xen-devel
>
>
next prev parent reply other threads:[~2014-01-28 15:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-16 14:13 [PATCH] KVM, XEN: Fix potential race in pvclock code Julian Stecklina
2014-01-16 15:04 ` Jan Beulich
2014-01-16 15:04 ` [Xen-devel] " Jan Beulich
2014-01-16 16:04 ` Julian Stecklina
2014-01-17 9:41 ` Jan Beulich
2014-01-17 9:41 ` [Xen-devel] " Jan Beulich
2014-01-17 9:50 ` Julian Stecklina
2014-01-17 9:50 ` Julian Stecklina
2014-01-27 17:47 ` [Xen-devel] " Paolo Bonzini
2014-01-28 15:25 ` Konrad Rzeszutek Wilk
2014-01-28 15:25 ` [Xen-devel] " Konrad Rzeszutek Wilk
2014-01-27 17:47 ` Paolo Bonzini
2014-01-16 16:04 ` Julian Stecklina
2014-01-24 18:08 ` Konrad Rzeszutek Wilk
2014-01-24 18:08 ` [Xen-devel] " Konrad Rzeszutek Wilk
2014-01-27 12:33 ` Julian Stecklina
2014-01-27 12:33 ` [Xen-devel] " Julian Stecklina
2014-01-28 15:16 ` Konrad Rzeszutek Wilk
2014-01-28 15:16 ` Konrad Rzeszutek Wilk [this message]
2014-01-28 16:06 ` [Xen-devel] " Julian Stecklina
2014-01-28 16:06 ` Julian Stecklina
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140128151646.GA4308@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=jbeulich@suse.com \
--cc=jsteckli@os.inf.tu-dresden.de \
--cc=kvm@vger.kernel.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.