All of lore.kernel.org
 help / color / mirror / Atom feed
* First release candidate for 3.2.0
@ 2007-12-10 11:09 Keir Fraser
  2007-12-10 17:43 ` Stefan de Konink
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Keir Fraser @ 2007-12-10 11:09 UTC (permalink / raw)
  To: xen-devel

Folks,

The first release candidate for Xen 3.2.0 is available at
http://xenbits.xensource.com/xen-unstable.hg, tagged as '3.2.0-rc1'.

Please test!

 -- Keir

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-10 11:09 First release candidate for 3.2.0 Keir Fraser
@ 2007-12-10 17:43 ` Stefan de Konink
  2007-12-10 19:56   ` Marco Sinhoreli
  2007-12-11  3:42 ` John Levon
  2007-12-13  2:03 ` John Levon
  2 siblings, 1 reply; 20+ messages in thread
From: Stefan de Konink @ 2007-12-10 17:43 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Keir Fraser schreef:
> The first release candidate for Xen 3.2.0 is available at
> http://xenbits.xensource.com/xen-unstable.hg, tagged as '3.2.0-rc1'.

Would it be possible to publish a tarball on the website?


Stefan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHXXq9YH1+F2Rqwn0RCg0xAJwIsZT7qHAZJ7y8KcVCwpsgGRuZsQCfQsag
YS0nHmNgZmvdf0oTggPp3oY=
=Bpwm
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-10 17:43 ` Stefan de Konink
@ 2007-12-10 19:56   ` Marco Sinhoreli
  2007-12-11  9:20     ` Ian Campbell
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Marco Sinhoreli @ 2007-12-10 19:56 UTC (permalink / raw)
  To: xen-devel

Hello all,

I tried to run 'make world' in xen-3.2.0rc1 and it has returned this error:

------------------ code -------------------
make -f buildconfigs/mk.linux-2.6-xen build
set -e ; \
        if [ ! -e linux-2.6.18-xen.hg/.hg ] ; then \
            __repo=$(sh buildconfigs/select-repository
linux-2.6.18-xen.hg .:..) ; \
            if [ -d ${__repo} ] ; then \
                echo "Linking ${__repo} to linux-2.6.18-xen.hg." ; \
                ln -s ${__repo} linux-2.6.18-xen.hg ; \
            else \
                echo "Cloning ${__repo} to linux-2.6.18-xen.hg." ; \
                hg clone ${__repo#file://} linux-2.6.18-xen.hg ; \
            fi ; \
        else \
            __parent=$(hg -R linux-2.6.18-xen.hg path default) ; \
            echo "Pulling changes from ${__parent} into
linux-2.6.18-xen.hg." ; \
            hg -R linux-2.6.18-xen.hg pull ${__parent} ; \
        fi
select-repository: Searching `.:..' for linux-2.6.18-xen.hg
select-repository: Ignoring `.'
hg: unknown command 'default'
select-repository: Unable to determine Xen repository parent.
make: *** [linux-2.6.18-xen.hg/.valid-src] Error 1
-------------------------- end code --------------------------------

Regards,

-- 
Marco Sinhoreli

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-10 11:09 First release candidate for 3.2.0 Keir Fraser
  2007-12-10 17:43 ` Stefan de Konink
@ 2007-12-11  3:42 ` John Levon
  2007-12-11 10:18   ` Keir Fraser
  2007-12-13  2:03 ` John Levon
  2 siblings, 1 reply; 20+ messages in thread
From: John Levon @ 2007-12-11  3:42 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Mon, Dec 10, 2007 at 11:09:45AM +0000, Keir Fraser wrote:

> The first release candidate for Xen 3.2.0 is available at
> http://xenbits.xensource.com/xen-unstable.hg, tagged as '3.2.0-rc1'.

I used current tip instead. This was the first time we've run Solaris
past 3.1*. As expected the results were not good.

There's some terrible nastiness in the rdmsr emulation path:

(XEN) traps.c:2046:d0 Domain attempted RDMSR 000000000000008b ret to EIP fffffffffb85a0d4 ESP fffffffffbc7b408.
(XEN) traps.c:2053:d0 succeeded, eax now 000000000000003a, edx now 0000000000000000
(XEN) traps.c:2061:d0 value at rsp was fffffffffb82acde now fffffffffb82acde
(XEN) traps.c:2046:d0 Domain attempted RDMSR 00000000c0010015 ret to EIP fffffffffb85a0d4 ESP fffffffffbc7b408.
(XEN) traps.c:2053:d0 succeeded, eax now 0000000010000040, edx now 0000000000000000
(XEN) traps.c:2061:d0 value at rsp was fffffffffb82acde now fffffffffb82acde
panic[cpu0]/thread=fffffffffbc48da0: BAD TRAP: type=e (#pf Page fault) rp=fffffffffbc7b320 addr=10000040 occurred in module "unix" due to a NULL pointer dereference

#pf Page fault
Bad kernel fault at addr=0x10000040

Something has been spewing zeroes all over our text:

checked_rdmsr:                  pushq  %rbp
checked_rdmsr+1:                movq   %rsp,%rbp
checked_rdmsr+4:                subq   $0x18,%rsp
checked_rdmsr+8:                pushq  %r12
checked_rdmsr+0xa:              movq   %rdi,-0x8(%rbp)
checked_rdmsr+0xe:              movq   %rsi,-0x10(%rbp)
checked_rdmsr+0x12:             movq   %rsi,%r12
checked_rdmsr+0x15:             cmpl   $0x0,+0x434218(%rip)     <disable_msrs>
checked_rdmsr+0x1c:             jne    +0x18    <checked_rdmsr+0x36>
checked_rdmsr+0x1e:             movl   +0x3d62fc(%rip),%eax     <x86_feature>
checked_rdmsr+0x24:             andl   $0x4,%eax
checked_rdmsr+0x27:             je     +0xd     <checked_rdmsr+0x36>
checked_rdmsr+0x29:             call   +0x2f3f2 <rdmsr>
checked_rdmsr+0x2e:             addb   %al,(%rax)
checked_rdmsr+0x30:             .byte   0
checked_rdmsr+0x31:             .byte   0
checked_rdmsr+0x32:             .byte   0
checked_rdmsr+0x33:             .byte   0
checked_rdmsr+0x34:             .byte   0
checked_rdmsr+0x35:             .byte   0
checked_rdmsr+0x36:             .byte   0
checked_rdmsr+0x37:             .byte   0
checked_rdmsr+0x38:             .byte   0
checked_rdmsr+0x39:             .byte   0
checked_rdmsr+0x3a:             .byte   0
checked_rdmsr+0x3b:             .byte   0
checked_rdmsr+0x3c:             .byte   0
checked_rdmsr+0x3d:             .byte   0
checked_rdmsr+0x3e:             ret

[0]> rdmsr::dis
rdmsr:                          movl   %edi,%ecx
rdmsr+2:                        rdmsr
rdmsr+4:                        shlq   $0x20,%rdx
rdmsr+8:                        orq    %rdx,%rax
rdmsr+0xb:                      ret

So, it's possible that we have somehow mangled things, but also quite unlikely
- this is based on code that works with 3.1.0. If I set 'disable_msrs', then I
get much further in Solaris boot until we die with a corrupted mutex.

I've had a look through changes in entry.S but I can't really make much sense
of them (I never can). I tried backing out Jan's sysenter changes, as they 
looked scary, but that didn't seem to help.

The only thing that comes to mind is somewhere the "disables_events" path got
broken again - any ideas Keir? Unfortunately I only have a limited time to look
at this as I'm focused on finishing up 3.1 now.

regards
john

* keeping dom0 up to date is a thankless task

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-10 19:56   ` Marco Sinhoreli
@ 2007-12-11  9:20     ` Ian Campbell
  2007-12-11 11:48       ` Marco Sinhoreli
  2007-12-11  9:24     ` Keir Fraser
  2007-12-11 15:27     ` Pradeep Singh
  2 siblings, 1 reply; 20+ messages in thread
From: Ian Campbell @ 2007-12-11  9:20 UTC (permalink / raw)
  To: Marco Sinhoreli; +Cc: xen-devel


On Mon, 2007-12-10 at 17:56 -0200, Marco Sinhoreli wrote: 
> Hello all,
> 
> I tried to run 'make world' in xen-3.2.0rc1 and it has returned this error:

It works for me here, what is your environment (distro, make and
mercurial version etc)? Did you clone direct from
http://xenbits.xensource.com/xen-unstable.hg or do you use a local
staging repository?

Can you try with this debug patch:

diff -r 4054cd60895b buildconfigs/select-repository
--- a/buildconfigs/select-repository	Mon Dec 10 13:49:22 2007 +0000
+++ b/buildconfigs/select-repository	Tue Dec 11 09:16:55 2007 +0000
@@ -1,6 +1,10 @@
 #!/bin/sh
 
 ME=$(basename $0)
+
+echo "$ME: XEN_ROOT is ${XEN_ROOT}" 1>&2
+
+set -x
 
 if [ $# -lt 1 ] || [ $# -gt 2 ] ; then
     echo "usage: $ME <repository-name> [search-path]" 1>&2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-10 19:56   ` Marco Sinhoreli
  2007-12-11  9:20     ` Ian Campbell
@ 2007-12-11  9:24     ` Keir Fraser
  2007-12-11 15:27     ` Pradeep Singh
  2 siblings, 0 replies; 20+ messages in thread
From: Keir Fraser @ 2007-12-11  9:24 UTC (permalink / raw)
  To: Marco Sinhoreli, xen-devel

What version of mercurial are you running ('hg --version')? It may be too
old for that script.

You can work around by manually cloning
http://xenbits.xensource.com/linux-2.6.18-xen.hg into a directory adjacent
to your local xen-unstable.hg repository. i.e.,:
 /path/to/xen-unstable.hg
 /path/to/linux-2.6.18-xen.hg

 -- Keir

On 10/12/07 19:56, "Marco Sinhoreli" <msinhore@gmail.com> wrote:

> Hello all,
> 
> I tried to run 'make world' in xen-3.2.0rc1 and it has returned this error:
> 
> ------------------ code -------------------
> make -f buildconfigs/mk.linux-2.6-xen build
> set -e ; \
>         if [ ! -e linux-2.6.18-xen.hg/.hg ] ; then \
>             __repo=$(sh buildconfigs/select-repository
> linux-2.6.18-xen.hg .:..) ; \
>             if [ -d ${__repo} ] ; then \
>                 echo "Linking ${__repo} to linux-2.6.18-xen.hg." ; \
>                 ln -s ${__repo} linux-2.6.18-xen.hg ; \
>             else \
>                 echo "Cloning ${__repo} to linux-2.6.18-xen.hg." ; \
>                 hg clone ${__repo#file://} linux-2.6.18-xen.hg ; \
>             fi ; \
>         else \
>             __parent=$(hg -R linux-2.6.18-xen.hg path default) ; \
>             echo "Pulling changes from ${__parent} into
> linux-2.6.18-xen.hg." ; \
>             hg -R linux-2.6.18-xen.hg pull ${__parent} ; \
>         fi
> select-repository: Searching `.:..' for linux-2.6.18-xen.hg
> select-repository: Ignoring `.'
> hg: unknown command 'default'
> select-repository: Unable to determine Xen repository parent.
> make: *** [linux-2.6.18-xen.hg/.valid-src] Error 1
> -------------------------- end code --------------------------------
> 
> Regards,

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-11  3:42 ` John Levon
@ 2007-12-11 10:18   ` Keir Fraser
  2007-12-12 13:02     ` John Levon
  0 siblings, 1 reply; 20+ messages in thread
From: Keir Fraser @ 2007-12-11 10:18 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel

Is Solaris easy to grab and build, or does it need a Solaris environment?
Alternatively, can I grab a pre-built binary (with symbols) from somewhere?

 -- Keir

On 11/12/07 03:42, "John Levon" <levon@movementarian.org> wrote:

> On Mon, Dec 10, 2007 at 11:09:45AM +0000, Keir Fraser wrote:
> 
>> The first release candidate for Xen 3.2.0 is available at
>> http://xenbits.xensource.com/xen-unstable.hg, tagged as '3.2.0-rc1'.
> 
> I used current tip instead. This was the first time we've run Solaris
> past 3.1*. As expected the results were not good.
> 
> There's some terrible nastiness in the rdmsr emulation path:
> 
> (XEN) traps.c:2046:d0 Domain attempted RDMSR 000000000000008b ret to EIP
> fffffffffb85a0d4 ESP fffffffffbc7b408.
> (XEN) traps.c:2053:d0 succeeded, eax now 000000000000003a, edx now
> 0000000000000000
> (XEN) traps.c:2061:d0 value at rsp was fffffffffb82acde now fffffffffb82acde
> (XEN) traps.c:2046:d0 Domain attempted RDMSR 00000000c0010015 ret to EIP
> fffffffffb85a0d4 ESP fffffffffbc7b408.
> (XEN) traps.c:2053:d0 succeeded, eax now 0000000010000040, edx now
> 0000000000000000
> (XEN) traps.c:2061:d0 value at rsp was fffffffffb82acde now fffffffffb82acde
> panic[cpu0]/thread=fffffffffbc48da0: BAD TRAP: type=e (#pf Page fault)
> rp=fffffffffbc7b320 addr=10000040 occurred in module "unix" due to a NULL
> pointer dereference
> 
> #pf Page fault
> Bad kernel fault at addr=0x10000040
> 
> Something has been spewing zeroes all over our text:
> 
> checked_rdmsr:                  pushq  %rbp
> checked_rdmsr+1:                movq   %rsp,%rbp
> checked_rdmsr+4:                subq   $0x18,%rsp
> checked_rdmsr+8:                pushq  %r12
> checked_rdmsr+0xa:              movq   %rdi,-0x8(%rbp)
> checked_rdmsr+0xe:              movq   %rsi,-0x10(%rbp)
> checked_rdmsr+0x12:             movq   %rsi,%r12
> checked_rdmsr+0x15:             cmpl   $0x0,+0x434218(%rip)     <disable_msrs>
> checked_rdmsr+0x1c:             jne    +0x18    <checked_rdmsr+0x36>
> checked_rdmsr+0x1e:             movl   +0x3d62fc(%rip),%eax     <x86_feature>
> checked_rdmsr+0x24:             andl   $0x4,%eax
> checked_rdmsr+0x27:             je     +0xd     <checked_rdmsr+0x36>
> checked_rdmsr+0x29:             call   +0x2f3f2 <rdmsr>
> checked_rdmsr+0x2e:             addb   %al,(%rax)
> checked_rdmsr+0x30:             .byte   0
> checked_rdmsr+0x31:             .byte   0
> checked_rdmsr+0x32:             .byte   0
> checked_rdmsr+0x33:             .byte   0
> checked_rdmsr+0x34:             .byte   0
> checked_rdmsr+0x35:             .byte   0
> checked_rdmsr+0x36:             .byte   0
> checked_rdmsr+0x37:             .byte   0
> checked_rdmsr+0x38:             .byte   0
> checked_rdmsr+0x39:             .byte   0
> checked_rdmsr+0x3a:             .byte   0
> checked_rdmsr+0x3b:             .byte   0
> checked_rdmsr+0x3c:             .byte   0
> checked_rdmsr+0x3d:             .byte   0
> checked_rdmsr+0x3e:             ret
> 
> [0]> rdmsr::dis
> rdmsr:                          movl   %edi,%ecx
> rdmsr+2:                        rdmsr
> rdmsr+4:                        shlq   $0x20,%rdx
> rdmsr+8:                        orq    %rdx,%rax
> rdmsr+0xb:                      ret
> 
> So, it's possible that we have somehow mangled things, but also quite unlikely
> - this is based on code that works with 3.1.0. If I set 'disable_msrs', then I
> get much further in Solaris boot until we die with a corrupted mutex.
> 
> I've had a look through changes in entry.S but I can't really make much sense
> of them (I never can). I tried backing out Jan's sysenter changes, as they
> looked scary, but that didn't seem to help.
> 
> The only thing that comes to mind is somewhere the "disables_events" path got
> broken again - any ideas Keir? Unfortunately I only have a limited time to
> look
> at this as I'm focused on finishing up 3.1 now.
> 
> regards
> john
> 
> * keeping dom0 up to date is a thankless task
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-11  9:20     ` Ian Campbell
@ 2007-12-11 11:48       ` Marco Sinhoreli
  2007-12-11 11:59         ` Ian Campbell
  2007-12-11 12:02         ` Marco Sinhoreli
  0 siblings, 2 replies; 20+ messages in thread
From: Marco Sinhoreli @ 2007-12-11 11:48 UTC (permalink / raw)
  To: xen-devel

Hello Keir, Ian,

I'm using Debian Etch and the mercurial version is  0.9.1. I'm clone
the repository with this syntax:
# hg clone -r '3.2.0-rc1'  http://xenbits.xensource.com/xen-unstable.hg

I'll try to clone manually.

Best regards,

-- 
Marco Sinhoreli

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-11 11:48       ` Marco Sinhoreli
@ 2007-12-11 11:59         ` Ian Campbell
  2007-12-11 12:02         ` Marco Sinhoreli
  1 sibling, 0 replies; 20+ messages in thread
From: Ian Campbell @ 2007-12-11 11:59 UTC (permalink / raw)
  To: Marco Sinhoreli; +Cc: xen-devel


On Tue, 2007-12-11 at 09:48 -0200, Marco Sinhoreli wrote:
> Hello Keir, Ian,
> 
> I'm using Debian Etch and the mercurial version is  0.9.1.

That's pretty much identical to me...

>  I'm clone
> the repository with this syntax:
> # hg clone -r '3.2.0-rc1'  http://xenbits.xensource.com/xen-unstable.hg

Really?

        $ hg --version
        Mercurial Distributed SCM (version 0.9.1)
        [...]
        $ hg clone -r '3.2.0-rc1'  http://xenbits.xensource.com/xen-unstable.hg xen-3.2.0-rc1.hg
        abort: clone by revision not supported yet for remote repositories

Perhaps you are cloning the full tree to a local repository and then
cloning the 3.2.0-rc1 tag from that? e.g.
        hg clone http://xenbits.xensource.com/xen-unstable.hg
        hg clone -r '3.2.0-rc1' xen-unstable.hg xen-3.2.0-rc1.hg
???

In that case you will also need to clone the linux-2.6.18-xen.hg tree
next to your full clone.

> I'll try to clone manually.

Can you also try the debugging patch I sent you? It appears that
XEN_ROOT is not being exported into the environment of the
buildconfigs/select-repository script which is very strange.

Ian.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-11 11:48       ` Marco Sinhoreli
  2007-12-11 11:59         ` Ian Campbell
@ 2007-12-11 12:02         ` Marco Sinhoreli
  2007-12-11 12:08           ` Ian Campbell
  1 sibling, 1 reply; 20+ messages in thread
From: Marco Sinhoreli @ 2007-12-11 12:02 UTC (permalink / raw)
  To: xen-devel

Correcting the information:

On Dec 11, 2007 9:48 AM, Marco Sinhoreli <msinhore@gmail.com> wrote:
> Hello Keir, Ian,
>
> I'm using Debian Etch and the mercurial version is  0.9.1. I'm clone
> the repository with this syntax:
# hg clone http://xenbits.xensource.com/xen-unstable.hg
# hg clone -r '3.2.0-rc1'  xen-unstable.hg/ xen-3.2.0-rc1

>
> I'll try to clone manually.
>
> Best regards,
>
> --
> Marco Sinhoreli
>



-- 
Marco Sinhoreli

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-11 12:02         ` Marco Sinhoreli
@ 2007-12-11 12:08           ` Ian Campbell
  0 siblings, 0 replies; 20+ messages in thread
From: Ian Campbell @ 2007-12-11 12:08 UTC (permalink / raw)
  To: Marco Sinhoreli; +Cc: xen-devel


On Tue, 2007-12-11 at 10:02 -0200, Marco Sinhoreli wrote:
> Correcting the information:
> 
> On Dec 11, 2007 9:48 AM, Marco Sinhoreli <msinhore@gmail.com> wrote:
> > Hello Keir, Ian,
> >
> > I'm using Debian Etch and the mercurial version is  0.9.1. I'm clone
> > the repository with this syntax:
> # hg clone http://xenbits.xensource.com/xen-unstable.hg
> # hg clone -r '3.2.0-rc1'  xen-unstable.hg/ xen-3.2.0-rc1

Yep, you also need to do 
# hg clone http://xenbits.xensource.com/linux-2.6.18-xen.hg

Ian.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-10 19:56   ` Marco Sinhoreli
  2007-12-11  9:20     ` Ian Campbell
  2007-12-11  9:24     ` Keir Fraser
@ 2007-12-11 15:27     ` Pradeep Singh
  2 siblings, 0 replies; 20+ messages in thread
From: Pradeep Singh @ 2007-12-11 15:27 UTC (permalink / raw)
  To: Marco Sinhoreli; +Cc: xen-devel

On Mon, 10 Dec 2007 17:56:49 -0200
"Marco Sinhoreli" <msinhore@gmail.com> wrote:

> Hello all,
> 
> I tried to run 'make world' in xen-3.2.0rc1 and it has returned this
> error:

Do you have linux-2.6.18-xen.hg in the same directory containing
xen-unstable.hg? Or i guess you need to download linux-2.6.18-xen too,
did you?

Thanks
		pradeep
> 
> ------------------ code -------------------
> make -f buildconfigs/mk.linux-2.6-xen build
> set -e ; \
>         if [ ! -e linux-2.6.18-xen.hg/.hg ] ; then \
>             __repo=$(sh buildconfigs/select-repository
> linux-2.6.18-xen.hg .:..) ; \
>             if [ -d ${__repo} ] ; then \
>                 echo "Linking ${__repo} to linux-2.6.18-xen.hg." ; \
>                 ln -s ${__repo} linux-2.6.18-xen.hg ; \
>             else \
>                 echo "Cloning ${__repo} to linux-2.6.18-xen.hg." ; \
>                 hg clone ${__repo#file://} linux-2.6.18-xen.hg ; \
>             fi ; \
>         else \
>             __parent=$(hg -R linux-2.6.18-xen.hg path default) ; \
>             echo "Pulling changes from ${__parent} into
> linux-2.6.18-xen.hg." ; \
>             hg -R linux-2.6.18-xen.hg pull ${__parent} ; \
>         fi
> select-repository: Searching `.:..' for linux-2.6.18-xen.hg
> select-repository: Ignoring `.'
> hg: unknown command 'default'
> select-repository: Unable to determine Xen repository parent.
> make: *** [linux-2.6.18-xen.hg/.valid-src] Error 1
> -------------------------- end code --------------------------------
> 
> Regards,
> 


-- 
heh...people do try to read my signature.

http://eagain.wordpress.com
http://emptydomain.googlepages.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-11 10:18   ` Keir Fraser
@ 2007-12-12 13:02     ` John Levon
  2007-12-12 13:36       ` Keir Fraser
  0 siblings, 1 reply; 20+ messages in thread
From: John Levon @ 2007-12-12 13:02 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Tue, Dec 11, 2007 at 10:18:15AM +0000, Keir Fraser wrote:

> Is Solaris easy to grab and build, or does it need a Solaris environment?
> Alternatively, can I grab a pre-built binary (with symbols) from somewhere?

Hi Keir, this was a false alarm: I failed to notice that physinfo got a
new 'out' handle and wasn't initialising it (yes, we use sysctl() in our
kernel - for both good and bad reasons). When I fixed that I booted up.
I'll do some more testing and report back either way.

sorry about that,
john

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-12 13:02     ` John Levon
@ 2007-12-12 13:36       ` Keir Fraser
  2007-12-12 13:59         ` John Levon
  0 siblings, 1 reply; 20+ messages in thread
From: Keir Fraser @ 2007-12-12 13:36 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel

On 12/12/07 13:02, "John Levon" <levon@movementarian.org> wrote:

>> Is Solaris easy to grab and build, or does it need a Solaris environment?
>> Alternatively, can I grab a pre-built binary (with symbols) from somewhere?
> 
> Hi Keir, this was a false alarm: I failed to notice that physinfo got a
> new 'out' handle and wasn't initialising it (yes, we use sysctl() in our
> kernel - for both good and bad reasons). When I fixed that I booted up.
> I'll do some more testing and report back either way.

If you tell us what you need it for (apart from the memory locking for tools
operations, which is just unfortunate, and I wonder if there is perhaps a
better way to address that problem) then perhaps we can extend platform_op()
to provide the same functionality?

At least you can interrogate CPUID(0x40000001) to find the Xen version and
hence have a kernel that is portable for both 3.1 and 3.2 sysctl
interfaces...

 -- Keir

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-12 13:36       ` Keir Fraser
@ 2007-12-12 13:59         ` John Levon
  2007-12-12 14:25           ` Keir Fraser
  0 siblings, 1 reply; 20+ messages in thread
From: John Levon @ 2007-12-12 13:59 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Wed, Dec 12, 2007 at 01:36:38PM +0000, Keir Fraser wrote:

> >> Is Solaris easy to grab and build, or does it need a Solaris environment?
> >> Alternatively, can I grab a pre-built binary (with symbols) from somewhere?
> > 
> > Hi Keir, this was a false alarm: I failed to notice that physinfo got a
> > new 'out' handle and wasn't initialising it (yes, we use sysctl() in our
> > kernel - for both good and bad reasons). When I fixed that I booted up.
> > I'll do some more testing and report back either way.
> 
> If you tell us what you need it for (apart from the memory locking for tools
> operations, which is just unfortunate, and I wonder if there is perhaps a
> better way to address that problem) then perhaps we can extend platform_op()
> to provide the same functionality?

The first big one, as you say, is our privcmd driver.  We have to decode the
hypercall and essentially do a copy_from/to_user() on all the buffers passed
in. The best way to fix this is to do it in userspace: make xc_solaris.c use a
new ioctl() that passes in buffer address+size information as well as the
structs themselves. To do this cleanly means pushing a lot of do_domctl() etc.
into xc_$OS.c, and we just haven't had time - though I'd certainly be
interested to know how you felt about a change like that.

I actually want to extend this further for least privilege reasons and have the
ioctl()s be much more "semantic", but that's a much wider fix.

Then we have three remaining uses during boot. They all use
xen_sysctl_physinfo_t.

Number of total physical pages

	- we ask about this for the benefit of our Xen crash dump
	  support. This could easily be a platform op I think?

Number of real CPUs

	- we use this in our errata checking code. It just seems plain
	  wrong in the Xen case for us to be checking machine errata,
	  but we've never found time to go through them and verify that
	  the hypervisor does the same checks and fixes. If you're
	  interested I list the errata that need this value below.

cpu_khz

	- this is an old, old change during bringup which is very
	  possibly fixed; once again, we need to verify we can remove
	  it. As the code says:

1023         /*
1024          * During dom0 bringup, it was noted that on at least one older
1025          * Intel HT machine, the hypervisor initially gives a tsc_to_system_mul
1026          * value that is quite wrong (the 3.06GHz clock was reported
1027          * as 4.77GHz)
1028          *
1029          * The curious thing is, that if you stop the kernel at entry,
1030          * breakpoint here and inspect the value with kmdb, the value
1031          * is correct - but if you don't stop and simply enable the
1032          * printf statement (below), you can see the bad value printed
1033          * here.  Almost as if something kmdb did caused the hypervisor to
1034          * figure it out correctly.  And, note that the hypervisor
1035          * eventually -does- figure it out correctly ... if you look at
1036          * the field later in the life of dom0, it is correct.
1037          *
1038          * For now, on dom0, we employ a slightly cheesy workaround of
1039          * using the DOM0_PHYSINFO hypercall.
1040          */
1041         if (DOMAIN_IS_INITDOMAIN(xen_info) && xpv_cpufreq_workaround) {
1042                 xen_sysctl_physinfo_t pi;
1043                 int ret;
1044
1045                 if ((ret = xen_get_physinfo(&pi)) != 0)
1046                         panic("xen_get_physinfo() failed: %d\n", ret);
1047
1048                 cpu_hz = 1000 * (uint64_t)pi.cpu_khz;
1049         } else {
1050                 cpu_hz = (UINT64_C(1000000000) << 32) / vti->tsc_to_system_mul;
1051
1052                 if (vti->tsc_shift < 0)
1053                         cpu_hz <<= -vti->tsc_shift;
1054                 else
1055                         cpu_hz >>= vti->tsc_shift;
1056         }

Errata:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/i86pc/os/mp_startup.c#627

AMD 122: TLB Flush Filter May Cause Coherency Problem in Multiprocessor Systems
AMD 131: Multiprocessor Systems with Four or More Cores May Deadlock Waiting for a Probe Response
AMD: Disable C1-Clock ramping on multi-core/multi-processor K8 platforms to guard against TSC drift.
  (I do remember that Xen does this, am I right?)
Plus another one around lfence around line 1000-1040

regards,
john

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-12 13:59         ` John Levon
@ 2007-12-12 14:25           ` Keir Fraser
  2007-12-12 14:28             ` Keir Fraser
  2007-12-12 14:50             ` John Levon
  0 siblings, 2 replies; 20+ messages in thread
From: Keir Fraser @ 2007-12-12 14:25 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel

On 12/12/07 13:59, "John Levon" <levon@movementarian.org> wrote:

> The first big one, as you say, is our privcmd driver.  We have to decode the
> hypercall and essentially do a copy_from/to_user() on all the buffers passed
> in. The best way to fix this is to do it in userspace: make xc_solaris.c use a
> new ioctl() that passes in buffer address+size information as well as the
> structs themselves. To do this cleanly means pushing a lot of do_domctl() etc.
> into xc_$OS.c, and we just haven't had time - though I'd certainly be
> interested to know how you felt about a change like that.

I'm happy to see OS-specific changes in libxc, and even some restructuring
if absolutely necessary. I'd much rather that than have kernel dependencies
on domctl and sysctl. libxc is not in its current form due to some grand
plan. ;-)

> Number of total physical pages
> 
> - we ask about this for the benefit of our Xen crash dump
>  support. This could easily be a platform op I think?

Would XENMEM_maximum_ram_page be as good or better? If you *really* mean
number of physical RAM pages then we could add that I suppose. How is it
useful?

> Number of real CPUs
> 
> - we use this in our errata checking code. It just seems plain
>  wrong in the Xen case for us to be checking machine errata,
>  but we've never found time to go through them and verify that
>  the hypervisor does the same checks and fixes. If you're
>  interested I list the errata that need this value below.

The TLB flush and C1 ramping errata we have worked around since at least Xen
3.0.2. Erratum 131 should be worked around by the BIOS (in fact, really all
these errata should be worked around by an up-to-date BIOS). If you insist
on fixing up 131 yourselves then you can do it regardless of number of CPUs
-- the manual does not specify that number of CPUs affects this bug, also it
explicitly states that applying the workaround does not affect performance.

> cpu_khz
> 
> - this is an old, old change during bringup which is very
>  possibly fixed; once again, we need to verify we can remove
>  it. As the code says:

I hope it's fixed. If not then you could calibrate the TSC yourself against
the RTC. You know you won't get preempted much during dom0 kernel boot. IRQ
handling is very quick in Xen.

 -- Keir

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-12 14:25           ` Keir Fraser
@ 2007-12-12 14:28             ` Keir Fraser
  2007-12-12 14:50             ` John Levon
  1 sibling, 0 replies; 20+ messages in thread
From: Keir Fraser @ 2007-12-12 14:28 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel

On 12/12/07 14:25, "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk> wrote:

>> The first big one, as you say, is our privcmd driver.  We have to decode the
>> hypercall and essentially do a copy_from/to_user() on all the buffers passed
>> in. The best way to fix this is to do it in userspace: make xc_solaris.c use
>> a
>> new ioctl() that passes in buffer address+size information as well as the
>> structs themselves. To do this cleanly means pushing a lot of do_domctl()
>> etc.
>> into xc_$OS.c, and we just haven't had time - though I'd certainly be
>> interested to know how you felt about a change like that.
> 
> I'm happy to see OS-specific changes in libxc, and even some restructuring
> if absolutely necessary. I'd much rather that than have kernel dependencies
> on domctl and sysctl. libxc is not in its current form due to some grand
> plan. ;-)

I'd like to see a brief proposal before you really go to town though...

 -- Keir

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-12 14:25           ` Keir Fraser
  2007-12-12 14:28             ` Keir Fraser
@ 2007-12-12 14:50             ` John Levon
  2007-12-12 15:29               ` Keir Fraser
  1 sibling, 1 reply; 20+ messages in thread
From: John Levon @ 2007-12-12 14:50 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Wed, Dec 12, 2007 at 02:25:39PM +0000, Keir Fraser wrote:

> > Number of total physical pages
> > 
> > - we ask about this for the benefit of our Xen crash dump
> >  support. This could easily be a platform op I think?
> 
> Would XENMEM_maximum_ram_page be as good or better? If you *really* mean
> number of physical RAM pages then we could add that I suppose. How is it
> useful?

If you remember Nils presentation the way we handle Xen failures is to
hook Xen panics into Solaris crash dumps. We need to estimate the
maximum possible size of the dump: since we include the hypervisor pages
themselves, we use the number of physical pages in the system as an
upper bound. It sounds like maximum_ram_page would work fine.
investigate.

> The TLB flush and C1 ramping errata we have worked around since at least Xen

Suspected as much.

> 3.0.2. Erratum 131 should be worked around by the BIOS (in fact, really all

Indeed, we just warn about it strongly. The point of these checks is
mainly to stop people trying to use totally broken machines. We only do
the CPUs check because we don't want to warn for the UP case where it
doesn't matter.

Though it does occur to me that checking the number of VCPUs would
actually be good enough here. Or would you take a patch to print a
warning from inside Xen itself?

thanks,
john

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-12 14:50             ` John Levon
@ 2007-12-12 15:29               ` Keir Fraser
  0 siblings, 0 replies; 20+ messages in thread
From: Keir Fraser @ 2007-12-12 15:29 UTC (permalink / raw)
  To: John Levon; +Cc: xen-devel

On 12/12/07 14:50, "John Levon" <levon@movementarian.org> wrote:

>> Would XENMEM_maximum_ram_page be as good or better? If you *really* mean
>> number of physical RAM pages then we could add that I suppose. How is it
>> useful?
> 
> If you remember Nils presentation the way we handle Xen failures is to
> hook Xen panics into Solaris crash dumps. We need to estimate the
> maximum possible size of the dump: since we include the hypervisor pages
> themselves, we use the number of physical pages in the system as an
> upper bound. It sounds like maximum_ram_page would work fine.
> investigate.

Well, you may over-estimate total RAM by up to about a gigabyte, depending
on the size of the I/O hole below 4GB. Perhaps that is good enough? It
sounds like you grossly over-estimate anyway. Oh, also there is
XENMEM_machine_memory_map, which returns the physical e820 map. You can
easily parse that to get total RAM.

>> 3.0.2. Erratum 131 should be worked around by the BIOS (in fact, really all
> 
> Indeed, we just warn about it strongly. The point of these checks is
> mainly to stop people trying to use totally broken machines. We only do
> the CPUs check because we don't want to warn for the UP case where it
> doesn't matter.
> 
> Though it does occur to me that checking the number of VCPUs would
> actually be good enough here. Or would you take a patch to print a
> warning from inside Xen itself?

I think we'd be warning on quite a wide range of AMD CPUs. Number of VCPUs
will work fine unless someone has specified dom0_max_vcpus=1 on the Xen
command line. Alternatively there is now XENPG_getidletime: this can be
(ab)used to find out whether there is more than one physical CPU.

 -- Keir

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: First release candidate for 3.2.0
  2007-12-10 11:09 First release candidate for 3.2.0 Keir Fraser
  2007-12-10 17:43 ` Stefan de Konink
  2007-12-11  3:42 ` John Levon
@ 2007-12-13  2:03 ` John Levon
  2 siblings, 0 replies; 20+ messages in thread
From: John Levon @ 2007-12-13  2:03 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

On Mon, Dec 10, 2007 at 11:09:45AM +0000, Keir Fraser wrote:

> The first release candidate for Xen 3.2.0 is available at
> http://xenbits.xensource.com/xen-unstable.hg, tagged as '3.2.0-rc1'.

I've done some very basic testing ("does it boot" kind of thing) with
Solaris dom0/domU based upon current unstable bits, and it all seems OK
(that is, no worse than 3.1.2)

cheers,
john

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-12-13  2:03 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-10 11:09 First release candidate for 3.2.0 Keir Fraser
2007-12-10 17:43 ` Stefan de Konink
2007-12-10 19:56   ` Marco Sinhoreli
2007-12-11  9:20     ` Ian Campbell
2007-12-11 11:48       ` Marco Sinhoreli
2007-12-11 11:59         ` Ian Campbell
2007-12-11 12:02         ` Marco Sinhoreli
2007-12-11 12:08           ` Ian Campbell
2007-12-11  9:24     ` Keir Fraser
2007-12-11 15:27     ` Pradeep Singh
2007-12-11  3:42 ` John Levon
2007-12-11 10:18   ` Keir Fraser
2007-12-12 13:02     ` John Levon
2007-12-12 13:36       ` Keir Fraser
2007-12-12 13:59         ` John Levon
2007-12-12 14:25           ` Keir Fraser
2007-12-12 14:28             ` Keir Fraser
2007-12-12 14:50             ` John Levon
2007-12-12 15:29               ` Keir Fraser
2007-12-13  2:03 ` John Levon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.