All of lore.kernel.org
 help / color / mirror / Atom feed
* osstest going offline for a bit due to database server move
@ 2015-03-11 16:28 Ian Jackson
  2015-03-14 11:02 ` Ian Campbell
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Jackson @ 2015-03-11 16:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Campbell, Jan Beulich

The db server for the production osstest instance needs to be
physically moved.  This is planned to take place on Friday.

I have dropped a `stop' file in which will stop osstest taking on new
work.  The db server will be moved on Friday morning (and anything
still running then will be killed).

We expect to resume service some time on Friday.

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: osstest going offline for a bit due to database server move
  2015-03-11 16:28 osstest going offline for a bit due to database server move Ian Jackson
@ 2015-03-14 11:02 ` Ian Campbell
  2015-03-16 12:41   ` Ian Campbell
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2015-03-14 11:02 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Jan Beulich

On Wed, 2015-03-11 at 16:28 +0000, Ian Jackson wrote:
> The db server for the production osstest instance needs to be
> physically moved.  This is planned to take place on Friday.
> 
> I have dropped a `stop' file in which will stop osstest taking on new
> work.  The db server will be moved on Friday morning (and anything
> still running then will be killed).
> 
> We expect to resume service some time on Friday.

TL;DR: The move took place successfully (?) but osstest is still
unavailable. :-/

Since the move of the DB server there seems to have been some issues
with the (unrelated) filer which supplies storage to the server hosting
the controller VM which is causing the VM's rootfs to go read only about
once a day. I've rebooted it again this morning but TBH based on recent
history I don't expect it to survive until tomorrow.

Separately there seems to be some sort of new issue with the PDU control
software (which may or may not relate to the db move since PDU control
is via the DB), so when the VM is up and running it isn't able to
actually do much which is useful.

I'll be looking into both of those with some urgency on Monday since
there isn't much I can do from here. In the meantime I don't think there
will be much in the way of useful test results.

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: osstest going offline for a bit due to database server move
  2015-03-14 11:02 ` Ian Campbell
@ 2015-03-16 12:41   ` Ian Campbell
  2015-03-17 10:28     ` Ian Campbell
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2015-03-16 12:41 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Jan Beulich

On Sat, 2015-03-14 at 11:02 +0000, Ian Campbell wrote:
> On Wed, 2015-03-11 at 16:28 +0000, Ian Jackson wrote:
> > The db server for the production osstest instance needs to be
> > physically moved.  This is planned to take place on Friday.
> > 
> > I have dropped a `stop' file in which will stop osstest taking on new
> > work.  The db server will be moved on Friday morning (and anything
> > still running then will be killed).
> > 
> > We expect to resume service some time on Friday.
> 
> TL;DR: The move took place successfully (?) but osstest is still
> unavailable. :-/
> 
> Since the move of the DB server there seems to have been some issues
> with the (unrelated) filer which supplies storage to the server hosting
> the controller VM which is causing the VM's rootfs to go read only about
> once a day. I've rebooted it again this morning but TBH based on recent
> history I don't expect it to survive until tomorrow.
> 
> Separately there seems to be some sort of new issue with the PDU control
> software (which may or may not relate to the db move since PDU control
> is via the DB), so when the VM is up and running it isn't able to
> actually do much which is useful.
> 
> I'll be looking into both of those with some urgency on Monday since
> there isn't much I can do from here. In the meantime I don't think there
> will be much in the way of useful test results.

The PDU issue is fixed.

We've not yet tracked down the source of the mysterious filer reboots
and there was another earlier today, we've fiddled with a few things to
see if we can track them down.

osstest is doing stuff now, fingers crossed.

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: osstest going offline for a bit due to database server move
  2015-03-16 12:41   ` Ian Campbell
@ 2015-03-17 10:28     ` Ian Campbell
  2015-03-17 14:25       ` Ian Campbell
  2015-03-19  9:31       ` Ian Campbell
  0 siblings, 2 replies; 7+ messages in thread
From: Ian Campbell @ 2015-03-17 10:28 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Jan Beulich

On Mon, 2015-03-16 at 12:41 +0000, Ian Campbell wrote:
> We've not yet tracked down the source of the mysterious filer reboots
> and there was another earlier today, we've fiddled with a few things to
> see if we can track them down.
> 
> osstest is doing stuff now, fingers crossed.

There were some more reboots overnight. We've made another config change
which we hope will resolve things. If not we will look at moving the
controller VM to another filer tomorrow.

In the meantime in an attempt to try and keep some of the more important
branches flowing with the limited bandwidth between reboots I've stopped
a bunch of stuff:

$ cd ~/testing.git
$ touch bisect.stop
$ for i in linux-{2.6.39,3.4,3.10,3.16} linux-arm-xen linux-linus
linux-next ovmf seabios xen-4.0-testing xen-4.1-testing qemu-mainline
qemu-upstream-4.2-testing ; do touch $i.stop; done
t$ touch rumpuserxen.stop
$ ls *.stop
bisect.stop	   linux-3.10.stop  linux-3.4.stop	linux-linus.stop  ovmf.stop	      qemu-upstream-4.2-testing.stop  seabios.stop	    xen-4.1-testing.stop
linux-2.6.39.stop  linux-3.16.stop  linux-arm-xen.stop	linux-next.stop   qemu-mainline.stop  rumpuserxen.stop		      xen-4.0-testing.stop

Since Jan is trying to get 4.3.x and 4.4.x out the door I've left those
stable branches going but otherwise I've stopped all the kernels which
aren't actually feeding other flights, and some other stuff I reckoned
we could live without for now.

Once we've had 24 hours of uninterrupted operation I'll remove those
again.

I've also killed some sg-execute-flights corresponding to the above.
Bisects:
36506 -- [linux-3.4 real-bisect]
36507 -- [qemu-upstream-unstable real-bisect]
36489 -- [rumpuserxen real-bisect]

Real:
 flight |          branch           | intended 
--------+---------------------------+----------
  36491 | qemu-mainline             | real
  36495 | linux-3.16                | real
  36497 | linux-3.4                 | real
  36498 | linux-3.10                | real
  36500 | ovmf                      | real
  36501 | seabios                   | real
  36505 | linux-next                | real

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: osstest going offline for a bit due to database server move
  2015-03-17 10:28     ` Ian Campbell
@ 2015-03-17 14:25       ` Ian Campbell
  2015-03-19  9:31       ` Ian Campbell
  1 sibling, 0 replies; 7+ messages in thread
From: Ian Campbell @ 2015-03-17 14:25 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Jan Beulich, Stefano Stabellini

On Tue, 2015-03-17 at 10:28 +0000, Ian Campbell wrote:
> On Mon, 2015-03-16 at 12:41 +0000, Ian Campbell wrote:
> > We've not yet tracked down the source of the mysterious filer reboots
> > and there was another earlier today, we've fiddled with a few things to
> > see if we can track them down.
> > 
> > osstest is doing stuff now, fingers crossed.
> 
> There were some more reboots overnight. We've made another config change
> which we hope will resolve things. If not we will look at moving the
> controller VM to another filer tomorrow.
> 
> In the meantime in an attempt to try and keep some of the more important
> branches flowing with the limited bandwidth between reboots I've stopped
> a bunch of stuff:
[...]

After discussion with Stefano I've also stopped the qemu-upstream stuff
for 4.2, 4.3, 4.4 and 4.5. AIUI the tags to be used for the 4.3.x and
4.4.x branches are already in the tested branch and everything after
that is targeting the next point release.

$ for i in 4.2 4.3 4.4 4.5 ; do
> touch qemu-upstream-$i-testing
> done

and killed these flights:

 flight |  blessing   |          branch           | intended 
--------+-------------+---------------------------+----------
  36492 | running     | qemu-upstream-4.5-testing | real
  36494 | running     | qemu-upstream-4.3-testing | real
  36499 | running     | qemu-upstream-4.4-testing | real

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: osstest going offline for a bit due to database server move
  2015-03-17 10:28     ` Ian Campbell
  2015-03-17 14:25       ` Ian Campbell
@ 2015-03-19  9:31       ` Ian Campbell
  2015-03-19 15:52         ` Ian Campbell
  1 sibling, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2015-03-19  9:31 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Jan Beulich

On Tue, 2015-03-17 at 10:28 +0000, Ian Campbell wrote:
> Once we've had 24 hours of uninterrupted operation I'll remove those
> again.

The filer seems to have survived the night, and we've even had a few
pushes happen.

I've done:

$ rm ovmf.stop seabios.stop qemu-upstream-4.2-testing.stop
qemu-mainline.stop xen-4.0-testing.stop xen-4.1-testing.stop bisect.stop

which leaves:

$ ls *.stop
linux-2.6.39.stop  linux-3.16.stop  linux-arm-xen.stop	linux-next.stop
linux-3.10.stop    linux-3.4.stop   linux-linus.stop	rumpuserxen.stop
$

Assuming all remains well I'll drop those either later this afternoon or
tomorrow AM.

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: osstest going offline for a bit due to database server move
  2015-03-19  9:31       ` Ian Campbell
@ 2015-03-19 15:52         ` Ian Campbell
  0 siblings, 0 replies; 7+ messages in thread
From: Ian Campbell @ 2015-03-19 15:52 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Jan Beulich

On Thu, 2015-03-19 at 09:31 +0000, Ian Campbell wrote:
> Assuming all remains well I'll drop those either later this afternoon or
> tomorrow AM.

I've dropped all the remaining stop files -- fingers crossed!

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-03-19 15:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-11 16:28 osstest going offline for a bit due to database server move Ian Jackson
2015-03-14 11:02 ` Ian Campbell
2015-03-16 12:41   ` Ian Campbell
2015-03-17 10:28     ` Ian Campbell
2015-03-17 14:25       ` Ian Campbell
2015-03-19  9:31       ` Ian Campbell
2015-03-19 15:52         ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.