linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: "Linas Vepstas" <linasvepstas@gmail.com>
To: "Olof Johansson" <olof@lixom.net>
Cc: lkessler@us.ibm.com, linuxppc-dev@ozlabs.org,
	Nathan Lynch <ntl@pobox.com>,
	mahuja@us.ibm.com, strosake@us.ibm.com
Subject: Re: [PATCH 1/8] pseries: phyp dump: Docmentation
Date: Mon, 14 Jan 2008 09:21:22 -0600	[thread overview]
Message-ID: <3ae3aa420801140721g3381b3d3u28b7b30fbaebe88@mail.gmail.com> (raw)
In-Reply-To: <20080114052402.GA23786@lixom.net>

On 13/01/2008, Olof Johansson <olof@lixom.net> wrote:
>
> How do you expect to have it in full production if you don't have all
> resources available for it? It's not until the dump has finished that you
> can return all memory to the production environment and use it.

With the PHYP dump, each chunk of RAM is returned for
general use immediately after being dumped; so its not
an all-or-nothing proposition.  Production systems don't
often hit 100% RAM use right out of the gate, they often
take hours or days to get there, so again, there should
be time to dump.

> This can very easily be argued in both direction, with no clear winner:
> If the crash is stress-induced (say a slashdotted website), for those
> cases it seems more rational to take the time, collect _good data_ even
> if it takes a little longer, and then go back into production. Especially
> if the alternative is to go back into production immediately, collect
> about half of the data, and then crash again. Rinse and repeat.

Again, the mode of operation for the phyp dump  is that you'll
always have all of the data from the *first* crash, even if there
are multiple crashes. That's because the the as-yet undumped
RAM is not put back into production.

> really surprises me that there's no way to reset a device through PHYP
> though. Seems like such a fundamental feature.

I don't know who said that; that's not right. The EEH function
certainly does allow you to halt/restart PCI traffic to a particular
device and also to reset the device.  So, yes, the pSeries
kexec code should call into the eeh subsystem to rationalize
the device state.

> I think people are overly optimistic if they think it'll be possible
> to do all of this reliably (as in with consistent performance) without
> a second reboot though.

The NUMA issues do concern me. But then, the whole virtualized,
fractional-cpu, tickless operation stuff sounds like a performance
tuning nightmare to begin with.

> At least without similar amounts of work being
> done as it would have taken to fix kdump's reliability in the first place.

:-)


> Speaking of reboots. PHYP isn't known for being quick at rebooting a
> partition, it used to take in the order of minutes even on a small
> machine. Has that been fixed?

Dunno.  Probably not.

>  If not, the avoiding an extra reboot
> argument hardly seems like a benefit versus kdump+kexec, which reboots
> nearly instantly and without involvement from PHYP.

OK, let me tell you what I'm up against right now.
I'm dealing with sporadic corruption on my home box.

About a month ago, I bought a whizzy ASUS M2NE
motherboard & an AMD64 2-core cpu, and two sticks
of RAM, 1GB per stick. I have one new hard drive,
SATA, and one old hard drive, from my old machine,
the PATA.  The two disks are mirrored in a RAID-1
config. Running Ubuntu.

During install/upgrade a month ago, I noticed some of
the install files seemed to have gotten corrupted, but
that downloading them again got me a working version.
This put a serious frown on my face: maybe a bad ethernet
card or connection !?

Two weeks ago, gcc stopped working one morning, although
it worked fine the night before. I'd done nothing in the interim
but sleep. Reinstalling it made it work again. Yesterday,
something else stopped working.  I found the offending
library, I compared file checksums against a known-good
version, and they were off. (!!!) Disk corruption?

Then apt-get stopped working. The /var/lib/dpkg/status file
had randomly corrupted single bytes. Its ascii, I hand
repaired it; it had maybe 10 bad bytes out of 2MB total size.

I installed tripwire. Between the first run of tripwire, and the
second, less than an hour later, it reported several dozen
files have changed checksums. Manual inspection of some
of these files against known-good versions show that, at least
this morning, that's no longer the case.

System hasn't crashed in a month, since first boot.  So
what's going on? Is it possible that one of the two disks
is serving up bad data, which explains the funny checksum
behaviour? Or maybe its bad RAM, so that a fresh disk
read shows good data?  If its bad ram, why doesn't the
system crash?  I forced fsck last night, fsck came back
spotless.

So ... moral of the story: If phyp is doing some sort of
hardware checks and validation, that's great. I wish I could
afford a pSeries system for my home computer, because
my impression is that they are very stable, and don't do
things like data corruption.  I'm such a friggin cheapskate
that I can't bear to spend many thousands instead of many
hundreds of dollars. However, I will trade a longer boot
for the dream of higher reliability.

--linas

  reply	other threads:[~2008-01-14 15:21 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-07 23:45 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-01-08  0:13 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-01-09  4:29   ` Nathan Lynch
2008-01-09  4:58     ` Michael Ellerman
2008-01-09 15:31     ` Linas Vepstas
2008-01-09 18:44       ` Nathan Lynch
2008-01-09 19:28         ` Manish Ahuja
2008-01-09 22:59         ` Michael Ellerman
2008-01-09 23:18           ` Manish Ahuja
2008-01-10  2:47           ` Linas Vepstas
2008-01-10  3:55             ` Michael Ellerman
2008-01-10  2:33         ` Linas Vepstas
2008-01-10  3:17           ` Olof Johansson
2008-01-10  4:12             ` Linas Vepstas
2008-01-10  4:52               ` Michael Ellerman
2008-01-10 16:21               ` Olof Johansson
2008-01-10 16:34                 ` Linas Vepstas
2008-01-10 21:46                   ` Mike Strosaker
2008-01-11  1:26                     ` Nathan Lynch
2008-01-11 16:57                       ` Linas Vepstas
2008-01-14  5:24                         ` Olof Johansson
2008-01-14 15:21                           ` Linas Vepstas [this message]
2008-01-08  0:16 ` [PATCH 2/8] pseries: phyp dump: config file Manish Ahuja
2008-01-08  3:18   ` Stephen Rothwell
2008-01-08  0:21 ` [PATCH 4/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-01-08  3:45   ` Stephen Rothwell
2008-01-08 18:34     ` Linas Vepstas
2008-01-08  0:25 ` [PATCH 3/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-01-08  3:16   ` Stephen Rothwell
2008-01-16  4:21   ` Paul Mackerras
2008-01-08  0:28 ` [PATCH 5/8] pseries: phyp dump: register dump area Manish Ahuja
2008-01-08  3:59   ` Stephen Rothwell
2008-01-08  0:35 ` [PATCH 6/8] pseries: phyp dump: debugging print routines Manish Ahuja
2008-01-08  0:49   ` Arnd Bergmann
2008-01-08  4:03   ` Stephen Rothwell
2008-01-08  0:37 ` [PATCH 7/8] pseries: phyp dump: Unregister and print dump areas Manish Ahuja
2008-01-08  4:25   ` Stephen Rothwell
2008-01-08 22:56     ` Manish Ahuja
2008-01-08  0:39 ` [PATCH 8/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
2008-02-12  6:31 ` [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-12  6:53   ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-02-12  7:08   ` [PATCH 2/8] pseries: phyp dump: reserve-release proof-of-concept Manish Ahuja
2008-02-12  8:48     ` Michael Ellerman
2008-02-12 16:38       ` Manish Ahuja
2008-02-14  3:46     ` Tony Breeds
2008-02-14 23:12       ` Olof Johansson
2008-02-15  7:16         ` Manish Ahuja
2008-02-12  7:11   ` [PATCH 3/8] pseries: phyp dump: use sysfs to release reserved mem Manish Ahuja
2008-02-12 10:08     ` Stephen Rothwell
2008-02-12 16:40       ` Manish Ahuja
2008-02-15  1:05     ` Tony Breeds
2008-02-15  7:17       ` Manish Ahuja
2008-02-15 22:32         ` Tony Breeds
2008-02-15 17:30       ` Linas Vepstas
2008-02-12  7:14   ` [PATCH 4/8] pseries: phyp dump: register dump area Manish Ahuja
2008-02-12 10:11     ` Stephen Rothwell
2008-02-12 16:31       ` Manish Ahuja
2008-02-12  7:16   ` [PATCH 5/8] pseries: phyp dump: debugging print routines Manish Ahuja
2008-02-12  7:18   ` [PATCH 6/8] pseries: phyp dump: Invalidate and print dump areas Manish Ahuja
2008-02-12 10:18     ` Stephen Rothwell
2008-02-12 16:32       ` Manish Ahuja
2008-02-13 21:43     ` Manish Ahuja
2008-02-12  7:20   ` [PATCH 7/8] pseries: phyp dump: Tracking memory range freed Manish Ahuja
2008-02-12  7:21   ` [PATCH 8/8] pseries: phyp dump: config file Manish Ahuja
  -- strict thread matches above, loose matches on Subject: below --
2008-01-22 19:12 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-01-22 19:26 ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja
2008-02-18  4:53 [PATCH 0/8] pseries: phyp dump: hypervisor-assisted dump Manish Ahuja
2008-02-22  0:53 ` Michael Ellerman
2008-02-28 23:57   ` Manish Ahuja
2008-02-29  0:22     ` [PATCH 1/8] pseries: phyp dump: Docmentation Manish Ahuja

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ae3aa420801140721g3381b3d3u28b7b30fbaebe88@mail.gmail.com \
    --to=linasvepstas@gmail.com \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=lkessler@us.ibm.com \
    --cc=mahuja@us.ibm.com \
    --cc=ntl@pobox.com \
    --cc=olof@lixom.net \
    --cc=strosake@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).