From: Chris Wedgwood <cw@f00f.org>
To: Peter Rival <frival@zk3.dec.com>
Cc: Anton Blanchard <anton@samba.org>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] CPU hot swap for 2.4.3 + s390 support
Date: Sun, 6 May 2001 14:02:54 +1200 [thread overview]
Message-ID: <20010506140254.C11201@tapu.f00f.org> (raw)
In-Reply-To: <20010505063726.A32232@va.samba.org> <3AF4118F.330C3E86@zk3.dec.com> <20010506033746.A30690@metastasis.f00f.org> <3AF4961B.F23C9948@zk3.dec.com>
In-Reply-To: <3AF4961B.F23C9948@zk3.dec.com>; from frival@zk3.dec.com on Sat, May 05, 2001 at 08:08:59PM -0400
On Sat, May 05, 2001 at 08:08:59PM -0400, Peter Rival wrote:
Hrmm... I agree this is a hard problem. I know people smarter
than I have been thinking about this type of problem at Compaq.
It's hard with current memory allocation and management paradigms, if
we wanted to abstract things more and make (break) certain rules, I'm
sure it can me made to work -- the only thing is, we would loose
_MUCH_ speed and efficiency (and waste much more space), so much so I
doubt anyone would serious want to know about it.
We would also have to violate certain assumptions of RT applications.
While I haven't talked to them directly, my only guess would be
that we'd have to hand-rewrite some page tables after copying the
page contents to a new area.
That in itself isn't too bad, except if the pages are mlocked this is
nasty, you have to block all access to the page during copy, very bad
for RT stuff.
Not only that, what if the pages themselves have kernel allocations in
them? We cannot find these (at present) let alone have _any_ idea how
to move them. I guess it could be fidged to work using the MMU if we
were allowed to _COMPLETELY_ lock the system duing the removal phase
from all interrupts and such like... seems pretty horrible to me.
It's late Saturday and I really haven't thought this through
fully, so I'm not even sure that would work, but it's something
like how we support replicated text segments on our GS
series...don't know why it wouldn't work here. *shrug*
There have been demonstrations in the past of this sort of thing. I
think HP may have done one. Not with a commodity OS though.
Actually, I just thought of a kill, what about platforms that have
physically mapped page-tables? This makes like even harder as you have
to move them :(
It's the IBM technology that works around bad memory by detecting
single-bit errors and removing the chip that caused it from use.
I think Solaris claims to do this right now... no idea if it works, I
know of at least one Solaris 7 machine with a dicky memory bit and it
keeps moaning about parity corrections so I guess it doesn't lock it
out. Maybe later versions (8) do?
I'd think of this as a big hammer version of that in software.
It's much easier to do in hardware with current OS design I should
think.
Besides, eventually you'll want to replace the DIMM that has the
bad chip, and what better way then while the system is still
running (as long as it's stable, of course ;) I'm just thinking
out loud, so someone can correct me if I'm being loopy...
Again, you could do this is hardware... have the hardware route writes
to the memory elsewhere and only take reads from the old memory until
the 'refresh' cycles have copied all the data over. Hmm, when I think
about this, doing this in the memory controller chipset seems much
easier I wonder if someone hasn't actually done it...
--cw
next prev parent reply other threads:[~2001-05-06 2:03 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-05-05 13:37 [PATCH] CPU hot swap for 2.4.3 + s390 support Anton Blanchard
2001-05-05 14:43 ` Peter Rival
2001-05-05 15:37 ` Chris Wedgwood
2001-05-05 16:34 ` Mitch Adair
2001-05-06 1:53 ` Chris Wedgwood
2001-05-06 2:24 ` Rik van Riel
2001-05-06 2:19 ` Rik van Riel
2001-05-06 0:08 ` Peter Rival
2001-05-06 2:02 ` Chris Wedgwood [this message]
2001-05-06 2:19 ` Rik van Riel
2001-05-06 2:25 ` Chris Wedgwood
2001-05-06 2:31 ` Rik van Riel
2001-05-06 15:38 ` David Woodhouse
2001-05-06 8:03 ` Aaron Lehmann
2001-05-06 8:43 ` Stephen Beynon
2001-05-06 7:15 ` Dwayne C. Litzenberger
2001-05-06 8:04 ` Aaron Lehmann
2001-05-06 17:06 ` Ben Ford
2001-05-07 1:42 ` Jakob Østergaard
2001-05-05 20:49 ` Bruce Harada
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20010506140254.C11201@tapu.f00f.org \
--to=cw@f00f.org \
--cc=anton@samba.org \
--cc=frival@zk3.dec.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox