public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* SMP Theory (was: Re: Interesting analysis of linux kernel threading by IBM)
@ 2000-01-24 22:46 dg50
  2000-01-24 23:56 ` Jamie Lokier
  2000-01-25  0:54 ` Larry McVoy
  0 siblings, 2 replies; 7+ messages in thread
From: dg50 @ 2000-01-24 22:46 UTC (permalink / raw)
  To: linux-kernel

I've been reading the SMP thread and this is a truly educational and
fascinating discussion. How SMP works, and how much of a benefit it
provides has always been a bit of a mystery to me - and I think the light
is slowly coming on.

But I have a couple of (perhaps dumb) questions.

OK, if you have an n-way SMP box, then you have n processors with n (local)
caches sharing a single block of main system memory. If you then run a
threaded program (like a renderer) with a thread per processor, you wind up
with n threads all looking at a single block of shared memory - right?

OK, if a thread accesses (I assume writes, reading isn't destructive, is
it?) a memory location that another processor is "interested" in, then
you've invalidated that processor's local cache - so it has to be flushed
and refreshed. Have enough cross-talk between threads, and you can achieve
the worst-case scenario where every memory access flushes the cache of
every processor, totally defeating the purpose of the cache, and perhaps
even adding nontrivial cache-flushing overhead.

If this is indeed the case (please correct any misconceptions I have) then
it strikes me that perhaps the hardware design of SMP is broken. That
instead of sharing main memory, each processor should have it's own main
memory. You connect the various main memory chunks to the "primary" CPU via
some sort of very wide, very fast memory bus, and then when you spawn a
thread, you instead do something more like a fork - copy the relevent
process and data to the child cpu's private main memory (perhaps via some
sort of blitter) over this bus, and then let that CPU go play in its own
sandbox for a while.

Which really is more like the "array of uni-processor boxen joined by a
network" model than it is current SMP - just with a REALLY fast&wide
network pipe that just happens to be in the same physical box.

Comments? Please feel free to reply private-only if this is just too
entry-level for general discussuion.

DG


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread
[parent not found: <000f01bf66da$872a6730$021d85d1@youwant.to>]
* Re: SMP Theory (was: Re: Interesting analysis of linux kernel threading  by IBM)
@ 2000-01-25 19:39 Iain McClatchie
       [not found] ` <388DFF0F.8E7784A1@timpanogas.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Iain McClatchie @ 2000-01-25 19:39 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

One of the problems with this forum is that you can't hear the murmur
of assent ripple through the hardware design crowd when Larry rants
about this stuff.  Larry has had his head out of the box for a long
time.

Look at the ASCI project.  The intention was for SGI to build an
Origin with around 1000 CPUs.  That Origin had extra cache coherence
directory RAM and special encodings in that RAM so that the hardware
could actually keep the memory across all 1000 CPUs coherent.  We
added extra physical address bits to the R10K to make this machine
possible.

Last I heard, the machine is mostly programmed with message passing.

I remember having a talk with an O/S guy who was implementing some
sort of message delivery utility inside the O/S.  This was when
Cellular IRIX was in development, and they were investigating having
the various O/S images talk to each other with messages across the
shared memory.  Then someone found out the O/S images could signal
each other FASTER through the HIPPI connections than they could
through shared memory.  That is, this machine had a HIPPI port local
to each O/S image, and all those HIPPI ports were connected together
via a HIPPI switch.

Those HIPPI connections were build with the _same_physical_link_ as
the shared memory - an 800 MB/s source-synchronous channel.  But if
you're sending a message, it's better to have the I/O system just
send the bits one way than have the shared memory system do two round
trips, one to invalidate the mailbox buffer for writing and another to
process the remote cache miss to receive the message.

-Iain McClatchie
www.10xinc.com
iain@10xinc.com
650-364-0520 voice
650-364-0530 FAX

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2000-01-25 17:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-01-24 22:46 SMP Theory (was: Re: Interesting analysis of linux kernel threading by IBM) dg50
2000-01-24 23:56 ` Jamie Lokier
2000-01-25  2:38   ` Ralf Baechle
2000-01-25  0:54 ` Larry McVoy
     [not found] <000f01bf66da$872a6730$021d85d1@youwant.to>
2000-01-25 10:47 ` Davide Libenzi
  -- strict thread matches above, loose matches on Subject: below --
2000-01-25 19:39 Iain McClatchie
     [not found] ` <388DFF0F.8E7784A1@timpanogas.com>
2000-01-25 21:26   ` Iain McClatchie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox