Nehalem

All of lore.kernel.org
 help / color / mirror / Atom feed

* Nehalem
@ 2009-03-24 18:05 Robert Hyatt
  2009-03-25 12:32 ` Nehalem Bill Davidsen
  0 siblings, 1 reply; 3+ messages in thread
From: Robert Hyatt @ 2009-03-24 18:05 UTC (permalink / raw)
  To: linux-smp

I ran into an issue that may or may not be on the radar.  Here goes:

1.  The old hyperthreading fix works well for an old PIV with 
hyperthreading, so that with two sockets, and 4 logical processors, the 
compute-bound processes get balanced across the sockets, which fixed the 
original hyper-threading bug everyone talked about.

2.  I now have a dual-socket Nehalem box, 4 cores per socket.  Someone 
wanted to test hyper-threading, which I had disabled, and I found an 
issue.

It appears that the current process scheduling works fine for balancing 
compute-bound processes across the two sockets to optimize cache 
usage. 
But with hyper-threading, things go wrong.  If I run 4 compute-bound 
processes on this box, they will run two per socket just fine.  But on any 
one chip, it is probable that the two processes will land on the same 
core, which is not good.

My first thought was this needs a hiararchical approach.  one big run 
queue per socket, then N run queues per socket, one per physical core.

Now the load can be balanced across the two sockets / chips using the 
"high-level" pair of queues, and then balanced across the physical cores 
on each socket using the low-level queues, to avoid running two processes 
on one physical core, and none on another.

Is a fix already in the works for this, or is this a new issue?  I am 
running 2.6.28.8 on this box.  I am also not so happy with turbo-boost 
either as it is giving some erratic timing data which I don't like for my 
benchmark and tweak software development.  But that's another issue. not 
kernel-related.

Robert M. Hyatt, Ph.D.          Computer and Information Sciences
hyatt@uab.edu                   University of Alabama at Birmingham
(205) 934-2213                  136A Campbell Hall
(205) 934-5473 FAX              Birmingham, AL 35294-1170

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Nehalem
  2009-03-24 18:05 Nehalem Robert Hyatt
@ 2009-03-25 12:32 ` Bill Davidsen
  2009-03-25 13:58   ` Nehalem Robert Hyatt
  0 siblings, 1 reply; 3+ messages in thread
From: Bill Davidsen @ 2009-03-25 12:32 UTC (permalink / raw)
  To: Robert Hyatt; +Cc: linux-smp

Robert Hyatt wrote:
>
> I ran into an issue that may or may not be on the radar.  Here goes:
>
> 1.  The old hyperthreading fix works well for an old PIV with 
> hyperthreading, so that with two sockets, and 4 logical processors, 
> the compute-bound processes get balanced across the sockets, which 
> fixed the original hyper-threading bug everyone talked about.
>
> 2.  I now have a dual-socket Nehalem box, 4 cores per socket.  Someone 
> wanted to test hyper-threading, which I had disabled, and I found an 
> issue.
>
> It appears that the current process scheduling works fine for 
> balancing compute-bound processes across the two sockets to optimize 
> cache usage. But with hyper-threading, things go wrong.  If I run 4 
> compute-bound processes on this box, they will run two per socket just 
> fine.  But on any one chip, it is probable that the two processes will 
> land on the same core, which is not good.
>
> My first thought was this needs a hiararchical approach.  one big run 
> queue per socket, then N run queues per socket, one per physical core.
>
> Now the load can be balanced across the two sockets / chips using the 
> "high-level" pair of queues, and then balanced across the physical 
> cores on each socket using the low-level queues, to avoid running two 
> processes on one physical core, and none on another.
>
> Is a fix already in the works for this, or is this a new issue?  I am 
> running 2.6.28.8 on this box.  I am also not so happy with turbo-boost 
> either as it is giving some erratic timing data which I don't like for 
> my benchmark and tweak software development.  But that's another 
> issue. not kernel-related.

This might be an issue for me as well, I've just ordered parts to build 
several servers based on the i7 architecture, so I will have four cores 
+ HT although they will all be in a single socket. I don't have any idea 
how well this will work, I suppose the HT can be turned off if needed, 
and it will run as well as the Q6600 system these will replace.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc

"You are disgraced professional losers. And by the way, give us our money back."
    - Representative Earl Pomeroy,  Democrat of North Dakota
on the A.I.G. executives who were paid bonuses  after a federal bailout.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Nehalem
  2009-03-25 12:32 ` Nehalem Bill Davidsen
@ 2009-03-25 13:58   ` Robert Hyatt
  0 siblings, 0 replies; 3+ messages in thread
From: Robert Hyatt @ 2009-03-25 13:58 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-smp


I don't think you will have the problem with one socket.  The problem I am 
seeing deals with two sockets where balancing the process load across both 
sockets (for cache usage) and across the physical cores is a bit more than 
the current scheduler can deal with.  I run hyper-threading on a dual PIV 
and the thing does great, making sure that each physical processor gets a 
load comparable to the other one.

Nehalem looks pretty good, and with hyper-threading turned off all looks 
good.  Whether hyper-threading will be of any benefit or not is unknown. 
We are going to buy another cluster from dell with this dual-socket i7 
type of node so we can probably do just fine with hyperthreading off...



Robert M. Hyatt, Ph.D.          Computer and Information Sciences
hyatt@uab.edu                   University of Alabama at Birmingham
(205) 934-2213                  136A Campbell Hall
(205) 934-5473 FAX              Birmingham, AL 35294-1170

On Wed, 25 Mar 2009, Bill Davidsen wrote:

> Robert Hyatt wrote:
>> 
>> I ran into an issue that may or may not be on the radar.  Here goes:
>> 
>> 1.  The old hyperthreading fix works well for an old PIV with 
>> hyperthreading, so that with two sockets, and 4 logical processors, the 
>> compute-bound processes get balanced across the sockets, which fixed the 
>> original hyper-threading bug everyone talked about.
>> 
>> 2.  I now have a dual-socket Nehalem box, 4 cores per socket.  Someone 
>> wanted to test hyper-threading, which I had disabled, and I found an issue.
>> 
>> It appears that the current process scheduling works fine for balancing 
>> compute-bound processes across the two sockets to optimize cache usage. But 
>> with hyper-threading, things go wrong.  If I run 4 compute-bound processes 
>> on this box, they will run two per socket just fine.  But on any one chip, 
>> it is probable that the two processes will land on the same core, which is 
>> not good.
>> 
>> My first thought was this needs a hiararchical approach.  one big run queue 
>> per socket, then N run queues per socket, one per physical core.
>> 
>> Now the load can be balanced across the two sockets / chips using the 
>> "high-level" pair of queues, and then balanced across the physical cores on 
>> each socket using the low-level queues, to avoid running two processes on 
>> one physical core, and none on another.
>> 
>> Is a fix already in the works for this, or is this a new issue?  I am 
>> running 2.6.28.8 on this box.  I am also not so happy with turbo-boost 
>> either as it is giving some erratic timing data which I don't like for my 
>> benchmark and tweak software development.  But that's another issue. not 
>> kernel-related.
>
> This might be an issue for me as well, I've just ordered parts to build 
> several servers based on the i7 architecture, so I will have four cores + HT 
> although they will all be in a single socket. I don't have any idea how well 
> this will work, I suppose the HT can be turned off if needed, and it will run 
> as well as the Q6600 system these will replace.
>
> -- 
> bill davidsen <davidsen@tmr.com>
> CTO TMR Associates, Inc
>
> "You are disgraced professional losers. And by the way, give us our money 
> back."
>   - Representative Earl Pomeroy,  Democrat of North Dakota
> on the A.I.G. executives who were paid bonuses  after a federal bailout.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-smp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-03-25 13:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-24 18:05 Nehalem Robert Hyatt
2009-03-25 12:32 ` Nehalem Bill Davidsen
2009-03-25 13:58   ` Nehalem Robert Hyatt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.