kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
From: fjohnber@zoho.com (Fredrick)
To: kernelnewbies@lists.kernelnewbies.org
Subject: Best way to debug an Intel Core i5 hang - likely graphics (possibly power) related
Date: Mon, 30 Jan 2012 22:14:03 -0800	[thread overview]
Message-ID: <4F2786AB.8080408@zoho.com> (raw)
In-Reply-To: <CALButC+fDvb842=vOA9Hdd3PObrAAQMLMZt=LaVqPWpjSfCJaQ@mail.gmail.com>

On 01/30/2012 08:52 PM, Graeme Russ wrote:
> Hi Mulyadi,
>
> On Tue, Jan 31, 2012 at 3:25 PM, Mulyadi Santosa
> <mulyadi.santosa@gmail.com>  wrote:
>> Hi :)
>>
>> On Tue, Jan 31, 2012 at 10:00, Graeme Russ<graeme.russ@gmail.com>  wrote:
>>> I _think_ I've solved the problem - SDRAM Voltage
>>
>> You got my respect man, you're really stubborn :)
>>
>>> The SDRAM I am using has a rated operating voltage of 1.5V +/- 0.075.
>>> It looked like the motherboard BIOS had decided to use the upper limit
>>> of 1.575V when set to 'Auto'. I changed it to 'Manual' and set the
>>> SDRAM voltage to 1.5V and it's been running stably for the longest
>>> time it ever has.
>>
>> Thanks (again) for sharing. So this indeed has tight relationship with
>> RAM "misbehaviour". How do you know it? Do you inspect every piece of
>> your hardware? I am curious to know (maybe others too).
>
> The first symptom was that the screen would cycle through solid colour, so
> naturally the video 'card' was the first to be blamed. Of course, the i5
> has the video built into the CPU, so the likelihood of a fault there is
> probably minimal, so the graphics driver was next in line
>
> So I installed an nVidia 8600GT and ran the nouveau driver (now I did get
> a glitch using this combo, but it wasn't a hang so I set that aside as a
> driver bug as well... could be related)
>
> I then installed an nVidia G210 (it's a much smaller and quieter card). I
> experienced one hang with this combination (right, now things are getting
> interesting...)
>
> In the meantime, I had tried fiddling with the IGPU voltage offset - no
> luck of course
>
> I removed my Linux hard drives and installed a spare hard drive and
> proceeded to install Windows 7 (using the on-chip Intel graphics). The
> machine hung once before the Window 7 drivers were installed (promising)
>
> I then installed the Windows 7 drivers and started downloading 3DMark 2006
>
> ...Off to Australia Day Lunch with friends, back later...
>
> OK, so 3DMark downloaded OK and the machine was still running some 6 hours
> later :(
>
> Before getting a chance to install 3DMark, I had some other things to
> attend to... Glancing over bright flashing colours!!! Linux had been
> exonerated :)
>
> So I took it back to the shop I bought it from (long argument about voiding
> the warranty by taking of the cover blah blah blah). They ran a stress
> test without failure. I suggested they run memtest which was met by 'Ah,
> yeah, I should have thought of that first' (and _I_ voided the warranty!)
>
> So memtest failed, they put in another pair of memory modules and memtest
> failed again. Now the plot thickens... They put the old memory back and
> memtest passed! (what the!) then the put the new memory in and, you guessed
> it, memtest passed! So the old memory goes back in and more stress testing
> begins.
>
> It was run all day, no failure. So I went in and picked up the machine to
> take back home on the assumption that the problem was the seating of the
> memory modules - well I couldn't really fault that analysis (another
> argument about voiding warranty, 'parts still in warranty, labour to run
> the tests not', and 'Oh, it failed under Linux, must be software related,
> not covered by warrantly' Me: 'It failed before I opened the case',
> Them: 'doesn't matter, you opened the case') - Anyway, I got it back
> without paying anything mumbling 'idiots' under my breath...
>
> so I put my Linux drives back in and run it over night. It survived and so
> I thought the problem was solved but alas, it failed ten minutes after
> waking it up in the morning... bugger!
>
> So RAM modules not the problem, that leaves CPU, Motherboard and PSU...
>
> So I switched out the PSU - Fail (really quickly this time... interesting)
>
> So that's when I decided to look at the SDRAM voltage - I looked up the
> datasheet for the RAM and compared it to the BIOS setting... Hmm, right
> at the upper limit of the spec'd DIMM voltage, so I set it to 1.5V
> manually.
>
> Since then it has not skipped a beat (only been ~18 hours, but that's way
> longer than previously)
>
> Now if it fails again, I'm just going to buy another motherboard. If that
> works, I'm going to have a _very_ interesting time with the shop I
> bought it from (after all, the parts are under warranty hardy, har har!)
>
>> NB: it could be a good lesson that system lock up might have
>> absolutely nothing to do with kernel.
>
> Verily :)
>
> Regards,
>
> Graeme
>

Thank you Graeme for sharing this experience. Amazing persistence! I 
would not have gone this far. :) Sometimes you have to doubt even the 
nuts and bolts :)

-Fredrick

> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

      parent reply	other threads:[~2012-01-31  6:14 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-20  2:20 Best way to debug an Intel Core i5 hang - likely graphics (possibly power) related Graeme Russ
2012-01-20  4:05 ` Mulyadi Santosa
     [not found]   ` <4F1A8C32.3050907@gmail.com>
2012-01-21 17:40     ` Mulyadi Santosa
2012-01-23 11:12       ` Graeme Russ
2012-01-23 17:21         ` Mulyadi Santosa
2012-01-26 10:15       ` Graeme Russ
2012-01-26 10:30         ` Mulyadi Santosa
2012-01-26 10:45           ` Graeme Russ
2012-01-26 11:00             ` Mulyadi Santosa
2012-01-31  3:00             ` Graeme Russ
2012-01-31  4:25               ` Mulyadi Santosa
2012-01-31  4:52                 ` Graeme Russ
2012-01-31  5:49                   ` Mulyadi Santosa
2012-01-31 10:44                     ` Graeme Russ
2012-01-31 14:41                       ` Mulyadi Santosa
2012-01-31 22:06                       ` Graeme Russ
2012-02-01 15:03                         ` Mulyadi Santosa
2012-02-01 15:28                           ` Graeme Russ
2012-02-01 15:34                             ` Mulyadi Santosa
2012-02-01 15:44                               ` Graeme Russ
2012-02-01 22:11                                 ` Graeme Russ
2012-02-02  3:28                                   ` Mulyadi Santosa
2012-02-02 23:04                                     ` Graeme Russ
2012-02-03  6:02                                       ` Mulyadi Santosa
2012-01-31  6:14                   ` Fredrick [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F2786AB.8080408@zoho.com \
    --to=fjohnber@zoho.com \
    --cc=kernelnewbies@lists.kernelnewbies.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).