intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* System freeze apparently due to GPU memory exhaustion - why?
@ 2016-02-23  1:55 Adam Nielsen
  2016-02-23  9:47 ` Joonas Lahtinen
  0 siblings, 1 reply; 7+ messages in thread
From: Adam Nielsen @ 2016-02-23  1:55 UTC (permalink / raw)
  To: intel-gfx

Hi all,

I'm an end user and I'm having problems which I believe are ultimately
caused by an issue with the Intel kernel driver.

When I am running programs that use a lot of images (e.g. GIMP, Firefox
with YouTube and Google Maps) then after a short while my whole machine
will grind to a halt, to the point where even my audio buffers don't
get updated so I hear the same 500ms of audio repeated in a loop until
the system comes back to life again.

Typically the system will stop responding from anywhere between 30
seconds to a minute, and usually it returns to service after the kernel
OOM killer has ended a process - typically Firefox as it uses the most
system memory.

Of course it's not a system OOM problem, because the system has 16GB of
RAM and the problem still happens even if less than 4GB is in use.

Looking at dmesg, the problem always starts with a message like this:

  Purging GPU memory, 69632 bytes freed, 48553984 bytes still pinned.

Since system memory is fine, I can only assume this means that there is
some limit on the amount of memory the video driver can access, and it
has reached that limit.

Here is the top of the backtrace, if it's relevant:

  Call Trace:
   [<ffffffff812c0dc9>] dump_stack+0x4b/0x72
   [<ffffffff811d6682>] dump_header+0x87/0x21e
   [<ffffffffa0646c01>] ? i915_gem_shrinker_oom+0x1a1/0x200 [i915]
   [<ffffffff8116214b>] oom_kill_process+0x34b/0x3b0
   [<ffffffff8116241f>] out_of_memory+0x21f/0x490
   [<ffffffff811680f8>] __alloc_pages_nodemask+0x8c8/0x960
   [<ffffffff811683ab>] alloc_kmem_pages_node+0x7b/0x150

Can anyone shed any light on what's happening here?  Is there a limit
to how much video memory the system can access?  I thought that was
done away with once AGP disappeared but perhaps not.

Is there any way to alleviate the problems this issue produces?  It's
very annoying to be zooming around in Google Maps one moment and then
dumped back to the console the next because the kernel killed X11.

My system is an Intel DH87MC w/ i7-4770K, kernel 4.3.3.  Three
monitors, DVI + HDMI + DisplayPort, total res 4960x1600.

Many thanks,
Adam.


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: System freeze apparently due to GPU memory exhaustion - why?
  2016-02-23  1:55 System freeze apparently due to GPU memory exhaustion - why? Adam Nielsen
@ 2016-02-23  9:47 ` Joonas Lahtinen
  2016-02-23 10:58   ` Adam Nielsen
  0 siblings, 1 reply; 7+ messages in thread
From: Joonas Lahtinen @ 2016-02-23  9:47 UTC (permalink / raw)
  To: Adam Nielsen, intel-gfx

Hi,

On ti, 2016-02-23 at 11:55 +1000, Adam Nielsen wrote:
> Hi all,
> 
> I'm an end user and I'm having problems which I believe are ultimately
> caused by an issue with the Intel kernel driver.
> 
> When I am running programs that use a lot of images (e.g. GIMP, Firefox
> with YouTube and Google Maps) then after a short while my whole machine
> will grind to a halt, to the point where even my audio buffers don't
> get updated so I hear the same 500ms of audio repeated in a loop until
> the system comes back to life again.
> 
> Typically the system will stop responding from anywhere between 30
> seconds to a minute, and usually it returns to service after the kernel
> OOM killer has ended a process - typically Firefox as it uses the most
> system memory.
> 
> Of course it's not a system OOM problem, because the system has 16GB of
> RAM and the problem still happens even if less than 4GB is in use.
> 
> Looking at dmesg, the problem always starts with a message like this:
> 
>   Purging GPU memory, 69632 bytes freed, 48553984 bytes still pinned.
> 
> Since system memory is fine, I can only assume this means that there is
> some limit on the amount of memory the video driver can access, and it
> has reached that limit.
> 
> Here is the top of the backtrace, if it's relevant:
> 
>   Call Trace:
>    [] dump_stack+0x4b/0x72
>    [] dump_header+0x87/0x21e
>    [] ? i915_gem_shrinker_oom+0x1a1/0x200 [i915]
>    [] oom_kill_process+0x34b/0x3b0
>    [] out_of_memory+0x21f/0x490
>    [] __alloc_pages_nodemask+0x8c8/0x960
>    [] alloc_kmem_pages_node+0x7b/0x150
> 
> Can anyone shed any light on what's happening here?  Is there a limit
> to how much video memory the system can access?  I thought that was
> done away with once AGP disappeared but perhaps not.
> 

Can you attach a full dmesg from boot until the problem appears?

Regards, Joonas

> Is there any way to alleviate the problems this issue produces?  It's
> very annoying to be zooming around in Google Maps one moment and then
> dumped back to the console the next because the kernel killed X11.
> 
> My system is an Intel DH87MC w/ i7-4770K, kernel 4.3.3.  Three
> monitors, DVI + HDMI + DisplayPort, total res 4960x1600.
> 
> Many thanks,
> Adam.
> 
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: System freeze apparently due to GPU memory exhaustion - why?
  2016-02-23  9:47 ` Joonas Lahtinen
@ 2016-02-23 10:58   ` Adam Nielsen
  2016-02-26 10:30     ` Joonas Lahtinen
  0 siblings, 1 reply; 7+ messages in thread
From: Adam Nielsen @ 2016-02-23 10:58 UTC (permalink / raw)
  To: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 599 bytes --]

> Can you attach a full dmesg from boot until the problem appears?

Attached, thanks for your reply.

You can ignore the problem at T=1032000, that was a broken floppy disk
in a USB floppy drive.  The first possibly-GPU-related problem starts
at T=2121945 then the same problem happens immediately following in the
log at T=2128002.

It may be of relevance to note that after a reboot it can take a while
for the problem to surface, but if X crashes due to this issue and I
restart X immediately, then it can be only a matter of minutes to
hours until the problem happens again.

Many thanks,
Adam.

[-- Attachment #2: dmesg.txt.gz --]
[-- Type: application/gzip, Size: 38214 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: System freeze apparently due to GPU memory exhaustion - why?
  2016-02-23 10:58   ` Adam Nielsen
@ 2016-02-26 10:30     ` Joonas Lahtinen
  2016-04-04 12:54       ` Adam Nielsen
  0 siblings, 1 reply; 7+ messages in thread
From: Joonas Lahtinen @ 2016-02-26 10:30 UTC (permalink / raw)
  To: Adam Nielsen, intel-gfx

On ti, 2016-02-23 at 20:58 +1000, Adam Nielsen wrote:
> > 
> > Can you attach a full dmesg from boot until the problem appears?
> Attached, thanks for your reply.
> 
> You can ignore the problem at T=1032000, that was a broken floppy disk
> in a USB floppy drive.  The first possibly-GPU-related problem starts
> at T=2121945 then the same problem happens immediately following in the
> log at T=2128002.
> 

That seems like a legit bug. If you can reproduce it with drm-intel-
nightly, could you please open a bug at freedesktop.org bugzilla?

Have you tried running the I-G-T testing suite on your hardware?

Regards, Joonas

> It may be of relevance to note that after a reboot it can take a while
> for the problem to surface, but if X crashes due to this issue and I
> restart X immediately, then it can be only a matter of minutes to
> hours until the problem happens again.
> 
> Many thanks,
> Adam.
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: System freeze apparently due to GPU memory exhaustion - why?
  2016-02-26 10:30     ` Joonas Lahtinen
@ 2016-04-04 12:54       ` Adam Nielsen
  2016-04-04 13:13         ` Adam Nielsen
  0 siblings, 1 reply; 7+ messages in thread
From: Adam Nielsen @ 2016-04-04 12:54 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

> That seems like a legit bug. If you can reproduce it with drm-intel-
> nightly, could you please open a bug at freedesktop.org bugzilla?

Just had this happen after running drm-intel-nightly for 18 days, so I
have opened a bug here:

https://bugs.freedesktop.org/show_bug.cgi?id=94814

> Have you tried running the I-G-T testing suite on your hardware?

No I haven't - do I just install intel-gpu-tools and find some test
program to run?

Since it takes at least two weeks for this issue to appear, I'm not
sure the tests would fail unless I run them after the issue has first
appeared.  But I'm happy to try this.

Many thanks,
Adam.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: System freeze apparently due to GPU memory exhaustion - why?
  2016-04-04 12:54       ` Adam Nielsen
@ 2016-04-04 13:13         ` Adam Nielsen
  2016-04-07 16:25           ` Marius Vlad
  0 siblings, 1 reply; 7+ messages in thread
From: Adam Nielsen @ 2016-04-04 13:13 UTC (permalink / raw)
  To: Joonas Lahtinen; +Cc: intel-gfx

> > Have you tried running the I-G-T testing suite on your hardware?  
> 
> No I haven't - do I just install intel-gpu-tools and find some test
> program to run?

I cloned the git repo for this and tried to run the tests as best I
could understand from the readme, but no luck:

  intel-gpu-tools/tests$ ../scripts/run-tests.sh
  Fatal Error: Cannot overwrite existing folder without the -o/--overwrite option being set.

  intel-gpu-tools/tests$ ../scripts/run-tests.sh -o
  Unknown option: -o

So I tried running piglit - the README says "./piglit" but for me
it was "./piglit/piglit" instead:

  intel-gpu-tools$ ./piglit/piglit
  Traceback (most recent call last):
    File "./piglit/piglit", line 165, in <module>
      main()
    File "./piglit/piglit", line 160, in main
      returncode = parsed.func(args)
  AttributeError: 'Namespace' object has no attribute 'func'

Thinking it might not be Python 3 compatible, I tried as Python 2:

  intel-gpu-tools$ python2 ./piglit/piglit run igt output
  ./piglit/framework/test/base.py:76: UserWarning: Timeouts are not available
    warnings.warn('Timeouts are not available')

  Traceback (most recent call last):
    File "./piglit/piglit", line 165, in <module>
      main()
    File "./piglit/piglit", line 160, in main
      returncode = parsed.func(args)
    File "./piglit/framework/exceptions.py", line 50, in _inner
      func(*args, **kwargs)
    File "./piglit/framework/programs/run.py", line 280, in run
      profile = framework.profile.merge_test_profiles(args.test_profile)
    File "./piglit/framework/profile.py", line 444, in merge_test_profiles
      profile = load_test_profile(profiles.pop())
    File "./piglit/framework/profile.py", line 422, in load_test_profile
      os.path.splitext(os.path.basename(filename))[0]))
    File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
      __import__(name)
    File "./piglit/tests/igt.py", line 177, in <module>
      populate_profile()
    File "./piglit/tests/igt.py", line 174, in populate_profile
      add_subtest_cases(test)
    File "./piglit/tests/igt.py", line 150, in add_subtest_cases
      universal_newlines=True)
    File "/usr/lib/python2.7/subprocess.py", line 566, in check_output
      process = Popen(stdout=PIPE, *popenargs, **kwargs)
    File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
      errread, errwrite)
    File "/usr/lib/python2.7/subprocess.py", line 1231, in _execute_child
      self.pid = os.fork()
  OSError: [Errno 12] Cannot allocate memory

Not sure why fork() is throwing an error - the system isn't *that*
broken...

Any suggestions what to try next, assuming you want these tests run
after the GPU error has occurred?

Thanks,
Adam.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: System freeze apparently due to GPU memory exhaustion - why?
  2016-04-04 13:13         ` Adam Nielsen
@ 2016-04-07 16:25           ` Marius Vlad
  0 siblings, 0 replies; 7+ messages in thread
From: Marius Vlad @ 2016-04-07 16:25 UTC (permalink / raw)
  To: Adam Nielsen; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 4190 bytes --]

Hi Adam,

   If you have cloned piglit into i-g-t directory go in scripts/
   directory and run it from there (./run-tests.sh -t basic -s).

   It might be that a results/ directory already exists so remove the
   directory before this. There's currently some discussion on how to
   handle this with piglit.

   I might be wrong but I think piglit is using python3 now...

   One more thing, if you can't seem to reproduce it might worth running
   all the tests but this might completely freeze your system. Running
   all the tests might also take a full night depending on how fast is your
   machine -> I'd suggest doing at night if you can.

   Running is the same (within scripts/, ./run-test.sh -s, -s will
   generate a human interpretable HTML with status for all tests.)

   Do note that running the tests might reveal other issues and that
   your system will enter suspend/resume more than once and you can't
   run the tests with an already running drm client (X.org for instance).

On Mon, Apr 04, 2016 at 11:13:38PM +1000, Adam Nielsen wrote:
> > > Have you tried running the I-G-T testing suite on your hardware?  
> > 
> > No I haven't - do I just install intel-gpu-tools and find some test
> > program to run?
> 
> I cloned the git repo for this and tried to run the tests as best I
> could understand from the readme, but no luck:
> 
>   intel-gpu-tools/tests$ ../scripts/run-tests.sh
>   Fatal Error: Cannot overwrite existing folder without the -o/--overwrite option being set.
> 
>   intel-gpu-tools/tests$ ../scripts/run-tests.sh -o
>   Unknown option: -o
> 
> So I tried running piglit - the README says "./piglit" but for me
> it was "./piglit/piglit" instead:
> 
>   intel-gpu-tools$ ./piglit/piglit
>   Traceback (most recent call last):
>     File "./piglit/piglit", line 165, in <module>
>       main()
>     File "./piglit/piglit", line 160, in main
>       returncode = parsed.func(args)
>   AttributeError: 'Namespace' object has no attribute 'func'
> 
> Thinking it might not be Python 3 compatible, I tried as Python 2:
> 
>   intel-gpu-tools$ python2 ./piglit/piglit run igt output
>   ./piglit/framework/test/base.py:76: UserWarning: Timeouts are not available
>     warnings.warn('Timeouts are not available')
> 
>   Traceback (most recent call last):
>     File "./piglit/piglit", line 165, in <module>
>       main()
>     File "./piglit/piglit", line 160, in main
>       returncode = parsed.func(args)
>     File "./piglit/framework/exceptions.py", line 50, in _inner
>       func(*args, **kwargs)
>     File "./piglit/framework/programs/run.py", line 280, in run
>       profile = framework.profile.merge_test_profiles(args.test_profile)
>     File "./piglit/framework/profile.py", line 444, in merge_test_profiles
>       profile = load_test_profile(profiles.pop())
>     File "./piglit/framework/profile.py", line 422, in load_test_profile
>       os.path.splitext(os.path.basename(filename))[0]))
>     File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
>       __import__(name)
>     File "./piglit/tests/igt.py", line 177, in <module>
>       populate_profile()
>     File "./piglit/tests/igt.py", line 174, in populate_profile
>       add_subtest_cases(test)
>     File "./piglit/tests/igt.py", line 150, in add_subtest_cases
>       universal_newlines=True)
>     File "/usr/lib/python2.7/subprocess.py", line 566, in check_output
>       process = Popen(stdout=PIPE, *popenargs, **kwargs)
>     File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
>       errread, errwrite)
>     File "/usr/lib/python2.7/subprocess.py", line 1231, in _execute_child
>       self.pid = os.fork()
>   OSError: [Errno 12] Cannot allocate memory
> 
> Not sure why fork() is throwing an error - the system isn't *that*
> broken...
> 
> Any suggestions what to try next, assuming you want these tests run
> after the GPU error has occurred?
> 
> Thanks,
> Adam.
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-04-07 16:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-23  1:55 System freeze apparently due to GPU memory exhaustion - why? Adam Nielsen
2016-02-23  9:47 ` Joonas Lahtinen
2016-02-23 10:58   ` Adam Nielsen
2016-02-26 10:30     ` Joonas Lahtinen
2016-04-04 12:54       ` Adam Nielsen
2016-04-04 13:13         ` Adam Nielsen
2016-04-07 16:25           ` Marius Vlad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).