From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by yocto-www.yoctoproject.org (Postfix, from userid 118) id A7F8FE00E7B; Thu, 8 Oct 2015 11:11:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on yocto-www.yoctoproject.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, MALFORMED_FREEMAIL, MIME_QP_LONG_LINE,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 X-Spam-HAM-Report: * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider * (dbureau[at]gmail.com) * -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low * trust * [209.85.192.54 listed in list.dnswl.org] * -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 0.0 HTML_MESSAGE BODY: HTML included in message * 0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars * -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's * domain * 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily * valid * -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature * 0.0 MALFORMED_FREEMAIL Bad headers on message from free email service Received: from mail-qg0-f54.google.com (mail-qg0-f54.google.com [209.85.192.54]) by yocto-www.yoctoproject.org (Postfix) with ESMTP id 6EF61E009C5 for ; Thu, 8 Oct 2015 11:11:13 -0700 (PDT) Received: by qgev79 with SMTP id v79so49026914qge.0 for ; Thu, 08 Oct 2015 11:11:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=user-agent:date:subject:from:to:message-id:thread-topic :mime-version:content-type; bh=vdwL/AIzVmgv6VyqiurMCxcNHBqyKqrUBSvA8dH2GVY=; b=Ar0PyC8Ekte17cDab5LPYdJ+JKwviEHmxCWwvTNx0OvX1bIlD2C1p8DF3NhevcnXcY 1DY931kP6bFl5+oJu51UvmUaaZVb2g5AYnL0DPYNvdCgyZtFdaK33zlHoCii3Kh4dZEF Hq5iuq2yc3c3OnHnibc3nTVGSA5PVb3jA6js76zSj1mg43/3t28M8pzTb9HcAv0ps+Ze 4uqjDjWcgUv/obFK0/8SrgzxCSoj82L8iFPdjQP7XF01k7wn3MZXTamyaADUVbZhfN51 Ucu6F+q8qz9ynXpPzGv96FoFbMbE/3SAU1ec4UsOb5L/50SkD6BDagkWmPeLkcogtqWX iyww== X-Received: by 10.140.107.54 with SMTP id g51mr10171282qgf.38.1444327873241; Thu, 08 Oct 2015 11:11:13 -0700 (PDT) Received: from [192.168.7.101] ([207.96.182.162]) by smtp.gmail.com with ESMTPSA id l8sm19173065qge.31.2015.10.08.11.11.11 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 08 Oct 2015 11:11:12 -0700 (PDT) User-Agent: Microsoft-MacOutlook/14.3.8.130913 Date: Thu, 08 Oct 2015 14:11:10 -0400 From: Dominique Bureau To: "meta-freescale@yoctoproject.org" Message-ID: Thread-Topic: Chromium file descriptor leak and crash issue on imx6 with Weston Mime-version: 1.0 Subject: Chromium file descriptor leak and crash issue on imx6 with Weston X-BeenThere: meta-freescale@yoctoproject.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Usage and development list for the meta-fsl-* layers List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Oct 2015 18:11:18 -0000 Content-type: multipart/alternative; boundary="B_3527158273_7470578" --B_3527158273_7470578 Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable Hello everyone, We're currently developing a product that relies heavily on Chromium to display content in a loop (including webgl and videos) and are running in a few stability issues. At this point we're unsure if it's webgl or video related (or both). One thing we noticed is that there seems to be a lot of file descriptors leaking from the gpu-process. Whenever a video is playing, reloading the page / loading a new url or closing the tab seems to be leaking file descriptors according the /proc/[gpu-process-pid]/fd. However, if we wait for the video to end completely before doing anything else, the leak does not seem to happen. I've been able to reproduce the issue on our custom hardware (imx6qdl based) using Freescale Yocto 1.6, as well as on a Nitrogen6X using the community's 1.7 and 1.8. We tried with both Chromium 38.0.2125.101 and 40.0.2214.91. Galcore version is 5.0.11:25762. The simplest way to reproduce the issue is to start Chromium with the following video: (Note: we run Chromium in incognito mode as it seems more stable, but running as a normal session behaves the same) google-chrome --incognito http://techslides.com/demos/sample-videos/small.mp4 Once the page is loaded, playing the video a couple of times while monitoring the number of fd in /proc/[gpu-process-pid]/fd should show that there are 3 more fds while the video is playing, and they dissapear once th= e video ends. For example: (while playing) root@imx6qh120:~# ls -al /proc/2007/fd | wc -l 63 (video ended) root@imx6qh120:~# ls -al /proc/2007/fd | wc -l 60 If you then start the video and hit reload while the video is playing, here is what you should see (result should be similar if you close the tab instead): (while playing) root@imx6qh120:~# ls -al /proc/2007/fd | wc -l 70 (video ended) root@imx6qh120:~# ls -al /proc/2007/fd | wc -l 68 Each time this happens, fds appear to be left trailing behind, looking like this: (here, the chrome process flagged as "--type=3Dgpu-process" is 1176) ls =ADal /proc/1176/fd [=8Asnip] lrwx------ 1 root root 64 Oct 8 12:03 50 -> /dev/shm/.org.chromium.Chromium.yqa9hs (deleted) lrwx------ 1 root root 64 Oct 8 12:03 51 -> /dev/shm/.org.chromium.Chromium.rxZ7gw (deleted) lrwx------ 1 root root 64 Oct 8 12:03 53 -> /dev/shm/.org.chromium.Chromium.wQhgYz (deleted) lrwx------ 1 root root 64 Oct 8 12:03 54 -> /dev/shm/.org.chromium.Chromium.Sd1bnl (deleted) lrwx------ 1 root root 64 Oct 8 12:03 55 -> /dev/shm/.org.chromium.Chromium.roolL0 (deleted) lrwx------ 1 root root 64 Oct 8 12:03 56 -> /dev/shm/.org.chromium.Chromium.8b1eGN (deleted) lrwx------ 1 root root 64 Oct 8 12:03 59 -> /dev/shm/.org.chromium.Chromium.k8DJ8O (deleted) lr-x------ 1 root root 64 Oct 8 12:03 6 -> /dev/urandom lrwx------ 1 root root 64 Oct 8 12:03 63 -> /dev/shm/.org.chromium.Chromium.dC0JSb (deleted) lrwx------ 1 root root 64 Oct 8 12:03 64 -> /dev/shm/.org.chromium.Chromium.kvYJfm (deleted) lrwx------ 1 root root 64 Oct 8 12:03 66 -> /dev/shm/.org.chromium.Chromium.3Uvica (deleted) lrwx------ 1 root root 64 Oct 8 12:03 69 -> /dev/shm/.org.chromium.Chromium.n0KK3C (deleted) lrwx------ 1 root root 64 Oct 8 12:03 7 -> socket:[9825] lrwx------ 1 root root 64 Oct 8 12:03 70 -> /dev/shm/.org.chromium.Chromium.4frFLT (deleted) lrwx------ 1 root root 64 Oct 8 12:03 73 -> /dev/shm/.org.chromium.Chromium.T3zH8f (deleted) lrwx------ 1 root root 64 Oct 8 12:03 8 -> /run/user/root/weston-shared-mt7RJ3 (deleted) lrwx------ 1 root root 64 Oct 8 12:06 84 -> socket:[45599] lrwx------ 1 root root 64 Oct 8 12:06 88 -> socket:[47512] lrwx------ 1 root root 64 Oct 8 12:03 9 -> /dev/fb0 lrwx------ 1 root root 64 Oct 8 12:06 92 -> /dev/shm/.org.chromium.Chromium.Vjrn54 (deleted) lrwx------ 1 root root 64 Oct 8 12:03 77 -> /dev/shm/.org.chromium.Chromium.l7PDMZ (deleted) [=8Asnip=8A] There are typically 2 crash outputs we frequently see. This most common one= : ---------------------------------------------------------------------- [5101:5101:0930/182607:ERROR:power_save_blocker_ozone.cc(32)] Not implemented reached in virtual content::PowerSaveBlockerImpl::~PowerSaveBlockerImpl() [8314:8314:0930/182608:ERROR:sandbox_linux.cc(301)] InitializeSandbox() called with multiple threads in process gpu-process [5101:5101:0930/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not send GpuCommandBufferMsg_Initialize. [5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(21= 3 )] CommandBufferProxy::Initialize failed. [5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(23= 0 )] Failed to initialize command buffer. [8258:8265:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(27= 4 )] Failed to initialize GLES2Implementation. [5152:5177:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(27= 4 )] Failed to initialize GLES2Implementation. [8320:8320:0930/182609:ERROR:sandbox_linux.cc(301)] InitializeSandbox() called with multiple threads in process gpu-process [5152:5177:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(27= 4 )] Failed to initialize GLES2Implementation. [8258:8265:0930/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not send GpuCommandBufferMsg_Initialize. [5101:5101:0930/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not send GpuCommandBufferMsg_Initialize. [5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(21= 3 )] CommandBufferProxy::Initialize failed. [5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(23= 0 )] Failed to initialize command buffer. [8258:8265:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(21= 3 )] CommandBufferProxy::Initialize failed. [8258:8265:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(23= 0 )] Failed to initialize command buffer. [5101:5101:0930/182609:ERROR:surface_factory_ozone.cc(53)] Not implemented reached in virtual scoped_ptr ui::SurfaceFactoryOzo= n e::CreateCanvasForWidget(gfx::AcceleratedWidget) [5101:5101:0930/182609:FATAL:software_output_device_ozone.cc(22)] Failed to initialize canvas ---------------------------------------------------------------------- And this one: ---------------------------------------------------------------------- [16856:16856:1006/110817:ERROR:power_save_blocker_ozone.cc(29)] Not implemented reached in content::PowerSaveBlockerImpl::PowerSaveBlockerImpl(content: :PowerSaveBlocker::PowerSaveBlockerType, const string&) [17672:17672:1006/110822:ERROR:display.cc(87)] WaylandDisplay failed to initialize hardware [17672:17672:1006/110822:FATAL:ozone_platform_wayland.cc(106)] failed to initialize display hardware [16856:16856:1006/110822:ERROR:surface_factory_ozone.cc(53)] Not implemente= d reached in virtual scoped_ptr ui::SurfaceFactoryOzone::CreateCanvasForWidget(gfx::AcceleratedWidget) [16856:16856:1006/110822:FATAL:software_output_device_ozone.cc(22)] Failed to initialize canvas ---------------------------------------------------------------------- Also, unless we perform the dreaded trick of dropping caches every now and then, we get the usual: [742:742:1006/162906:ERROR:texture_manager.cc(1706)] [.GPU-VideoAccelerator-Offscreen-0x78323380]GL ERROR :GL_OUT_OF_MEMORY : glTexImage2D: I'm basically looking to see if anyone would know how "problematic" the fil= e leaks can be to the system, and if that could explain why we get so many crashes while loading content. Someone here did some tracing on a debug build when Chrome crashes and it was in libvpu, but couldn't get more usefu= l info. From what we understand, libvpu maps CMA memory for the application and we suspect that it's possibly being misused, leading to a crash. Any pointers are appreciated. Thanks! -- Dominique Bureau --B_3527158273_7470578 Content-type: text/html; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable
Hello everyone,

We're currently developing a product that relies heavily on= Chromium to display content in a loop (including webgl and videos) and are = running in a few stability issues. At this point we're unsure if it's webgl = or video related (or both).

One thing we noticed is= that there seems to be a lot of file descriptors leaking from the gpu-proce= ss. Whenever a video is playing, reloading the page / loading a new url or c= losing the tab seems to be leaking file descriptors according the /proc/[gpu= -process-pid]/fd. However, if we wait for the video to end completely before= doing anything else, the leak does not seem to happen. I've been able to re= produce the issue on our custom hardware (imx6qdl based) using Freescale Yoc= to 1.6, as well as on a Nitrogen6X using the community's 1.7 and 1.8. We tri= ed with both Chromium 38.0.2125.101 and 40.0.2214.91. Galcore version is&nbs= p;5.0.11:25762.

The simplest way to reproduce = the issue is to start Chromium with the following video:
(Note: we= run Chromium in incognito mode as it seems more stable, but running as a no= rmal session behaves the same)

Once the page= is loaded, playing the video a couple of times while monitoring the number = of fd in /proc/[gpu-process-pid]/fd should show that there are 3 more fds wh= ile the video is playing, and they dissapear once the video ends. For exampl= e:
(while playing)
root@imx6qh120:~# ls -al /proc/2= 007/fd | wc -l
63
(video ended)
root@imx6qh120= :~# ls -al /proc/2007/fd | wc -l
60

If you then start the video and hit reload while the video is playing, here= is what you should see (result should be similar if you close the tab inste= ad):
(while playing)
root@imx6qh120:~# ls -al /proc= /2007/fd | wc -l
70
(video ended)
root@imx6qh1= 20:~# ls -al /proc/2007/fd | wc -l
68

Each time this happens, fds appear to be left trailing behind, looking li= ke this:
(here, the chrome process flagged as "--type=3Dgpu-process"= is 1176)

ls –al /proc/1176/fd
[= 230;snip]
lrwx------ 1 root root 64 Oct  8 12:03 50 ->= ; /dev/shm/.org.chromium.Chromium.yqa9hs (deleted)
lrwx------ 1 ro= ot root 64 Oct  8 12:03 51 -> /dev/shm/.org.chromium.Chromium.rxZ7gw= (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 53 -> /d= ev/shm/.org.chromium.Chromium.wQhgYz (deleted)
lrwx------ 1 root r= oot 64 Oct  8 12:03 54 -> /dev/shm/.org.chromium.Chromium.Sd1bnl (de= leted)
lrwx------ 1 root root 64 Oct  8 12:03 55 -> /dev/s= hm/.org.chromium.Chromium.roolL0 (deleted)
lrwx------ 1 root root = 64 Oct  8 12:03 56 -> /dev/shm/.org.chromium.Chromium.8b1eGN (delete= d)
lrwx------ 1 root root 64 Oct  8 12:03 59 -> /dev/shm/.= org.chromium.Chromium.k8DJ8O (deleted)
lr-x------ 1 root root 64 O= ct  8 12:03 6 -> /dev/urandom
lrwx------ 1 root root 64 Oc= t  8 12:03 63 -> /dev/shm/.org.chromium.Chromium.dC0JSb (deleted)
lrwx------ 1 root root 64 Oct  8 12:03 64 -> /dev/shm/.org.c= hromium.Chromium.kvYJfm (deleted)
lrwx------ 1 root root 64 Oct &n= bsp;8 12:03 66 -> /dev/shm/.org.chromium.Chromium.3Uvica (deleted)
<= div>lrwx------ 1 root root 64 Oct  8 12:03 69 -> /dev/shm/.org.chrom= ium.Chromium.n0KK3C (deleted)
lrwx------ 1 root root 64 Oct  = 8 12:03 7 -> socket:[9825]
lrwx------ 1 root root 64 Oct  = 8 12:03 70 -> /dev/shm/.org.chromium.Chromium.4frFLT (deleted)
= lrwx------ 1 root root 64 Oct  8 12:03 73 -> /dev/shm/.org.chromium.= Chromium.T3zH8f (deleted)
lrwx------ 1 root root 64 Oct  8 12= :03 8 -> /run/user/root/weston-shared-mt7RJ3 (deleted)
lrwx----= -- 1 root root 64 Oct  8 12:06 84 -> socket:[45599]
lrwx--= ---- 1 root root 64 Oct  8 12:06 88 -> socket:[47512]
lrwx= ------ 1 root root 64 Oct  8 12:03 9 -> /dev/fb0
lrwx-----= - 1 root root 64 Oct  8 12:06 92 -> /dev/shm/.org.chromium.Chromium.= Vjrn54 (deleted)
lrwx------ 1 root root 64 Oct  8 12:03= 77 -> /dev/shm/.org.chromium.Chromium.l7PDMZ (deleted)
[…= ;snip…]


There are typica= lly 2 crash outputs we frequently see. This most common one:
-----= -----------------------------------------------------------------
= [5101:5101:0930/182607:ERROR:power_save_blocker_ozone.cc(32)] Not implemente= d reached in virtual content::PowerSaveBlockerImpl::~PowerSaveBlockerImpl()<= /div>
[8314:8314:0930/182608:ERROR:sandbox_linux.cc(301)] Initializ= eSandbox() called with multiple threads in process gpu-process
[51= 01:5101:0930/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not send = GpuCommandBufferMsg_Initialize.
[5101:5101:0930/182609:ERROR:webgr= aphicscontext3d_command_buffer_impl.cc(213)] CommandBufferProxy::Initialize = failed.
[5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_= buffer_impl.cc(230)] Failed to initialize command buffer.
[8258:82= 65:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(274)] Faile= d to initialize GLES2Implementation.
[5152:5177:0930/182609:ERROR:= webgraphicscontext3d_command_buffer_impl.cc(274)] Failed to initialize GLES2= Implementation.
[8320:8320:0930/182609:ERROR:sandbox_linux.cc(301)= ] InitializeSandbox() called with multiple threads in process gpu-process
[5152:5177:0930/182609:ERROR:webgraphicscontext3d_command_buffer_imp= l.cc(274)] Failed to initialize GLES2Implementation.
[8258:8265:09= 30/182609:ERROR:command_buffer_proxy_impl.cc(150)] Could not send GpuCommand= BufferMsg_Initialize.
[5101:5101:0930/182609:ERROR:command_buffer_= proxy_impl.cc(150)] Could not send GpuCommandBufferMsg_Initialize.
[5101:5101:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(21= 3)] CommandBufferProxy::Initialize failed.
[5101:5101:0930/182609:= ERROR:webgraphicscontext3d_command_buffer_impl.cc(230)] Failed to initialize= command buffer.
[8258:8265:0930/182609:ERROR:webgraphicscontext3d= _command_buffer_impl.cc(213)] CommandBufferProxy::Initialize failed.
[8258:8265:0930/182609:ERROR:webgraphicscontext3d_command_buffer_impl.cc(= 230)] Failed to initialize command buffer.
[5101:5101:0930/182609:= ERROR:surface_factory_ozone.cc(53)] Not implemented reached in virtual scope= d_ptr<ui::SurfaceOzoneCanvas> ui::SurfaceFactoryOzon
e::Crea= teCanvasForWidget(gfx::AcceleratedWidget)
[5101:5101:0930/182609:F= ATAL:software_output_device_ozone.cc(22)] Failed to initialize canvas
<= /div>
------------------------------------------------------------------= ----

And this one:
----------------------= ------------------------------------------------
[16856:16856= :1006/110817:ERROR:power_save_blocker_ozone.cc(29)] Not implemented reached = in content::PowerSaveBlockerImpl::PowerSaveBlockerImpl(content:
:P= owerSaveBlocker::PowerSaveBlockerType, const string&)
[17672:1= 7672:1006/110822:ERROR:display.cc(87)] WaylandDisplay failed to initialize h= ardware
[17672:17672:1006/110822:FATAL:ozone_platform_wayland.cc(1= 06)] failed to initialize display hardware
[16856:16856:1006/11082= 2:ERROR:surface_factory_ozone.cc(53)] Not implemented reached in virtual sco= ped_ptr<ui::SurfaceOzoneCanvas> ui::SurfaceFactoryOzone::CreateCanvasF= orWidget(gfx::AcceleratedWidget)
[16856:16856:1006/110822:FATAL:so= ftware_output_device_ozone.cc(22)] Failed to initialize canvas
----------------------------------------------------------------------

Also, unless we perform the dreaded trick of dropping= caches every now and then, we get the usual:
[742:742:1006/162906= :ERROR:texture_manager.cc(1706)] [.GPU-VideoAccelerator-Offscreen-0x78323380= ]GL ERROR :GL_OUT_OF_MEMORY : glTexImage2D:


I'm basically looking to see if anyone would know how "problema= tic" the file leaks can be to the system, and if that could explain why we g= et so many crashes while loading content. Someone here did some tracing on a= debug build when Chrome crashes and it was in libvpu, but couldn't get more= useful info. From what we understand, libvpu maps CMA memory for the applic= ation and we suspect that it's possibly being misused, leading to a crash.

Any pointers are appreciated.
Thanks!

--
Dominique Bureau

--B_3527158273_7470578--