From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from szxga01-in.huawei.com ([58.251.152.64]:20592 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751340AbbJHGR5 (ORCPT ); Thu, 8 Oct 2015 02:17:57 -0400 Subject: Re: [BUG] 3.4.109 - unable to handle kernel NULL pointer dereference at (null) To: Cal Peake , Steven Rostedt References: <20151001170756.2a75aa73@gandalf.local.home> CC: stable , LKML , From: Zefan Li Message-ID: <56160A7A.7060601@huawei.com> Date: Thu, 8 Oct 2015 14:17:30 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="iso-8859-15"; format=flowed Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org List-ID: (back from vacation) On 2015/10/4 12:26, Cal Peake wrote: > On Thu, 1 Oct 2015, Steven Rostedt wrote: > >> >> I merged 3.4.109 into 3.4-rt, and it bugged. I then booted 3.4.109 >> vanilla and it bugged too. 3.4.108 is fine. >> > I guess this is caused by the following commit, which has already been reverted in mainline kernel. I'll fix it in 3.4.110. drm/i915: Don't skip request retirement if the active list is empty commit 0aedb1626566efd72b369c01992ee7413c82a0c5 upstream. > I'm getting a similar type bug here. I've bisected it down to this commit: > > commit 961bd13539b9e7ca5d2e667668141496b7a1d6bc > Author: Michel D�nzer > Date: Thu Apr 16 11:17:27 2015 +0900 > > drm/radeon: Use drm_calloc_ab for CS relocs > > commit b421ed15d2c3039eb724680e4de1e4b2bd196a9a upstream. > > The number of relocs is passed in by userspace and can be large. It has > been observed to cause kcalloc failures in the wild. > > > Backing it out of vanilla 3.4.109 has so far eliminated the problem. > As you and Satoshi-san have already found out the culprit, I'll just revert it in 3.4.110. There are other 2 commits in drivers/gpu/drm/radeon betwwen 3.4.108 and 3.4.109, and "drm/radeon: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handling" has been partially reverted in mainline kernel, so I'll fix this too. > Steven, you look to be using i915 graphics instead of radeon, so it seems > unlikely to me that we're hitting the same problem. Here's my oops for > comparison though: > ... > [] ? do_select+0x333/0x5f0 > [] ? r600_cs_packet_parse+0x42/0x140 [radeon] > [] ? __pollwait+0x110/0x110 > Oct 3 23:24:38 lancer last message repeated 7 times > [] ? kmem_cache_free+0x86/0x90 > [] ? __dequeue_signal+0x102/0x190 > [] ? core_sys_select+0x20c/0x380 > [] ? set_current_blocked+0x38/0x60 > [] ? block_sigmask+0x3c/0x50 > [] ? do_signal+0x1d4/0x620 > [] ? ktime_get_ts+0x6d/0xe0 > [] ? sys_select+0x42/0x110 > [] ? system_call_fastpath+0x16/0x1b