From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maarten Lankhorst Subject: Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference Date: Mon, 13 Jul 2015 09:58:31 +0200 Message-ID: <55A36FA7.7010707@linux.intel.com> References: <20150713062222.GG3736@phenom.ffwll.local> <55A3678B.6080803@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: =?UTF-8?B?SsO2cmcgT3R0ZQ==?= Cc: Linus Torvalds , David Airlie , DRI , Linux Kernel Mailing List List-Id: dri-devel@lists.freedesktop.org Op 13-07-15 om 09:42 schreef J=C3=B6rg Otte: > 2015-07-13 9:23 GMT+02:00 Maarten Lankhorst : >> Op 13-07-15 om 08:22 schreef Daniel Vetter: >>> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote: >>>> On Sun, Jul 12, 2015 at 1:03 AM, J=C3=B6rg Otte wrote: >>>>> BUG: unable to handle kernel NULL pointer dereference at 00000000= 00000009 >>>>> IP: [] 0xffffffffbd3447bb >>>> Ugh. Please enable KALLSYMS to get sane symbols. >>>> >>>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state= ", >>>> so it's pretty clearly just that change frm >>>> >>>> - if (intel_crtc->active) { >>>> + if (crtc_state->base.active) { >>>> >>>> and "crtc_state" is NULL. >>>> >>>> And the code very much knows that crtc_state can be NULL, since it= 's >>>> initialized with >>>> >>>> crtc_state =3D state->base.state ? >>>> intel_atomic_get_crtc_state(state->base.state, >>>> intel_crtc) : NULL; >>>> >>>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4 >>>> ("drm/i915: Use crtc_state->active in primary check_plane func") f= or >>>> now, or is there a better fix? Like just checking crtc_state for N= ULL? >>> Indeed embarrassing. I've missed that we still have 1 caller left t= hat's >>> using the transitional helpers, and those don't fill out >>> plane_state->state backpointers to the global atomic update since t= here is >>> no global atomic update for transitional helpers. Below diff should= fix >>> this - we need to preferentially check crts_state->active and if th= at's >>> not set intel_crtc->active should yield the right result for the on= e >>> remaining caller (it's in the crtc_disable paths). >>> >>> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot d= ecision >>> to transition to a different QA team ("we'll do this in 1 week with= out >>> upfront planing") I essentially don't have proper QA support for 1-= 2 >>> months by now. The other trouble in this area specifically is that = this >>> code is already completely changed in -next again, so any testing d= one on >>> integration trees (like -next or drm-intel-nightly) won't test any = patches >>> for 4.2. >>> -Daniel >>> >>> Oh and Signed-off-by: Daniel Vetter in ca= se you >>> decide to apply this right away. >>> >> Well your version has the benefit of compiling without errors. :-) >> >> Reviewed-by: Maarten Lankhorst > Just noticed another problem: > On each resume I get the following error: > -----------[ cut here ]------------ > WARNING: CPU: 2 PID: 2663 at > /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319 > 0xffffffff9a33d5e9() > WARN_ON(!crtc->state->enable) > CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15 > ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22= /2012 > orkqueue: events_unbound 0xffffffff9a055750 > 0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000 > ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000 > ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664 > all Trace: > [] ? 0xffffffff9a6d84d2 > [] ? 0xffffffff9a03c416 > [] ? 0xffffffff9a03c4d5 > [] ? 0xffffffff9a33d5e9 > [] ? 0xffffffff9a343ac3 > [] ? 0xffffffff9a34444a > [] ? 0xffffffff9a345518 > [] ? 0xffffffff9a3246f0 > [] ? 0xffffffff9a2e1ce8 > [] ? 0xffffffff9a236170 > [] ? 0xffffffff9a38b28d > [] ? 0xffffffff9a38b784 > [] ? 0xffffffff9a38baa4 > [] ? 0xffffffff9a05577d > [] ? 0xffffffff9a04dc47 > [] ? 0xffffffff9a04dfab > [] ? 0xffffffff9a04dea0 > [] ? 0xffffffff9a05331c > [] ? 0xffffffff9a053260 > [] ? 0xffffffff9a6dfa0f > [] ? 0xffffffff9a053260 > --[ end trace 1b6d28ee34071679 ]--- > > Nervertheless resume works, so it doesn't hurt me. > > > BTW: I get also up to 40..50! compile warnings like: > i915/i915_drv.h: In function 'i915_debugfs_connector_add': > i915/i915_drv.h:3119:53: warning: no return statement in function > returning non-void [-Wreturn-type] > > which may cause yet uncovered troubles. > > Thanks, J=C3=B6rg kallsyms please! Looks like intel_crtc_disable being called with a mode change on a alre= ady disabled crtc, it's gone in 4.3 because of the atomic rework. Does something like below work? diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i91= 5/intel_display.c index ba9321998a41..725d2b727704 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *c= rtc) struct drm_connector *connector; struct drm_i915_private *dev_priv =3D dev->dev_private; =20 - /* crtc should still be enabled when we disable it. */ - WARN_ON(!crtc->state->enable); - intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc); dev_priv->display.off(crtc); @@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *mo= deset_crtc, continue; =20 if (!crtc_state->enable) { - intel_crtc_disable(crtc); + if (crtc->state->enable) + intel_crtc_disable(crtc); } else if (crtc->state->enable) { intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc);