* [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path @ 2013-03-26 19:56 Michal Hocko 2013-03-30 22:26 ` Ilija Hadzic 0 siblings, 1 reply; 9+ messages in thread From: Michal Hocko @ 2013-03-26 19:56 UTC (permalink / raw) To: dri-devel Cc: David Airlie, Ilija Hadzic, Thomas Hellstrom, Marco Munderloh, linux-kernel Hi, the patch bellow fixes a nullptr dereference reported with OpenSUSE12.3. I am not familiar with the area so I have no idea whether this is the right way to go but after applying this patch the problem is not reproducible anymore. If the patch is correct then please mark it for stable (3.7+). Thanks! --- >From a786a701bd6c277329e2b788fea9a69b1c3ced2e Mon Sep 17 00:00:00 2001 From: Michal Hocko <mhocko@suse.cz> Date: Tue, 26 Mar 2013 19:04:40 +0100 Subject: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path Starting with fdb40a08 (drm: set dev_mapping before calling drm_open_helper) inode and file mappings are set to old_mapping in the error path. old_mapping can be NULL, however, which is handled by initializing dev_mapping to default inode->i_data. old_mapping is left intact though so the both inode's and filep's mapping will still point to NULL which is unexpected and can it results in crashes later one. Marco Munderloh has reported such crashes: BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 IP: [<ffffffff81190be4>] drop_pagecache_sb+0x74/0xe0 PGD 252bc1067 PUD 253d11067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: fuse af_packet xt_tcpudp xt_pkttype xt_LOG xt_limit bnep bluetooth ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq snd_hda_codec_hdmi mperf coretemp snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep kvm_intel snd_pcm arc4 snd_seq snd_timer snd_seq_device kvm iwldvm mac80211 snd uvcvideo crc32c_intel videobuf2_core videodev ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw videobuf2_vmalloc aes_x86_64 iTCO_wdt xts tpm_infineon mei r8169 videobuf2_memops iTCO_vendor_support sr_mod lpc_ich iwlwifi gf128mul sony_laptop rts_pstor(C) cdrom i2c_i801 tpm_tis tpm tpm_bios battery mfd_core soundcore snd_page_alloc cfg80211 rfkill ac sg microcode pcspkr autofs4 xhci_hcd ehci_hcd usbcore usb_common radeon i915 video ttm drm_kms_helper drm i2c_algo_bit thermal button processor thermal_sys scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh CPU 0 Pid: 1452, comm: bash Tainted: G C 3.7.10-1.1-default ation VPCSA4W9E/VAIO RIP: 0010:[<ffffffff81190be4>] [<ffffffff81190be4>] drop_pagecache_sb+0x74/0xe0 RSP: 0018:ffff880252bc9e18 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88024ecb7db0 RCX: 0000000000000002 RDX: 0000000000000007 RSI: ffff88024f63a670 RDI: ffff88024ecb7e38 RBP: ffff88024ecb7e38 R08: dead000000200200 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000210 R12: ffff880254d588a0 R13: ffff88024fcb25e8 R14: ffffffff81190b70 R15: ffffffffffffffea FS: 00007fad2b9ed700(0000) GS:ffff88025fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000058 CR3: 0000000252ad2000 CR4: 00000000000407f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process bash (pid: 1452, threadinfo ffff880252bc8000, task ffff880253d321c0) Stack: 0000000000000001 ffff880254d58800 ffff880254e94800 ffff880254d58868 0000000000000000 ffffffff8116a499 0000000000000000 0000000000000001 ffffffff81a228a0 ffff880252bc9f50 0000000000000002 ffffffff81190cce Call Trace: [<ffffffff8116a499>] iterate_supers+0xd9/0xe0 [<ffffffff81190cce>] drop_caches_sysctl_handler+0x7e/0x90 [<ffffffff811d0e26>] proc_sys_call_handler.isra.10+0xc6/0xe0 [<ffffffff81166fd7>] vfs_write+0xa7/0x180 [<ffffffff81167321>] sys_write+0x51/0xa0 [<ffffffff8154f2ed>] system_call_fastpath+0x1a/0x1f [<00007fad2ae959c0>] 0x7fad2ae959bf Code: 01 00 00 49 39 c4 48 8d 98 00 ff ff ff 74 68 48 8d ab 88 00 00 00 48 89 ef e8 49 69 3b 00 f6 83 a0 00 00 00 38 75 d0 48 8b 43 30 <48> 83 78 58 00 74 c5 48 89 df e8 dd ef fe ff 66 83 45 00 01 66 RIP [<ffffffff81190be4>] drop_pagecache_sb+0x74/0xe0 RSP <ffff880252bc9e18> CR2: 0000000000000058 when dropping caches when inode with NULL i_mapping is encountered. Or a different one when umounting devtmpfs: BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 IP: [<ffffffff81122001>] shmem_evict_inode+0x11/0x130 PGD 0 Oops: 0000 [#1] SMP Modules linked in: xt_tcpudp xt_pkttype xt_LOG xt_limit af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack bnep bluetooth ip6table_filter ip6_tables cpufreq_conservative x_tables cpufreq_userspace cpufreq_powersave snd_hda_codec_hdmi snd_hda_codec_realtek acpi_cpufreq snd_hda_intel mperf snd_hda_codec coretemp snd_hwdep kvm_intel snd_pcm kvm arc4 snd_seq iwldvm mac80211 crc32c_intel ghash_clmulni_intel snd_timer aesni_intel snd_seq_device iTCO_wdt uvcvideo videobuf2_core iwlwifi videodev sony_laptop videobuf2_vmalloc videobuf2_memops ablk_helper iTCO_vendor_support cryptd cfg80211 tpm_infineon r8169 sr_mod cdrom mei snd lpc_ich battery lrw aes_x86_64 xts rfkill i2c_i801 pcspkr mfd_core tpm_tis ac gf128mul tpm tpm_bios soundcore snd_page_alloc sg microcode autofs4 xhci_hcd ehci_hcd radeon(-) i915 ttm drm_kms_helper usbcore usb_common drm thermal i2c_algo_bit video button processor thermal_sys scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh CPU 1 <4>[ 44.175256] Pid: 29, comm: kdevtmpfs Tainted: G W 3.7.10-1-default-patched #4 Sony Corpora tion VPCSA4W9E/VAIO RIP: 0010:[<ffffffff81122001>] [<ffffffff81122001>] shmem_evict_inode+0x11/0x130 RSP: 0018:ffff880254ed3d18 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff88024fb185e8 RCX: 0000000000000034 RDX: 0000000000002433 RSI: 0000000000000c11 RDI: ffff88024fb185e8 RBP: ffff88024fb186e8 R08: 1038000000000000 R09: 024fb186881c0000 R10: fd924f0d6445a207 R11: 0000000000000000 R12: ffffffff8161b640 R13: ffff88024fb185e8 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88025fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000068 CR3: 0000000001a0c000 CR4: 00000000000407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kdevtmpfs (pid: 29, threadinfo ffff880254ed2000, task ffff880254ed0080) Stack: ffff88024fb185e8 ffff88024fb185e8 ffff88024fb186e8 ffffffff8161b640 0000000000000000 ffffffff8117f5f3 ffff88024e453a80 ffff88024fb185e8 0000000000000000 ffffffff8117b778 0000000000000000 ffff88024e453a80 Call Trace: [<ffffffff8117f5f3>] evict+0xa3/0x190 [<ffffffff8117b778>] d_delete+0x148/0x180 [<ffffffff81171d77>] vfs_unlink+0xf7/0x110 [<ffffffff81386ab2>] handle_remove+0x202/0x250 [<ffffffff81386de5>] devtmpfsd+0xd5/0x130 [<ffffffff81066273>] kthread+0xb3/0xc0 [<ffffffff81549c3c>] ret_from_fork+0x7c/0xb0 Code: 7b 30 b9 01 00 00 00 31 d2 4c 89 f6 e8 69 e3 00 00 e9 23 ff ff ff 0f 1f 40 00 41 55 49 89 fd 41 54 55 53 48 83 ec 08 48 8b 47 30 <48> 81 78 68 00 b7 61 81 74 75 48 8b 7f a8 4d 8d 65 90 e8 b8 1f RIP [<ffffffff81122001>] shmem_evict_inode+0x11/0x130 RSP <ffff880254ed3d18> CR2: 0000000000000068 This patch fixes that by initializating old_mapping to the inode->i_data same as dev_mapping. Reported-and-tested-by: Marco Munderloh <munderl@tnt.uni-hannover.de> Signed-off-by: Michal Hocko <mhocko@suse.cz> --- drivers/gpu/drm/drm_fops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index 133b413..62a5435 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -139,7 +139,7 @@ int drm_open(struct inode *inode, struct file *filp) mutex_lock(&dev->struct_mutex); old_mapping = dev->dev_mapping; if (old_mapping == NULL) - dev->dev_mapping = &inode->i_data; + dev->dev_mapping = old_mapping = &inode->i_data; /* ihold ensures nobody can remove inode with our i_data */ ihold(container_of(dev->dev_mapping, struct inode, i_data)); inode->i_mapping = dev->dev_mapping; -- 1.7.10.4 -- Michal Hocko SUSE Labs ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path 2013-03-26 19:56 [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path Michal Hocko @ 2013-03-30 22:26 ` Ilija Hadzic 2013-03-31 10:34 ` Michal Hocko 0 siblings, 1 reply; 9+ messages in thread From: Ilija Hadzic @ 2013-03-30 22:26 UTC (permalink / raw) To: Michal Hocko Cc: dri-devel, David Airlie, Thomas Hellstrom, Marco Munderloh, linux-kernel [-- Attachment #1: Type: text/plain, Size: 9668 bytes --] This looks a bit like a hack and it doesn't look right, conceptually. If the call fails, it should restore things as if nothing has ever happened and overwriting old_mapping is not going to do the trick. I think the right way to fix it would be to separately store the original mapping for filp->f_mapping and inode->i_mapping and restore it from their respective temporary variables if drm_open_helper or drm_setup fail. Attached is a quick patch to show you what I have in mind, can you please test it and if it solves your problem, I'll send it to Dave. By the way, what specific course of action reproduces the problem? It requires drm_open to fail, but is there anything else that you do? thanks, Ilija On Tue, Mar 26, 2013 at 3:56 PM, Michal Hocko <mhocko@suse.cz> wrote: > Hi, > the patch bellow fixes a nullptr dereference reported with OpenSUSE12.3. > I am not familiar with the area so I have no idea whether this is the > right way to go but after applying this patch the problem is not > reproducible anymore > If the patch is correct then please mark it for stable (3.7+). > > Thanks! > --- > From a786a701bd6c277329e2b788fea9a69b1c3ced2e Mon Sep 17 00:00:00 2001 > From: Michal Hocko <mhocko@suse.cz> > Date: Tue, 26 Mar 2013 19:04:40 +0100 > Subject: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open > in error path > > Starting with fdb40a08 (drm: set dev_mapping before calling > drm_open_helper) inode and file mappings are set to old_mapping in the > error path. old_mapping can be NULL, however, which is handled by > initializing dev_mapping to default inode->i_data. old_mapping is left > intact though so the both inode's and filep's mapping will still point > to NULL which is unexpected and can it results in crashes later one. > > Marco Munderloh has reported such crashes: > BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 > IP: [<ffffffff81190be4>] drop_pagecache_sb+0x74/0xe0 > PGD 252bc1067 PUD 253d11067 PMD 0 > Oops: 0000 [#1] SMP > Modules linked in: fuse af_packet xt_tcpudp xt_pkttype xt_LOG xt_limit bnep bluetooth ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 > ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq snd_hda_codec_hdmi mperf coretemp snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep kvm_intel snd_pcm arc4 snd_seq snd_timer snd_seq_device kvm iwldvm mac80211 snd uvcvideo crc32c_intel videobuf2_core videodev ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw videobuf2_vmalloc aes_x86_64 iTCO_wdt xts tpm_infineon mei r8169 videobuf2_memops iTCO_vendor_support sr_mod lpc_ich iwlwifi gf128mul sony_laptop rts_pstor(C) cdrom i2c_i801 tpm_tis tpm tpm_bios battery mfd_core soundcore snd_page_alloc cfg80211 rfkill ac sg microcode pcspkr autofs4 xhci_hcd ehci_hcd usbcore usb_common radeon i915 video ttm drm_kms_helper drm i2c_algo_bit thermal button processor thermal_sys scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua > scsi_dh > CPU 0 > Pid: 1452, comm: bash Tainted: G C 3.7.10-1.1-default > ation VPCSA4W9E/VAIO > RIP: 0010:[<ffffffff81190be4>] [<ffffffff81190be4>] drop_pagecache_sb+0x74/0xe0 > RSP: 0018:ffff880252bc9e18 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffff88024ecb7db0 RCX: 0000000000000002 > RDX: 0000000000000007 RSI: ffff88024f63a670 RDI: ffff88024ecb7e38 > RBP: ffff88024ecb7e38 R08: dead000000200200 R09: 0000000000000000 > R10: 0000000000000001 R11: 0000000000000210 R12: ffff880254d588a0 > R13: ffff88024fcb25e8 R14: ffffffff81190b70 R15: ffffffffffffffea > FS: 00007fad2b9ed700(0000) GS:ffff88025fa00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000058 CR3: 0000000252ad2000 CR4: 00000000000407f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process bash (pid: 1452, threadinfo ffff880252bc8000, task ffff880253d321c0) > Stack: > 0000000000000001 ffff880254d58800 ffff880254e94800 ffff880254d58868 > 0000000000000000 ffffffff8116a499 0000000000000000 0000000000000001 > ffffffff81a228a0 ffff880252bc9f50 0000000000000002 ffffffff81190cce > Call Trace: > [<ffffffff8116a499>] iterate_supers+0xd9/0xe0 > [<ffffffff81190cce>] drop_caches_sysctl_handler+0x7e/0x90 > [<ffffffff811d0e26>] proc_sys_call_handler.isra.10+0xc6/0xe0 > [<ffffffff81166fd7>] vfs_write+0xa7/0x180 > [<ffffffff81167321>] sys_write+0x51/0xa0 > [<ffffffff8154f2ed>] system_call_fastpath+0x1a/0x1f > [<00007fad2ae959c0>] 0x7fad2ae959bf > Code: 01 00 00 49 39 c4 48 8d 98 00 ff ff ff 74 68 48 8d ab 88 00 00 00 48 89 ef e8 49 69 3b 00 f6 83 a0 00 00 00 38 75 d0 48 8b 43 30 <48> 83 78 58 00 74 c5 48 89 df e8 dd ef fe ff 66 83 45 00 01 66 > RIP [<ffffffff81190be4>] drop_pagecache_sb+0x74/0xe0 > RSP <ffff880252bc9e18> > CR2: 0000000000000058 > > when dropping caches when inode with NULL i_mapping is encountered. Or a > different one when umounting devtmpfs: > BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 > IP: [<ffffffff81122001>] shmem_evict_inode+0x11/0x130 > PGD 0 > Oops: 0000 [#1] SMP > Modules linked in: xt_tcpudp xt_pkttype xt_LOG xt_limit af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack bnep bluetooth ip6table_filter ip6_tables cpufreq_conservative x_tables cpufreq_userspace cpufreq_powersave snd_hda_codec_hdmi snd_hda_codec_realtek acpi_cpufreq snd_hda_intel mperf snd_hda_codec coretemp snd_hwdep kvm_intel snd_pcm kvm arc4 snd_seq iwldvm mac80211 crc32c_intel ghash_clmulni_intel snd_timer aesni_intel snd_seq_device iTCO_wdt uvcvideo videobuf2_core iwlwifi videodev sony_laptop videobuf2_vmalloc videobuf2_memops ablk_helper iTCO_vendor_support cryptd cfg80211 tpm_infineon r8169 sr_mod cdrom mei snd lpc_ich battery lrw aes_x86_64 xts rfkill i2c_i801 pcspkr mfd_core tpm_tis ac gf128mul tpm tpm_bios soundcore snd_page_alloc sg microcode autofs4 xhci_hcd ehci_hcd radeon(-) i915 ttm drm_kms_helper usbcore usb_common drm thermal i2c_algo_bit video button processor thermal_sys scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh > CPU 1 <4>[ 44.175256] Pid: 29, comm: kdevtmpfs Tainted: G W 3.7.10-1-default-patched #4 Sony Corpora > tion VPCSA4W9E/VAIO > RIP: 0010:[<ffffffff81122001>] [<ffffffff81122001>] shmem_evict_inode+0x11/0x130 > RSP: 0018:ffff880254ed3d18 EFLAGS: 00010296 > RAX: 0000000000000000 RBX: ffff88024fb185e8 RCX: 0000000000000034 > RDX: 0000000000002433 RSI: 0000000000000c11 RDI: ffff88024fb185e8 > RBP: ffff88024fb186e8 R08: 1038000000000000 R09: 024fb186881c0000 > R10: fd924f0d6445a207 R11: 0000000000000000 R12: ffffffff8161b640 > R13: ffff88024fb185e8 R14: 0000000000000000 R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff88025fa40000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000068 CR3: 0000000001a0c000 CR4: 00000000000407e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process kdevtmpfs (pid: 29, threadinfo ffff880254ed2000, task ffff880254ed0080) > Stack: > ffff88024fb185e8 ffff88024fb185e8 ffff88024fb186e8 ffffffff8161b640 > 0000000000000000 ffffffff8117f5f3 ffff88024e453a80 ffff88024fb185e8 > 0000000000000000 ffffffff8117b778 0000000000000000 ffff88024e453a80 > Call Trace: > [<ffffffff8117f5f3>] evict+0xa3/0x190 > [<ffffffff8117b778>] d_delete+0x148/0x180 > [<ffffffff81171d77>] vfs_unlink+0xf7/0x110 > [<ffffffff81386ab2>] handle_remove+0x202/0x250 > [<ffffffff81386de5>] devtmpfsd+0xd5/0x130 > [<ffffffff81066273>] kthread+0xb3/0xc0 > [<ffffffff81549c3c>] ret_from_fork+0x7c/0xb0 > Code: 7b 30 b9 01 00 00 00 31 d2 4c 89 f6 e8 69 e3 00 00 e9 23 ff ff ff 0f 1f 40 00 41 55 49 89 fd 41 54 55 53 48 83 ec 08 48 8b 47 30 <48> 81 78 68 00 b7 61 81 74 75 48 8b 7f a8 4d 8d 65 90 e8 b8 1f > RIP [<ffffffff81122001>] shmem_evict_inode+0x11/0x130 > RSP <ffff880254ed3d18> > CR2: 0000000000000068 > > This patch fixes that by initializating old_mapping to the inode->i_data > same as dev_mapping. > > Reported-and-tested-by: Marco Munderloh <munderl@tnt.uni-hannover.de> > Signed-off-by: Michal Hocko <mhocko@suse.cz> > --- > drivers/gpu/drm/drm_fops.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c > index 133b413..62a5435 100644 > --- a/drivers/gpu/drm/drm_fops.c > +++ b/drivers/gpu/drm/drm_fops.c > @@ -139,7 +139,7 @@ int drm_open(struct inode *inode, struct file *filp) > mutex_lock(&dev->struct_mutex); > old_mapping = dev->dev_mapping; > if (old_mapping == NULL) > - dev->dev_mapping = &inode->i_data; > + dev->dev_mapping = old_mapping = &inode->i_data; > /* ihold ensures nobody can remove inode with our i_data */ > ihold(container_of(dev->dev_mapping, struct inode, i_data)); > inode->i_mapping = dev->dev_mapping; > -- > 1.7.10.4 > > -- > Michal Hocko > SUSE Labs [-- Attachment #2: 0001-drm-correctly-restore-mappings-if-drm_open-fails.patch --] [-- Type: application/octet-stream, Size: 2098 bytes --] From e5f79d996a8487cf5349c18df5e68abcbfa88768 Mon Sep 17 00:00:00 2001 From: Ilija Hadzic <ihadzic@research.bell-labs.com> Date: Sat, 30 Mar 2013 18:20:35 -0400 Subject: [PATCH] drm: correctly restore mappings if drm_open fails If first drm_open fails, the error patch will incorrectly restore inode's mapping to NULL. This can cause the crash later on. Fix by separately storing away all mapping pointers that drm_open can touch and restore each from its own respective variable if the call fails. Reference: http://lists.freedesktop.org/archives/dri-devel/2013-March/036564.html Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de> Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> --- drivers/gpu/drm/drm_fops.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index 13fdcd1..247f44d 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -123,6 +123,8 @@ int drm_open(struct inode *inode, struct file *filp) int retcode = 0; int need_setup = 0; struct address_space *old_mapping; + struct address_space *old_imapping; + struct address_space *old_fmapping; minor = idr_find(&drm_minors_idr, minor_id); if (!minor) @@ -137,6 +139,8 @@ int drm_open(struct inode *inode, struct file *filp) if (!dev->open_count++) need_setup = 1; mutex_lock(&dev->struct_mutex); + old_fmapping = filp->f_mapping; + old_imapping = inode->i_mapping; old_mapping = dev->dev_mapping; if (old_mapping == NULL) dev->dev_mapping = &inode->i_data; @@ -159,8 +163,8 @@ int drm_open(struct inode *inode, struct file *filp) err_undo: mutex_lock(&dev->struct_mutex); - filp->f_mapping = old_mapping; - inode->i_mapping = old_mapping; + filp->f_mapping = old_fmapping; + inode->i_mapping = old_imapping; iput(container_of(dev->dev_mapping, struct inode, i_data)); dev->dev_mapping = old_mapping; mutex_unlock(&dev->struct_mutex); -- 1.8.1.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path 2013-03-30 22:26 ` Ilija Hadzic @ 2013-03-31 10:34 ` Michal Hocko 2013-04-01 18:14 ` Ilija Hadzic 0 siblings, 1 reply; 9+ messages in thread From: Michal Hocko @ 2013-03-31 10:34 UTC (permalink / raw) To: Ilija Hadzic Cc: dri-devel, David Airlie, Thomas Hellstrom, Marco Munderloh, linux-kernel On Sat 30-03-13 18:26:53, Ilija Hadzic wrote: > This looks a bit like a hack and it doesn't look right, > conceptually. If the call fails, it should restore things as if > nothing has ever happened and overwriting old_mapping is not going to > do the trick. OK, I thought this is what the patch does as it falls back to &inode->i_data which is the default mapping for all inodes or it uses what used to be in device mapping. I am obviously not familiar with the drm code but it feels a bit strange that the device mapping can be different than inode's resp. file's one and even more confusing that inode and file are saved separately. > I think the right way to fix it would be to separately store the > original mapping for filp->f_mapping and inode->i_mapping and restore > it from their respective temporary variables if drm_open_helper or > drm_setup fail. Attached is a quick patch to show you [...] > @@ -137,6 +139,8 @@ int drm_open(struct inode *inode, struct file *filp) > if (!dev->open_count++) > need_setup = 1; > mutex_lock(&dev->struct_mutex); > + old_fmapping = filp->f_mapping; > + old_imapping = inode->i_mapping; How can file and inode mappings be different? > old_mapping = dev->dev_mapping; > if (old_mapping == NULL) > dev->dev_mapping = &inode->i_data; > @@ -159,8 +163,8 @@ int drm_open(struct inode *inode, struct file *filp) > > err_undo: > mutex_lock(&dev->struct_mutex); > - filp->f_mapping = old_mapping; > - inode->i_mapping = old_mapping; > + filp->f_mapping = old_fmapping; > + inode->i_mapping = old_imapping; > iput(container_of(dev->dev_mapping, struct inode, i_data)); > dev->dev_mapping = old_mapping; > mutex_unlock(&dev->struct_mutex); -- 1.8.1.5 -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path 2013-03-31 10:34 ` Michal Hocko @ 2013-04-01 18:14 ` Ilija Hadzic 2013-04-02 8:25 ` Michal Hocko 2013-04-02 10:36 ` Marco Munderloh 0 siblings, 2 replies; 9+ messages in thread From: Ilija Hadzic @ 2013-04-01 18:14 UTC (permalink / raw) To: Michal Hocko Cc: Ilija Hadzic, Thomas Hellstrom, Marco Munderloh, linux-kernel, dri-devel [-- Attachment #1: Type: TEXT/PLAIN, Size: 2440 bytes --] On Sun, 31 Mar 2013, Michal Hocko wrote: > On Sat 30-03-13 18:26:53, Ilija Hadzic wrote: >> This looks a bit like a hack and it doesn't look right, >> conceptually. If the call fails, it should restore things as if >> nothing has ever happened and overwriting old_mapping is not going to >> do the trick. > > OK, I thought this is what the patch does as it falls back to > &inode->i_data which is the default mapping for all inodes or it uses > what used to be in device mapping. > > I am obviously not familiar with the drm code but it feels a bit strange > that the device mapping can be different than inode's resp. file's one The reason for this is explained in commit message associated with 949c4a34. In summary, the device's mapping is that of the inode associated with the first opener. Before 949c4a34, subsequent openers would have to come in through exactly the same inode that the first opener came in (otherwise the open call would fail). So if a user did something like: start X, remove /dev/dri/cardN file, mknod the same file again, the applications started after such an action would stop working. Also, using the GPU from chroot-ed environment was not possible if there was another opener from different root. The 949c4a34, removed this restriction, but introduced a problem with VmWare GPU drivers, which fdb40a08. However, fdb40a08 introduced the bug that you have reported. The problem that I have with your proposed fix is that if the first opener fails, it can set the device's mapping to that of the inode that was never used and never opened (and could even be removed later down the road). > and even more confusing that inode and file are saved separately. > I was trying to quickly get out the patch that was safe in terms of introducing new breakage. So the "conservative" thing to do (without having to think through all possible scenarios) was to restore each of the three pointers from their own temporary variable. Thinking about it, you are probably right that file descriptor's and inode's mapping pointer are equal when open call is entered so we could use one variable. However, you still need a separate variable to store the device's mapping pointer because that one can be different. Attached is a v2 of the patch, for reference. I would appreciate if the original reporter or you tested it in lieu of your proposed patch and let me know if it fixes your issue. -- Ilija [-- Attachment #2: Type: TEXT/PLAIN, Size: 2250 bytes --] From 7e3c832158e2552e5e106a588e2b9e61c35b68f2 Mon Sep 17 00:00:00 2001 From: Ilija Hadzic <ihadzic@research.bell-labs.com> Date: Sat, 30 Mar 2013 18:20:35 -0400 Subject: [PATCH] drm: correctly restore mappings if drm_open fails If first drm_open fails, the error-handling path will incorrectly restore inode's mapping to NULL. This can cause the crash later on. Fix by separately storing away mapping pointers that drm_open can touch and restore each from its own respective variable if the call fails. Reference: http://lists.freedesktop.org/archives/dri-devel/2013-March/036564.html v2: use one variable to store file and inode mapping since they are the same at the function entry; also fix spelling mistakes in commit message. Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de> Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> --- drivers/gpu/drm/drm_fops.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index 13fdcd1..429e07d 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -123,6 +123,7 @@ int drm_open(struct inode *inode, struct file *filp) int retcode = 0; int need_setup = 0; struct address_space *old_mapping; + struct address_space *old_imapping; minor = idr_find(&drm_minors_idr, minor_id); if (!minor) @@ -137,6 +138,7 @@ int drm_open(struct inode *inode, struct file *filp) if (!dev->open_count++) need_setup = 1; mutex_lock(&dev->struct_mutex); + old_imapping = inode->i_mapping; old_mapping = dev->dev_mapping; if (old_mapping == NULL) dev->dev_mapping = &inode->i_data; @@ -159,8 +161,8 @@ int drm_open(struct inode *inode, struct file *filp) err_undo: mutex_lock(&dev->struct_mutex); - filp->f_mapping = old_mapping; - inode->i_mapping = old_mapping; + filp->f_mapping = old_imapping; + inode->i_mapping = old_imapping; iput(container_of(dev->dev_mapping, struct inode, i_data)); dev->dev_mapping = old_mapping; mutex_unlock(&dev->struct_mutex); -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path 2013-04-01 18:14 ` Ilija Hadzic @ 2013-04-02 8:25 ` Michal Hocko 2013-04-02 10:36 ` Marco Munderloh 1 sibling, 0 replies; 9+ messages in thread From: Michal Hocko @ 2013-04-02 8:25 UTC (permalink / raw) To: Ilija Hadzic, Marco Munderloh Cc: Ilija Hadzic, Thomas Hellstrom, linux-kernel, dri-devel On Mon 01-04-13 13:14:50, Ilija Hadzic wrote: > > > On Sun, 31 Mar 2013, Michal Hocko wrote: > > >On Sat 30-03-13 18:26:53, Ilija Hadzic wrote: > >>This looks a bit like a hack and it doesn't look right, > >>conceptually. If the call fails, it should restore things as if > >>nothing has ever happened and overwriting old_mapping is not going to > >>do the trick. > > > >OK, I thought this is what the patch does as it falls back to > >&inode->i_data which is the default mapping for all inodes or it uses > >what used to be in device mapping. > > > >I am obviously not familiar with the drm code but it feels a bit strange > >that the device mapping can be different than inode's resp. file's one > > The reason for this is explained in commit message associated with > 949c4a34. > > In summary, the device's mapping is that of the inode associated with the > first opener. Before 949c4a34, subsequent openers would have to come in > through exactly the same inode that the first opener came in > (otherwise the open call would fail). So if a user did something > like: start X, remove /dev/dri/cardN file, mknod the same file > again, the applications started after such an action would stop > working. Also, using the GPU from chroot-ed environment was not > possible if there was another opener from different root. Oh, I see. Thanks for the clarification. > The 949c4a34, removed this restriction, but introduced a problem > with VmWare GPU drivers, which fdb40a08. However, fdb40a08 > introduced the bug that you have reported. > > The problem that I have with your proposed fix is that if the first > opener fails, it can set the device's mapping to that of the inode > that was never used and never opened (and could even be removed > later down the road). Makes sense. > >and even more confusing that inode and file are saved separately. > > > > I was trying to quickly get out the patch that was safe in terms of > introducing new breakage. So the "conservative" thing to do (without > having to think through all possible scenarios) was to restore each > of the three pointers from their own temporary variable. Thinking > about it, you are probably right that file descriptor's and inode's > mapping pointer are equal when open call is entered so we could use > one variable. However, you still need a separate variable to store > the device's mapping pointer because that one can be different. Right. > Attached is a v2 of the patch, for reference. I would appreciate if > the original reporter or you tested it in lieu of your proposed > patch and let me know if it fixes your issue. OK, this is a call for Marco. I have attached this bug to our bugzilla as well (just for reference: https://bugzilla.novell.com/show_bug.cgi?id=807850) > > -- Ilija > From 7e3c832158e2552e5e106a588e2b9e61c35b68f2 Mon Sep 17 00:00:00 2001 > From: Ilija Hadzic <ihadzic@research.bell-labs.com> > Date: Sat, 30 Mar 2013 18:20:35 -0400 > Subject: [PATCH] drm: correctly restore mappings if drm_open fails > > If first drm_open fails, the error-handling path will > incorrectly restore inode's mapping to NULL. This can > cause the crash later on. Fix by separately storing > away mapping pointers that drm_open can touch and > restore each from its own respective variable if the > call fails. > > Reference: > http://lists.freedesktop.org/archives/dri-devel/2013-March/036564.html > > v2: use one variable to store file and inode mapping > since they are the same at the function entry; also > fix spelling mistakes in commit message. > > Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de> > Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> > Cc: Michal Hocko <mhocko@suse.cz> > Cc: stable@vger.kernel.org Feel free to add Reviewed-by: Michal Hocko <mhocko@suse.cz> Thanks! > Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> > --- > drivers/gpu/drm/drm_fops.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c > index 13fdcd1..429e07d 100644 > --- a/drivers/gpu/drm/drm_fops.c > +++ b/drivers/gpu/drm/drm_fops.c > @@ -123,6 +123,7 @@ int drm_open(struct inode *inode, struct file *filp) > int retcode = 0; > int need_setup = 0; > struct address_space *old_mapping; > + struct address_space *old_imapping; > > minor = idr_find(&drm_minors_idr, minor_id); > if (!minor) > @@ -137,6 +138,7 @@ int drm_open(struct inode *inode, struct file *filp) > if (!dev->open_count++) > need_setup = 1; > mutex_lock(&dev->struct_mutex); > + old_imapping = inode->i_mapping; > old_mapping = dev->dev_mapping; > if (old_mapping == NULL) > dev->dev_mapping = &inode->i_data; > @@ -159,8 +161,8 @@ int drm_open(struct inode *inode, struct file *filp) > > err_undo: > mutex_lock(&dev->struct_mutex); > - filp->f_mapping = old_mapping; > - inode->i_mapping = old_mapping; > + filp->f_mapping = old_imapping; > + inode->i_mapping = old_imapping; > iput(container_of(dev->dev_mapping, struct inode, i_data)); > dev->dev_mapping = old_mapping; > mutex_unlock(&dev->struct_mutex); > -- > 1.7.4.1 > -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path 2013-04-01 18:14 ` Ilija Hadzic 2013-04-02 8:25 ` Michal Hocko @ 2013-04-02 10:36 ` Marco Munderloh [not found] ` <CA+4h6HkOSPUpfT-5Hwe+zRkmSdhURM6Tv4RxM+9PMCEvG+tjZw@mail.gmail.com> 1 sibling, 1 reply; 9+ messages in thread From: Marco Munderloh @ 2013-04-02 10:36 UTC (permalink / raw) To: Ilija Hadzic Cc: Michal Hocko, Ilija Hadzic, Thomas Hellstrom, linux-kernel, dri-devel [-- Attachment #1: Type: text/plain, Size: 1125 bytes --] > Attached is a v2 of the patch, for reference. I would appreciate if the original reporter or you tested it in lieu of your proposed patch and let me know if it fixes your > issue. The patch works for me. echo 3 > /proc/sys/vm/drop_caches as well as rmmod radeon do not end up in a crash anymore. However, I have still no clue why one of these makes drm_open to fail. On rmmod radeon I get the following log messages. If don't know if the 'unpin not necessary' has anything to do with it. [drm] radeon: finishing device. radeon 0000:01:00.0: ffff88024e526c00 unpin not necessary radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary [TTM] Finalizing pool allocator [TTM] Finalizing DMA pool allocator [TTM] Zone kernel: Used memory at exit: 0 kiB [TTM] Zone dma32: Used memory at exit: 0 kiB [drm] radeon: ttm finalized vga_switcheroo: disabled [drm] Module unloaded By the way, sometimes my r8169 ethernet controller does not survive suspend/hibernation (does not detect link). rmmod/modprobe helps. I don't know if this is related. [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4523 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CA+4h6HkOSPUpfT-5Hwe+zRkmSdhURM6Tv4RxM+9PMCEvG+tjZw@mail.gmail.com>]
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path [not found] ` <CA+4h6HkOSPUpfT-5Hwe+zRkmSdhURM6Tv4RxM+9PMCEvG+tjZw@mail.gmail.com> @ 2013-04-02 12:01 ` Marco Munderloh 2013-04-02 13:31 ` Ilija Hadzic 0 siblings, 1 reply; 9+ messages in thread From: Marco Munderloh @ 2013-04-02 12:01 UTC (permalink / raw) To: Ilija Hadzic Cc: Ilija Hadzic, Michal Hocko, Thomas Hellstrom, LKML, dri-devel@lists.freedesktop.org [-- Attachment #1.1: Type: text/plain, Size: 2039 bytes --] Hi Ilija, > Thanks for testing. Other issues are probably unrelated, so I'll send the last version of the patch to Dave. I came across another problem which seems related. rmmod radeon works, however, modprobe radeon afterwards results in a crash (divide error), see attachment. Best, Marco On 02.04.2013 13:23, Ilija Hadzic wrote: > > -- Ilija > > On Tue, Apr 2, 2013 at 6:36 AM, Marco Munderloh <munderl@tnt.uni-hannover.de <mailto:munderl@tnt.uni-hannover.de>> wrote: > > Attached is a v2 of the patch, for reference. I would appreciate if the original reporter or you tested it in lieu of your proposed patch and let me know if it > fixes your > issue. > > > The patch works for me. echo 3 > /proc/sys/vm/drop_caches as well as rmmod radeon do not end up in a crash anymore. However, I have still no clue why one of these makes > drm_open to fail. On rmmod radeon I get the following log messages. If don't know if the 'unpin not necessary' has anything to do with it. > > [drm] radeon: finishing device. > radeon 0000:01:00.0: ffff88024e526c00 unpin not necessary > radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary > radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary > [TTM] Finalizing pool allocator > [TTM] Finalizing DMA pool allocator > [TTM] Zone kernel: Used memory at exit: 0 kiB > [TTM] Zone dma32: Used memory at exit: 0 kiB > [drm] radeon: ttm finalized > vga_switcheroo: disabled > [drm] Module unloaded > > By the way, sometimes my r8169 ethernet controller does not survive suspend/hibernation (does not detect link). rmmod/modprobe helps. I don't know if this is related. > > -- Dipl.-Ing. Marco Munderloh Mail: munderl@tnt.uni-hannover.de Institut für Informationsverarbeitung (TNT) Phone: +49 511 762-19587 Leibniz Universitaet Hannover, Appelstr. 9a Fax: +49 511 762- 5333 30167 Hannover, Germany Web: http://www.tnt.uni-hannover.de/~munderl [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1.2: crash_modprobe_radeon.log --] [-- Type: text/x-log; name="crash_modprobe_radeon.log", Size: 12139 bytes --] 2013-04-02T12:46:25.434028+02:00 apophis kernel: [ 1826.998301] [drm] radeon defaulting to kernel modesetting. 2013-04-02T12:46:25.434042+02:00 apophis kernel: [ 1826.998303] [drm] radeon kernel modesetting enabled. 2013-04-02T12:46:25.434044+02:00 apophis kernel: [ 1826.998316] VGA switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle 2013-04-02T12:46:25.434045+02:00 apophis kernel: [ 1826.998452] [drm] initializing kernel modesetting (TURKS 0x1002:0x6741 0x104D:0x907B). 2013-04-02T12:46:25.434046+02:00 apophis kernel: [ 1826.998476] [drm] register mmio base: 0xC8400000 2013-04-02T12:46:25.434047+02:00 apophis kernel: [ 1826.998477] [drm] register mmio size: 131072 2013-04-02T12:46:25.434047+02:00 apophis kernel: [ 1826.998478] vga_switcheroo: enabled 2013-04-02T12:46:25.434048+02:00 apophis kernel: [ 1826.998548] ATPX version 1 2013-04-02T12:46:26.290054+02:00 apophis kernel: [ 1827.852872] ATOM BIOS: Sony 2013-04-02T12:46:26.290096+02:00 apophis kernel: [ 1827.852895] radeon 0000:01:00.0: GPU softreset 2013-04-02T12:46:26.290102+02:00 apophis kernel: [ 1827.852900] radeon 0000:01:00.0: GRBM_STATUS=0xFFFFFFFF 2013-04-02T12:46:26.290106+02:00 apophis kernel: [ 1827.852905] radeon 0000:01:00.0: GRBM_STATUS_SE0=0xFFFFFFFF 2013-04-02T12:46:26.290109+02:00 apophis kernel: [ 1827.852909] radeon 0000:01:00.0: GRBM_STATUS_SE1=0xFFFFFFFF 2013-04-02T12:46:26.290112+02:00 apophis kernel: [ 1827.852914] radeon 0000:01:00.0: SRBM_STATUS=0xFFFFFFFF 2013-04-02T12:46:26.290115+02:00 apophis kernel: [ 1827.852918] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0xFFFFFFFF 2013-04-02T12:46:26.290118+02:00 apophis kernel: [ 1827.852923] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0xFFFFFFFF 2013-04-02T12:46:26.290121+02:00 apophis kernel: [ 1827.852928] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0xFFFFFFFF 2013-04-02T12:46:26.290124+02:00 apophis kernel: [ 1827.852932] radeon 0000:01:00.0: R_008680_CP_STAT = 0xFFFFFFFF 2013-04-02T12:46:27.262050+02:00 apophis kernel: [ 1828.824062] radeon 0000:01:00.0: Wait for MC idle timedout ! 2013-04-02T12:46:27.262086+02:00 apophis kernel: [ 1828.824073] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B 2013-04-02T12:46:27.262091+02:00 apophis kernel: [ 1828.824178] radeon 0000:01:00.0: GRBM_STATUS=0xFFFFFFFF 2013-04-02T12:46:27.262095+02:00 apophis kernel: [ 1828.824182] radeon 0000:01:00.0: GRBM_STATUS_SE0=0xFFFFFFFF 2013-04-02T12:46:27.262099+02:00 apophis kernel: [ 1828.824186] radeon 0000:01:00.0: GRBM_STATUS_SE1=0xFFFFFFFF 2013-04-02T12:46:27.262103+02:00 apophis kernel: [ 1828.824191] radeon 0000:01:00.0: SRBM_STATUS=0xFFFFFFFF 2013-04-02T12:46:27.262106+02:00 apophis kernel: [ 1828.824195] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0xFFFFFFFF 2013-04-02T12:46:27.262109+02:00 apophis kernel: [ 1828.824200] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0xFFFFFFFF 2013-04-02T12:46:27.262112+02:00 apophis kernel: [ 1828.824204] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0xFFFFFFFF 2013-04-02T12:46:27.262115+02:00 apophis kernel: [ 1828.824209] radeon 0000:01:00.0: R_008680_CP_STAT = 0xFFFFFFFF 2013-04-02T12:46:27.994081+02:00 apophis kernel: [ 1829.553641] radeon 0000:01:00.0: limiting VRAM 2013-04-02T12:46:27.994095+02:00 apophis kernel: [ 1829.553653] radeon 0000:01:00.0: VRAM: 3584M 0x0000000000000000 - 0x00000000DFFFFFFF (3584M used) 2013-04-02T12:46:27.994096+02:00 apophis kernel: [ 1829.553659] radeon 0000:01:00.0: GTT: 512M 0x00000000E0000000 - 0x00000000FFFFFFFF 2013-04-02T12:46:27.994097+02:00 apophis kernel: [ 1829.553675] mtrr: no more MTRRs available 2013-04-02T12:46:27.994098+02:00 apophis kernel: [ 1829.553679] [drm] Detected VRAM RAM=3584M, BAR=256M 2013-04-02T12:46:27.994099+02:00 apophis kernel: [ 1829.553682] [drm] RAM width 128bits DDR 2013-04-02T12:46:27.994100+02:00 apophis kernel: [ 1829.553859] [TTM] Zone kernel: Available graphics memory: 4053020 kiB 2013-04-02T12:46:27.994100+02:00 apophis kernel: [ 1829.553866] [TTM] Zone dma32: Available graphics memory: 2097152 kiB 2013-04-02T12:46:27.994101+02:00 apophis kernel: [ 1829.553870] [TTM] Initializing pool allocator 2013-04-02T12:46:27.994102+02:00 apophis kernel: [ 1829.553880] [TTM] Initializing DMA pool allocator 2013-04-02T12:46:27.994103+02:00 apophis kernel: [ 1829.553929] [drm] radeon: 3584M of VRAM memory ready 2013-04-02T12:46:27.994103+02:00 apophis kernel: [ 1829.553933] [drm] radeon: 512M of GTT memory ready. 2013-04-02T12:46:27.994104+02:00 apophis kernel: [ 1829.553967] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). 2013-04-02T12:46:27.994105+02:00 apophis kernel: [ 1829.553970] [drm] Driver supports precise vblank timestamp query. 2013-04-02T12:46:27.994106+02:00 apophis kernel: [ 1829.554021] [drm] radeon: irq initialized. 2013-04-02T12:46:27.994106+02:00 apophis kernel: [ 1829.554030] [drm] GART: num cpu pages 131072, num gpu pages 131072 2013-04-02T12:46:27.994107+02:00 apophis kernel: [ 1829.555334] [drm] probing gen 2 caps for device 8086:101 = 2/0 2013-04-02T12:46:27.994108+02:00 apophis kernel: [ 1829.555339] [drm] PCIE gen 2 link speeds already enabled 2013-04-02T12:46:27.994108+02:00 apophis kernel: [ 1829.555439] [drm] Loading TURKS Microcode 2013-04-02T12:46:28.966010+02:00 apophis kernel: [ 1830.524411] radeon 0000:01:00.0: Wait for MC idle timedout ! 2013-04-02T12:46:29.090044+02:00 apophis kernel: [ 1830.646315] radeon 0000:01:00.0: Wait for MC idle timedout ! 2013-04-02T12:46:29.914583+02:00 apophis kernel: [ 1831.376370] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). 2013-04-02T12:46:29.914616+02:00 apophis kernel: [ 1831.376440] divide error: 0000 [#1] SMP 2013-04-02T12:46:29.914617+02:00 apophis kernel: [ 1831.376510] Modules linked in: radeon(+) cpufreq_stats fuse af_packet xt_tcpudp xt_pkttype xt_LOG xt_limit bnep bluetooth ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf coretemp kvm_intel kvm snd_hda_codec_hdmi crc32c_intel ghash_clmulni_intel aesni_intel snd_hda_codec_realtek arc4 iwldvm mac80211 snd_hda_intel snd_hda_codec ablk_helper cryptd tpm_infineon lrw aes_x86_64 xts uvcvideo snd_hwdep gf128mul snd_pcm videobuf2_core videodev videobuf2_vmalloc iwlwifi videobuf2_memops tpm_tis i2c_i801 sony_laptop iTCO_wdt r8169 tpm iTCO_vendor_support mei lpc_ich mfd_core sr_mod snd_seq sg pcspkr snd_timer snd_seq_device battery microcode snd cfg80211 soundcore rfkill snd_page_alloc ac cdrom tpm_bios autofs4 xhci_hcd i915 ehci_hcd ttm usbcore drm_kms_helper usb_common drm i2c_algo_bit thermal video button processor thermal_sys scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh [last unloaded: radeon] 2013-04-02T12:46:29.914619+02:00 apophis kernel: [ 1831.378143] CPU 3 2013-04-02T12:46:29.914620+02:00 apophis kernel: [ 1831.378174] Pid: 3034, comm: modprobe Not tainted 3.7.10-1-default-patched #5 Sony Corporation VPCSA4W9E/VAIO 2013-04-02T12:46:29.914621+02:00 apophis kernel: [ 1831.378281] RIP: 0010:[<ffffffffa07f4f60>] [<ffffffffa07f4f60>] r6xx_remap_render_backend+0x70/0xe0 [radeon] 2013-04-02T12:46:29.914622+02:00 apophis kernel: [ 1831.378449] RSP: 0018:ffff880254d89c40 EFLAGS: 00010246 2013-04-02T12:46:29.914622+02:00 apophis kernel: [ 1831.378509] RAX: 0000000000000004 RBX: 0000000000000000 RCX: 0000000000000000 2013-04-02T12:46:29.914623+02:00 apophis kernel: [ 1831.378585] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8802512b0000 2013-04-02T12:46:29.914624+02:00 apophis kernel: [ 1831.378661] RBP: 0000000000000002 R08: 00000000000000ff R09: 0000000000000000 2013-04-02T12:46:29.914634+02:00 apophis kernel: [ 1831.378737] R10: 0000000000000001 R11: 0000000000000008 R12: 0000000000000002 2013-04-02T12:46:29.914636+02:00 apophis kernel: [ 1831.378813] R13: 0000000000000004 R14: 00000000ffffffff R15: ffff8802506e6380 2013-04-02T12:46:29.914639+02:00 apophis kernel: [ 1831.378891] FS: 00007f2f1310f700(0000) GS:ffff88025fac0000(0000) knlGS:0000000000000000 2013-04-02T12:46:29.914642+02:00 apophis kernel: [ 1831.378978] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2013-04-02T12:46:29.914644+02:00 apophis kernel: [ 1831.379041] CR2: 00007f5ebd0ea000 CR3: 00000002513c8000 CR4: 00000000000407e0 2013-04-02T12:46:29.914647+02:00 apophis kernel: [ 1831.379117] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2013-04-02T12:46:29.914649+02:00 apophis kernel: [ 1831.379193] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 2013-04-02T12:46:29.914652+02:00 apophis kernel: [ 1831.379271] Process modprobe (pid: 3034, threadinfo ffff880254d88000, task ffff88024c31e580) 2013-04-02T12:46:29.914654+02:00 apophis kernel: [ 1831.379359] Stack: 2013-04-02T12:46:29.914657+02:00 apophis kernel: [ 1831.379384] ffff8802512b0000 00000000000000ff 0000000002010002 0000000040000000 2013-04-02T12:46:29.914659+02:00 apophis kernel: [ 1831.379479] ffffffffa080aaf9 0000000000000282 ffffffff00000018 ffff880254d89cc8 2013-04-02T12:46:29.914662+02:00 apophis kernel: [ 1831.379573] ffff880254d89c88 ffff8802512b0000 ffff8802512b0000 0000000000000000 2013-04-02T12:46:29.914664+02:00 apophis kernel: [ 1831.379668] Call Trace: 2013-04-02T12:46:29.914667+02:00 apophis kernel: [ 1831.379928] [<ffffffffa080aaf9>] evergreen_gpu_init+0x269/0xc20 [radeon] 2013-04-02T12:46:29.914669+02:00 apophis kernel: [ 1831.380226] [<ffffffffa080eff4>] evergreen_startup+0x1d4/0xa60 [radeon] 2013-04-02T12:46:29.914672+02:00 apophis kernel: [ 1831.380517] [<ffffffffa080f9f6>] evergreen_init+0x176/0x290 [radeon] 2013-04-02T12:46:29.914674+02:00 apophis kernel: [ 1831.380815] [<ffffffffa07b15a2>] radeon_device_init+0x532/0x620 [radeon] 2013-04-02T12:46:29.914677+02:00 apophis kernel: [ 1831.384920] [<ffffffffa07b2f54>] radeon_driver_load_kms+0x84/0x170 [radeon] 2013-04-02T12:46:29.914680+02:00 apophis kernel: [ 1831.389039] [<ffffffffa00a77e5>] drm_get_pci_dev+0x185/0x2a0 [drm] 2013-04-02T12:46:29.914682+02:00 apophis kernel: [ 1831.393077] [<ffffffff812d96c6>] local_pci_probe+0x46/0x80 2013-04-02T12:46:29.914685+02:00 apophis kernel: [ 1831.397045] [<ffffffff812d9912>] pci_device_probe+0x122/0x130 2013-04-02T12:46:29.914688+02:00 apophis kernel: [ 1831.401000] [<ffffffff813826bd>] driver_probe_device+0x7d/0x380 2013-04-02T12:46:29.914691+02:00 apophis kernel: [ 1831.404944] [<ffffffff81382a53>] __driver_attach+0x93/0xa0 2013-04-02T12:46:29.914693+02:00 apophis kernel: [ 1831.408876] [<ffffffff813808fd>] bus_for_each_dev+0x4d/0x80 2013-04-02T12:46:29.914696+02:00 apophis kernel: [ 1831.412796] [<ffffffff81381ce0>] bus_add_driver+0x180/0x280 2013-04-02T12:46:29.914698+02:00 apophis kernel: [ 1831.416727] [<ffffffff81383094>] driver_register+0x84/0x180 2013-04-02T12:46:29.914701+02:00 apophis kernel: [ 1831.420605] [<ffffffff810002ea>] do_one_initcall+0x12a/0x180 2013-04-02T12:46:29.914704+02:00 apophis kernel: [ 1831.424449] [<ffffffff810a4b02>] sys_init_module+0xb2/0x220 2013-04-02T12:46:29.914706+02:00 apophis kernel: [ 1831.428314] [<ffffffff81549bed>] system_call_fastpath+0x1a/0x1f 2013-04-02T12:46:29.914709+02:00 apophis kernel: [ 1831.432140] [<00007f2f12a2fe2a>] 0x7f2f12a2fe29 2013-04-02T12:46:29.914713+02:00 apophis kernel: [ 1831.435902] Code: 31 db 45 89 c1 66 0f 1f 44 00 00 44 89 cb 41 d1 e9 83 e3 01 41 01 db 83 ee 01 75 ef 89 c1 44 29 d9 41 39 cd 72 6b 31 d2 44 89 e8 <f7> f1 0f af c8 41 89 c1 44 89 e8 29 c8 83 bf c0 00 00 00 27 19 2013-04-02T12:46:29.914717+02:00 apophis kernel: [ 1831.444363] RIP [<ffffffffa07f4f60>] r6xx_remap_render_backend+0x70/0xe0 [radeon] 2013-04-02T12:46:29.914720+02:00 apophis kernel: [ 1831.448575] RSP <ffff880254d89c40> 2013-04-02T12:46:29.914722+02:00 apophis kernel: [ 1831.473249] ---[ end trace 8f29e167bc8c5823 ]--- [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4523 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path 2013-04-02 12:01 ` Marco Munderloh @ 2013-04-02 13:31 ` Ilija Hadzic 2013-04-02 13:48 ` Alex Deucher 0 siblings, 1 reply; 9+ messages in thread From: Ilija Hadzic @ 2013-04-02 13:31 UTC (permalink / raw) To: Marco Munderloh Cc: Ilija Hadzic, Michal Hocko, Thomas Hellstrom, LKML, dri-devel@lists.freedesktop.org [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed, Size: 3364 bytes --] Marco, What makes you think that the crash after second modprobe is related to the mappings pointers in DRM module? Can you actually establish the correlation between these patches and the crash or you are just suspecting because your other bug had something to do with module removal/insertion? If it's the latter, then you may want to open another bug report here https://bugs.freedesktop.org/ (use DRI for product and pick DRM/radeon for component) and have this issue tracked and addressed separately. The divide error that your log shows apparently happens at this line inside r6xx_remap_render_backend: pipe_rb_ratio = rendering_pipe_num / req_rb_num; I would suspect that req_rb_num somehow evaluates to zero at the second modprobe. That variable seems to be the derived of the last three arguments to r6xx_remap_render_backend. If I look at the caller (evergreen_gpu_init) the arguments that have the play here are all derived from the GPU's hardware registers (or are the constant for a given GPU device). So I suspect that the GPU driver leaves some state in GPU at module removal that later bites you. -- Ilija On Tue, 2 Apr 2013, Marco Munderloh wrote: > Hi Ilija, > >> Thanks for testing. Other issues are probably unrelated, so I'll send the >> last version of the patch to Dave. > > I came across another problem which seems related. rmmod radeon works, > however, modprobe radeon afterwards results in a crash (divide error), see > attachment. > > Best, Marco > > On 02.04.2013 13:23, Ilija Hadzic wrote: >> >> -- Ilija >> >> On Tue, Apr 2, 2013 at 6:36 AM, Marco Munderloh >> <munderl@tnt.uni-hannover.de <mailto:munderl@tnt.uni-hannover.de>> wrote: >> >> Attached is a v2 of the patch, for reference. I would appreciate if >> the original reporter or you tested it in lieu of your proposed patch and >> let me know if it >> fixes your >> issue. >> >> >> The patch works for me. echo 3 > /proc/sys/vm/drop_caches as well as >> rmmod radeon do not end up in a crash anymore. However, I have still no >> clue why one of these makes >> drm_open to fail. On rmmod radeon I get the following log messages. If >> don't know if the 'unpin not necessary' has anything to do with it. >> >> [drm] radeon: finishing device. >> radeon 0000:01:00.0: ffff88024e526c00 unpin not necessary >> radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary >> radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary >> [TTM] Finalizing pool allocator >> [TTM] Finalizing DMA pool allocator >> [TTM] Zone kernel: Used memory at exit: 0 kiB >> [TTM] Zone dma32: Used memory at exit: 0 kiB >> [drm] radeon: ttm finalized >> vga_switcheroo: disabled >> [drm] Module unloaded >> >> By the way, sometimes my r8169 ethernet controller does not survive >> suspend/hibernation (does not detect link). rmmod/modprobe helps. I don't >> know if this is related. >> >> > > -- > Dipl.-Ing. Marco Munderloh Mail: munderl@tnt.uni-hannover.de > Institut für Informationsverarbeitung (TNT) Phone: +49 511 762-19587 > Leibniz Universitaet Hannover, Appelstr. 9a Fax: +49 511 762- 5333 > 30167 Hannover, Germany Web: http://www.tnt.uni-hannover.de/~munderl > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path 2013-04-02 13:31 ` Ilija Hadzic @ 2013-04-02 13:48 ` Alex Deucher 0 siblings, 0 replies; 9+ messages in thread From: Alex Deucher @ 2013-04-02 13:48 UTC (permalink / raw) To: Ilija Hadzic Cc: Marco Munderloh, Michal Hocko, Thomas Hellstrom, dri-devel@lists.freedesktop.org, LKML On Tue, Apr 2, 2013 at 9:31 AM, Ilija Hadzic <ihadzic@research.bell-labs.com> wrote: > > Marco, > > What makes you think that the crash after second modprobe is related to the > mappings pointers in DRM module? Can you actually establish the correlation > between these patches and the crash or you are just suspecting because your > other bug had something to do with module removal/insertion? > > If it's the latter, then you may want to open another bug report here > https://bugs.freedesktop.org/ (use DRI for product and pick DRM/radeon for > component) and have this issue tracked and addressed separately. > > The divide error that your log shows apparently happens at this line > inside r6xx_remap_render_backend: > > pipe_rb_ratio = rendering_pipe_num / req_rb_num; > > I would suspect that req_rb_num somehow evaluates to zero at the second > modprobe. That variable seems to be the derived of the last three arguments > to r6xx_remap_render_backend. If I look at the caller (evergreen_gpu_init) > the arguments that have the play here are all derived from the GPU's > hardware registers (or are the constant for a given GPU device). So I > suspect that the GPU driver leaves some state in GPU at module removal that > later bites you. Newer kernels have a fix for this. http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f689e3acbd2e48cc4101e0af454193f81af4baaf Alex > > -- Ilija > > On Tue, 2 Apr 2013, Marco Munderloh wrote: > >> Hi Ilija, >> >>> Thanks for testing. Other issues are probably unrelated, so I'll send the >>> last version of the patch to Dave. >> >> >> I came across another problem which seems related. rmmod radeon works, >> however, modprobe radeon afterwards results in a crash (divide error), see >> attachment. >> >> Best, Marco >> >> On 02.04.2013 13:23, Ilija Hadzic wrote: >>> >>> >>> -- Ilija >>> >>> On Tue, Apr 2, 2013 at 6:36 AM, Marco Munderloh >>> <munderl@tnt.uni-hannover.de <mailto:munderl@tnt.uni-hannover.de>> wrote: >>> >>> Attached is a v2 of the patch, for reference. I would appreciate >>> if the original reporter or you tested it in lieu of your proposed patch and >>> let me know if it >>> fixes your >>> issue. >>> >>> >>> The patch works for me. echo 3 > /proc/sys/vm/drop_caches as well as >>> rmmod radeon do not end up in a crash anymore. However, I have still no clue >>> why one of these makes >>> drm_open to fail. On rmmod radeon I get the following log messages. >>> If don't know if the 'unpin not necessary' has anything to do with it. >>> >>> [drm] radeon: finishing device. >>> radeon 0000:01:00.0: ffff88024e526c00 unpin not necessary >>> radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary >>> radeon 0000:01:00.0: ffff88024f2f6000 unpin not necessary >>> [TTM] Finalizing pool allocator >>> [TTM] Finalizing DMA pool allocator >>> [TTM] Zone kernel: Used memory at exit: 0 kiB >>> [TTM] Zone dma32: Used memory at exit: 0 kiB >>> [drm] radeon: ttm finalized >>> vga_switcheroo: disabled >>> [drm] Module unloaded >>> >>> By the way, sometimes my r8169 ethernet controller does not survive >>> suspend/hibernation (does not detect link). rmmod/modprobe helps. I don't >>> know if this is related. >>> >>> >> >> -- >> Dipl.-Ing. Marco Munderloh Mail: munderl@tnt.uni-hannover.de >> Institut für Informationsverarbeitung (TNT) Phone: +49 511 762-19587 >> Leibniz Universitaet Hannover, Appelstr. 9a Fax: +49 511 762- 5333 >> 30167 Hannover, Germany Web: http://www.tnt.uni-hannover.de/~munderl > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-04-02 13:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-26 19:56 [PATCH] drm: fix i_mapping and f_mapping initialization in drm_open in error path Michal Hocko
2013-03-30 22:26 ` Ilija Hadzic
2013-03-31 10:34 ` Michal Hocko
2013-04-01 18:14 ` Ilija Hadzic
2013-04-02 8:25 ` Michal Hocko
2013-04-02 10:36 ` Marco Munderloh
[not found] ` <CA+4h6HkOSPUpfT-5Hwe+zRkmSdhURM6Tv4RxM+9PMCEvG+tjZw@mail.gmail.com>
2013-04-02 12:01 ` Marco Munderloh
2013-04-02 13:31 ` Ilija Hadzic
2013-04-02 13:48 ` Alex Deucher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox