All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oded Gabbay <oded.gabbay@amd.com>
To: Arthur Marsh <arthur.marsh@internode.on.net>, linux-mm@kvack.org
Subject: Re: kernel BUG at mm/rmap.c:399! part 2
Date: Sun, 11 Jan 2015 13:36:36 +0200	[thread overview]
Message-ID: <54B26044.6070602@amd.com> (raw)
In-Reply-To: <54B25F43.6020608@internode.on.net>

[-- Attachment #1: Type: text/plain, Size: 8064 bytes --]



On 01/11/2015 01:32 PM, Arthur Marsh wrote:
> This happened on an AMD64 machine running current Linus' git head in 64 bit
> mode straight after vlc crashed:
> 
> [    0.000000] Initializing cgroup subsys cpuset
> [    0.000000] Initializing cgroup subsys cpu
> [    0.000000] Initializing cgroup subsys cpuacct
> [    0.000000] Linux version 3.19.0-rc3+ (root@am64) (gcc version 4.9.2
> (Debian 4.9.2-10) ) #1452 SMP PREEMPT Sat Jan 10 17:48:48 ACDT 2015
> 
> [ 3292.010716] vlc[7254]: segfault at 7f5638087000 ip 00007f561393a232 sp
> 00007f5632fb47a0 error 6 in libvdpau_r600.so.1.0.0[7f5613879000+376000]
> [ 3412.519619] radeon 0000:01:00.0: ring 0 stalled for more than 10000msec
> [ 3412.519626] radeon 0000:01:00.0: GPU lockup (current fence id
> 0x000000000005fbde last fence id 0x000000000005fbe8 on ring 0)
> [ 3412.526917] radeon 0000:01:00.0: Saved 313 dwords of commands on ring 0.
> [ 3412.526932] radeon 0000:01:00.0: GPU softreset: 0x00000009
> [ 3412.526935] radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0xE7733030
> [ 3412.526937] radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00FF0103
> [ 3412.526939] radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200400C0
> [ 3412.526941] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> [ 3412.526943] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00008002
> [ 3412.526945] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00008086
> [ 3412.526947] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80018645
> [ 3412.526950] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> [ 3412.590365] radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
> [ 3412.590418] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
> [ 3412.592503] radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0xA0003030
> [ 3412.592506] radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00000003
> [ 3412.592508] radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200480C0
> [ 3412.592510] radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> [ 3412.592512] radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
> [ 3412.592514] radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
> [ 3412.592516] radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80100000
> [ 3412.592518] radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> [ 3412.592524] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
> [ 3412.608224] [drm] PCIE gen 2 link speeds already enabled
> [ 3412.609356] [drm] PCIE GART of 512M enabled (table at 0x0000000000254000).
> [ 3412.609380] radeon 0000:01:00.0: WB enabled
> [ 3412.609384] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr
> 0x0000000020000c00 and cpu addr 0xffff880224008c00
> [ 3412.609753] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr
> 0x00000000000521d0 and cpu addr 0xffffc900101921d0
> [ 3412.640570] [drm] ring test on 0 succeeded in 0 usecs
> [ 3412.814984] [drm] ring test on 5 succeeded in 1 usecs
> [ 3412.814990] [drm] UVD initialized successfully.
> [ 3423.000575] radeon 0000:01:00.0: ring 0 stalled for more than 10204msec
> [ 3423.000583] radeon 0000:01:00.0: GPU lockup (current fence id
> 0x000000000005fbe0 last fence id 0x000000000005fbe8 on ring 0)
> [ 3423.000735] [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait failed
> (-35).
> [ 3423.000771] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed
> testing IB on GFX ring (-35).
> [ 3423.120222] vlc[7337]: segfault at 7f3bb41b6000 ip 00007f3bbd6ef232 sp
> 00007f3bd8c457a0 error 6 in libvdpau_r600.so.1.0.0[7f3bbd62e000+376000]
> [ 3432.509934] ------------[ cut here ]------------
> [ 3432.509971] kernel BUG at mm/rmap.c:399!
> [ 3432.509992] invalid opcode: 0000 [#1] PREEMPT SMP
> [ 3432.510022] Modules linked in: rfcomm arc4 ecb md4 hmac nls_utf8 cifs
> dns_resolver fscache bnep bluetooth nfc cpufreq_userspace
> cpufreq_conservative rfkill cpufreq_stats cpufreq_powersave binfmt_misc
> uinput max6650 fuse parport_pc ppdev lp parport ir_lirc_codec
> ir_sharp_decoder ir_mce_kbd_decoder ir_jvc_decoder ir_sanyo_decoder
> ir_xmp_decoder lirc_dev ir_rc5_decoder ir_sony_decoder ir_rc6_decoder
> ir_nec_decoder fc0012 snd_hda_codec_hdmi dvb_usb_rtl28xxu rtl2830 rtl2832
> i2c_mux dvb_usb_v2 dvb_core rc_core snd_hda_codec_realtek
> snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec
> snd_hwdep snd_pcm_oss kvm_amd snd_mixer_oss radeon kvm snd_pcm ttm psmouse
> snd_timer pcspkr snd soundcore evdev k10temp serio_raw drm_kms_helper
> sp5100_tco acpi_cpufreq drm i2c_algo_bit i2c_piix4 wmi processor
> [ 3432.510482]  thermal_sys asus_atk0110 button ext4 mbcache crc16 jbd2 sg
> sr_mod cdrom sd_mod ata_generic uas usb_storage ohci_pci ahci libahci
> pata_atiixp libata scsi_mod ohci_hcd ehci_pci ehci_hcd usbcore usb_common
> r8169 mii
> [ 3432.510620] CPU: 0 PID: 4880 Comm: JS GC Helper Not tainted 3.19.0-rc3+
> #1452
> [ 3432.510655] Hardware name: System manufacturer System Product Name/M3A78
> PRO, BIOS 1701    01/27/2011
> [ 3432.510698] task: ffff8800acc80790 ti: ffff8800b8478000 task.ti:
> ffff8800b8478000
> [ 3432.510734] RIP: 0010:[<ffffffff81172f35>]  [<ffffffff81172f35>]
> unlink_anon_vmas+0x195/0x210
> [ 3432.510780] RSP: 0000:ffff8800b847bb68  EFLAGS: 00010286
> [ 3432.510806] RAX: ffff880075fdfd10 RBX: ffff880206098760 RCX:
> 00000000ffffffff
> [ 3432.510839] RDX: ffffffff00000001 RSI: ffff880075fdfd00 RDI:
> ffff8802249e8160
> [ 3432.510873] RBP: ffff8800b847bba8 R08: 0000000000000000 R09:
> 0000000000000001
> [ 3432.510907] R10: 0000000000000000 R11: ffff880075fdfd20 R12:
> ffff8802249e8160
> [ 3432.510940] R13: ffff880206098760 R14: ffff880206098770 R15:
> ffff8802249e8160
> [ 3432.510974] FS:  00007f2f77eeb700(0000) GS:ffff88022fc00000(0000)
> knlGS:0000000000000000
> [ 3432.511012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3432.511040] CR2: 00000000004d8800 CR3: 0000000001a0e000 CR4:
> 00000000000007f0
> [ 3432.511073] Stack:
> [ 3432.511085]  ffff8800b847bb78 ffff8802060986f8 ffff8800b847bba8
> ffff88004c16f858
> [ 3432.511127]  00007f2ed3000000 0000000000000000 ffff8800b847bc18
> ffff8802060986f8
> [ 3432.511169]  ffff8800b847bbf8 ffffffff81164760 ffff8800b847bbf8
> 0000000000000000
> [ 3432.511210] Call Trace:
> [ 3432.511228]  [<ffffffff81164760>] free_pgtables+0xa0/0x120
> [ 3432.511256]  [<ffffffff8116f1be>] exit_mmap+0xae/0x170
> [ 3432.511283]  [<ffffffff8104f70d>] mmput+0x4d/0x110
> [ 3432.511308]  [<ffffffff8105530f>] do_exit+0x2af/0xb30
> [ 3432.511334]  [<ffffffff8153e38b>] ? _raw_spin_unlock_irq+0x2b/0x60
> [ 3432.511365]  [<ffffffff81055c1f>] do_group_exit+0x4f/0xe0
> [ 3432.511393]  [<ffffffff81061af6>] get_signal+0x2c6/0x7f0
> [ 3432.511421]  [<ffffffff8100251e>] do_signal+0x2e/0x760
> [ 3432.511447]  [<ffffffff810935be>] ? up_read+0x1e/0x40
> [ 3432.511473]  [<ffffffff8153e39c>] ? _raw_spin_unlock_irq+0x3c/0x60
> [ 3432.511504]  [<ffffffff8107610f>] ? finish_task_switch+0x8f/0x140
> [ 3432.511535]  [<ffffffff8153efb1>] ? sysret_signal+0x5/0x4a
> [ 3432.511562]  [<ffffffff81002cc8>] do_notify_resume+0x78/0xa0
> [ 3432.511591]  [<ffffffff8153f247>] int_signal+0x12/0x17
> [ 3432.511616] Code: 48 89 46 18 e8 ed 54 01 00 48 8b 43 10 48 8d 53 10 48
> 83 e8 10 49 39 d6 74 3c 48 8b 7b 08 48 89 de 8b 97 8c 00 00 00 85 d2 74 9b
> <0f> 0b 90 48 89 75 c8 e8 bf fd ff ff 48 8b 75 c8 eb 95 48 8b 45
> [ 3432.511831] RIP  [<ffffffff81172f35>] unlink_anon_vmas+0x195/0x210
> [ 3432.511863]  RSP <ffff8800b847bb68>
> [ 3432.511927] ---[ end trace ec049a8f8b1d1018 ]---
> [ 3432.511954] Fixing recursive fault but reboot is needed!
> 
> As before, I'm happy to supply further information or run tests to help
> isolate the problem.
> 
> Arthur.
> 
> -- 
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

See this thread:
http://marc.info/?l=linux-kernel&m=142097604508577&w=2

Attached patch.

	Oded

[-- Attachment #2: mm-fix-anon_vma-degree-counter-in-case-of-anon_vma-export --]
[-- Type: text/plain, Size: 1230 bytes --]

mm: fix anon_vma degree counter in case of anon_vma export

From: Konstantin Khlebnikov <koct9i@gmail.com>

anon_vma_clone() is usually called for copy of source vma in dst argument.
If source vma has anon_vma it should be already in dst->anon_vma.
NULL pointer in dst->anon_vma means clone is called from anon_vma_fork and
anon_vma_clone() should try to reuse some old anon_vma.
vma_adjust() calls it differently and breaks anon_vma degree counter logic.

This patch copies anon_vma pointer to satisfy anon_vma_clone() expectations.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Fixes: 7a3ef208e662 ("mm: prevent endless growth of anon_vma hierarchy")
---
 mm/mmap.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 7b36aa7..12616c5 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -778,10 +778,12 @@ again:			remove_next = 1 + (end > next->vm_end);
 		if (exporter && exporter->anon_vma && !importer->anon_vma) {
 			int error;
 
+			importer->anon_vma = exporter->anon_vma;
 			error = anon_vma_clone(importer, exporter);
-			if (error)
+			if (error) {
+				importer->anon_vma = NULL;
 				return error;
-			importer->anon_vma = exporter->anon_vma;
+			}
 		}
 	}
 

  reply	other threads:[~2015-01-11 11:36 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-11 11:32 kernel BUG at mm/rmap.c:399! part 2 Arthur Marsh
2015-01-11 11:36 ` Oded Gabbay [this message]
2015-01-11 11:40   ` Arthur Marsh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54B26044.6070602@amd.com \
    --to=oded.gabbay@amd.com \
    --cc=arthur.marsh@internode.on.net \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.