From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <48D90C78.7090100@domain.hid> Date: Tue, 23 Sep 2008 17:34:16 +0200 From: Jan Kiszka MIME-Version: 1.0 References: <48D8FEE6.3090109@domain.hid> <48D90034.1070602@domain.hid> <48D903AA.5090404@domain.hid> <48D90ADC.2020406@domain.hid> In-Reply-To: <48D90ADC.2020406@domain.hid> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Adeos-main] [BUG] vmalloc_sync_one complains about __ipipe_pin_range_globally List-Id: General discussion about Adeos List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: adeos-main , Philippe Gerum Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: >>>> Hi, >>>> >>>> any thoughts on this BUG? Happens with ipipe-2.0-07 on 2.6.24.7, >>>> obviously during module loading. >>>> >>>> kernel BUG at arch/x86/mm/fault_64.c:258! >>>> invalid opcode: 0000 [1] SMP >>>> CPU 3 >>>> Modules linked in: ide_core ide_disk scsi_mod sd_mod serverworks libata >>>> sata_svw scsi_transport_sas mptbase mptscsih mptsas sg fan edd >>>> pata_serverworks jbd mbcache ext3 usbcore hwmon i2c_core k8temp >>>> pci_hotplug i2c_piix4 shpchp ehci_hcd ohci_hcd rtc_lib rtc_core rtc_cmos >>>> tg3 >>>> Pid: 1683, comm: modprobe Not tainted 2.6.24.7-xeno #1 >>>> RIP: 0010:[] [] >>>> vmalloc_sync_one+0x6f/0x197 >>>> RSP: 0018:ffff81023b0c1c98 EFLAGS: 00010287 >>>> RAX: 00003ffffffff000 RBX: ffff81023feeea88 RCX: ffff810000000000 >>>> RDX: ffff81023c423000 RSI: 000000023c423000 RDI: ffff81023b1e7c20 >>>> RBP: ffff81023b0c1cc8 R08: ffffffff80201c20 R09: 0000000000000800 >>>> R10: ffffffff8099a380 R11: 0000000000000002 R12: 0000000000000c20 >>>> R13: ffffc20001888000 R14: ffffc20001888000 R15: 0000000000000000 >>>> FS: 00002ac2367716d0(0000) GS:ffff81023c31d5c0(0000) >>>> knlGS:0000000000000000 >>>> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>>> CR2: 00002ac236442000 CR3: 000000023b139000 CR4: 00000000000006e0 >>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>>> Process modprobe (pid: 1683, threadinfo ffff81023b0c0000, task >>>> ffff81023c3ba7f0) >>>> Stack: ffffc2000188bfff ffff81023feeea88 0000000000000c20 >>>> ffffc20001888000 >>>> ffffc2000188c000 0000000000000000 ffff81023b0c1d08 ffffffff802252ac >>>> ffffc2000188c000 0000000000000000 ffffc2000188c000 ffff81013a1cf468 >>>> Call Trace: >>>> [] __ipipe_pin_range_globally+0x9a/0xe4 >>>> [] map_vm_area+0x29f/0x2b0 >>>> [] __vmalloc_area_node+0x173/0x199 >>>> [] __vmalloc_node+0x5d/0x6a >>>> [] __vmalloc+0x11/0x13 >>>> [] vmalloc+0x1d/0x1f >>>> [] sys_init_module+0x71/0x18ba >>>> [] mcount+0x4c/0x72 >>>> [] mcount+0x4c/0x72 >>>> [] __ipipe_syscall_root+0xc/0x197 >>>> [] __ipipe_syscall_root_thunk+0x35/0x6a >>>> [] system_call+0x92/0x97 >>>> >>>> >>>> Code: 0f 0b eb fe 49 8b 00 4c 89 f2 49 bf 00 f0 ff ff ff 3f 00 00 >>>> RIP [] vmalloc_sync_one+0x6f/0x197 >>>> RSP >>>> >>>> >>>> The relevant code in fault_64.c: >>>> >>>> static int vmalloc_sync_one(pgd_t *pgd, unsigned long address) >>>> { >>>> pgd_t *pgd_ref; >>>> pud_t *pud, *pud_ref; >>>> pmd_t *pmd, *pmd_ref; >>>> pte_t *pte, *pte_ref; >>>> >>>> /* Copy kernel mappings over when needed. This can also >>>> happen within a race in page table update. In the later >>>> case just flush. */ >>>> >>>> pgd_ref = pgd_offset_k(address); >>>> if (pgd_none(*pgd_ref)) >>>> return -1; >>>> if (pgd_none(*pgd)) >>>> set_pgd(pgd, *pgd_ref); >>>> else >>>> BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref)); >>>> >>>> This one triggers. >>> I think there is something missing in the I-pipe patch: when a vmalloc >>> occurs we update all page directories, but when a vfree occurs, we do >>> nothing. Is there any chance that the bug you observed is in fact a >>> vmalloc which reuses an address which has been vfreed recently ? >> Maybe. This happens during boot-up, probably while issuing modprobes in >> a row where you also tend to release some temporary memory again. That >> said, I cannot provide a precise test case. And according to the >> reporter, this only happens fairly sporadically. > > Ok. Maybe printing pgd_page_vaddr(*pgd) and pgd_page_vaddr(*pgd_ref) > would help ? I don't see yet where you want to go. As I said, the issue is rare. I rather think we need to approach it theoretically. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux