From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail137.messagelabs.com (mail137.messagelabs.com [216.82.249.19]) by kanga.kvack.org (Postfix) with SMTP id AD1616B00A3 for ; Thu, 28 Jan 2010 09:57:37 -0500 (EST) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: [PATCH 00 of 31] Transparent Hugepage support #8 Message-Id: Date: Thu, 28 Jan 2010 15:33:14 +0100 From: Andrea Arcangeli Sender: owner-linux-mm@kvack.org To: linux-mm@kvack.org Cc: Marcelo Tosatti , Adam Litke , Avi Kivity , Izik Eidus , Hugh Dickins , Nick Piggin , Rik van Riel , Mel Gorman , Dave Hansen , Benjamin Herrenschmidt , Ingo Molnar , Mike Travis , KAMEZAWA Hiroyuki , Christoph Lameter , Chris Wright , Andrew Morton , bpicco@redhat.com, Christoph Hellwig , KOSAKI Motohiro , Balbir Singh , Arnd Bergmann List-ID: Hello, this is the last covering all review plus fixes for vm_normal_page in khugepaged (which still results in a warning generated by a pte having pte_special bit and mapping mmio area in the 256M area of the graphics card of mst but nor VM_PFNMAP nor VM_MIXEDMAP set, even worse vm_file is null and vm_ops is null too). I still can't figure out how a special pte can be mapped in an area with vm_ops and vm_file both null (btw, on a side note I doubt we ever have a case of vm_ops not null and vm_file null or vm_ops null and vm_file not null). It happens during the speculative readonly pass and I intentionally tried to avoid taking the pt lock (it was later taken in collapse_huge_page of course). I wonder if that's the reason so I added the pt lock in the speculative pass too, but I can't see how it can happen even without the lock (the pte can't go away under it because khugepaged holds the mmap_sem read mode). I would imagine it could be a problem only if pte updates weren't atomic as they are on 64bit (despite not being enforced with asm() constructs but relaying on gcc) so takign pt lock will help on that side, but even if it was gcc doing partial writes to ptes, it shouldn't be always pointing to mmio of graphics card. I can't reproduce the khugepaged warning here, but it's only a warning, it can't affect stability if it's true my hypotesis that the bug was already there, and that I only exposed it with khugepaged. But if it's not my bug, then I wonder why munmap doesn't trip on it too. The suspicious code at the moment is i915_gem, things like unmap_mapping_range etc.. One of my theories is that unmap_mapping_range of i915_gem is not clearing all ptes and then those pte_special leaks into newly allocated mappings as new pte are allocated but then I can't imagine this not to have adverse effects (at the very least it should screwup graphics card at boot around the time these bugcheck triggers). Also teh bugchecks seem to go away after 80 sec uptime so maybe the corruption is then cleared as the new mappings are teardown too (this time correctly through regular munmap and not ->close). It's my primary focus is to understand that pte_special thing, because other than the above there is no other known issue so far, and it is rock solid in all hardware where I deployed it and I leave swapping storms running 24/7 in addition to khugepaged at 0 scan/defrag_sleep to stress smp safety of split_huge_page and collapse_huge_page. I rebooted laptop and test server, only to upgrade the #8 version to include in the testing the new code post review. I suggest to try it (especially if you use i915_gem, as I need to know if anybody else can reproduce the khugepaged warning with pte_special set) and let me know, the more testing the better. Thanks, Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org