From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934685AbcI0QFw (ORCPT ); Tue, 27 Sep 2016 12:05:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44206 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934471AbcI0QFm (ORCPT ); Tue, 27 Sep 2016 12:05:42 -0400 Date: Tue, 27 Sep 2016 18:05:29 +0200 From: Andrea Arcangeli To: Shaun Tancheff Cc: Andrew Morton , "Kirill A. Shutemov" , Vlastimil Babka , Michal Hocko , Ingo Molnar , Dave Hansen , Dan Williams , Johannes Weiner , Joonsoo Kim , Konstantin Khlebnikov , Chen Gang , Andrey Ryabinin , Thomas Gleixner , Mel Gorman , Piotr Kwapulinski , linux-mm@kvack.org, LKML , Shaun Tancheff Subject: Re: BUG Re: mm: vma_merge: fix vm_page_prot SMP race condition against rmap_walk Message-ID: <20160927160529.GJ4618@redhat.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.0 (2016-08-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 27 Sep 2016 16:05:35 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Tue, Sep 27, 2016 at 05:16:15AM -0500, Shaun Tancheff wrote: > git bisect points at commit c9634dcf00c9c93b ("mm: vma_merge: fix > vm_page_prot SMP race condition against rmap_walk") I assume linux-next? But I can't find the commit, but I should know what this is. > > Last lines to console are [transcribed]: > > vma ffff8c3d989a7c78 start 00007fe02ed4c000 end 00007fe02ed52000 > next ffff8c3d96de0c38 prev ffff8c3d989a6e40 mm ffff8c3d071cbac0 > prot 8000000000000025 anon_vma ffff8c3d96fc9b28 vm_ops (null) > pgoff 7fe02ed4c file (null) private_data (null) > flags: 0x8100073(read|write|mayread|maywrite|mayexec|account|softdirty) It's a false positive, you have DEBUG_VM_RB=y, you can disable it or cherry-pick the fix: https://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/commit/?id=74d8b44224f31153e23ca8a7f7f0700091f5a9b2 The assumption validate_mm_rb did isn't valid anymore on the new code during __vma_unlink, the validation code must be updated to skip the next vma instead of the current one after this change. It's a bug in DEBUG_VM_RB=y, if you keep DEBUG_VM_RB=n there's no bug. > Reproducer is an Ubuntu 16.04.1 LTS x86_64 running on a VM (VirtualBox). > Symptom is a solid hang after boot and switch to starting gnome session. > > Hang at about 35s. > > kdbg traceback is all null entries. > > Let me know what additional information I can provide. I already submitted the fix to Andrew last week: https://marc.info/?l=linux-mm&m=147449253801920&w=2 I assume it's pending for merging in -mm. If you can test this patch and confirm the problem goes away with DEBUG_VM_RB=y it'd be great. Thanks, Andrea