From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755229AbZHYPCa (ORCPT ); Tue, 25 Aug 2009 11:02:30 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755070AbZHYPC3 (ORCPT ); Tue, 25 Aug 2009 11:02:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:20434 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755057AbZHYPC2 (ORCPT ); Tue, 25 Aug 2009 11:02:28 -0400 Date: Tue, 25 Aug 2009 16:58:32 +0200 From: Andrea Arcangeli To: Hugh Dickins Cc: Izik Eidus , Rik van Riel , Chris Wright , Nick Piggin , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 9/12] ksm: fix oom deadlock Message-ID: <20090825145832.GP14722@random.random> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 03, 2009 at 01:18:16PM +0100, Hugh Dickins wrote: > tables which have been freed for reuse; and even do_anonymous_page > and __do_fault need to check they're not being called by break_ksm > to reinstate a pte after zap_pte_range has zapped that page table. This deadlocks exit_mmap in an infinite loop when there's some region locked. mlock calls gup and pretends to page fault successfully if there's a vma existing on the region, but it doesn't page fault anymore because of the mm_count being 0 already, so follow_page fails and gup retries the page fault forever. And generally I don't like to add those checks to page fault fast path. Given we check mm_users == 0 (ksm_test_exit) after taking mmap_sem in unmerge_and_remove_all_rmap_items, why do we actually need to care that a page fault happens? We hold mmap_sem so we're guaranteed to see mm_users == 0 and we won't ever break COW on that mm with mm_users == 0 so I think those troublesome checks from page fault can be simply removed.