From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934816Ab0EEOuW (ORCPT ); Wed, 5 May 2010 10:50:22 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52780 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757494Ab0EEOuT (ORCPT ); Wed, 5 May 2010 10:50:19 -0400 Date: Wed, 5 May 2010 16:48:13 +0200 From: Andrea Arcangeli To: Mel Gorman Cc: Andrew Morton , Christoph Lameter , Adam Litke , Avi Kivity , David Rientjes , Minchan Kim , KAMEZAWA Hiroyuki , KOSAKI Motohiro , Rik van Riel , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] fix count_vm_event preempt in memory compaction direct reclaim Message-ID: <20100505144813.GI5835@random.random> References: <1271797276-31358-1-git-send-email-mel@csn.ul.ie> <1271797276-31358-13-git-send-email-mel@csn.ul.ie> <20100505121908.GA5835@random.random> <20100505125156.GM20979@csn.ul.ie> <20100505131112.GB5835@random.random> <20100505135537.GO20979@csn.ul.ie> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100505135537.GO20979@csn.ul.ie> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 05, 2010 at 02:55:38PM +0100, Mel Gorman wrote: > I haven't seen this problem. The testing I'd have been doing with compaction > were stress tests allocating huge pages but not from the fault path. That explains it! But anything can call alloc_pages(order>0) with some semaphore held. > It's not mandatory but the LRU lists should be drained so they can be properly > isolated. It'd make a slight difference to success rates as there will be > pages that cannot be isolated because they are on some pagevec. Yes success rate will be slightly worse but this also applies to all regular vmscan paths that don't send IPI but they only flush the local queue with lru_add_drain, simply pages won't be freed until there will be some other cpu holding the refcount on them, it is not specific to compaction.c but it applies to vmscan.c and vmscan likely not wanting to send an IPI flood because it could too if it wanted. But I guess I should at least use lru_add_drain() in replacement of migrate_prep... > While true, is compaction density that high under normal workloads? I guess > it would be if a scanner was constantly trying to promote pages. If the > IPI load is out of hand, I'm ok with disabling in some cases. For example, > I'd be ok with it being skipped if it was part of a daemon doing speculative > promotion but I'd prefer it to still be used if the static hugetlbfs pool > was being resized if that was possible. I don't know if IPI is measurable, but it usually is... > > ----- > > Subject: disable migrate_prep() > > > > From: Andrea Arcangeli > > > > I get trouble from lockdep if I leave it enabled: > > > > ======================================================= > > [ INFO: possible circular locking dependency detected ] > > 2.6.34-rc3 #50 > > ------------------------------------------------------- > > largepages/4965 is trying to acquire lock: > > (events){+.+.+.}, at: [] flush_work+0x38/0x130 > > > > but task is already holding lock: > > (&mm->mmap_sem){++++++}, at: [] do_page_fault+0xd2/0x430 > > > > Hmm, I'm not seeing where in the fault path flush_work is getting called > from. Can you point it out to me please? lru_add_drain_all->schedule_on_each_cpu->flush_work > We already do some IPI work in the page allocator although it happens after > direct reclaim and only for high-order pages. What happens there and what > happens in migrate_prep are very similar so if there was a problem with IPI > and fault paths, I'd have expected to see it from hugetlbfs at some stage. Where? I never triggered other issues in the page allocator with lockdep, just this one pops up.