From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754207AbaGVMJ7 (ORCPT ); Tue, 22 Jul 2014 08:09:59 -0400 Received: from cantor2.suse.de ([195.135.220.15]:43762 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751823AbaGVMJ6 (ORCPT ); Tue, 22 Jul 2014 08:09:58 -0400 Message-ID: <53CE5494.3030708@suse.cz> Date: Tue, 22 Jul 2014 14:09:56 +0200 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Hugh Dickins , Sasha Levin CC: Andrew Morton , Konstantin Khlebnikov , Johannes Weiner , Michel Lespinasse , Lukas Czerner , Dave Jones , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Michal Hocko Subject: Re: [PATCH 0/2] shmem: fix faulting into a hole while it's punched, take 3 References: <53C7F55B.8030307@suse.cz> <53C7F5FF.7010006@oracle.com> <53C8FAA6.9050908@oracle.com> <53CDD961.1080006@oracle.com> <53CE37A6.2060000@suse.cz> In-Reply-To: <53CE37A6.2060000@suse.cz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/22/2014 12:06 PM, Vlastimil Babka wrote: > So if this is true, the change to TASK_UNINTERRUPTIBLE will avoid the > problem, but it would be nicer to keep the KILLABLE state. > I think it could be done by testing if the wait queue still exists and > is the same, before attempting finish wait. If it doesn't exist, that > means the faulter can skip finish_wait altogether because it must be > already TASK_RUNNING. > > shmem_falloc = inode->i_private; > if (shmem_falloc && shmem_falloc->waitq == shmem_falloc_waitq) > finish_wait(shmem_falloc_waitq, &shmem_fault_wait); > > It might still be theoretically possible that although it has the same > address, it's not the same wait queue, but that doesn't hurt > correctness. I might be uselessly locking some other waitq's lock, but > the inode->i_lock still protects me from other faulters that are in the > same situation. The puncher is already gone. Actually, I was wrong and deleting from a different queue could corrupt the queue head. I don't know if trinity would be able to trigger this, but I wouldn't be comfortable knowing it's possible. Calling fallocate twice in quick succession from the same process could easily end up at the same address on the stack, no? Another also somewhat ugly possibility is to make sure that the wait queue is empty before the puncher quits, regardless of the running state of the processes in the queue. I think the conditions here (serialization by i_lock) might allow us to do that without risking that we e.g. leave anyone sleeping. But it's bending the wait queue design... > However it's quite ugly and if there is some wait queue debugging mode > (I hadn't checked) that e.g. checks if wait queues and wait objects are > empty before destruction, it wouldn't like this at all...