From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steve French Subject: filesystem behavior when low on memory and PF_MEMALLOC Date: 27 Apr 2004 11:20:53 -0500 Sender: linux-fsdevel-owner@vger.kernel.org Message-ID: <1083082853.13165.15.camel@stevef95.austin.ibm.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Return-path: Received: from e6.ny.us.ibm.com ([32.97.182.106]:43168 "EHLO e6.ny.us.ibm.com") by vger.kernel.org with ESMTP id S264235AbUD0QVv (ORCPT ); Tue, 27 Apr 2004 12:21:51 -0400 Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.56.224.150]) by e6.ny.us.ibm.com (8.12.10/8.12.2) with ESMTP id i3RGLofQ727682 for ; Tue, 27 Apr 2004 12:21:50 -0400 Received: from stevef95-009041091094.austin.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay02.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id i3RGM18N085928 for ; Tue, 27 Apr 2004 12:22:02 -0400 To: linux-fsdevel@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Does PF_MEMALLOC have a similar effect to setting SLAB_NOFS and equivalent on memory allocations? and prevent memory allocations in critical code paths from blocking? Sergey Vlasov recently made a good suggestion about fixing a problem with very large file copy hangs via the use of the PF_MEMALLOC. He noted that shrink_caches can cause writepage (cifs_writepage in my case) to be invoked to write out dirty pages - but writepage needs to allocate memory both explicitly (for each the 4.5K cifs write buffer) and implicitly as a result of using the sockets API (sock_sendmsg can allocate memory) but this presumably can block. In addition the cifs demultiplex thread needs to get an acknowledgement from the server to before waking up the writepage thread - but the demultiplex thread can allocate memory in some cases. His suggested solution was to add the PF_MEMALLOC flag to the current->flags for the demultiplex thread, which makes sense and seems similar to what XFS and a few other filesystems do in some of their daemons. What was harder to evaluate though was how to fix the context of the process doing writepage - is it ok to temporarily set PF_MEMALLOC on entry to a filesystems writepage and writepages routines? Or would this be redundant since the linux/mm code should already be doing this in all low memory paths in the calling function? Is it ok to clear the flag - always clearing PF_MEMALLOC on exit from cifs_writepage (and eventually cifs_writepages when that is added). The alternative is to set SLAB_NOFS and equivalent on memory allocations on all calls in cifs on behalf of writepages which would probably be ok but would hit more code and make the codepaths trickier (figuring out if an smb buffer allocation e.g. came from writepage). My initial observations was that there is a significant performance hit setting SLAB_NOFS on all cifs buffer allocations (although I think that this is what at least one other filesystem basically does) - it seems like overkill when writepage (and possibly prepare_write/commit_write) are the ones that matter for performance during low memory situations as pages are being freed.