From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934738AbdERNVU (ORCPT ); Thu, 18 May 2017 09:21:20 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:53108 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934649AbdERNVK (ORCPT ); Thu, 18 May 2017 09:21:10 -0400 Authentication-Results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=fb.com; Date: Thu, 18 May 2017 14:20:33 +0100 From: Roman Gushchin To: Michal Hocko CC: Tetsuo Handa , , , , , Subject: Re: [PATCH] mm,oom: fix oom invocation issues Message-ID: <20170518132033.GA12219@castle> References: <1495034780-9520-1-git-send-email-guro@fb.com> <20170517161446.GB20660@dhcp22.suse.cz> <20170517194316.GA30517@castle> <201705180703.JGH95344.SOHJtFFMOQFLOV@I-love.SAKURA.ne.jp> <20170518084729.GB25462@dhcp22.suse.cz> <20170518090039.GC25462@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20170518090039.GC25462@dhcp22.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) X-Originating-IP: [2620:10d:c092:200::1:74c7] X-ClientProxiedBy: DB6PR07CA0093.eurprd07.prod.outlook.com (10.175.238.31) To CO1PR15MB1080.namprd15.prod.outlook.com (10.166.30.138) X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PR15MB1080: X-MS-Office365-Filtering-Correlation-Id: a8b13787-c69e-4fe8-b774-08d49df0b4b6 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(201703131423075)(201703031133081);SRVR:CO1PR15MB1080; X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1080;3:/pNwSQeFjUIEM5hWH7rVkqy0UKZ+kSmSxhuIxk17eoOBLSUKIJsnaBY5go2AkNCbpeyi+fEW5S03DrmIJouzUOg6empeOuijNCyP1vmgRCk7FCL2hcGoPoqGBg5WTnh09sNuWHYAgtftDMgFIgjcxvMZxrHOM6qnt/vkgcm+Cgou/Pe8ugDhSFocX3NU6Z3a3HdIeoIn/z4swjsd26GR2qIJFbSkjGPAqSc0wQopPF3j3SGPKqJRor5c/Z8BZ6LyPQvLnN8njbgb8EbimG3icPJl6ydViUFoRQGXcVIFNk2ku6P6qVfSA9okzkztZAeHsErO3S23W4OsgB9tPitu3g==;25:9fhY/e/RrTA1/RyX/HnfjeI+uWE4LY6heMu3AfJ6ecigQw1L6gDT+MEUka9ZzpxvcjsaA1juxhTs48h50h0It9qMtjo/xjSJahoI+iQspX7hioObyuPPqcWkpljtPXm4KEsjH0CPxaYAPSFrspWyzBDK5i2CA4NSjdqsiudKxl36qiVelzozQNOp/tbf2X3TXwNEhBSBMhR1G2xdm3rDS51huHxrIp8f5zJvu1EEfCad7eE80c1r0fE09CSF8WioR7pC5289YE/6zhg1MTE17aVAZfNqyIwf1KsTsU9Q9MclJGe4rZFpJVkxtv3nu/6A5QEP+lgNDYGGaJPlrJIpDD00uwzC3Y/q4gaKyGqiZrwhKku6uRUHeW2Bwy/0mgWD07J1wJ1s2RiEtqrDpgY0ZrxnCIVBvOk1B1XmQ6x1DbYZvCZPJOUGsJay3ELdj/gU9agPz7zDiG/lBGzr6mC6qM+9u+evtH4+HWrOLWRTJR8= X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1080;31:30DRPvr+KGyOAHVcgB9lMrBmC/XD7Jb1LlWR3AX9sJV35Ohwui5BVEaucaaKZRQrIllTYnU0BpUy0+kM0/6l6btEZZjsrPN+vRIEJz0JzA6nev99jhJ66It9y54Vm6VXOdjm869VJE11CGgX7zG6X7icPFC517SNKbXGBOzbvaW76Tb4vcg9B86C9vqZK81VGeWRjG7eYZVZ9vY0Q9R0T5h0CzA8dEBeqpBaRYuYJew=;20:Q4Bqqbd5yV8bOG2xlCOPkI01e8m+HT8d87Cn+OHu6BhkHcPWxb6MY+NtsapMsh9IRESyKM2MHz17GBagmebMLxp7t1qR2W7Kb7PGIkRoo3ASz909CGOQYXyP1Rq97TnaxaJhcc3nl8w4UqGXHOpAjSHkq8qLcmG31omNWaSmbHMDwuF8iQW+CPnSXLms+/CDdzc+s/6qT2IuvVm8gUnlODPgfxp2iJHCTr7RybyJ33z82a58MXTxZFs4uTPFPoNYDVokFV0UqNBNQdwz252KyCbQkngpsTulBKX8a1K2ddVkfgY0OWur2cnl7rCZekiaIJ3bQaORnxadeVQ0X73jfHsZV+VVLAdb92Hk48FatzOKPweVe4y7J0uKaZKXECTLL1ez4CLbXTorploQzclBRGgQv+zqFqrVEEwnS3Vrx83OdRaTMza22rdHgHPAkr1jGoX3XzoR/3sS8ndAz6/FDn4m1Z0HEaVoGpi6nudh/fdJdhL444Fdkew+fxfGOeDs X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(192374486261705)(9452136761055)(67672495146484); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040450)(601004)(2401047)(8121501046)(5005006)(10201501046)(93006095)(93001095)(3002001)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123560025)(20161123564025)(20161123562025)(20161123558100)(20161123555025)(6072148);SRVR:CO1PR15MB1080;BCL:0;PCL:0;RULEID:;SRVR:CO1PR15MB1080; X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1080;4:kPGO6nbNXqZmWDRD2YSXGirtnQBYnms1TlonA4Dkv5uuP9AtImyeMYBTsAcbmI6JO6mgyQNT9UDmgrAefAW3m5Buw91JaJBgmU/dJmSqCy1R2e/Hi0JNFPwyJOcntDOfmcQ8kFql8o3oyxUN1LV+rB77kMJWgqJQdCGtUz34alKpj1sz93x8dfw9P4h/Gd9a4ZEvG6YYhkW3+6z2udvkHlqQFWhGhHXnA8Kc5QanE9zVrB2Q+qoeJNX1oTeMiMj0ZHMNsfSiIaXtQtIT38J/aKJliiOjoGBnoDTaGrltqQmMNKGXlK9Nll5PAEx5MueqxghUSq1nxZJdv/MMzeK+kpMaciSBaY7w+Q/udkigOFE0gUFB9XAgi1EhmGUTmq+Bi662yPIHqGMbCah61k2+CJ6NFvDAs0XMQ4h3bdgWcSYrwT1OwyBnvXVnbhANLC3xoYESq9iab9fniiDdcF/5YD8ULYDaxvCaOU0Yc7UZg++y2cKGAx3jK7vzoisK7VGuJ1E74M2wCuKk5PFJPx/0EbfQdmZLc4BJU5E0q+f/2RamHhTfk9jACS4nf12+WDurW6daM6aJTatgWQgU4P8oxS8hihRzOje2cIwMRYcUTuRSop+sadTxsCDLEwTzDSzwbM+Wdh3wwQ89GpJYewpeADpijMDCFEePduPug4huX6Cg69R3ZA2jjqFLn986ZdRfA5FJPQEmhZ85Qm+/iQR9hzAOzw+YcL9m9byzSsf302aATvRi43CB+7fgPtu/e0zTiST4inefVXsdP3nybTaT0hJZREkb7GloM8XqxwKAvJBVefXmzACpj5pRiJPtrDWFQlXb2vJ0Jqm8p32pMJWF2P3HC/1SRYyZuAS9FMersOpbG2MU2jbaXK1iHqpdqMU2nWkx2M+Ytho0RHR4XXI5Kg== X-Forefront-PRVS: 0311124FA9 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(39450400003)(39850400002)(39400400002)(39410400002)(39840400002)(377424004)(24454002)(76176999)(50986999)(38730400002)(6496005)(54356999)(110136004)(5660300001)(1076002)(33656002)(7736002)(4001350100001)(305945005)(6246003)(6116002)(93886004)(97756001)(50466002)(86362001)(575784001)(23726003)(42186005)(46406003)(47776003)(2906002)(189998001)(6666003)(2950100002)(6916009)(4326008)(25786009)(53546009)(229853002)(478600001)(83506001)(54906002)(9686003)(81166006)(8676002)(55016002)(53936002)(33716001)(18370500001)(142933001)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:CO1PR15MB1080;H:castle;FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;CO1PR15MB1080;23:jAR6KiaqEA3tiNcBdxIDkk8CRdtFYhvCgUmeQFbxm?= =?us-ascii?Q?7SIaBx0C/bfWkXIR0PFPQwvR6UhuH7JzaP4vKEPOdiMwhPYyTgxATwFfrfC5?= =?us-ascii?Q?iraC9DBT5KikJ0cD0uortYX8q0b7Qzuf7aV0jFVssJIahkPrE3RPj6phLNbt?= =?us-ascii?Q?4YClKACoeT+uxriaehZ3Hw4auAE8VBcg4o7RGTVkBkQTFOWPCEFn0tQeGzKm?= =?us-ascii?Q?T9HeiYi377LHkhApKR/wZNOR7ieLyjp3IDaU1QrA14/+xqFoM8kggDSCQfIf?= =?us-ascii?Q?VGyudmQO5Y/QE5tLfkkvIVqB9eGg8zIFkgZHwL05XnoiBAcS+5uDga8Gx+Id?= =?us-ascii?Q?qi95RDrrhHEOINldPFIsT4HNTXDurZbkbRyZajxUBEBXOSwqf5qZhuKNzDK0?= =?us-ascii?Q?OMv28UhXLYM617ftY9sMAlt7X2Ux0slP54zNqkuI6IgLglcnv7YVscDx6IAo?= =?us-ascii?Q?IPZKOBk3CJ8eXFl/w1I+YbJhEg0QmI0fps58Dr8bEIATiqZ6OAUtaGA/FvGN?= =?us-ascii?Q?biynkH39DQW95rDIXDAjzl7nF5z5Z6C5/pjsO0AsG6Qqn98dw7r3S4D2zq/J?= =?us-ascii?Q?b9rHpU24zqqIvmvf29FBHcaCpcMO236mXo5hMSO3HyM1UoZJ0M2FW0L9iHc+?= =?us-ascii?Q?YCIgoN1in7rgfoExu6Nv7n6OUJ4GTzkqf/mBn8jmN/wDflQ+KfT0lDtnEGyD?= =?us-ascii?Q?tKajKQRs6fQ/jbOUVBCL0FdnAd4ryVD7n2YOv8h3OMxhx9FX+mLY0qVi6ngB?= =?us-ascii?Q?lqlKGV/tGJVMVa2XkQVcg6UpSOIzvmyksC08wW3Kme5PvbnJSTbrnGszVxc7?= =?us-ascii?Q?xoGalIgL2bUfhN8b1zZrXTrwgoANukQzR/aLw9S0WZEbwndL/h8E0ja7jk2Z?= =?us-ascii?Q?YN1RQQ96qZyYxyEsVaVm/c6up+j1BXH5mDI+yfazkRj3qyONlE1r+2YS/S5R?= =?us-ascii?Q?GGpz0IfVKlFmBdHxQD6e7EldJHaXlP1YlDPYdyC7G7nmg6mLdt3tGi4IMNy0?= =?us-ascii?Q?OFEJqCdccEcJhWcz95VG87VAGY4PiLdJy5E8Bko4QgfuZQCZ/ax8gTzeZc3o?= =?us-ascii?Q?Y3SDU0tvSibszU+MmyHRMdzt809nBkJQYAepp5AC/RkV7/y8KYR1hZ+HCJ1V?= =?us-ascii?Q?5Ssz7Q39Z8xl5O/ZdELgENYSmOF2F7XXH3qCEpP2D+T0CQKuloPxnIj0fd/s?= =?us-ascii?Q?n/p00cVBndQTYGLkgtVEnW4gICxcYhEc0qpWqr6AzU/uqAj3oXpIMWVYNsza?= =?us-ascii?Q?8dNDI9fYBbiWnzdKLG75uIP3gL+QZVrnMEJyuNxIe6v9R9NRnGspaA52v6nk?= =?us-ascii?Q?hnoC7BybYfyGdRH80VkfFg=3D?= X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1080;6:HGDEPARTiijMh9X9xc8uusRfwfSmX5nVLzmWvaExAW8B5dbc4w04TFI7qNtBmOOtjrRZ7d3x2qcsBXrXTeH6R2JxmAu3xVyp8gTV6mJt6QmhbYPT0YJ6ifXbeCiBmqEjaIAjH39A03F2XJWKnnOGV5XwdLAkhBAIF+QGHkPtnwRg/nBaiXsgKWK3lUBXe/DXO8ASr7uP8tTWotosGEXmXEC+V+TBTZUyicYnIDrtCwMS1Xs5hasurnU2Sajp6XRIfnlM/2tXMUV/eUAhNKFe3MgBIMlZNtYs9tJTvE/Q3IctHy7+Adht1tGSUw4rK3SgKhxI2sdo8AW2CIT5gYv6V16vx01Q+mye7nmFr4utQF7PA4Mrv6gvt+9gnCjBKjjbk0NdJFQpn8Scn2hnOibXMA6EcXn2CFcTkjwpz9XqVVnopZ7Wp19i7BbRbqfTEgbYKEDukBM70fxdFTKoOCvklFE6ht3C+szH5L8oeZhJcBniSPDzlzs8uBWt+HTezFUQ7ps25kWTgEW9zE6xLdtFXQ==;5:EFf1Cs/kZbLu4H24GVZHymVOn063ELmFZiudBMPY4XYiF9vMZ8v3R6eBI04jW6EnImV8eL4ylYAgMxHTZ10mxtd+YolYOHAnK5Ei1bL+yuod77tUyOfWSTHBnRpe/xqUPs+ePgj0fAMTJdelPfsahw==;24:9fhK0hfK5wq/VXp6qsnU7U+XVJeIZuj/oqMkQWyPuwj7yz8Bwlcnu85qEVTwyfOsxeV/lOAWuTmZKJ5cpan7HaNaOfKRAa4FFp87XR6oRwk= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1080;7:481z29YJaPtt85Zg7cMOC4HEeoOhTH9sTfiiBs/VvE1BTemFHnM5U5dG+giuRRMleXuqAN7q/F1PZXTsTsVyQjOQNzoOv54KFsgD4R3lBPrpIbkSMAkErPeWad3QNi/JQYxXYBi5TEd0VNfMkOhCCDYRP3osNBaNmsUqW7VehYXM/HBwpH1e3R43PVj94NZa26pgE6BUcv/NNXsTlQcmGt/Gykaq110btsd9thR8DFxyaFE7b7bgkDf+BExPcT7KPToQZecK7zfEUtRz1oZ12uQI1dPnlHVJhHfyP2SLkTbvJ2G+eCEC1gfpA14unsiAOGZyPgwULBTtZxpeNweadQ==;20:vk+2QX6FBDxLunH8SaHa119CG0abaYybu+7Q+NTsNVv0fojvL/4Ye2hoIWgcAOuxEHC4dn1j2+wa4rptBBvMZ2zw66pqt1zxMTmFEPVZ7MMZOqxXFOdD2byAbiVJU78Cc7eX4KgetivMN86zpo4wr8+DktjJAA2XpATs8rJGWkI= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 May 2017 13:20:49.8153 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR15MB1080 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-05-18_04:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 18, 2017 at 11:00:39AM +0200, Michal Hocko wrote: > On Thu 18-05-17 10:47:29, Michal Hocko wrote: > > > > Hmm, I guess you are right. I haven't realized that pagefault_out_of_memory > > can race and pick up another victim. For some reason I thought that the > > page fault would break out on fatal signal pending but we don't do that (we > > used to in the past). Now that I think about that more we should > > probably remove out_of_memory out of pagefault_out_of_memory completely. > > It is racy and it basically doesn't have any allocation context so we > > might kill a task from a different domain. So can we do this instead? > > There is a slight risk that somebody might have returned VM_FAULT_OOM > > without doing an allocation but from my quick look nobody does that > > currently. > > If this is considered too risky then we can do what Roman was proposing > and check tsk_is_oom_victim in pagefault_out_of_memory and bail out. Hi, Michal! If we consider this approach, I've prepared a separate patch for this problem (stripped all oom reaper list stuff). Thanks! >>From 317fad44a0fe79fb76e8e4fd6bd81c52ae1712e9 Mon Sep 17 00:00:00 2001 From: Roman Gushchin Date: Tue, 16 May 2017 21:19:56 +0100 Subject: [PATCH] mm,oom: prevent OOM double kill from a pagefault handling path During the debugging of some OOM-related stuff, I've noticed that sometimes OOM kills two processes instead of one. The problem can be easily reproduced on a vanilla kernel: [ 25.721494] allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0 [ 25.725658] allocate cpuset=/ mems_allowed=0 [ 25.727033] CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 [ 25.729215] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 25.729598] Call Trace: [ 25.729598] dump_stack+0x63/0x82 [ 25.729598] dump_header+0x97/0x21a [ 25.729598] ? do_try_to_free_pages+0x2d7/0x360 [ 25.729598] ? security_capable_noaudit+0x45/0x60 [ 25.729598] oom_kill_process+0x219/0x3e0 [ 25.729598] out_of_memory+0x11d/0x480 [ 25.729598] __alloc_pages_slowpath+0xc84/0xd40 [ 25.729598] __alloc_pages_nodemask+0x245/0x260 [ 25.729598] alloc_pages_vma+0xa2/0x270 [ 25.729598] __handle_mm_fault+0xca9/0x10c0 [ 25.729598] handle_mm_fault+0xf3/0x210 [ 25.729598] __do_page_fault+0x240/0x4e0 [ 25.729598] trace_do_page_fault+0x37/0xe0 [ 25.729598] do_async_page_fault+0x19/0x70 [ 25.729598] async_page_fault+0x28/0x30 < cut > [ 25.810868] oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB < cut > [ 25.817589] allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0 [ 25.818821] allocate cpuset=/ mems_allowed=0 [ 25.819259] CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 [ 25.819847] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 25.820549] Call Trace: [ 25.820733] dump_stack+0x63/0x82 [ 25.820961] dump_header+0x97/0x21a [ 25.820961] ? security_capable_noaudit+0x45/0x60 [ 25.820961] oom_kill_process+0x219/0x3e0 [ 25.820961] out_of_memory+0x11d/0x480 [ 25.820961] pagefault_out_of_memory+0x68/0x80 [ 25.820961] mm_fault_error+0x8f/0x190 [ 25.820961] ? handle_mm_fault+0xf3/0x210 [ 25.820961] __do_page_fault+0x4b2/0x4e0 [ 25.820961] trace_do_page_fault+0x37/0xe0 [ 25.820961] do_async_page_fault+0x19/0x70 [ 25.820961] async_page_fault+0x28/0x30 < cut > [ 25.863078] Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child [ 25.863634] Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB This actually happens if pagefault_out_of_memory() is called after the calling process has already been selected as an OOM victim and killed. There is a race with the oom reaper: if the process is reaped before it enters out_of_memory(), the MMF_OOM_SKIP flag is set, and out_of_memory() will not consider the process as a eligible victim. That means that another victim will be selected and killed. Tetsuo Handa has noticed, that this is a side effect of commit 9a67f6488eca926f ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath"). To avoid this, out_of_memory() shouldn't be called from pagefault_out_of_memory(), if current task already has been chosen as an oom victim. v2: dropped changes related to the oom_reaper synchronization, as it looks like a separate and minor issue; rebased on new mm; renamed, updated commit message. Signed-off-by: Roman Gushchin Cc: Michal Hocko Cc: Tetsuo Handa Cc: Johannes Weiner Cc: Vladimir Davydov Cc: kernel-team@fb.com Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/oom_kill.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 04c9143..9c643a3 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1068,6 +1068,9 @@ void pagefault_out_of_memory(void) if (mem_cgroup_oom_synchronize(true)) return; + if (tsk_is_oom_victim(current)) + return; + if (!mutex_trylock(&oom_lock)) return; out_of_memory(&oc); -- 2.7.4