From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759366AbcHEAP6 (ORCPT ); Thu, 4 Aug 2016 20:15:58 -0400 Received: from mail-db5eur01on0135.outbound.protection.outlook.com ([104.47.2.135]:18080 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1759198AbcHEAP4 (ORCPT ); Thu, 4 Aug 2016 20:15:56 -0400 X-Greylist: delayed 100869 seconds by postgrey-1.27 at vger.kernel.org; Thu, 04 Aug 2016 20:15:55 EDT Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=gorcunov@virtuozzo.com; Date: Wed, 3 Aug 2016 20:36:48 +0300 From: Cyrill Gorcunov To: Stanislav Kinsburskiy CC: , , , , , , , , Subject: Re: [RFC PATCH] sunrpc: do not allow process to freeze within RPC state machine Message-ID: <20160803173648.GA10543@uranus> References: <20160803165412.22407.47399.stgit@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20160803165412.22407.47399.stgit@localhost.localdomain> User-Agent: Mutt/1.6.2 (2016-07-01) X-Originating-IP: [5.18.178.32] X-ClientProxiedBy: AMSPR02CA0034.eurprd02.prod.outlook.com (10.242.225.162) To AM5PR0801MB1810.eurprd08.prod.outlook.com (10.169.247.139) X-MS-Office365-Filtering-Correlation-Id: 4ff88a1c-1abe-4aea-4b14-08d3bbc4c19f X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1810;2:ibHZMFApa7wEFYH5sFuamRGNNPInB8Pi05VhgZOuR6k7X4lNa/zIpPrlb9pxVgKYWcJqDpTej+vewc0lkSdjtmU+Qy0n6VpYBtT5tL3QWjtoFoa6v/uwBznX1cw428VyOEZA1kMnNWhMsShYPjlawao2X+wHqB05t2Nwk5FSr1umK1F+1Da2Ipk556pUAoi3;3:q5VZnT8Qx8vg7vlLWcaSKmntFzaB7BTdOtrIMiTm4aGZWuqC+8VG4uK18Zmi6pcZ2pvcEkajtcv/6O1HrTj3sSLG3JzjIwqcwwP9bJvN36wH6ZyxVPptK8B8va5KGhor;25:fHXw156nimTSqUnhKsMuBZA3TXuZ/pNelP/feQHIVOCNYIDEhOLms9uJAaCmMKDG8VQruEGgjO6CD0xuRM2crmcmDUgKAO7MMYiIaHbxvOgpuDI59XPWZ/YNyavS/BGBPsgRRjrcCTAXBEbfBaVIr6wP4OrItyhPlDmx8DLQ+GmsOuFqqHNYDhBML8cAm6YX7zwvYvIgw2s/QUx7YZdg//fJ06z3MLv51hpMhfbgqpnjMzsZZZJdTASUhKZWORh2WEem8oFzaDswCtxpOThQAGH7AIzlTDhxJC9//7GbO6FM8Bqj+HrK9GrhYqU+fWEy+O+pCPG3Sggjv6ffGI8MKF5eSNIM9LxMryuZrTCxbiUJOdobfsmi5/T/gvvPpfqZtiV7aJ+7/GUh+eRtdCvEPaJYZX+6S2YU1pA3HY7d9X0= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0801MB1810; X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1810;31:Ea822bN3N4ofkXbqcggINYfDtsdUQzOre37Hl7D969YKCa0OFiW9AvDtHIh2hNL8A9M3YC6vHyg3hcehiXWiae7PQtpJOKDKYO9eFDF4TtNHRQUk18QiemM/DC1N6JKB4aer+syHYJ03oMH9J6WbAIdq5GbNaxO5YPLrvGt89qKmOW9HgZiRtvuWGgs1OS2pZfvLGEaLwROEh4o4kiE71/tg0HXHgEoGwrktCjax/xc=;4:cgQcD2LcvGvr/T7WVgCffsTuzLZig6yobZ24QdvcmrUvyagFgxDtJOVD6ap5CoKSnaFKdWEZNjKe9zLU8OoYw/bqKtq+usP8Cuchmt+YnlkyobVEQdTM6B90kXXYab5HBVAdrrHfa8C1pDyRH/KoTNm+R02sKgA5z1trR/XXQ5DX3SXEiXejv1eN1LSiSLFRRjhERLdwmp/buhDgR+3DXz4WG5K+YS69zWu+YI4/p6F8cUVsVDXFuJPhRAZ1rxjOWVI5XgmmjrrlhIEHhYMFbqrNKvXwLYfrwj3gguzevQJok12zeS/1qT4fwrzIdvEBW2r46O8ZLl4+/Y6b2QyRmCR43wHglPFk5WvqMr1RWMLYO+sgscKci7GZyMYlxqoN8lGUTj3tfgXZpRQy6AQ2f1yVfcXihds8RD6OX+9yGpw= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040130)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6041072)(6043046);SRVR:AM5PR0801MB1810;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0801MB1810; X-Forefront-PRVS: 00235A1EEF X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(7916002)(189002)(199003)(24454002)(4326007)(9686002)(110136002)(4001450100002)(107886002)(4001350100001)(189998001)(97736004)(33716001)(23726003)(19580395003)(6116002)(50466002)(3846002)(575784001)(1076002)(92566002)(71446004)(586003)(4001430100002)(83506001)(86362001)(19580405001)(68736007)(2906002)(52956003)(2950100001)(106356001)(45336002)(305945005)(54356999)(76176999)(50986999)(66066001)(81156014)(122856001)(8676002)(97756001)(101416001)(7846002)(90966002)(46386002)(47776003)(42186005)(7736002)(105586002)(81166006)(46406003)(37156001)(33656002)(18370500001);DIR:OUT;SFP:1102;SCL:1;SRVR:AM5PR0801MB1810;H:uranus.localdomain;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;AM5PR0801MB1810;23:woFylmWlIYyvOQMiEEdEdi3Ql0KbwaVKJnh4z8v?= =?us-ascii?Q?DRIUYHfjbN1/INNFaEiST5Yc+fQ2icJ7Kc6b5cyhcUoJlC7wgJ/7lvGCwOK6?= =?us-ascii?Q?h4pvGUMHiJqTE73Wwi0uJODT/7y9fG01m5iQ2UiczLKHv5gHISxDfDFbiv8y?= =?us-ascii?Q?pkXqEISyZMDGJY8oVhTagx4/+KLcxUMM6xN1fsfA+Zk03/ENTdJTsuzncTEd?= =?us-ascii?Q?CsZsGGo/Txd3z0xKuqZS+jU9XIL7KbJFmRF7E1jjEFnNMdH3x8kXHFm5rrwe?= =?us-ascii?Q?XTj2ujnOJzH9tBKp5E8zyJGvd37Yr+Smbxgf4wt8i6IV1aBo5FFxmm9hoJlf?= =?us-ascii?Q?agU/WJVllKinDchx06/baT/EujsPyI6iwEWhW1HWr2o3zf4TYT6XRhAP1AKP?= =?us-ascii?Q?0LwaeA0GBy6ylHv7vrnB6NnFdL9ZcfI2kLFTeYbOim+gEqgJge46u64hj8zh?= =?us-ascii?Q?xDIOW3Z5kPnYGdECusF9NK9f69kqSVHUt8UwMt8rZCF/tgpKRheXCWQYXb/I?= =?us-ascii?Q?tQzRDhPvMK72W544aAd5d/9ahIfNmyhzBSFCfpzu6Ug+VCmTiTPljNZINFn+?= =?us-ascii?Q?wlNHpWaFvn83/zsdWBf7haVsDVtJEjio7EQcKBZu7q0u+7B+EwJtn9Qy6K9F?= =?us-ascii?Q?NeDEqvsNBaZWtLWmw67V6OwPPFvkbxwtYF1WYp2Z+qiERE1Fath0MNQw1K8L?= =?us-ascii?Q?UW+HGSblr+4zcdO4CljYjKWg+Z+X8eRBGnVFmav4j45yNxXj9JJjqbllI9Ba?= =?us-ascii?Q?IveaOd2KtOzvJtNeGtFImNFVgDflHi3mUlbnrZQNB5NfZFBXDCteEbQJEFna?= =?us-ascii?Q?hoIB3tB4o7UOSL8VwXqJkPGNG/13AKhg8fgLmK/rImw99qkmUQl8pjcbTgtl?= =?us-ascii?Q?HLqHlkMonwpR3BylozJzENh2qpUWzBacSCyj84dmk9JJg0Nry9nuITQU7QdN?= =?us-ascii?Q?ZII9b9FwW1v5QW7QPYMc3HVE1S2wn+rkvYrJqKzILNf5o5PeQjSx4Mf875Lq?= =?us-ascii?Q?7ldW8KmzXM9/x4RgiZGAfJH9TCqCLLozoFNw7ZPDUfiPI/Rkv6xx+N3sJKRN?= =?us-ascii?Q?HJqJ6jRqXFQRv5rOpcSAN3TzctwFbP72UDfiWCfxcrVvyQnnGlv4OuaEVUc2?= =?us-ascii?Q?rJwF+nMjDn/B9aI9BJJ+cN/kBl2tSMz5Na6/xVYjKSuISeur+Ah12SLBwcQf?= =?us-ascii?Q?mzOHRYz34YH1cEshtJUNqq1jtLdQ8RSRWf1yUSKsT592AiLn1Q+FBJSL0+II?= =?us-ascii?Q?/WAFp3WiKiXxqPU3r25wmJB+ASM1VnDW5ZNGxypPPnePyvhIOTeQyMzzlW/a?= =?us-ascii?Q?YTTWENC8r1rGcz6M7upAgk6ETyXAnqAC+OwB0BaS4MB7ZRYo1qqd7pBnPTJa?= =?us-ascii?Q?yjjOlX8gcyXbQAaQb3khLWGWg099KNP9/fkeQGJ/kdICwIrDB39PPF1qGRjX?= =?us-ascii?Q?deev1Nj54eg=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1810;6:MGO4XQZJ2hjFeEafN39oimKX+IQBpFSit7ACr4jUY2Z9YxmexE1Wuv0tBmrpro48SlCcNnxW6q3PKf8CVHTVDdlzXpxOqiGlfBB5tZL0fdVvS+hEx2CPQQoCVaS2Y1G/ftTIyJrG6POxYiaTz3On13yHRfSXXBfctyLeQ95wHDT/Rn/UW3Do8WWiBy7gXIWIOwA44b+Ytm251eSrsPg4l277LdX79MJ1h1GxYMLAiF3XRFUP9JJjNtxnc5MfImcliOxSYPLk8brSoaNy9mrOeNBd9fK++J9rRAJtgJu/vsdfpOCr7J8+yfCHQHma53Dk;5:grzZKfDxm7bEsYMlUSgS0ICvd9/vv6qFVPNCQGu474q/ipZUZhrsmlwspjanw5QfHkThCSSYvuQJJVDD7ixOjh+mdiF4zBRjxdbmyuTPOHlGjyZ4bRSum6VVvbLOljwQuaCq2K1+mVspdG2Y+WZeaw==;24:QqZpx2130JPCyG7Y9BC7Lmcruz0UuMzuNDHKAz21yGu86f/MBqldSF2SUcRNqO2mRTWqL1uTVUISDaIBjCgTlVeOgLNGPrnAJjx4JiH52Kc=;7:+iD6dv08zCN1Uu1MgKmCBSosd5+9nPVXpmiIWJtPc0KIxLl41Wu4h+ZnMTSV1AJ4AkBbxzD3iqVmCRgOzALfZGDNzsRfCf3T3gGR39Nb/NqecR0988zGZGnJZ+yQioYm5IGx3bMdChLWYyQ0TrMWDxu2LHjcPTGp/7OkVzeOWzv5hjD0BVhQTKo5psAV+kpo3zRrkYumMWRjdTxLWMor4gEonhal0stvw7uieqKar5u0iKdfjqwGAGVpHR3hG3yt SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;AM5PR0801MB1810;20:iDlCp6AR6S63CPkIM0oXvhXP4eoSMvdN7OBwtXCII5ZylqVkxrsA3ZbFQpPXoFKT4Jj8w7ZCiFhPiLxSjyFFOxY192GkXsXul/SbQfNKzOBk07d2oPJBN0lVEHenWQ23ImxfdqzdlwwyNxSarBJ0bTXcrhPWL8OG37dxdl8ByRI=;23:VENr+q6b3XAA5ubKSYFCGHsJP1KtxwBLTedRWln0jtwkULatJLPg6bUgNqe0OL/AQ28Jtqzq9eLwD31UnnhCHSyQfaHGAGxKcgrSe62nFxlV25t1ftrWH0mvKK0XAckVPfDvbEuTtCkT9+Dqmr2DzyUVMZL6tp1d2FEPO42r30X4ojbTHYyhZDqSunq7ysaP X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Aug 2016 17:36:52.4987 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0801MB1810 X-OriginatorOrg: virtuozzo.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 03, 2016 at 08:54:50PM +0400, Stanislav Kinsburskiy wrote: > Otherwise freezer cgroup state might never become "FROZEN". > > Here is a deadlock scheme for 2 processes in one freezer cgroup, which is > freezing: > > CPU 0 CPU 1 > -------- -------- > do_last > inode_lock(dir->d_inode) > vfs_create > nfs_create > ... > __rpc_execute > rpc_wait_bit_killable > __refrigerator > do_last > inode_lock(dir->d_inode) > > So, the problem is that one process takes directory inode mutex, executes > creation request and goes to refrigerator. > Another one waits till directory lock is released, remains "thawed" and thus > freezer cgroup state never becomes "FROZEN". > > Notes: > 1) Interesting, that this is not a pure deadlock: one can thaw cgroup and then > freeze it again. > 2) The issue was introduced by commit d310310cbff18ec385c6ab4d58f33b100192a96a. > 3) This patch is not aimed to fix the issue, but to show the problem root. > Look like this problem moght be applicable to other hunks from the commit, > mentioned above. > > Signed-off-by: Stanislav Kinsburskiy I think it's worth adding backtrace as well --- === pid: 708987 === (file_read) [] __refrigerator+0x5b/0x190 [] rpc_wait_bit_killable+0x66/0x80 [sunrpc] [] __rpc_execute+0x154/0x420 [sunrpc] [] rpc_execute+0x5e/0xa0 [sunrpc] [] rpc_run_task+0x70/0x90 [sunrpc] [] rpc_call_sync+0x50/0xc0 [sunrpc] [] nfs3_rpc_wrapper.constprop.10+0x6b/0xb0 [nfsv3] [] nfs3_proc_setattr+0xbf/0x140 [nfsv3] [] nfs3_proc_create+0x1a3/0x220 [nfsv3] [] nfs_create+0x83/0x150 [nfs] [] vfs_create+0x8c/0x110 [] do_last+0xc0d/0x11d0 [] path_openat+0xc2/0x460 [] do_filp_open+0x4b/0xb0 [] do_sys_open+0xf3/0x1f0 [] SyS_open+0x1e/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff === pid: 708988 === (file_read) [] do_last+0x283/0x11d0 [] path_openat+0xc2/0x460 [] do_filp_open+0x4b/0xb0 [] do_sys_open+0xf3/0x1f0 [] SyS_open+0x1e/0x20 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff