From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932162AbbIXWy3 (ORCPT ); Thu, 24 Sep 2015 18:54:29 -0400 Received: from mail-bl2on0059.outbound.protection.outlook.com ([65.55.169.59]:49328 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753227AbbIXWy1 (ORCPT ); Thu, 24 Sep 2015 18:54:27 -0400 Authentication-Results: spf=pass (sender IP is 63.163.107.172) smtp.mailfrom=sandisk.com; fb.com; dkim=none (message not signed) header.d=none;fb.com; dmarc=bestguesspass action=none header.from=sandisk.com; X-AuditID: ac160a68-f790b6d00000123b-11-56047f1aef15 Subject: Re: [PATCH 3/3] blk-mq: Fix the queue freezing mechanism To: Tejun Heo References: <560323AB.80900@sandisk.com> <56032432.6080006@sandisk.com> <20150924112251.2ec061fd@tom-T450> <56042844.60603@sandisk.com> <20150924165354.GB25415@mtj.duckdns.org> <5604346D.3080000@sandisk.com> <20150924174938.GC25415@mtj.duckdns.org> <56043C5D.3090307@sandisk.com> <20150924181430.GD25415@mtj.duckdns.org> CC: Ming Lei , Jens Axboe , "Christoph Hellwig" , "linux-kernel@vger.kernel.org" , Akinobu Mita From: Bart Van Assche Message-ID: <56047F1A.1060804@sandisk.com> Date: Thu, 24 Sep 2015 15:54:18 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <20150924181430.GD25415@mtj.duckdns.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupikeLIzCtJLcpLzFFi42JZI8azSFeqniXM4PdybYtXBzoYLf7vOcZm sXL1USaLy7vmsFn8Wn6U0eL9j+vsDmweE5vfsXvsnHWX3WPTqk42j903G9g8Pm+SC2CN4rJJ Sc3JLEst0rdL4MpYdNe9YAdnxYnOzSwNjIfZuxg5OSQETCS678+HssUkLtxbz9bFyMUhJHCC UeLcrQOsEM4ORonT5yeywnS8PHgGrENIYBOjxP4XhSC2sICjxKmZ+1lAbBEBWYkr0x4yQk1i klhy/xvYWGaBk4wSPbOegFWxCRhJfHs/E8zmFdCSWDlzPhuIzSKgKnH013mwuKhAhMSps2/Z IGoEJU7OhOjlFDCVmHbkM5jNLGAhMXP+eUYIW15i+9s5zCDLJATOskq8/NAKdaq6xMkl85km MIrMQjJrFpL+WUj6FzAyr2IUy83MKc5NTy0wNNQrTsxLySzO1kvOz93ECI4hrowdjFsnmR9i FOBgVOLh/aDHHCbEmlhWXJl7iFGCg1lJhJchgSVMiDclsbIqtSg/vqg0J7X4EKM0B4uSOG9v rk6okEB6YklqdmpqQWoRTJaJg1OqgVF+yprgD7/u332ldK37w3+DgNaGw9Oimc6Vx7zd1Fpg xP7L730lw/E1phvVhNbc6N68dnGwzNeNbTvd89eorbt/XM5P4HDtlrQrxdaHGPum3n1pv/7Q 7T2za/0vnNqouU/66wWlX7O1ViZH73/w9eH/tbam83Rdd6c3G4e3m5flCHTfLvm6OPWpEktx RqKhFnNRcSIAdHtZ+Z0CAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrPJMWRmVeSWpSXmKPExsXCtZEjRVeqniXMoH+DnMWrAx2MFv/3HGOz WLn6KJPF5V1z2Cx+LT/KaPH+x3V2BzaPic3v2D12zrrL7rFpVSebx+6bDWwenzfJBbBGcdmk pOZklqUW6dslcGUsuutesIOz4kTnZpYGxsPsXYycHBICJhIvD56BssUkLtxbzwZiCwlsYJQ4 2ZwFYgsLOEqcmrmfBcQWEZCVuDLtIWMXIxdQzQkmiSX3v7GBOMwCpxklJjbuYwKpYhMwkvj2 fiZYB6+AlsTKmfPBprIIqEoc/XUeLC4qECFx6uxbNogaQYmTM5+AxTkFTCWmHfkMZjMLmEnM 2/yQGcKWl9j+dg7zBEb+WUhaZiEpm4WkbAEj8ypGsdzMnOLc9MwCQ0O94sS8lMzibL3k/NxN jOAg5ozcwfh0ovkhRiYOTqkGxj0dqwp53yxvenH1Me+RL3vX7N/cce3XSbV1B7++3slwWsVI SjN37RUFnp2WzE/0Q39yqVfLq+g8dkjVvjbzSlboogMLbW2fBQutjNu8Y8n7qLwrF/YLsX6J OnR0fvvbK9ZXz2/bGf73207tn0m71Y/78PisULm8JVPh6SadKe6i4fNt7S7Ur/dTYinOSDTU Yi4qTgQA2XkKixICAAA= X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;BN1AFFO11FD052;1:UPxVJAwxo5U4MAMEfQRO6QWZ9hW9MV8BqpLwBYE3cGr6FLpolq89XONkBMyPh7tA27aoxeDFtzpAi8P/qoSPl5eX24FlC+a76Gz2phu7tj/K1i/PiWSW1QZAPRfYTTSrtaFDWHqVVlkYWfPwh6mBgNY3EOoYmn+eHWqFktzi0qf7kh5Y4f9CcTxyD2MSqCVpOvEyzw/tl86bPnYoV+hhldCfCeJtvZS9c4ClLuZgoFTni6Sm/12ueTzhxEMWoEi8SojedYJOpcQlL9Wg6gjpgBhWNzPVtKT6aJSSfnMSFAkBAqrTPMDCgiB87qn3bEuamUDSFVN7b/NIfNG+mPjUpW/wpLbvVNsWg/1aimq09TY0IH9kDiyWShsGX4Gx+UBh X-Forefront-Antispam-Report: CIP:63.163.107.172;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(2980300002)(438002)(377454003)(24454002)(189002)(51444003)(52314003)(199003)(479174004)(23676002)(106466001)(5007970100001)(97736004)(69596002)(81156007)(80316001)(5001830100001)(5001860100001)(4001540100001)(4001350100001)(77096005)(68736005)(62966003)(2950100001)(11100500001)(99136001)(83506001)(92566002)(189998001)(110136002)(77156002)(46102003)(5001960100002)(93886004)(87936001)(59896002)(64126003)(65956001)(47776003)(64706001)(33656002)(65806001)(87266999)(54356999)(65816999)(50986999)(76176999)(86362001)(36756003)(50466002);DIR:OUT;SFP:1101;SCL:1;SRVR:BY2PR0201MB0742;H:milsmgep11.sandisk.com;FPR:;SPF:Pass;PTR:ErrorRetry;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;BY2PR0201MB0742;2:TtSECIgFqUnn4O4CxzelcAJJ8JP5A1S9qgEOnGO7xXITDaxUHLdk2VlzbRsBSnmp27r3fq7zjQDoeszJu2/sqWUapMpVowHWo9/p5zI0mb9lXoqfr/VmaBP1PFsi7vTLwPPEpu8KOAWkU7ruJiujTEyR+mk6/m3/Dv9W7bB7sD4=;3:gliyZxDE/SsI6BxWzTAR2kzJwi3LOgJZSLWEKHoGrHT/LBCDaC2y2eVpDWGy6B3jPOdgNsas1fPHdvrpQAmLTe8Q8c+eSu3/yHl9vaAOPzG84WaDQ1oCpOMUwSmhehgtmPyaBfLMdYt4XkwZ9nuuEPOXvcDK5W5fKj0g63/x3Tzr8Sz2/xrJs2KfpdMmDCaD13y3R1Fxqx8BPLgbwZUMqCRtTTJ4S+0bvdYVXJ1GJW0N1lnGQ2FqRwVpgmnUYPZDl1KYglVjwhGpq1xVbJzTPQ==;25:O23tqN+P6dT2o6QN7vhl0PqBJ6vxAdkV6wHgiY/aE+/GE3oeD7yrjHp5phn0JeCjlr/H6+e/uD32cxM5yKnYPVXJfyn5THFaAagD1jHlSpzXwW5drf075Q9S1vmwlC532REuDtZgg3YKOaxkDgguwCzUp2qSqFaS8KGuhGtLM9HjB2KM6rBDxC3Z4EzAiq09OZgGRyPQRuYnIle2SlkTAe7jukiz8/K1l8Gr83yfjvPJLmu8BdR96q7Od2URbZBY X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(8251501001);SRVR:BY2PR0201MB0742; X-Microsoft-Exchange-Diagnostics: 1;BY2PR0201MB0742;20:0qLn4gMNEDReYGWwGIHDsAGUnRFRaZa/17wPHFc+aPIRIX4ueKbLF8nc7yEW6n52Bwg8OghSStBSUH49gswNgQm2n+8kH5rPsAgz0uSvTXi0sBEH8f81zKdkUVtBAbZnYKLBFmJTppOJcyjIFOt9XDe73kJ3vN8HZWds8BWrPw74U9qQiu4UIpNhP/KBWQ5nGgWCJqYvJX6V4nnUuAYdND2LfI0AcISFpBEFzEBkW6QFs7j8danI7Td4HuMYgv1VNs2zgJqx7ex61r+MOl6jwdu6FYGm/Yj8arlItrUqzRDCalYhtT8vr9CBB9lVsnAgIEN8CC1xMG9u91KtSqftd7RQaYyJ8pi5j/Hg+M2L+7cttZRuYJ/vzUaWfRKZ17ES22OHa916C1q2aww2l1YcTxo81wRoVBsbkUOB/DzuUaKEABw36EdMuU4Wp+ZR2dIPjQKgMPhyiPfMe4Q/+nXvyvRzQeS0j56t764pmIz5Pxi2KDTRgyRReryyJE/Gw0pU;4:reK+NEzVp5F3dd9o/Qdyxb+1yGJnl5fZfiPs2Jl7Z5jTNnXlBRTgCo2ogaR3iy6grAxshLJLL0FRpltwCkCAS6kE0wdFO95HyBHu/m7vlpSDLP6LQm+ARNSwvs0R5QVB4Qd8F45JJn336v+pG+w2M96ud/VYbkfDtGFon8m3GAv+AlghAiovS0MM7Ozi5wbS0CI5s2XZ7al12gUOVAMvym6cIL8YWuzHdXosI9KLFRIZD3DiBNl6/A3xuuX+3ZFDRyaoHea3VVjitwqEm+T3tVJovlrQXLF48bvEs9pIv5HtPRmbc6xwfhJwtvnOo+A4pm7zHyHRiRzwqZg20xa5mOluAcYkZE7fFuCde0RRDl4= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(520078)(3002001);SRVR:BY2PR0201MB0742;BCL:0;PCL:0;RULEID:;SRVR:BY2PR0201MB0742; X-Forefront-PRVS: 070912876F X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtCWTJQUjAyMDFNQjA3NDI7MjM6TUEzU3hRM2l4eFBLYy9VWHBHNUJLcEJX?= =?utf-8?B?VzNUVDJodDUwRHIwL0w2Wmp4d2lvZ1ZiTUF5TUtsdDFlU0ZEQmlleVNPRENB?= =?utf-8?B?YXNxdUhDLzhRVU1uREd0alJhSFdLYmFBOUlIZDhCWGtnZWFqYnR3ZzBTUmhC?= =?utf-8?B?VzRlQ2ZZWFV1VnlVcUxKWi9hYXpxd1FDTlQrelZwRkd4QzFURUtQQUpoYXBj?= =?utf-8?B?NU5NSnVpMUs0dUpSS2RNcWZvaXFsK1o1ak5SR000RHVoTU9tT053WStkZmV5?= =?utf-8?B?VE1ra0VXWGZoWUVaTC9udkorZUF6eFlIQ3ZadGFad0lYdXhLblJyVmpFeEM0?= =?utf-8?B?STVSR05DdkIvYVVtS2wwdERZQmt6WDE0ZmM5R0E5TjI4NXQrNFNRTWFHTHZj?= =?utf-8?B?N2g2MzFIUHJ5V1dCUEtiWUxvemswZnd0aGlLSURReUZ3V3lWZ0tJbjRUQ2NR?= =?utf-8?B?ZGRBNEpwSzc3UDFKeXBKeGphRW16WThtS0ZjRDJya2IybUhzMFRkTVl4SnFh?= =?utf-8?B?UXZZRlk5NmUvSXpMUldwSndvVC9abHhZc3MzNU5NaUR0RVhQZnlESThvQzIz?= =?utf-8?B?Vi9KSXJwanZtdGU1V2J5Q00xNEE0ZzRPK0cwc2draE9qaENoWmlKS2M5cDVj?= =?utf-8?B?d2F5OWY5VlZ3cFpRbWcvR2Q5UjlENmFycm9LdWlab3RTQUVrSEoveU5pYmxL?= =?utf-8?B?YkRsUEhrcFNrV2xUdWJ6cUFXazlPcEZTcG5nYU1yYldBYll1QS9aYkNrcHA5?= =?utf-8?B?aVpyc3ZlRmRaRGhHa0pMZFEzV2JpNEF4SG5zaTdIMVJPVUpZVXd4aVlMU3lX?= =?utf-8?B?aVJheGNFdmloWHpGWWJMSTJPY056M2pHSUs1OWJnUWROcjVENUZ2cnkxdHFl?= =?utf-8?B?bVZ5VkdVZ09zVDU1VWJIWjNxVzA1ZlIrTnlUditVbG1IVXdNK1o3amllTzNi?= =?utf-8?B?REFybksxTllRMSt2M2FPL0lqT2NxZ1dUZ1IvWEJvTkYxZWpsdTltL0Y3N29D?= =?utf-8?B?dTBOVGxEOEtvSW8xMU0rdGRQRGpGM2I3d0Nrand3MDVKcWwzZ3JXRUV4aWhx?= =?utf-8?B?WE1WbFppYk03V09FbmpZNjFuMmxSUW1aL0lFMEQrQWwvL1Fxa3JEVW11Nkdj?= =?utf-8?B?bnREaVR5eGIxbUJsbHUxalRCOXZjMnc1anFHeTdRL2NndTg4REVCYWRZb2lC?= =?utf-8?B?WnMxaXVDWlRueFJIZG5NYzNKajJweU1rVXd4MjhTekJCSkMzYWpWa3RuZ2dM?= =?utf-8?B?Z0ZaS1lZNVg2YmlEanUwb2VySUpNaFpFU1UxeDdlSGhnQWdxZkg2M0VZSE1w?= =?utf-8?B?SGxyY2o5cC92NzU2TXJCK20yK3NzZE1MZ2xDb0xEUkpkZDJVNXgwK3dVMlEz?= =?utf-8?B?bFFZQWVHS21rcWhVbG45UDFaM1kwU0licGsvWTU0eWtreG16Rk9KcmliV1Jw?= =?utf-8?B?L2tCWkIrUVJpbksrbktSajkyaEJnYkNPREJzRHFkMzNRZTg1aDMra1QvVVNz?= =?utf-8?B?VWJ1N0M1Yk1HVVhxMGZ1cjhhSHFOZG9zVS9ZZC9SLzd5eUsyWXNUVjhhbVUr?= =?utf-8?B?UHV3Q0paOTRqRW42bkVGcFhwQURGbjBMd3lsSXc0K0JWUDR3dm5RRnBoTEtp?= =?utf-8?B?bHVVZi9kdUdIc3ZpQk5qbjhMMzhzOE5hT2R5VTZwcWRWN1lsaWdjMEg0djRu?= =?utf-8?B?WllyMmlJanF6T3JQNitsMDZHNnRnbEdFUGhUczdhUEVJRkpSOFdVU0p0cFc3?= =?utf-8?B?RGZRdi8vSHJYSVdOd3dVYUd1dlI0aG0xODI2TzZ4SHAwdUgzU0g5ajJwYXhU?= =?utf-8?Q?MD3CVkMOpbbBbnC?= X-Microsoft-Exchange-Diagnostics: 1;BY2PR0201MB0742;5:/Wv6vQqVIhmq0Pjuf9rGYpHSV4yHvBTdlQDUO/K7h+y3G7Ok87hoGQn61szwDFTZEuLD6x+Igrd6RHVsQve1nBqEcP/gJjBpp0IMC7iX1s37k1zTCg7dH7vFe5xfliaDn4TtPY+oAj8v00yqFhfBmA==;24:WIYjv4oqfa8dEUycDSgjYe9rQ/OfwhVdRe4wxySM33oEKI5r6jegYdeEcsq1U2M7eN59ECUAh+2CBBWo5OghTK3GN/iMVqvAAW0HX49pbo8=;20:iCGn7gH+WpHy3ALhby0O9fzldncYVtXVZLi7RVbwtohnjgKEAFxiV702r7ng67LimGBuj2+MaXBWIGZrtrRoYD6IZSehkwJTcPaVr9gs1sqLCmnNoFyWCURGK71IGRisj1AXZ/rhfvwJS/NHRAHk2T1grkQZAr1+zHxdmYthxz9Pm4ra8T6gRHNDW5bZy0jzd5aZcVMdq7J9JjzxWRsUlXKYW9Ym1fSK9sx+fsVjRuMuhPq7M3YXXJ1FpHoLoN6N SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: sandisk.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Sep 2015 22:54:19.2651 (UTC) X-MS-Exchange-CrossTenant-Id: fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d;Ip=[63.163.107.172];Helo=[milsmgep11.sandisk.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR0201MB0742 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/24/2015 11:14 AM, Tejun Heo wrote: > On Thu, Sep 24, 2015 at 11:09:33AM -0700, Bart Van Assche wrote: >> On 09/24/2015 10:49 AM, Tejun Heo wrote: >>> Again, that doesn't happen. >> >> In case anyone would be interested, the backtraces for the lockup I had >> observed are as follows: > > If this is happening and it's not caused by a hung in-flight request, > it's either percpu_ref being buggy or the forementioned kill/reinit > race screwing it up. percpu_ref_kill() is expected to disable > tryget_live() in a finite amount of time regardless of concurrent > tryget tries. Hello Tejun, Sorry that I had not yet made this clear but I agreed with the analysis in your two most recent e-mails. I think I have found the cause of the loop: for one or another reason the scsi_dh_alua driver was not loaded automatically. I think that caused the SCSI core to return a retryable error code for reads and writes sent over paths in the SCSI ALUA state "standby" instead of a non-retryable error code and that that caused the dm-mpath driver to enter an infinite loop. Loading the scsi_dh_alua driver resolved the infinite loop. Anyway, thank you for the feedback. Bart.