From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755460AbcARPBC (ORCPT ); Mon, 18 Jan 2016 10:01:02 -0500 Received: from mail-bl2on0132.outbound.protection.outlook.com ([65.55.169.132]:8419 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752596AbcARPA6 (ORCPT ); Mon, 18 Jan 2016 10:00:58 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Joe.Lawrence@stratus.com; Subject: Re: [patch 00/14] x86/irq: Plug various vector cleanup races To: Thomas Gleixner References: <20151231155849.772553760@linutronix.de> <568A9157.9070402@stratus.com> <20160114103326.GG8496@pd.tnic> <569AB81D.9090904@stratus.com> CC: Borislav Petkov , LKML , Ingo Molnar , Peter Anvin , Jiang Liu , Jeremiah Mahler , , Guenter Roeck From: Joe Lawrence Message-ID: <569CFE21.9010104@stratus.com> Date: Mon, 18 Jan 2016 10:00:49 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <569AB81D.9090904@stratus.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [198.97.41.12] X-ClientProxiedBy: BL2PR01CA0029.prod.exchangelabs.com (10.141.66.29) To DM2PR0801MB587.namprd08.prod.outlook.com (10.242.127.14) X-Microsoft-Exchange-Diagnostics: 1;DM2PR0801MB587;2:HZK//orRnmK0ELLJTEAMN1ZRXI5FwEVKhf2dPMX4OJfHG9KE6t2Ihv4wJJN6h1bUc25Xxw+KEFfWEW1eR+62zwsC7Wj4Rsq7nyzcgSKin+G37HyovkKrhefmeOfir/J1lHiQUscuH2rcRvNxkgJhjg==;3:AWEIxZEXRqcBOaymluK0Zk4mb/hlCpYSxxUAi/n4HYpnme7nUoZotY4Neo4Ou1aXI6au0CDwAxXZTl03kNTfPWjmktgt5plWObBBuXjvVOI16n20xjtc8VQW0lwUqxHx;25:Ctkx/rjTQJkZIx2wrsqN24NfZ8KPB+Yz9SBOpVXjktCV8VIeXD4WSxWAJe5Np2PqENbyWYbPdw8PdtoZWSuAX+Hxl0EoSj87RX5XTvsu/hh3y99m/sjeLYSrUgT1d1OctMQw1TwISK+3/By/6MNxlFoO+hNvYR84TUFv94MTo63jMrSB8JxqOaD6sOMOFmCDKEOjd+sJarvC4QLg96pwSUgjdWtsGL6AawhA4tpmdDF+j3DdE9U0FZoHG++EHC74 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DM2PR0801MB587; X-MS-Office365-Filtering-Correlation-Id: 223759d9-d297-4bd7-91a9-08d320182a83 X-Microsoft-Exchange-Diagnostics: 1;DM2PR0801MB587;20:ct9x4HN6K1FZooODRQhuw2dy0Emli3Aq7hkP9ZjfIzC5058f88KHUy+q5F0aQT09ogEyRu/svaA6kqKBmkQO8TSpbDYMt6jAzy5IcuOUbxsXyyn0LqFNaz73qytHaUAANWSNF9Imye1jI300nruS5CeCk7x817oQ8YezaqjlwO0XVH0ypwEKLrYQSWd4qFCrC8O3gDVYouHI6pcMjqkoRUxTaddo9DSwyjj/E8ENcg8frC3F6Y+MmWTdeCvjSuJBBZbu+V9x9S1x+Q1UN71KyJdYWL5ODLiIE67+J9fRA5v0w0YIPRH/tdDw5LP9C5TQIguocs+LkBwzqDJJOuGONgtpbabwXLlZrGpKm/l57mmPgEXf8yqTPZDTrr+xpwkWOtGM/qVBInNE7tP9uv8iZslBvqM03LguBAtjcBtyGxKluDh7eCFPQLCr2e7sCAqJbbPYTx/fnJZAg/zQ4DnWTbG5+Wz2gzbpcmHx0pj7h0n7Mgqwyb6EGpsTcJGx0wln;4:xcbjUjk6rDvJIEc3Qi92n5K2vxWPeKjmSt2Fk7Wz0ad4MSX3wVbYOuoARPjvJDZwwrn760nagRf3Oy4J95+RLrTz2wlWNsGoSFi7hDn8NTgNOHKycVXC31Ta9YP+HgYpBd8yMkciLE0DFwNGVety+l+kiPn8k41qURroIeT7W+ethKcsPvWqpKHrKVrqXUaohUf9lYGTM01dp0lLfTAcprnSaLh63EaId8Dp3DUiAD5/xOcP2LAzIePqt5Rg8hcdXlsA6adu1+vJZUA32LfH8CcjD2uIA4XuZbLwdVtYGZUPXXIAcl7rb+ltRSYv/0GmETeg7rlhs4+70xfTQxdY+Hn2pi8IWUlikbI6ulpSUBt4elImD/ID5mq1ylS7KYpa X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(10201501046)(3002001);SRVR:DM2PR0801MB587;BCL:0;PCL:0;RULEID:;SRVR:DM2PR0801MB587; X-Forefront-PRVS: 08252193F3 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6009001)(52604005)(164054003)(189002)(479174004)(24454002)(377454003)(51234002)(199003)(80316001)(5004730100002)(93886004)(64126003)(87266999)(2950100001)(40100003)(33656002)(66066001)(59896002)(5001960100002)(189998001)(106356001)(50986999)(81156007)(19580395003)(110136002)(50466002)(83506001)(54356999)(97736004)(65956001)(101416001)(4001350100001)(2906002)(6116002)(19580405001)(3846002)(36756003)(5008740100001)(586003)(4326007)(86362001)(47776003)(122386002)(42186005)(230700001)(105586002)(87976001)(77096005)(92566002)(23676002)(1096002)(65806001)(65816999)(76176999);DIR:OUT;SFP:1102;SCL:1;SRVR:DM2PR0801MB587;H:jlaw-desktop.mno.stratus.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtETTJQUjA4MDFNQjU4NzsyMzp0MlJDQmhlQVliY3p1WklqekxDTmNGbGV0?= =?utf-8?B?RG1kTVdBdkFTS1hVeXZlUVo5MFVLWDhobDVMWWJJRmN2Y014cUhZYUxvNnVQ?= =?utf-8?B?Vlp4TkJQRFVNQ0V4VWhMSFRlNG1qdHZuK1JIK0VhcFpzZnZVUDJuY2p2MURQ?= =?utf-8?B?T2x5amo1U3Y4V1M0cjBYaTZvcXh3by91OGJFb0lybk8zbG02Wm9BZjgrTnI4?= =?utf-8?B?eThsREJwRjhpTHZrdGNFNzBCSmpoMGJCRmE3L2M0bkVEL0lHcFhYYkRvZCts?= =?utf-8?B?QnpCNmJYd2tLc1hNK3FKMFNwTmpTSUVIQWpGRS9XTFFUQWd6cytCbWxqcmoy?= =?utf-8?B?MkxIbjBzZXdIYTdOaDZqQlZtc2Y1UnkramxSWllKVjZaSEVVNExBbHdreG9S?= =?utf-8?B?cm9IZjc0dnVBRjZ6TFBYbEVyM3FzVk9raFVjUklwSSt1akMrZy9Obk8wR25U?= =?utf-8?B?YmZkMlBHZWdlVHVoTEVhT084NXdCWnRjUFAzMzZyRW85Z3N3SEhES0lDTzE3?= =?utf-8?B?ZHUrMlhqdnQyMTVkdlJ5WTZVMG94ZjM3MU1MSHFVRHpJNnYwQnZESFZRWXE4?= =?utf-8?B?N0pzL0FYVFFYYkdXKzJOUkh6bGtjTDI3UkV0ZENoVWQ5SEhHV0MxU1I5aUVN?= =?utf-8?B?VERUaFkrT3BreURTNGFTSEhBQlNWU3MxS0p3NDM0Y05ydGJGK0FZQjJiMWFO?= =?utf-8?B?L3ZLNGpWS0RZUDZXQ0JzOWE5cWI1VExXdk0vNFgyall0em5wcVFsZGg4OWdk?= =?utf-8?B?bjBLVjJiMlVUUzRhWGxRT1hKWlhMbHNxaGlNWERvdkNwbHIrcEd3aDdzMms3?= =?utf-8?B?QlRkcmZTazZWQ28yTWw4VStYeUUwQ25QaFoxRXB4bUlJd3h2cGIwZXFWT2ds?= =?utf-8?B?azM0bFlXNGVnbC9ZdjV3ZitiSHdidzdtV002SWlJUnZBc3BtZi9wTDhpdnVF?= =?utf-8?B?d0JwaW1mOGxhM3doR1BhT3lkN21WUlRMSjJXNCs4bXJiS3B0emg2VHZXV01z?= =?utf-8?B?Q1d0Z2x6ZTRhWUhjbVV0VndEVzVEVVUzTzBQR0l3UDh4UmhUL2dFajJVcHZz?= =?utf-8?B?RHJEOFBEbzMwbTJDaHZLZGlnM3RtazAzRVBsZ1NoYkcwb1A2ZDhsM0xiM2hs?= =?utf-8?B?NzNsS0xscTh4ZXY0SHpNZ29rdHpjL2V6RTlXQ0ZXQ25FRnJmVUFBQ204dm1M?= =?utf-8?B?NTBtejJTRHllL2NSVHJjYmZ4ZUpRYnNpdjIwREtUbi84cEtyYWVTdTZ5eWJ6?= =?utf-8?B?djhJTm1TZlFHcENNTkxSbzNmMnRSUkpGUEJjVTVWenVFVFVnN0RyM0VXU1Vu?= =?utf-8?B?cmNVN2xkT3U4WEVyQ2J3clVLcWtjMHlxSGN4VytKREFvZWRncSt0TjFjcUUx?= =?utf-8?B?ZVkvK2UzbUNxbVVQL1FYZWE0dTNNZFkrME1JeE5xR0x3NHdaeXh1eERCZzRV?= =?utf-8?B?aVB3ME5meWJ0dk13d25ZU0x1Rll1YTF4SnMycWhGOVRJbGdxbXY4eGxCS0p1?= =?utf-8?B?WkE2eUYxUlF0WDhMc1c5cnJKUDZsV25QTmFuYWxEemlCOHZJR2xQUVZKQnpu?= =?utf-8?B?Z3RYRlRlY3M2c0NpOEdJNU1wODRWa2V5SGdwTFVyR2VyUzhkbDhYNFZLZ1hL?= =?utf-8?B?VWlkaVRaMjFEd2pFK04rdVc1a1VjUlF3ZElwVGpvaEN5Z3RWbnc2TWthUy81?= =?utf-8?B?QytMQ0J1cHlXcHF1MEQvVUhZSGx1cTN0MnNxZExHay9sSEpmUDltc1FTb3ZZ?= =?utf-8?B?VjZkN2FoS2oyeUxpTFltYWhWT05MTUdNYUZ4STVBYk5CWEFpWENuREliVHJy?= =?utf-8?B?T0hILzBvVjVVaCtqTXRXclR0YXNEMmNSV2FnbmdlR3pxeS9leEU2QmZtQUdp?= =?utf-8?Q?nAWEvGgfMkCdpgvzr7wWlDWBcsdy/ndUR?= X-Microsoft-Exchange-Diagnostics: 1;DM2PR0801MB587;5:LWlNboaxzNtwVR3di2oJylKy9LtwHCrajWULkr+9LXdLVOrdZV0jo6aDrCp+hteA/zHYAMA2InMuHQPekxoB5EDFBT5w4djfwsRwpSusyIYkdg6pcHUWJRnLtTB9mSozdjMuAtqicKP/OfjI6+KVhg==;24:jHcPTncGgDNlD34sra8mBz7zEG/r4R6+fdkG8/QNECgMqor0n8ZCmxPQhKnLBQFd/e31B7H3UB2A3dWAcjUfs9mw/Z8+0j2BvbgzolAawYs=;20:pHUDsel0zJcBOZPmF0qD0D/o6giMg4qKSUrcqOluXwVl/n3G4NfTVCqpwySoC1bnWdfYWl2QHDDU/+1lmxCgJDzK/SYN2XuhXVhUz1Ma8hW6KJ8KK/DutBipvvAEoK6Siq0LMaoJrPrJIIASQklL1ynz6Yckt9Vs9B/0awRV/wY= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: stratus.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Jan 2016 15:00:54.8003 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0801MB587 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/16/2016 04:37 PM, Joe Lawrence wrote: > On 01/14/2016 05:33 AM, Borislav Petkov wrote: >> On Thu, Jan 14, 2016 at 09:24:35AM +0100, Thomas Gleixner wrote: >>> On Mon, 4 Jan 2016, Joe Lawrence wrote: >>>> No issues running the same PCI device removal and stress tests against >>>> the patchset. >>> >>> Thanks for testing! >>> >>> Though there is yet another long standing bug in that area. Fix below. >>> >>> Thanks, >>> >>> tglx >>> >>> 8<-------------------- >>> > [ ... snip ... ] >> >> s/d// >> >> With those micro-changes: >> >> Tested-by: Borislav Petkov >> >> :-) > > Tests still running ok here (with same micro-change as Borislav). Hi Thomas, When logging in this morning and looking at the box running the 14 patches + additional patch, I see it hit a hung task timeout in xhci USB code about 39 hours in. Stack trace below (looks to be waiting on a completion that never comes). I didn't see this when running only the *initial* 14 patches. Of course, before these irq cleanup fixes my tests never ran this long :) So it may or may not be related to the patchset, I'm still poking around the generated vmcore. Let me know if there is anything you might be interested in looking at from the wreckage. -- Joe INFO: task kworker/0:1:1506 blocked for more than 120 seconds. Tainted: P OE 4.3.0sra12+ #50 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/0:1 D 0000000000000000 0 1506 2 0x00000080 Workqueue: usb_hub_wq hub_event ffff8801e46dba58 0000000000000046 ffff8810375dac00 ffff881038430000 ffff8801e46dc000 ffff88025ac20440 ffff88025ac20438 ffff881038430000 0000000000000000 ffff8801e46dba70 ffffffff81659893 7fffffffffffffff Call Trace: [] schedule+0x33/0x80 [] schedule_timeout+0x200/0x2a0 [] ? internal_add_timer+0x71/0xb0 [] ? mod_timer+0x114/0x210 [] wait_for_completion+0xf1/0x130 [] ? wake_up_q+0x70/0x70 [] xhci_discover_or_reset_device+0x1e1/0x540 [] hub_port_reset+0x3c8/0x590 [] hub_port_init+0x525/0xb00 [] hub_port_connect+0x328/0x940 [] hub_event+0x63c/0xb00 [] process_one_work+0x14c/0x3c0 [] worker_thread+0x114/0x470 [] ? __schedule+0x2af/0x8b0 [] ? rescuer_thread+0x310/0x310 [] kthread+0xd8/0xf0 [] ? kthread_park+0x60/0x60 [] ret_from_fork+0x3f/0x70 [] ? kthread_park+0x60/0x60