From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753555AbbI1Vyw (ORCPT ); Mon, 28 Sep 2015 17:54:52 -0400 Received: from mail-db3on0091.outbound.protection.outlook.com ([157.55.234.91]:33408 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752013AbbI1Vys (ORCPT ); Mon, 28 Sep 2015 17:54:48 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=cmetcalf@ezchip.com; Subject: Re: [PATCH v7 03/11] task_isolation: support PR_TASK_ISOLATION_STRICT mode To: Andy Lutomirski References: <1443453446-7827-1-git-send-email-cmetcalf@ezchip.com> <1443453446-7827-4-git-send-email-cmetcalf@ezchip.com> CC: Gilad Ben Yossef , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Andrew Morton , Rik van Riel , Tejun Heo , Frederic Weisbecker , Thomas Gleixner , "Paul E. McKenney" , Christoph Lameter , Viresh Kumar , Catalin Marinas , Will Deacon , "linux-doc@vger.kernel.org" , Linux API , "linux-kernel@vger.kernel.org" From: Chris Metcalf Message-ID: <5609B713.5020709@ezchip.com> Date: Mon, 28 Sep 2015 17:54:27 -0400 User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: CY1PR21CA0117.namprd21.prod.outlook.com (25.164.213.43) To DB5PR02MB0775.eurprd02.prod.outlook.com (25.161.243.146) X-Microsoft-Exchange-Diagnostics: 1;DB5PR02MB0775;2:gWlJF4de6cepr5zxR+IbFpzW97GEiKOtAbZ95UkHgXTKvE5X7V3RYsbldYG/4MFJFJJuhT5S/seVQMwhVCRj7UL6OzGB9xSAiOC2M7cR4XusL2bq2Lx1PDwpxQWj/ehEjT9aJkCcwRWLwrZI+PZ1ZI/2PYSsmfFkC5euOCmW6jw=;3:HZZidr1LMzA5XPGQFtdt6WdwawzKF640rzWbQmKFPqXBNNsaNf5O83scxSE27OFDUoFHYTjI5qrYNzRSH7J1B38fFE/Shd/TiO2JCLsPG6ek0n3AveEX6IN3aiifVlJZe7XjKlH29+wbSZl2GmN8tg==;25:iEYEtdoWKd/DBHEfihwt3kCIXRWTCfqmI9s4K5hvk9uAX0rwQSaRaPbwIauE7a7vA64Nhs9PBlfneKbLflpg+XAOUcV+Dr3r/MZ+Z3VNYV7BwPci5YJ7dKIJBNjbyWHgEwb6zDKd5PiB4ANcoWItPTzSywTB4s80tNS/C5fF1jKVG7TgMnW+wQ3jDxi0MT1PzwIuKOjO/oAfC60AUIM1mqmFc9goQPtO7ievFMW+vNVEJ/LLPw1Gp7D7bH1nOEKNhSrAhho91JMydeR+zdpAWg==;20:V+Jk4BABGCJsga0OdsJ+YccjKA1I529N0uK9pX7SIdi7Nxhc17EfZGkVt7A5g1MWjbgruN/0s9+B+Wg6A4og0AcNSwedyJ5HUlFdBNHU+gmzhs4CXqNO+z3DSHxUT77waZpLhrSnLybCF9zuvkR12Sn+6i/yK1nuKlBiwR3CJ6M= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0775; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(3002001);SRVR:DB5PR02MB0775;BCL:0;PCL:0;RULEID:;SRVR:DB5PR02MB0775; X-Microsoft-Exchange-Diagnostics: 1;DB5PR02MB0775;4:g+Bz9p/H4rFgOmcO0tS18I54vPYUTHfhChYdxqy4my0/KwN3nNGbbxeVQKmzlxqAHN0JpmmFKvxIdWo8GD5HKinp7oMODBh4s4vLkkQ6WExCtO1Nmt0Q/TrbegTeLDqUC8YPkCRGsiZvfVolIaSAqLuB2YrrykMSCCBVqzV1x2mSFcB1hh2w1DR7+nTsj5BuAkRQ0Bcw/rGlG0X++QfZxO92/x1+LEI7pCVpkWyZ0w5qB23nYKF7u+mSLDfjR4RYpo8xL7UKTcHnXrBMARqqctARQbnCdyj7vck9JEs1tGEEg0Q1VunT71oukIPxIwaWKLUxak2DfzuQHGtYQEo5re5fnLX1r38K/rvNzU9Z4qY= X-Forefront-PRVS: 0713BC207F X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(6049001)(6009001)(199003)(377454003)(479174004)(24454002)(189002)(5001830100001)(2950100001)(65956001)(23676002)(80316001)(46102003)(87976001)(40100003)(19580405001)(92566002)(5004730100002)(19580395003)(122386002)(5001960100002)(97736004)(110136002)(5007970100001)(5001860100001)(64126003)(42186005)(189998001)(50986999)(47776003)(4001350100001)(86362001)(64706001)(4001540100001)(5001920100001)(54356999)(68736005)(87266999)(106356001)(77096005)(101416001)(81156007)(36756003)(62966003)(33656002)(15975445007)(83506001)(59896002)(105586002)(65806001)(66066001)(76176999)(77156002)(65816999)(50466002)(5867745003)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR02MB0775;H:[10.7.0.41];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtEQjVQUjAyTUIwNzc1OzIzOmEzUDBwVHZvR0FqT3c2cWJmdHJqNWVnZjFp?= =?utf-8?B?d283V2lLMkRoVEJsNlRUZk9GV1lIbnJGTnhLTDh4ZEtjZ2FWSUY1QkV5Nzgz?= =?utf-8?B?VXh0cjNhbjh1cW1ac2c0TVR3TWNsZUdrNmVNSHI5TjVxVElzM3lDeEpickM0?= =?utf-8?B?SW9HVm1IVHlOWFhQd0R5b1NOWkJBOE9QWjIrcXNLbWZjRjd2aTRxTVhOdFdO?= =?utf-8?B?VEx0MkNXT3UzSXBDbHhmbVNsWWd1SFcxUTNXSm56bFZ6a3JTOVBTVjdZbVp1?= =?utf-8?B?ektXU2E0cGMvOXZWOEFKWjVybkxxbU9kN0lmYVk1U2ZhTU1LR1dDcnF0aU4w?= =?utf-8?B?bWxnZHNBYkhCVEFGeUpVc1BGZTVCWWNuWWk2RU1Pb2dIRDFhMTBuZUs4cTFM?= =?utf-8?B?cmIySCt3Tjd1dzRZREdZSjFEUzJ3bjdUTytnYm5QKy9sQTR2dmlzSndycWFy?= =?utf-8?B?WkZnMWo1YktiWFd4bXEwYmFONUt5QXk5ZDJ4TXNwUVlja1dDTGwwSlptWkRR?= =?utf-8?B?aUFpd1FLK2tTc1FabzRDQ1l6bFM0aFdjbEJuZVVncVZyc3doVlBWcXNnaWhQ?= =?utf-8?B?ajJmYUx6clR1cXZWTkMzdGNzOU0wTmV1Q25VbWVBK3ZRUG9FdXNQVFp0RkNR?= =?utf-8?B?M2NxMDJuTS94ZURRL2lzOGgyd0pPQzVGeHk0V3VwY2MyMmxiQUNaM1RVWkRr?= =?utf-8?B?b3Z4TDhKLzhJTzhBYWQzYzVDRmVKUlQwb3FtS25mK3F0d3d5dGdWTWJacFJG?= =?utf-8?B?dEsrMXFmSUY4VmQrNzkwTlJndEljRHVjWVI5ZVY5UmhjSjdLeDhnU2hsOVU3?= =?utf-8?B?VG40VTBtN2o3cXZMVkJxRFpXWHM3S092ZVhJZHladW1mL0c0R0NGY0lOUy9C?= =?utf-8?B?RCtqeDZZVjJ3MWFHWmprRHArL25nK1plU1U3bzdFbjRIZ3Y1UGFUU0pNKzk2?= =?utf-8?B?MjFEckJIQ2N1WnZxaTFxeU9ZamxWanhPM0NnTklHWjNnTWZwdy9LUGJMdjlJ?= =?utf-8?B?MkowRlRQSzRwR3IzUmpMd2NxRDlSV0F4djl5cWdSZng4a0tVZ2pWWEVVT2o5?= =?utf-8?B?cXA1QkRxWDdoSkpITGk4OHo1ODg1eUpweHNhL2NwWFpZRW1TdDl4elJTanNS?= =?utf-8?B?OUdxQ0hqL1o5RlUyWnBTcWdpTGpveUYvZXdITlA4SXl4Mnh6TFlCcnZvb1o2?= =?utf-8?B?SlR0UkJ2VG5JdnhubTVTcVFYM1dUbWd2cHR3aEhaT0N6QU1ncWxRNGpHdGg0?= =?utf-8?B?ZkFBZmxlR3BQb2J2bitFaVVuYW82RGViM2Rzc1VRN0U3NGtLTU50UVBvZlpX?= =?utf-8?B?eHMrcEJvS1VGM1RLV0w5S08xT2lZTlpKSTJGRi9kWjZ0L2I1UHMreEN5WjFr?= =?utf-8?B?RDVNeXdNa1RZZHpzVGMyNXB5Z01hQ1g5cVdmWkRMQUVjYVY3d2FQMzYvdWxN?= =?utf-8?B?akRzYXBhS2hoVHZ4b0hUSlVRTnpOdUErSUp3bnFIS2k3UWVuOWoxUXpIUlR2?= =?utf-8?B?ZmhOTFA4MHZBMmJ6bkUwTjhkZmVMdlNPSmdVM2x1eXlVWXZRZzcrNjdGcVBG?= =?utf-8?B?Y2ZWYzlSNkZsSVFQUXc4WmUxWThJbmxpSzNOKzR1SWJvbGhVcUlnV0xXQy9a?= =?utf-8?B?Vnk4SThOY05nQ3RiNEk5M2tncmNyUkZRVnl3OTIwRzRPNitUWUtnWGRYV0pN?= =?utf-8?B?SCtEQ3M1QUZhWGlWZmdUemVOR2xhWFMrSnRreFRXMm14R0JOMnRMKzNXbjFq?= =?utf-8?B?enpVMy9wNFRFTmpLTFBNZ2svaVl1SXI2RXdwaThYSEUySlM5b0pNSVV5ME1D?= =?utf-8?B?RGFaZjk2bGNzZjRVSk16eExGNk1vNXRwVU1kcThRdVYrMWxjRDh1NUxZd244?= =?utf-8?B?UkgzckpIeGNGYWViVjl2L3huZnAyOUkvL1lBTzExUDNqWFlDYW5Bc1JhM1RZ?= =?utf-8?B?YVVKRFlrT2J3NVZpK1Z5VDFmbHUxNWVrcnJNZmxGbEh3aHE4V0MxcDNLVnda?= =?utf-8?B?Rmk5TkpWSEt3dFllVUd4NW5GQTl2ZFFnbVZuVk9JdHRNOE4zb1N5QkZMRnhp?= =?utf-8?Q?a264=3D?= X-Microsoft-Exchange-Diagnostics: 1;DB5PR02MB0775;5:FWJlsMZEClp/0HN7NnTQ/SAgMfhy4m+G7JNlXVzGUCAR5opWfZogJRtlEaeTPUSsaJ63KiJql3ZXPwXFS93QD2jmHzMo8mxTnuv4KUuind+1p/qQKavCgjqHOhLCTQXTSOc6+sN8HrrM9CoPKi3Jmw==;24:wh96pRvfZJzYNKhpJqbrmmuMOLzixNymsmEO1HzNaQq+H5UQig/t4EA5GuNodhGci84od1gvf0+ZFE7MXJN46iHJgwF+pNY0nfwJIHLKt7c=;20:304NAYo30TtqpNvykbYLjkXCbdgiJGQ3Nu9pp+aAjzjhBfJ+czi+OkwUu7INA5nMiMAhwVP82ZgSHIeLXunHiA== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Sep 2015 21:54:40.6516 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR02MB0775 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/28/2015 04:51 PM, Andy Lutomirski wrote: > On Mon, Sep 28, 2015 at 11:17 AM, Chris Metcalf wrote: >> With task_isolation mode, the task is in principle guaranteed not to >> be interrupted by the kernel, but only if it behaves. In particular, >> if it enters the kernel via system call, page fault, or any of a >> number of other synchronous traps, it may be unexpectedly exposed >> to long latencies. Add a simple flag that puts the process into >> a state where any such kernel entry is fatal; this is defined as >> happening immediately after the SECCOMP test. > Why after seccomp? Seccomp is still an entry, and the code would be > considerably simpler if it were before seccomp. I could be convinced to do it either way. My initial thinking was that a security violation was more interesting and more important to report than a strict-mode task-isolation violation. But see my comments in response to your email on patch 07/11. >> @@ -35,8 +36,12 @@ static inline enum ctx_state exception_enter(void) >> return 0; >> >> prev_ctx = this_cpu_read(context_tracking.state); >> - if (prev_ctx != CONTEXT_KERNEL) >> - context_tracking_exit(prev_ctx); >> + if (prev_ctx != CONTEXT_KERNEL) { >> + if (context_tracking_exit(prev_ctx)) { >> + if (task_isolation_strict()) >> + task_isolation_exception(); >> + } >> + } >> >> return prev_ctx; >> } > x86 does not promise to call this function. In fact, x86 is rather > likely to stop ever calling this function in the reasonably near > future. Yes, in which case we'd have to do it the same way we are doing it for arm64 (see patch 09/11), by calling task_isolation_exception() explicitly from within the relevant exception handlers. If we start doing that, it's probably worth wrapping up the logic into a single inline function to keep the added code short and sweet. If in fact this might happen in the short term, it might be a good idea to hook the individual exception handlers in x86 now, and not hook the exception_enter() mechanism at all. >> --- a/kernel/context_tracking.c >> +++ b/kernel/context_tracking.c >> @@ -144,15 +144,16 @@ NOKPROBE_SYMBOL(context_tracking_user_enter); >> * This call supports re-entrancy. This way it can be called from any exception >> * handler without needing to know if we came from userspace or not. >> */ >> -void context_tracking_exit(enum ctx_state state) >> +bool context_tracking_exit(enum ctx_state state) > This needs clear documentation of what the return value means. Added: * Return: if called with state == CONTEXT_USER, the function returns * true if we were in fact previously in user mode. >> +static void kill_task_isolation_strict_task(void) >> +{ >> + /* RCU should have been enabled prior to this point. */ >> + RCU_LOCKDEP_WARN(!rcu_is_watching(), "kernel entry without RCU"); >> + >> + dump_stack(); >> + current->task_isolation_flags &= ~PR_TASK_ISOLATION_ENABLE; >> + send_sig(SIGKILL, current, 1); >> +} > Wasn't this supposed to be configurable? Or is that something that > happens later on in the series? Yup, next patch. >> +void task_isolation_exception(void) >> +{ >> + pr_warn("%s/%d: task_isolation strict mode violated by exception\n", >> + current->comm, current->pid); >> + kill_task_isolation_strict_task(); >> +} > Should this say what exception? I could modify it to take a string argument (and then use it for the arm64 case at least). For the exception_enter() caller, we actually don't have the information available to pass down, and it would be hard to get it. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com