From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: XSA-152 follow-up: log levels for guest related messages Date: Fri, 30 Oct 2015 18:47:01 +0000 Message-ID: <5633BB25.4040905@citrix.com> References: <5632325502000078000AFE4C@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZsEhd-0001sF-W2 for xen-devel@lists.xenproject.org; Fri, 30 Oct 2015 18:47:06 +0000 In-Reply-To: <5632325502000078000AFE4C@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , xen-devel List-Id: xen-devel@lists.xenproject.org On 29/10/15 13:51, Jan Beulich wrote: > All, > > in the course of auditing other code in the context of that > security issue my attention was caught by the various printk()s > issued by domain_crash() or around and alike. Within the security > team we discussed this and decided that a few not rate limited > messages per second (resulting from a possibly auto-restarting > guest) are not really a security issue, yet I still wanted to bring > this up for wider discussion: Would any such printk()s better use > a guest log level, thus making them rate limited by default? There are a number of actions which end up in a domain crash. Some are from guest actions, but some are from Xen failing to look after state properly. XenServer has always had domain ratelimiting. If a VM crashes or reboots within 5s of starting, it will be shut down as opposed to rebooting. xl doesn't have any such behaviour. As such, it is extraordinarily difficult to stop a runaway triple-faulting domain with xl. `xl destroy` doesn't work as the domid changes frequently and domain name appears and disappears, leading to TOCTOU races. The best I found was locate the xl process used to create the domain originally and kill it, then using xl destroy to clean up the remnants. Putting domain_crash() printks into ratelimit by default runs the risk of hiding information pertinent to debugging a crash. IMO, if "rate of printk()s" is a concern, then the fix is to prevent the guest wildly restarting, not to hide information. ~Andrew