From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 36C2B7F5A for ; Wed, 2 Dec 2015 18:18:16 -0600 (CST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay2.corp.sgi.com (Postfix) with ESMTP id 2BAE6304053 for ; Wed, 2 Dec 2015 16:18:12 -0800 (PST) Received: from app1a.xlhost.de (mailout173.xlhost.de [84.200.252.173]) by cuda.sgi.com with ESMTP id 2xKQBuwNVrVJrBjA for ; Wed, 02 Dec 2015 16:18:07 -0800 (PST) Message-ID: <565F8A68.9040401@5t9.de> Date: Thu, 03 Dec 2015 01:18:48 +0100 From: Lutz Vieweg MIME-Version: 1.0 Subject: Re: automatic testing of cgroup writeback limiting References: <5652F311.7000406@5t9.de> <20151125213500.GK26718@dastard> <565B70F9.8060707@5t9.de> <1711940.cDn6AztRgi@merkaba> <20151201163815.GB12922@mtj.duckdns.org> In-Reply-To: <20151201163815.GB12922@mtj.duckdns.org> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Tejun Heo , Martin Steigerwald Cc: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com On 12/01/2015 05:38 PM, Tejun Heo wrote: > As opposed to pages. cgroup ownership is tracked per inode, not per > page, so if multiple cgroups write to the same inode at the same time, > some IOs will be incorrectly attributed. I can't think of use cases where this could become a problem. If more than one user/container/VM is allowed to write to the same file at any one time, isolation is probably absent anyway ;-) > cgroup ownership is per-inode. IO throttling is per-device, so as > long as multiple filesystems map to the same device, they fall under > the same limit. Good, that's why I assumed it useful to include a scenario with more than one filesystem on the same device into the test scenario, just to know whether there are unexpected issues if more than one filesystem utilizes the same underlying device. >>>> Metadata IO not throttled - it is owned by the filesystem and hence >>>> root cgroup. >>> >>> Ouch. That kind of defeats the purpose of limiting evil processes' >>> ability to DOS other processes. > > cgroup isn't a security mechanism and has to make active tradeoffs > between isolation and overhead. It doesn't provide protection against > malicious users and in general it's a pretty bad idea to depend on > cgroup for protection against hostile entities. I wrote of "evil" processes for simplicity, but 99 out of 100 times it's not intentional "evilness" that makes a process exhaust I/O bandwidth of some device shared with other users/containers/VMs, it's usually just bugs, inconsiderate programming or inappropriate use that makes one process write like crazy, making other users/containers/VMs suffer. Whereever strict service level guarantees are relevant, and applications require writing to storage, you currently cannot consolidate two or more applications onto the same physical host, even if they run under separate users/containers/VMs. I understand there is no short or medium term solution that would allow to isolate processes writing to the same filesytem (because of the meta data writing), but is it correct to say that at least VMs, which do not allow the virtual guest to cause extensive meta data writes on the physical host, only writes into pre-allocated image files, can be safely isolated by the new "buffered write accounting"? If so, we'd have stay away from user or container based isolation of independently SLA'd applications, but could at least resort to VMs using image files on a shared filesystem. Regards, Lutz Vieweg _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs