From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alberto =?iso-8859-1?q?Trevi=F1o?= Subject: Re: Avoiding I/O bottlenecks between VM's Date: Fri, 19 Sep 2008 12:53:30 -0600 Message-ID: <200809191253.30228.alberto@byu.edu> References: <200809191126.09889.alberto@byu.edu> <20080919184146.GA12928@dmt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE To: "kvm@vger.kernel.org" Return-path: Received: from pto.byu.edu ([128.187.16.44]:17369 "EHLO webmail-int.byu.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751227AbYISSxO convert rfc822-to-8bit (ORCPT ); Fri, 19 Sep 2008 14:53:14 -0400 In-Reply-To: <20080919184146.GA12928@dmt.cnet> Content-Disposition: inline Sender: kvm-owner@vger.kernel.org List-ID: On Friday 19 September 2008 12:41:46 pm you wrote: > Are you using filesystem backed storage for the guest images or direc= t > block device storage? I assume there's heavy write activity on the > guests when these hangs happen? Yes, they happen when one VM is doing heavy writes. I'm actually using= a=20 whole stack of things: OCFS2 on DRBD (Primary-Primary) on LVM Volume (continuous) on LUKS-encr= ypted=20 partition. Fun debugging that, heh? In trying to figure out the problem, I tried to reconfigure DRBD to use= =20 Protocol B instead of C. However, it failed to make the switch and bot= h=20 nodes disconnected so now I have a split-brain. In try to fix the spli= t=20 brain I'm taking down on one node all the VM's one by one, copying the = VM=20 drives from one node to the next, and starting up on the other node (ol= d- fashioned migration). Yes, I'm having *lots* of fun! Perfect way to e= nd=20 the week! So, any ideas on how to solve the bottleneck? Isn't the CFQ scheduler=20 supposed to grant every processes the same amount of I/O? Is there a w= ay to=20 change something in proc to avoid this situation? --=20 Alberto Trevi=F1o BYU Testing Center Brigham Young University