From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Wray Subject: Re: still can't start domain Date: Fri, 09 Jul 2004 18:31:32 +0100 Sender: xen-devel-admin@lists.sourceforge.net Message-ID: <40EED674.5010006@hpl.hp.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Errors-To: xen-devel-admin@lists.sourceforge.net List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , List-Archive: To: Keir Fraser Cc: Avery Pennarun , Mark Williamson , xen-devel@lists.sourceforge.net List-Id: xen-devel@lists.xenproject.org Keir Fraser wrote: >>Clearer error messages than "Error: Cannot start domain" would probably be a >>good start. Anyone could have figured out in short order that the domain >>couldn't be started. What's much more important is *why* the domain >>couldn't be started. You should find that the error reason is being returned along with the 'cannot start domain message now'. Up to now I've been letting xend exceptions go uncaught so I can debug the resulting stacktrace, so it's onlt recently that the errors have been caught and sent back to the client (xm). > This is absolutely true. > > >>In general, error messages should always say why an operation failed, not >>just that the operation failed. I find that this works wonders in my own >>programs. In C programs, for example, always print out 'errno' in any error >>message where it could possibly be relevant. (Of course, to avoid >>confusion, definitely don't print out errno where it's not valid anymore :)) >>The errno equivalent in perl is $!. In python I guess you have to use >>exceptions or something. Errno is not usually that useful - it's usually something higher-level that went wrong in most programs. > > The problem with the new toolset is that some of the context for > building a useful error message at the client program is separated > from the client in both space (its in a server, behind a socket) and > in time (the failure may be due to some random asynchronous request > made ages ago). A tedious problem but certainly surmountable -- this > area definitely needs attention. > > The crux of the problem is that we need a domain controller capable of > juggling multiple outstanding operations at the same time. In general > you do that by using multiple threads or by decoupling requests from > their eventual responses. Both can have their drawbacks, and > asynchronous models in particular need great care to avoid runaway > complexity. > > My preferred model would be lightweight language-level threads, but I > don't know if anything suitable exists for Python. What I've seen of > Twisted so far hasn't recommended it to me. :-) The problem is not with Twisted - it's with the way that domains have to be built and shutdown. Twisted is merely a handy framework for asynchronous programming - basically a big select loop. Domain create used to be pretty simple - call create in xc with the domain image, number of nics, disk spec and away you go. It's now much more complex, involving at least 5 independent parties connected by message streams: xen, xend, device driver domain(s), actual domain, client (xm). One a domain memory image is created, all devices have to be set up using messaging from xend to the device driver domain(s) and the new domain. All this is asynchronous, and not order-preserving, so it's intrinsically fairly complex. And the messaging is all multiplexed over a single channel. And it's mixed with the console i/o too. Xend has to keep lots of state - like all the devices - otherwise the domain can't even be shutdown. And since shutdown is intrinsically asynchronous this is complicated too. This could be programmed 2 ways: event-driven (asynch), or using threads. I've done both in my time and each has its advantages and disadvantages. Neither is easy. In my view we need to do some work on the state kept by device drivers, domains, and xen so that things can be simplified and xend doesn't need to keep so much fragile state and order information. Mike ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com