From mboxrd@z Thu Jan  1 00:00:00 1970
From: Corey Minyard <tcminyard@gmail.com>
Subject: Re: [Qemu-devel] Adding an IPMI BMC device to KVM
Date: Mon, 07 May 2012 15:47:02 -0500
Message-ID: <4FA834C6.8030502@acm.org>
References: <4FA429BA.3040006@acm.org> <4FA6788A.8080500@redhat.com> <4FA68C1E.3070503@codemonkey.ws> <4FA68D35.7060704@redhat.com> <4FA7DCA1.2010804@codemonkey.ws> <4FA7DFC7.4080603@redhat.com> <4FA7E253.30003@codemonkey.ws> <4FA7E61D.6000702@redhat.com> <4FA7E860.8010207@codemonkey.ws> <4FA80F71.30209@acm.org> <20120507194514.GK2437@redhat.com>
Reply-To: minyard@acm.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Anthony Liguori <anthony@codemonkey.ws>,
	Corey Minyard <tcminyard@gmail.com>,
	Avi Kivity <avi@redhat.com>, kvm@vger.kernel.org,
	qemu-devel <qemu-devel@nongnu.org>
To: Dave Allan <dallan@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-yw0-f46.google.com ([209.85.213.46]:54835 "EHLO
	mail-yw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S932146Ab2EGUrF (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 7 May 2012 16:47:05 -0400
Received: by yhmm54 with SMTP id m54so4606514yhm.19
        for <kvm@vger.kernel.org>; Mon, 07 May 2012 13:47:04 -0700 (PDT)
In-Reply-To: <20120507194514.GK2437@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 05/07/2012 02:45 PM, Dave Allan wrote:
> FWIW, the idea of an IPMI interface to VMs was proposed for libvirt
> not too long ago.  See:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=815136

Well, it wouldn't be to hard to do.  I already have working emulation 
code that does the IPMI LAN interface (including the IPMI 2.0 stuff for 
more reasonable security).  I have a KCS interface and a minimal IPMI 
controller working in KVM, though I'm not quite sure the best final way 
to hook it in.

Configuration is going to be the hardest part, but a minimal 
configuration for providing basic management would be easy.

-corey

> Dave
>
> On Mon, May 07, 2012 at 01:07:45PM -0500, Corey Minyard wrote:
>> I think we are getting a little out of hand here, and we are mixing
>> up concepts :).
>>
>> There are lots of things IPMI *can* do (including serial access, VGA
>> snooping, LAN access, etc.) but I don't see any value it that.  The
>> main thing here is to emulate the interface to the guest.  OOB
>> management is really more appropriately handled with libvirt.  How
>> the BMC integrates into the hardware varies a *lot* between systems,
>> but it's really kind of irrelevant.  (Well, almost irrelevant, BMCs
>> can provide a direct I2C messaging capability, and that may matter.)
>>
>> A guest can have one (or more) of a number of interfaces (that are
>> all fairly bad, unfortunately).  The standard ones are called "KCS",
>> "BT" and "SMIC" and they generally are directly on the ISA bus, but
>> are in memory on non-x86 boxes (and on some x86 boxes) and sometimes
>> on the PCI bus.  Some systems also have interfaces over I2C, but
>> that hasn't really caught on.  Others have interfaces over serial
>> ports, and that unfortunately has caught on in the ATCA world.  And
>> there are at least 3 different basic types of serial port interfaces
>> with sub-variants :(.  I'm not sure what the USB rndis device is,
>> but I'll look at it.  But there is no IPMI over USB.
>>
>> The big things a guest can do are sensor management, watchdog timer,
>> reset, and power control.  In complicated IPMI-based systems like
>> ATCA, a guest may want to send messages through its local IPMI
>> controller to other guest's IPMI controllers or to a central BMC
>> that runs an entire chassis of systems.  So that may need to be
>> supported, depending on what people want to do and how hard they
>> want to work on it.
>>
>> My proposal is to start small, with just a local interface, watchdog
>> timer, sensors and power control.  But have an architecture that
>> would allow external LAN access, tying BMCs in different qemu
>> instances together, perhaps serial over IPMI, and other things of
>> that nature.
>>
>> -corey
>>
>>
>> On 05/07/2012 10:21 AM, Anthony Liguori wrote:
>>> On 05/07/2012 10:11 AM, Avi Kivity wrote:
>>>> On 05/07/2012 05:55 PM, Anthony Liguori wrote:
>>>>>>> For all intents and purposes, the BMC/RSA is a separate physical
>>>>>>> machine.
>>>>>> That's true for any other card on a machine.
>>>>>
>>>>> It has a separate power source for all intents and purposes.  If you
>>>>> think of it in QOM terms, what connects the nodes together ultimately
>>>>> is the "Vcc" pin that travels across all devices.  The RTC is a little
>>>>> special because it has a battery backed CMOS/clock but it's also
>>>>> handled specially.
>>>> And we fail to emulate it correctly as well, wrt. alarms.
>>>>
>>>>> The BMC does not share Vcc.  It's no different than a separate
>>>>> physical box.  It just shares a couple buses.
>>>> It controls the main power place, reset line, can read VGA and emulate
>>>> keyboard, seems pretty well integrated.
>>> Emulating the keyboard is done through USB.  How the VGA thing
>>> works is very vendor dependent.  The VGA snooping can happen as
>>> part of the display path (essentially connected via a VGA cable)
>>> or it can be a side-band using a special graphics adapter.  I
>>> think QEMU VNC emulation is a pretty good analogy actually.
>>>
>>>>>> That is one way to do it.  Figure out the interactions between two
>>>>>> different parts in a machine, define an interface for them to
>>>>>> communicate, and split them into two processes.  We don't usually do
>>>>>> that; I believe your motivation is that the two have different power
>>>>>> domains (but then so do NICs with wake-on-LAN support).
>>>>> The power still comes from the PCI bus.
>>>> Maybe.  But it's on when the rest of the machine is off.  So Vcc is not
>>>> shared.
>>> That's all plumbed through the PCI bus FWIW.
>>>
>>>>> Think of something like a blade center.  Each individual blade does
>>>>> not have it's own BMC.  There's a single common BMC that provides an
>>>>> IPMI interface for all blades.  Yet each blade still sees an IPMI
>>>>> interface via a USB rndis device.
>>>>>
>>>>> You can rip out the memory, PCI devices, etc. from a box while the
>>>>> Power is in and the BMC will be unaffected.
>>>>>
>>>>>>> At any rate, you would have some sort of virtual hardware device that
>>>>>>> essentially spoke QMP to the slave instance.  You could just do
>>>>>>> virtio-serial and call it a day actually.
>>>>>> Sorry I lost you.  Which is the master and which is the slave?
>>>>> The BMC is the master, system being controlled is the slave.
>>>> Ah okay.  It also has to read the VGA output (say via vnc) and supply
>>>> keyboard input (via sendkey).
>>> Right, QMP + VNC is a pretty accurate analogy.
>>>
>>>>>>> It really boils down to what you are trying to do.  If you want to
>>>>>>> just get some piece of software working that expects to do IPMI, the
>>>>>>> easiest thing to do is run IPMI in the host and use a USB rndis
>>>>>>> interface to interact with it.
>>>>>> That would be most strange.  A remote client connecting to the IPMI
>>>>>> interface would control the power level of the host, not the guest.
>>>>> IPMI with a custom backend is what I mean.  That's what I mean by an
>>>>> IPMI ->   libvirt bridge.  You could build a libvirt client that exposes
>>>>> an IPMI interface and when you issue IPMI commands, it translate it to
>>>>> libvirt operations.
>>>>>
>>>>> This can run as a normal process on the host and then network it to
>>>>> the guest via an emulated USB rndis device.  Existing software on the
>>>>> guest shouldn't be able to tell the difference as long as it doesn't
>>>>> try to use I2C to talk to the BMC.
>>>> I still like the single process solution, it is more in line with the
>>>> rest of qemu and handles live migration better.
>>> Two QEMU processes could be migrated in unison if you really
>>> wanted to support that...
>>>
>>> With qemu-system-mips/sh4 you could probably even run the real BMC
>>> software stack if you were so inclined :-)
>>>
>>>> But even better would
>>>> be not to do this at all, and satisfy the remote management requirements
>>>> using the existing tools.
>>> Right.
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>