From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:44796)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1SRPLV-0000CT-0U
	for qemu-devel@nongnu.org; Mon, 07 May 2012 10:55:39 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1SRPLR-0003lV-Cn
	for qemu-devel@nongnu.org; Mon, 07 May 2012 10:55:28 -0400
Received: from mail-ob0-f173.google.com ([209.85.214.173]:64673)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1SRPLR-0003k4-6B
	for qemu-devel@nongnu.org; Mon, 07 May 2012 10:55:25 -0400
Received: by obbwd20 with SMTP id wd20so10158785obb.4
	for <qemu-devel@nongnu.org>; Mon, 07 May 2012 07:55:23 -0700 (PDT)
Message-ID: <4FA7E253.30003@codemonkey.ws>
Date: Mon, 07 May 2012 09:55:15 -0500
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <4FA429BA.3040006@acm.org> <4FA6788A.8080500@redhat.com>
	<4FA68C1E.3070503@codemonkey.ws> <4FA68D35.7060704@redhat.com>
	<4FA7DCA1.2010804@codemonkey.ws> <4FA7DFC7.4080603@redhat.com>
In-Reply-To: <4FA7DFC7.4080603@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Adding an IPMI BMC device to KVM
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: minyard@acm.org, qemu-devel <qemu-devel@nongnu.org>, kvm@vger.kernel.org, Corey Minyard <tcminyard@gmail.com>

On 05/07/2012 09:44 AM, Avi Kivity wrote:
> On 05/07/2012 05:30 PM, Anthony Liguori wrote:
>> On 05/06/2012 09:39 AM, Avi Kivity wrote:
>>> On 05/06/2012 05:35 PM, Anthony Liguori wrote:
>>>> On 05/06/2012 08:11 AM, Avi Kivity wrote:
>>>> libvirt is essentially the BMC for a virtual guest.  I would suggest
>>>> looking at implementing an IPMI interface to libvirt and exposing it
>>>> to the guest through a USB RNDIS device.
>>>>
>>>
>>> That's the first option.  One unanswered question is what to do when the
>>> guest is down?  Someone should listen for IPMI events, but we can't make
>>> it libvirt unconditionally, since many instances of libvirt are active
>>> at any one time.
>>>
>>> Note the IPMI external interface needs to be migrated, like any other.
>>
>> For all intents and purposes, the BMC/RSA is a separate physical
>> machine.
>
> That's true for any other card on a machine.

It has a separate power source for all intents and purposes.  If you think of it 
in QOM terms, what connects the nodes together ultimately is the "Vcc" pin that 
travels across all devices.  The RTC is a little special because it has a 
battery backed CMOS/clock but it's also handled specially.

The BMC does not share Vcc.  It's no different than a separate physical box.  It 
just shares a couple buses.

>> If you really wanted to model it, you would launch two instances of
>> QEMU.  The BMC instance would have a virtual NIC and would share a USB
>> bus with the slave QEMU instance (probably via USBoIP).  The USB bus
>> is how the BMC exposes IPMI to the guest (via a USB rndis adapter),
>> remote media, etc.  I believe some BMC's also expose IPMI over i2c but
>> that's pretty low bandwidth.
>
> That is one way to do it.  Figure out the interactions between two
> different parts in a machine, define an interface for them to
> communicate, and split them into two processes.  We don't usually do
> that; I believe your motivation is that the two have different power
> domains (but then so do NICs with wake-on-LAN support).

The power still comes from the PCI bus.

Think of something like a blade center.  Each individual blade does not have 
it's own BMC.  There's a single common BMC that provides an IPMI interface for 
all blades.  Yet each blade still sees an IPMI interface via a USB rndis device.

You can rip out the memory, PCI devices, etc. from a box while the Power is in 
and the BMC will be unaffected.

>
>> At any rate, you would have some sort of virtual hardware device that
>> essentially spoke QMP to the slave instance.  You could just do
>> virtio-serial and call it a day actually.
>
> Sorry I lost you.  Which is the master and which is the slave?

The BMC is the master, system being controlled is the slave.

>
>> It really boils down to what you are trying to do.  If you want to
>> just get some piece of software working that expects to do IPMI, the
>> easiest thing to do is run IPMI in the host and use a USB rndis
>> interface to interact with it.
>
> That would be most strange.  A remote client connecting to the IPMI
> interface would control the power level of the host, not the guest.

IPMI with a custom backend is what I mean.  That's what I mean by an IPMI -> 
libvirt bridge.  You could build a libvirt client that exposes an IPMI interface 
and when you issue IPMI commands, it translate it to libvirt operations.

This can run as a normal process on the host and then network it to the guest 
via an emulated USB rndis device.  Existing software on the guest shouldn't be 
able to tell the difference as long as it doesn't try to use I2C to talk to the BMC.

>
>> I don't think there's a tremendous amount of value in QEMU making
>> itself look like an IBM IMM or whatever HP/Dell's equivalent is.  As I
>> said, these stacks are hugely complicated and there are better ways of
>> doing out of band management (like talk to libvirt directly).
>
> I have to agree here.
>

Regards,

Anthony Liguori