From mboxrd@z Thu Jan 1 00:00:00 1970 From: henanwxr Subject: Re-design the architecture of Xen Date: Mon, 23 May 2011 04:39:37 -0700 (PDT) Message-ID: <1306150777633-4418793.post@n5.nabble.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org http://xen.1045712.n5.nabble.com/file/n4418793/6.bmp We have researched virtualization for several years, with the reference of Xen, we have design a new VMM architecture called Cooperative model VMM=EF=BC=8Cand have implem= ented a prototype system. We present its principle and part of details here. Part1 motivation B. Domain0 problems Domain0 has several features:=20 =EF=81=AC=09Running modified operating system.=20 =EF=81=AC=09Running on processor with privilege level 1=20 =EF=81=AC=09Running in a form of virtual machine =EF=81=AC=09Single system managing hardware These features of Domain0 bring the following issues: 1) tight coupling >>From a performance point of view, the coordination of Domain0 and VMM (such as: hypercall), event channel and IO ring can improve virtualization efficiency, which, however, requires more modification of guest operating system. Also, VMM needs to provide the corresponding interface. The tight coupling formed between Domain0 and VMM results that VMM implementations must take third-party system characteristics into account, design is lack o= f independence and flexibility.=20 2) privilege level switch Domain0 is running on the processor with privilege level 1, context switch from the VMM to Domain0 will trigger processor privilege level switches. If operation of this type is more frequent (such as IO request operation for a virtual machine), it will result in larger processor overhead, impacting th= e performance of virtual machine. 3) overhead of management Operating as a virtual machine, Domain0 needs VMM to provide appropriate virtual machine managing interface, such as: creation, resource allocation, scheduling, and destruction, etc., the resulting administrative overhead. Domain0, as the main provider of device access, its function is relatively fixed and administrative overhead should be avoided to reduce the burden on VMM.=20 4) scheduling Delay=20 Domain0 and other virtual machines take part in VMM scheduling, due to scheduling rotation characteristics, Domain0 can not guarantee timely delivery of services, which results a number of related issues. First, afte= r VMM receive IO request from virtual machine, Domain0 could not be immediately notice, only asynchronous notice way which similar to soft interrupt can be used, and Domian0 will test and process it when running. Second, device model of Domain0 is provided by Qemu, which is running as a process of guest OS. When Domain0 is not running, Qemu can not handle IO requests from virtual machine, resulting in delay of processing IO requests= . Third, other virtual machine scheduling depends on virtual clock interrupts= , Domian0 simulation of virtual clock will lead to problems of virtual clock synchronization, virtual machine scheduling, and clock synchronization between the virtual multi-core (currently the realization of virtual clock has migrated from Domain0 to VMM). 5) IOPM bottleneck In multiple virtual machines running case, the resulting IO request will be quite frequently, because Domain0 is the only IOPM (IO process machine) of entire system, and all IO requests will be handled through Domain0, forming the IOPM bottleneck. For further considerations, if one IOPM fails, and if it cannot be replaced timely by alternative IOPM, entire system can only be restarted, resulting in delays or even collapse of services of virtual machine.=20 Main cause of Domain0 related problems mentioned above are that IOPM is virtualized, acting as a subsidiary module of VMM. Because the nature role of Domain0 is providing services of accessing equipment to VMM, a possible solution is: under the premise that Domain0 provides services to VMM, to achieve IOPM thoroughly separated from VMM. From four aspects:=20 Weakening of VMM and Domain0 coupling to increase the independence of VMM design.=20 =EF=81=AC=09Reducing VMM interference to Domain0 to give Domain0 the right = to operate independently.=20 =EF=81=AC=09Establishing interact between VMM and Domain0 to ensure that Do= main0 provide device access services to VMM.=20 =EF=81=AC=09Providing multiple IOPM to achieve load balance.=20 In accordance with the above considerations, operating system does not need to be modified too much to implement IOPM, IOPM interacts with VMM with onl= y a small number of interfaces. From the way of controlling hardware resource= s directly, IOPM converts from subsidiary module of VMM into cooperation module of VMM. The cooperation model of VMM discussed below achieves and verifies the above-mentioned IOPM.=20 Part2 Cooperative model VMM=20 A. Cooperative model description With the popularity of multi-core processors and of large-capacity memory, hardware resources of PC machine are no longer scarce. In the 60's of last century, IBM S/360 mainframe used hardware partition approach to implement virtualization, providing a useful inspiration for the current PC platform virtualization.=20 For the problems of IOPM virtualization and coupling tightly with VMM in Hybrid model, method of hardware division can be used to make IOPM control = a part of hardware resources directly, converting from virtual machine to privileged machine, forming structure of IOPM and VMM cooperative. Main control system consists of two parts: VMM which implement processor and memory virtualizations, and IOPM which controls peripherals and provides device model. More than one IOPM can exist, and each IOPM control an AP, while VMM controls BSP and the rest of APs, as shown in Fig 5. Cooperative model has the following characteristics: =EF=81=AC=09Elimination of tight coupling between VMM and IOPM, which inter= act through only a handful of interfaces. =EF=81=AC=09Independence of IOPM from VMM monitoring and scheduling.=20 =EF=81=AC=09Multiple IOPM parallel for load balance and failure replacement= =20 http://xen.1045712.n5.nabble.com/file/n4418793/1.bmp=20 Figure 5. Structure of cooperative VMM B. Interrupt handling=20 1) IOPM controls right of interrupt reception Assume that device interrupt is submitted directly to IOPM, it looks like that device access path of=20 IOPM is shortened, as shown in Fig 6.=20 http://xen.1045712.n5.nabble.com/file/n4418793/2.bmp=20 Figure 6. IOPM controls right of interrup= t reception In this way, IOPM has the rights of external interrupt reception and processing at the same time, but consider the following three situations:= =20 =EF=81=AC=09IOPM contains a large number of device drivers, whose stability= will affect the security of IOPM and whole system. Suppose that IOPM fails due t= o device driver failure, consequences result is that corresponding device interrupted can not be responded so that virtual machine IO requests can no= t be processed.=20 =EF=81=AC=09In some cases, a small amount special device drivers are need t= o be integrated into VMM, then IO requests can be handled within VMM without delivering to IOPM, thereby enhancing efficiency of devices access, such as certain interrupt high frequency devices (clock, net card, etc.).=20 =EF=81=AC=09To enhance the stability of whole system, hoping driver can be = distributed across multiple IOPM, to prevent collapse of entire system caused by a single IOPM failure. In this case, VMM needs to control right of interrupt reception, and submit the interruption to other IOPM.=20 Above analysis shows that, right of interrupt reception controlled by IOPM has a big problem, interrupt reception and interrupt handling need to be separate: VMM receive interrupts, while IOPM handling interrupts, controlling of right of interrupt reception by VMM can achieve equipment control at minimal expense.=20 2) VMM controls right of interrupt reception To solve these problems of IOPM control right of interrupt reception, interrupt handling can be improved as follows: External interrupt submitted to VMM firstly, VMM providing interrupt routing function, routed interrupt to appropriate IOPMs. External interrupt first submitted to the VMM, depending on actual circumstances, VMM can handle directly, or submit to an IOPM, as shown in Fig 7.=20 http://xen.1045712.n5.nabble.com/file/n4418793/3.bmp=20 Figure 7. VMM controls right of interrupt reception The improved VMM has the following characteristics in device processing: =EF=81=AC=09Interruption is received and routed by VMM to improve flexibili= ty of interrupt handling.=20 =EF=81=AC=09VMM integrates directly some of the key device drivers to short= en device access path.=20 =EF=81=AC=09Device drivers are distributed in multiple IOPM to achieve load= balance and failure replacement.=20 Part3 Model implementation Implementation of cooperative VMM require division of hardware resources which can eliminating control conflict of hardware between VMM and IOPM. On this basis, appropriate operating system will be selected and transformatio= n to IOPM. Currently, the realization of this model is based on the dual-processor platform with Intel VT-x, and the IOPM is based on Linux. A. Hardware division Hardware division among IOPM and VMM as shown in Table 1. TABLE 1. HARDWARE DIVISION BETWEEN IOPM AND VMM=20 http://xen.1045712.n5.nabble.com/file/n4418793/4.bmp=20 1) Processor IOPM controls a single processor, can not be used for multi-processor-related operations. BSP need to be run first after starting of machine and controlled by VMM, VMM then can start AP and running IOPM at an appropriate time in order to make the VMM and IOPM running paralleled. 2) Memory Physical memory is controlled with subarea by VMM and IOPM, but data can interact through shared memory. 3) IOAPIC External interruption must first submit to BSP in which VMM is located, the decision of handling interruption will be made by VMM. 4) Clock Both VMM and IOPM require scheduling of its internal program. Since scheduling and clock interrupts are related, clock interrupt will need to b= e submitted to the VMM and IOPM at the same time. 5) IO Device IO device is controlled by IOPM, IO request of the Virtual Machine will be submitted to IOPM through VMM, accessing of device is achieved with help of its device driver. B. IOPM Implementation Implementation of IOPM involves four aspects: 1)=09Boot IOPM In traditional, Linux is load by boot loader, for example grub, Linux kerne= l code is divided into two parts, real mode and protected mode. According to Linux boot protocol, real mode code is required to be copied to a space which below 1M by bootloader and bootloader parse kernel header information in order to cope protected mode code to specified location. Boot loader the= n jump to location of real mode code and operating system will take control o= f machine. Boot IOPM by VMM also needs to simulate this flow, Linux real mode code will be copied to a free space which below 1M. In traditional, protected mode code is located in 1M, which has been occupied by VMM. Therefore, protected mode code is copied to another security zones. VMM boot AP processors after completion of layout of IOPM code, it needs to switch to real mode before the execution of IOPM by AP, and then jump to the starting address of the real mode code. The flow is shown in Fig 8. http://xen.1045712.n5.nabble.com/file/n4418793/5.bmp=20 Figure 8. Flow of booting IOPM=20 2)=09Physical memory isolation In order to achieve spacial address isolation and data exchange between VMM and IOPM, entire physical memory is divided into three parts: VMM managemen= t zone, IOPM Management zone, and shared zone. Management zones involved in the dynamic allocation and recovery of memory manager, sharing zone can onl= y be accessed but not participate in allocation, division of physical memory and its property as shown in Fig 9.=20 http://xen.1045712.n5.nabble.com/file/n4418793/6.bmp=20 Figure 9. division of physical memory and its property 3) Communications between VMM and IOPM VMM and IOPM generally communicate under two conditions: First of all, IO requests issued by virtual machine captured by VMM and submit to IOPM, IOPM then return the processing results to VMM. Secondly, user issues a request to VMM through user interface which provided by IOPM to complete the virtua= l machine operation. Communication mechanism built on IPIs and shared memory, IPIs is used for message notification between IOPM and VMM, shared memory i= s used for temporary storage of interactive data.=20 3)Shared memory Shared memory is used for temporary storage of interactive data between VMM and IOPM. In order to prevent buffer overflow, organizations of shared memory is required. The shared memory is divided into four parts: VMM-controlled area, IOPM-controlled area, VMM data area, IOPM data area. The public control pointer which store in controlled area is used to operat= e data package in data area. Data area is organized in form of ring: VMM data area is used for temporary storage of data package from VMM to IOPM, IOPM data area is used for temporary storage data package from IOPM to VMM.=20 Others =E2=80=A6.. -- View this message in context: http://xen.1045712.n5.nabble.com/Re-design-th= e-architecture-of-Xen-tp4418793p4418793.html Sent from the Xen - Dev mailing list archive at Nabble.com.