qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Linux CPU accouting with kqemu
@ 2008-06-06 17:57 Steve Fosdick
  2008-06-06 18:59 ` Laurent Vivier
  0 siblings, 1 reply; 4+ messages in thread
From: Steve Fosdick @ 2008-06-06 17:57 UTC (permalink / raw)
  To: qemu-devel

I have noticed that the Linux system CPU usage seems remarkably high 
when running Windows XP as a guest under qemu/kqemu, but not when 
running under qemu without kqemu.

Just today I was reading the kqemu API documentation and it occurred to 
be that from the Linux perspective the user/system CPU times may be 
backward with kqemu.

Is it perhaps the case that when kqemu has been told to run code in the 
guest VM (KQEMU_EXEC ioctl) and this is happenning without any 
exceptions (more likely when the guest is running user-mode code rather 
than OS code) Linux still sees this as running in the kernel and 
records it as "system" time but when kqemu returns to qemu (because of 
an exception) the Linux kernel now sees this as user CPU time when in 
all likelihood the guest OS is now executing somewhere it it's kernel?

Regards,
Steve.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Linux CPU accouting with kqemu
  2008-06-06 17:57 [Qemu-devel] Linux CPU accouting with kqemu Steve Fosdick
@ 2008-06-06 18:59 ` Laurent Vivier
  2008-06-11  8:41   ` [Qemu-devel] [PATCHl] kqemu: Linux CPU accouting Laurent Vivier
  0 siblings, 1 reply; 4+ messages in thread
From: Laurent Vivier @ 2008-06-06 18:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: Steve Fosdick

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]


Le 6 juin 08 à 19:57, Steve Fosdick a écrit :

> I have noticed that the Linux system CPU usage seems remarkably high
> when running Windows XP as a guest under qemu/kqemu, but not when
> running under qemu without kqemu.
>
> Just today I was reading the kqemu API documentation and it occurred  
> to
> be that from the Linux perspective the user/system CPU times may be
> backward with kqemu.
>
> Is it perhaps the case that when kqemu has been told to run code in  
> the
> guest VM (KQEMU_EXEC ioctl) and this is happenning without any
> exceptions (more likely when the guest is running user-mode code  
> rather
> than OS code) Linux still sees this as running in the kernel and
> records it as "system" time but when kqemu returns to qemu (because of
> an exception) the Linux kernel now sees this as user CPU time when in
> all likelihood the guest OS is now executing somewhere it it's kernel?


If kqemu works like KVM, it is the case.

I've made some modifications for KVM in the linux kernel to account  
this time to user time and to a new field called guest time.
To use this, kqemu should be modified in the same way:

add "current->flags |= PF_VCPU;"
when entering in the guest code and
"current->flags &= ~PF_VCPU;"  when exiting.

The patch for kvm is here:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d172fcd3ae1ca7ac27ec8904242fd61e0e11d332

You can have details with the patch submission thread:

http://kerneltrap.org/Linux/Virtual_Machine_Time_Accounting


Regards,
Laurent
----------------------- Laurent Vivier ----------------------
"The best way to predict the future is to invent it."
- Alan Kay






[-- Attachment #2: Type: text/html, Size: 5171 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCHl] kqemu: Linux CPU accouting
  2008-06-06 18:59 ` Laurent Vivier
@ 2008-06-11  8:41   ` Laurent Vivier
  2008-06-11  9:27     ` [Qemu-devel] " Fabrice Bellard
  0 siblings, 1 reply; 4+ messages in thread
From: Laurent Vivier @ 2008-06-11  8:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: Steve Fosdick

[-- Attachment #1: Type: text/plain, Size: 584 bytes --]

Hi,

Here is a patch allowing to register guest time to guest field
of /proc/stat and /proc/%pid/stat (it moves also this part of cpu time
from system time to user time). This field is available since linux
2.6.24.

To play with it, I provide a second patch which is a patch for
procps-3.2.7 allowing to display correctly this value (the per cpu "g"
field).

For more details, see:

http://kerneltrap.org/Linux/Virtual_Machine_Time_Accounting

Regards,
Laurent
-- 
------------- Laurent.Vivier@bull.net ---------------
"The best way to predict the future is to invent it."
- Alan Kay

[-- Attachment #2: kqemu-guest-time.patch --]
[-- Type: text/x-patch, Size: 647 bytes --]

Index: kqemu-1.4.0pre1/kqemu-linux.c
===================================================================
--- kqemu-1.4.0pre1.orig/kqemu-linux.c	2008-06-11 10:29:37.000000000 +0200
+++ kqemu-1.4.0pre1/kqemu-linux.c	2008-06-11 10:31:30.000000000 +0200
@@ -318,7 +318,13 @@ static int kqemu_ioctl(struct inode *ino
 #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,10)
             unlock_kernel();
 #endif
+#ifdef PF_VCPU
+            current->flags |= PF_VCPU;
+#endif
             ret = kqemu_exec(s);
+#ifdef PF_VCPU
+            current->flags &= ~PF_VCPU;
+#endif
 #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,10)
             lock_kernel();
 #endif

[-- Attachment #3: procps-3.2.7-guest-time.patch --]
[-- Type: text/x-patch, Size: 5049 bytes --]

Index: procps-3.2.7/top.c
===================================================================
--- procps-3.2.7.orig/top.c	2007-08-10 17:34:46.000000000 +0200
+++ procps-3.2.7/top.c	2007-08-13 13:14:43.000000000 +0200
@@ -935,7 +935,8 @@ static CPU_t *cpus_refresh (CPU_t *cpus)
    cpus[Cpu_tot].x = 0;  // FIXME: can't tell by kernel version number
    cpus[Cpu_tot].y = 0;  // FIXME: can't tell by kernel version number
    cpus[Cpu_tot].z = 0;  // FIXME: can't tell by kernel version number
-   num = sscanf(buf, "cpu %Lu %Lu %Lu %Lu %Lu %Lu %Lu %Lu",
+   cpus[Cpu_tot].g = 0;  // FIXME: can't tell by kernel version number
+   num = sscanf(buf, "cpu %Lu %Lu %Lu %Lu %Lu %Lu %Lu %Lu %Lu",
       &cpus[Cpu_tot].u,
       &cpus[Cpu_tot].n,
       &cpus[Cpu_tot].s,
@@ -943,7 +944,8 @@ static CPU_t *cpus_refresh (CPU_t *cpus)
       &cpus[Cpu_tot].w,
       &cpus[Cpu_tot].x,
       &cpus[Cpu_tot].y,
-      &cpus[Cpu_tot].z
+      &cpus[Cpu_tot].z,
+      &cpus[Cpu_tot].g
    );
    if (num < 4)
          std_err("failed /proc/stat read");
@@ -960,9 +962,10 @@ static CPU_t *cpus_refresh (CPU_t *cpus)
       cpus[i].x = 0;  // FIXME: can't tell by kernel version number
       cpus[i].y = 0;  // FIXME: can't tell by kernel version number
       cpus[i].z = 0;  // FIXME: can't tell by kernel version number
-      num = sscanf(buf, "cpu%u %Lu %Lu %Lu %Lu %Lu %Lu %Lu %Lu",
+      cpus[i].g = 0;  // FIXME: can't tell by kernel version number
+      num = sscanf(buf, "cpu%u %Lu %Lu %Lu %Lu %Lu %Lu %Lu %Lu %Lu",
          &cpus[i].id,
-         &cpus[i].u, &cpus[i].n, &cpus[i].s, &cpus[i].i, &cpus[i].w, &cpus[i].x, &cpus[i].y, &cpus[i].z
+         &cpus[i].u, &cpus[i].n, &cpus[i].s, &cpus[i].i, &cpus[i].w, &cpus[i].x, &cpus[i].y, &cpus[i].z, &cpus[i].g
       );
       if (num < 4)
             std_err("failed /proc/stat read");
@@ -2879,10 +2882,11 @@ static void summaryhlp (CPU_t *cpu, cons
    // we'll trim to zero if we get negative time ticks,
    // which has happened with some SMP kernels (pre-2.4?)
 #define TRIMz(x)  ((tz = (SIC_t)(x)) < 0 ? 0 : tz)
-   SIC_t u_frme, s_frme, n_frme, i_frme, w_frme, x_frme, y_frme, z_frme, tot_frme, tz;
+   SIC_t u_frme, s_frme, n_frme, i_frme, w_frme, x_frme, y_frme, z_frme, g_frme, tot_frme, tz, u_tmp;
    float scale;
 
-   u_frme = cpu->u - cpu->u_sav;
+   u_tmp = cpu->u - cpu->g;
+   u_frme = TRIMz(u_tmp - cpu->u_sav);
    s_frme = cpu->s - cpu->s_sav;
    n_frme = cpu->n - cpu->n_sav;
    i_frme = TRIMz(cpu->i - cpu->i_sav);
@@ -2890,7 +2894,8 @@ static void summaryhlp (CPU_t *cpu, cons
    x_frme = cpu->x - cpu->x_sav;
    y_frme = cpu->y - cpu->y_sav;
    z_frme = cpu->z - cpu->z_sav;
-   tot_frme = u_frme + s_frme + n_frme + i_frme + w_frme + x_frme + y_frme + z_frme;
+   g_frme = cpu->g - cpu->g_sav;
+   tot_frme = u_frme + s_frme + n_frme + i_frme + w_frme + x_frme + y_frme + z_frme + g_frme;
    if (tot_frme < 1) tot_frme = 1;
    scale = 100.0 / (float)tot_frme;
 
@@ -2908,13 +2913,14 @@ static void summaryhlp (CPU_t *cpu, cons
          (float)w_frme * scale,
          (float)x_frme * scale,
          (float)y_frme * scale,
-         (float)z_frme * scale
+         (float)z_frme * scale,
+         (float)g_frme * scale
       )
    );
    Msg_row += 1;
 
    // remember for next time around
-   cpu->u_sav = cpu->u;
+   cpu->u_sav = u_tmp;
    cpu->s_sav = cpu->s;
    cpu->n_sav = cpu->n;
    cpu->i_sav = cpu->i;
@@ -2922,6 +2928,7 @@ static void summaryhlp (CPU_t *cpu, cons
    cpu->x_sav = cpu->x;
    cpu->y_sav = cpu->y;
    cpu->z_sav = cpu->z;
+   cpu->g_sav = cpu->g;
 
 #undef TRIMz
 }
Index: procps-3.2.7/top.h
===================================================================
--- procps-3.2.7.orig/top.h	2007-08-10 17:34:46.000000000 +0200
+++ procps-3.2.7/top.h	2007-08-13 13:14:43.000000000 +0200
@@ -211,8 +211,8 @@ typedef struct HST_t {
 // calculations.  It exists primarily for SMP support but serves
 // all environments.
 typedef struct CPU_t {
-   TIC_t u, n, s, i, w, x, y, z; // as represented in /proc/stat
-   TIC_t u_sav, s_sav, n_sav, i_sav, w_sav, x_sav, y_sav, z_sav; // in the order of our display
+   TIC_t u, n, s, i, w, x, y, z, g; // as represented in /proc/stat
+   TIC_t u_sav, s_sav, n_sav, i_sav, w_sav, x_sav, y_sav, z_sav, g_sav; // in the order of our display
    unsigned id;  // the CPU ID number
 } CPU_t;
 
@@ -390,7 +390,7 @@ typedef struct WIN_t {
 #define STATES_line2x6  "%s\03" \
    " %#4.1f%% \02us,\03 %#4.1f%% \02sy,\03 %#4.1f%% \02ni,\03 %#4.1f%% \02id,\03 %#4.1f%% \02wa,\03 %#4.1f%% \02hi,\03 %#4.1f%% \02si\03\n"
 #define STATES_line2x7  "%s\03" \
-   "%#5.1f%%\02us,\03%#5.1f%%\02sy,\03%#5.1f%%\02ni,\03%#5.1f%%\02id,\03%#5.1f%%\02wa,\03%#5.1f%%\02hi,\03%#5.1f%%\02si,\03%#5.1f%%\02st\03\n"
+  "%#4.1f%%\02us,\03%#4.1f%%\02sy,\03%#4.1f%%\02ni,\03%#5.1f%%\02id,\03%#4.1f%%\02wa,\03%#4.1f%%\02hi,\03%#4.1f%%\02si,\03%#4.1f%%\02st\03,\02%#4.1f%%\02g\n"
 #ifdef CASEUP_SUMMK
 #define MEMORY_line1  "Mem: \03" \
    " %8luK \02total,\03 %8luK \02used,\03 %8luK \02free,\03 %8luK \02buffers\03\n"

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] Re: [PATCHl] kqemu: Linux CPU accouting
  2008-06-11  8:41   ` [Qemu-devel] [PATCHl] kqemu: Linux CPU accouting Laurent Vivier
@ 2008-06-11  9:27     ` Fabrice Bellard
  0 siblings, 0 replies; 4+ messages in thread
From: Fabrice Bellard @ 2008-06-11  9:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier

Laurent Vivier wrote:
> Hi,
> 
> Here is a patch allowing to register guest time to guest field
> of /proc/stat and /proc/%pid/stat (it moves also this part of cpu time
> from system time to user time). This field is available since linux
> 2.6.24.

Good Idea. It will be included in the next kqemu release.

Fabrice.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-06-11  9:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-06 17:57 [Qemu-devel] Linux CPU accouting with kqemu Steve Fosdick
2008-06-06 18:59 ` Laurent Vivier
2008-06-11  8:41   ` [Qemu-devel] [PATCHl] kqemu: Linux CPU accouting Laurent Vivier
2008-06-11  9:27     ` [Qemu-devel] " Fabrice Bellard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).