All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 13/24] kvm: Set up signal mask also for !CONFIG_IOTHREAD
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

Block SIG_IPI, unblock it during KVM_RUN, just like in io-thread mode.
It's unused so far, but this infrastructure will be required for
self-IPIs and to process SIGBUS plus, in KVM mode, SIGIO and SIGALRM. As
Windows doesn't support signal services, we need to provide a stub for
the init function.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 cpus.c |   29 +++++++++++++++++++++++++++--
 1 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/cpus.c b/cpus.c
index 42717ba..a33e470 100644
--- a/cpus.c
+++ b/cpus.c
@@ -231,11 +231,9 @@ fail:
     return err;
 }
 
-#ifdef CONFIG_IOTHREAD
 static void dummy_signal(int sig)
 {
 }
-#endif
 
 #else /* _WIN32 */
 
@@ -267,6 +265,32 @@ static void qemu_event_increment(void)
 #endif /* _WIN32 */
 
 #ifndef CONFIG_IOTHREAD
+static void qemu_kvm_init_cpu_signals(CPUState *env)
+{
+#ifndef _WIN32
+    int r;
+    sigset_t set;
+    struct sigaction sigact;
+
+    memset(&sigact, 0, sizeof(sigact));
+    sigact.sa_handler = dummy_signal;
+    sigaction(SIG_IPI, &sigact, NULL);
+
+    sigemptyset(&set);
+    sigaddset(&set, SIG_IPI);
+    pthread_sigmask(SIG_BLOCK, &set, NULL);
+
+    pthread_sigmask(SIG_BLOCK, NULL, &set);
+    sigdelset(&set, SIG_IPI);
+    sigdelset(&set, SIGBUS);
+    r = kvm_set_signal_mask(env, &set);
+    if (r) {
+        fprintf(stderr, "kvm_set_signal_mask: %s\n", strerror(-r));
+        exit(1);
+    }
+#endif
+}
+
 int qemu_init_main_loop(void)
 {
     cpu_set_debug_excp_handler(cpu_debug_handler);
@@ -292,6 +316,7 @@ void qemu_init_vcpu(void *_env)
             fprintf(stderr, "kvm_init_vcpu failed: %s\n", strerror(-r));
             exit(1);
         }
+        qemu_kvm_init_cpu_signals(env);
     }
 }
 
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 07/24] Flatten the main loop
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

First of all, vm_can_run is a misnomer, it actually means "no request
pending". Moreover, there is no need to check all pending requests
twice, the first time via the inner loop check and then again when
actually processing the requests. We can simply remove the inner loop
and do the checks directly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 vl.c |   30 +++++++++++++++---------------
 1 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/vl.c b/vl.c
index 2ebc55b..f5dec09 100644
--- a/vl.c
+++ b/vl.c
@@ -1371,14 +1371,16 @@ void main_loop_wait(int nonblocking)
 
 }
 
-static int vm_can_run(void)
+#ifndef CONFIG_IOTHREAD
+static int vm_request_pending(void)
 {
-    return !(powerdown_requested ||
-             reset_requested ||
-             shutdown_requested ||
-             debug_requested ||
-             vmstop_requested);
+    return powerdown_requested ||
+           reset_requested ||
+           shutdown_requested ||
+           debug_requested ||
+           vmstop_requested;
 }
+#endif
 
 qemu_irq qemu_system_powerdown;
 
@@ -1393,21 +1395,19 @@ static void main_loop(void)
     qemu_main_loop_start();
 
     for (;;) {
-        do {
 #ifndef CONFIG_IOTHREAD
-            nonblocking = cpu_exec_all();
-            if (!vm_can_run()) {
-                nonblocking = true;
-            }
+        nonblocking = cpu_exec_all();
+        if (vm_request_pending()) {
+            nonblocking = true;
+        }
 #endif
 #ifdef CONFIG_PROFILER
-            ti = profile_getclock();
+        ti = profile_getclock();
 #endif
-            main_loop_wait(nonblocking);
+        main_loop_wait(nonblocking);
 #ifdef CONFIG_PROFILER
-            dev_time += profile_getclock() - ti;
+        dev_time += profile_getclock() - ti;
 #endif
-        } while (vm_can_run());
 
         if ((r = qemu_debug_requested())) {
             vm_stop(r);
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 22/24] kvm: Leave kvm_cpu_exec directly after KVM_EXIT_SHUTDOWN
From: Jan Kiszka @ 2011-02-01 21:16 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

The reset we issue on KVM_EXIT_SHUTDOWN implies that we should also
leave the VCPU loop. As we now check for exit_request which is set by
qemu_system_reset_request, this bug is no longer critical. Still it's an
unneeded extra turn.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm-all.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index cf54256..35860df 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -959,7 +959,6 @@ int kvm_cpu_exec(CPUState *env)
         case KVM_EXIT_SHUTDOWN:
             DPRINTF("shutdown\n");
             qemu_system_reset_request();
-            ret = 1;
             break;
         case KVM_EXIT_UNKNOWN:
             fprintf(stderr, "KVM: unknown exit, hardware reason %" PRIx64 "\n",
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 21/24] kvm: Remove static return code of kvm_handle_io
From: Jan Kiszka @ 2011-02-01 21:16 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

Improve the readability of the exit dispatcher by moving the static
return value of kvm_handle_io to its caller.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 kvm-all.c |   17 ++++++++---------
 1 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index d961697..cf54256 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -770,8 +770,8 @@ err:
     return ret;
 }
 
-static int kvm_handle_io(uint16_t port, void *data, int direction, int size,
-                         uint32_t count)
+static void kvm_handle_io(uint16_t port, void *data, int direction, int size,
+                          uint32_t count)
 {
     int i;
     uint8_t *ptr = data;
@@ -805,8 +805,6 @@ static int kvm_handle_io(uint16_t port, void *data, int direction, int size,
 
         ptr += size;
     }
-
-    return 1;
 }
 
 #ifdef KVM_CAP_INTERNAL_ERROR_DATA
@@ -940,11 +938,12 @@ int kvm_cpu_exec(CPUState *env)
         switch (run->exit_reason) {
         case KVM_EXIT_IO:
             DPRINTF("handle_io\n");
-            ret = kvm_handle_io(run->io.port,
-                                (uint8_t *)run + run->io.data_offset,
-                                run->io.direction,
-                                run->io.size,
-                                run->io.count);
+            kvm_handle_io(run->io.port,
+                          (uint8_t *)run + run->io.data_offset,
+                          run->io.direction,
+                          run->io.size,
+                          run->io.count);
+            ret = 1;
             break;
         case KVM_EXIT_MMIO:
             DPRINTF("handle_mmio\n");
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 01/24] kvm: x86: Fix build in absence of KVM_CAP_ASYNC_PF
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

Reported by Stefan Hajnoczi.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 target-i386/kvm.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 8e8880a..05010bb 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -167,7 +167,9 @@ static int get_para_features(CPUState *env)
             features |= (1 << para_features[i].feature);
         }
     }
+#ifdef KVM_CAP_ASYNC_PF
     has_msr_async_pf_en = features & (1 << KVM_FEATURE_ASYNC_PF);
+#endif
     return features;
 }
 #endif
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 18/24] kvm: Add MCE signal support for !CONFIG_IOTHREAD
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Hidetoshi Seto, Jin Dongming, qemu-devel, kvm, Huang Ying
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

Currently, we only configure and process MCE-related SIGBUS events if
CONFIG_IOTHREAD is enabled. The groundwork is laid, we just need to
factor out the required handler registration and system configuration.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
---
 cpus.c |  107 +++++++++++++++++++++++++++++++++++++++-------------------------
 1 files changed, 65 insertions(+), 42 deletions(-)

diff --git a/cpus.c b/cpus.c
index 18caf47..c4c5914 100644
--- a/cpus.c
+++ b/cpus.c
@@ -34,9 +34,6 @@
 
 #include "cpus.h"
 #include "compatfd.h"
-#ifdef CONFIG_LINUX
-#include <sys/prctl.h>
-#endif
 
 #ifdef SIGRTMIN
 #define SIG_IPI (SIGRTMIN+4)
@@ -44,10 +41,24 @@
 #define SIG_IPI SIGUSR1
 #endif
 
+#ifdef CONFIG_LINUX
+
+#include <sys/prctl.h>
+
 #ifndef PR_MCE_KILL
 #define PR_MCE_KILL 33
 #endif
 
+#ifndef PR_MCE_KILL_SET
+#define PR_MCE_KILL_SET 1
+#endif
+
+#ifndef PR_MCE_KILL_EARLY
+#define PR_MCE_KILL_EARLY 1
+#endif
+
+#endif /* CONFIG_LINUX */
+
 static CPUState *next_cpu;
 
 /***********************************************************/
@@ -166,6 +177,52 @@ static void cpu_debug_handler(CPUState *env)
     vm_stop(EXCP_DEBUG);
 }
 
+#ifdef CONFIG_LINUX
+static void sigbus_reraise(void)
+{
+    sigset_t set;
+    struct sigaction action;
+
+    memset(&action, 0, sizeof(action));
+    action.sa_handler = SIG_DFL;
+    if (!sigaction(SIGBUS, &action, NULL)) {
+        raise(SIGBUS);
+        sigemptyset(&set);
+        sigaddset(&set, SIGBUS);
+        sigprocmask(SIG_UNBLOCK, &set, NULL);
+    }
+    perror("Failed to re-raise SIGBUS!\n");
+    abort();
+}
+
+static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo,
+                           void *ctx)
+{
+    if (kvm_on_sigbus(siginfo->ssi_code,
+                      (void *)(intptr_t)siginfo->ssi_addr)) {
+        sigbus_reraise();
+    }
+}
+
+static void qemu_init_sigbus(void)
+{
+    struct sigaction action;
+
+    memset(&action, 0, sizeof(action));
+    action.sa_flags = SA_SIGINFO;
+    action.sa_sigaction = (void (*)(int, siginfo_t*, void*))sigbus_handler;
+    sigaction(SIGBUS, &action, NULL);
+
+    prctl(PR_MCE_KILL, PR_MCE_KILL_SET, PR_MCE_KILL_EARLY, 0, 0);
+}
+
+#else /* !CONFIG_LINUX */
+
+static void qemu_init_sigbus(void)
+{
+}
+#endif /* !CONFIG_LINUX */
+
 #ifndef _WIN32
 static int io_thread_fd = -1;
 
@@ -288,8 +345,6 @@ static int qemu_signalfd_init(sigset_t mask)
     return 0;
 }
 
-static void sigbus_reraise(void);
-
 static void qemu_kvm_eat_signals(CPUState *env)
 {
     struct timespec ts = { 0, 0 };
@@ -310,13 +365,11 @@ static void qemu_kvm_eat_signals(CPUState *env)
         }
 
         switch (r) {
-#ifdef CONFIG_IOTHREAD
         case SIGBUS:
             if (kvm_on_sigbus_vcpu(env, siginfo.si_code, siginfo.si_addr)) {
                 sigbus_reraise();
             }
             break;
-#endif
         default:
             break;
         }
@@ -405,6 +458,7 @@ static sigset_t block_synchronous_signals(void)
     sigset_t set;
 
     sigemptyset(&set);
+    sigaddset(&set, SIGBUS);
     if (kvm_enabled()) {
         /*
          * We need to process timer signals synchronously to avoid a race
@@ -433,6 +487,8 @@ int qemu_init_main_loop(void)
 #endif
     cpu_set_debug_excp_handler(cpu_debug_handler);
 
+    qemu_init_sigbus();
+
     return qemu_event_init();
 }
 
@@ -565,13 +621,9 @@ static void qemu_tcg_init_cpu_signals(void)
     pthread_sigmask(SIG_UNBLOCK, &set, NULL);
 }
 
-static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo,
-                           void *ctx);
-
 static sigset_t block_io_signals(void)
 {
     sigset_t set;
-    struct sigaction action;
 
     /* SIGUSR2 used by posix-aio-compat.c */
     sigemptyset(&set);
@@ -585,12 +637,6 @@ static sigset_t block_io_signals(void)
     sigaddset(&set, SIGBUS);
     pthread_sigmask(SIG_BLOCK, &set, NULL);
 
-    memset(&action, 0, sizeof(action));
-    action.sa_flags = SA_SIGINFO;
-    action.sa_sigaction = (void (*)(int, siginfo_t*, void*))sigbus_handler;
-    sigaction(SIGBUS, &action, NULL);
-    prctl(PR_MCE_KILL, 1, 1, 0, 0);
-
     return set;
 }
 
@@ -601,6 +647,8 @@ int qemu_init_main_loop(void)
 
     cpu_set_debug_excp_handler(cpu_debug_handler);
 
+    qemu_init_sigbus();
+
     blocked_signals = block_io_signals();
 
     ret = qemu_signalfd_init(blocked_signals);
@@ -708,31 +756,6 @@ static void qemu_tcg_wait_io_event(void)
     }
 }
 
-static void sigbus_reraise(void)
-{
-    sigset_t set;
-    struct sigaction action;
-
-    memset(&action, 0, sizeof(action));
-    action.sa_handler = SIG_DFL;
-    if (!sigaction(SIGBUS, &action, NULL)) {
-        raise(SIGBUS);
-        sigemptyset(&set);
-        sigaddset(&set, SIGBUS);
-        sigprocmask(SIG_UNBLOCK, &set, NULL);
-    }
-    perror("Failed to re-raise SIGBUS!\n");
-    abort();
-}
-
-static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo,
-                           void *ctx)
-{
-    if (kvm_on_sigbus(siginfo->ssi_code, (void *)(intptr_t)siginfo->ssi_addr)) {
-        sigbus_reraise();
-    }
-}
-
 static void qemu_kvm_wait_io_event(CPUState *env)
 {
     while (!cpu_has_work(env))
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 17/24] kvm: Fix race between timer signals and vcpu entry under !IOTHREAD
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm, Stefan Hajnoczi
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between
checking for exit_request on vcpu entry and timer signals arriving
before KVM starts to catch them. Plug it by blocking both timer related
signals also on !CONFIG_IOTHREAD and process those via signalfd.

As this fix depends on real signalfd support (otherwise the timer
signals only kick the compat helper thread, and the main thread hangs),
we need to detect the invalid constellation and abort configure.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
CC: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 configure |    6 ++++++
 cpus.c    |   31 ++++++++++++++++++++++++++++++-
 2 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/configure b/configure
index 4673bf0..368ca8a 100755
--- a/configure
+++ b/configure
@@ -2056,6 +2056,12 @@ EOF
 
 if compile_prog "" "" ; then
   signalfd=yes
+elif test "$kvm" = "yes" -a "$io_thread" != "yes"; then
+  echo
+  echo "ERROR: Host kernel lacks signalfd() support,"
+  echo "but KVM depends on it when the IO thread is disabled."
+  echo
+  exit 1
 fi
 
 # check if eventfd is supported
diff --git a/cpus.c b/cpus.c
index 359361f..18caf47 100644
--- a/cpus.c
+++ b/cpus.c
@@ -327,6 +327,12 @@ static void qemu_kvm_eat_signals(CPUState *env)
             exit(1);
         }
     } while (sigismember(&chkset, SIG_IPI) || sigismember(&chkset, SIGBUS));
+
+#ifndef CONFIG_IOTHREAD
+    if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) {
+        qemu_notify_event();
+    }
+#endif
 }
 
 #else /* _WIN32 */
@@ -376,11 +382,15 @@ static void qemu_kvm_init_cpu_signals(CPUState *env)
 
     sigemptyset(&set);
     sigaddset(&set, SIG_IPI);
+    sigaddset(&set, SIGIO);
+    sigaddset(&set, SIGALRM);
     pthread_sigmask(SIG_BLOCK, &set, NULL);
 
     pthread_sigmask(SIG_BLOCK, NULL, &set);
     sigdelset(&set, SIG_IPI);
     sigdelset(&set, SIGBUS);
+    sigdelset(&set, SIGIO);
+    sigdelset(&set, SIGALRM);
     r = kvm_set_signal_mask(env, &set);
     if (r) {
         fprintf(stderr, "kvm_set_signal_mask: %s\n", strerror(-r));
@@ -389,13 +399,32 @@ static void qemu_kvm_init_cpu_signals(CPUState *env)
 #endif
 }
 
+#ifndef _WIN32
+static sigset_t block_synchronous_signals(void)
+{
+    sigset_t set;
+
+    sigemptyset(&set);
+    if (kvm_enabled()) {
+        /*
+         * We need to process timer signals synchronously to avoid a race
+         * between exit_request check and KVM vcpu entry.
+         */
+        sigaddset(&set, SIGIO);
+        sigaddset(&set, SIGALRM);
+    }
+
+    return set;
+}
+#endif
+
 int qemu_init_main_loop(void)
 {
 #ifndef _WIN32
     sigset_t blocked_signals;
     int ret;
 
-    sigemptyset(&blocked_signals);
+    blocked_signals = block_synchronous_signals();
 
     ret = qemu_signalfd_init(blocked_signals);
     if (ret) {
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 23/24] Refactor kvm&tcg function names in cpus.c
From: Jan Kiszka @ 2011-02-01 21:16 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

Pure interface cosmetics: Ensure that only kvm core services (as
declared in kvm.h) start with "kvm_". Prepend "qemu_" to those that
violate this rule in cpus.c. Also rename the corresponding tcg functions
for the sake of consistency.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 cpus.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/cpus.c b/cpus.c
index 9c50a34..0d11a20 100644
--- a/cpus.c
+++ b/cpus.c
@@ -778,7 +778,7 @@ static void qemu_kvm_wait_io_event(CPUState *env)
 
 static int qemu_cpu_exec(CPUState *env);
 
-static void *kvm_cpu_thread_fn(void *arg)
+static void *qemu_kvm_cpu_thread_fn(void *arg)
 {
     CPUState *env = arg;
     int r;
@@ -811,7 +811,7 @@ static void *kvm_cpu_thread_fn(void *arg)
     return NULL;
 }
 
-static void *tcg_cpu_thread_fn(void *arg)
+static void *qemu_tcg_cpu_thread_fn(void *arg)
 {
     CPUState *env = arg;
 
@@ -930,7 +930,7 @@ void resume_all_vcpus(void)
     }
 }
 
-static void tcg_init_vcpu(void *_env)
+static void qemu_tcg_init_vcpu(void *_env)
 {
     CPUState *env = _env;
     /* share a single thread for all cpus with TCG */
@@ -938,7 +938,7 @@ static void tcg_init_vcpu(void *_env)
         env->thread = qemu_mallocz(sizeof(QemuThread));
         env->halt_cond = qemu_mallocz(sizeof(QemuCond));
         qemu_cond_init(env->halt_cond);
-        qemu_thread_create(env->thread, tcg_cpu_thread_fn, env);
+        qemu_thread_create(env->thread, qemu_tcg_cpu_thread_fn, env);
         while (env->created == 0)
             qemu_cond_timedwait(&qemu_cpu_cond, &qemu_global_mutex, 100);
         tcg_cpu_thread = env->thread;
@@ -949,12 +949,12 @@ static void tcg_init_vcpu(void *_env)
     }
 }
 
-static void kvm_start_vcpu(CPUState *env)
+static void qemu_kvm_start_vcpu(CPUState *env)
 {
     env->thread = qemu_mallocz(sizeof(QemuThread));
     env->halt_cond = qemu_mallocz(sizeof(QemuCond));
     qemu_cond_init(env->halt_cond);
-    qemu_thread_create(env->thread, kvm_cpu_thread_fn, env);
+    qemu_thread_create(env->thread, qemu_kvm_cpu_thread_fn, env);
     while (env->created == 0)
         qemu_cond_timedwait(&qemu_cpu_cond, &qemu_global_mutex, 100);
 }
@@ -966,9 +966,9 @@ void qemu_init_vcpu(void *_env)
     env->nr_cores = smp_cores;
     env->nr_threads = smp_threads;
     if (kvm_enabled())
-        kvm_start_vcpu(env);
+        qemu_kvm_start_vcpu(env);
     else
-        tcg_init_vcpu(env);
+        qemu_tcg_init_vcpu(env);
 }
 
 void qemu_notify_event(void)
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 04/24] Process vmstop requests in IO thread
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

A pending vmstop request is also a reason to leave the inner main loop.
So far we ignored it, and pending stop requests issued over VCPU threads
were simply ignored.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 vl.c |   14 +++++---------
 1 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/vl.c b/vl.c
index db24a05..5fad700 100644
--- a/vl.c
+++ b/vl.c
@@ -1373,15 +1373,11 @@ void main_loop_wait(int nonblocking)
 
 static int vm_can_run(void)
 {
-    if (powerdown_requested)
-        return 0;
-    if (reset_requested)
-        return 0;
-    if (shutdown_requested)
-        return 0;
-    if (debug_requested)
-        return 0;
-    return 1;
+    return !(powerdown_requested ||
+             reset_requested ||
+             shutdown_requested ||
+             debug_requested ||
+             vmstop_requested);
 }
 
 qemu_irq qemu_system_powerdown;
-- 
1.7.1

^ permalink raw reply related

* [Qemu-devel] [PATCH v2 00/24] [uq/master] Patch queue, part II
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Hidetoshi Seto, kvm, Gleb Natapov, qemu-devel, Alexander Graf,
	Huang Ying, Paolo Bonzini, Stefan Hajnoczi, Jin Dongming

Version 2 of part II. Changes:
 - Fixed "Unconditionally reenter kernel after IO exits" to take
   self-INIT into account
 - Fixed misplaced hunk in "Fix race between timer signals and vcpu
   entry under !IOTHREAD" (rebase artifact)
 - Factor out block_synchronous_signals (analogue to block_io_signals)
 - Additional fix to break out of SMP VCPU loop on pending IO event
 - Fork qemu_kvm_init_cpu_signals over CONFIG_IOTHREAD
 - Additional cleanup, flattening the main loop

Hope I addressed all review comments (except for passing env to
qemu_cpu_kick_self which I think is better as it is).

Thanks,
Jan

CC: Alexander Graf <agraf@suse.de>
CC: Gleb Natapov <gleb@redhat.com>
CC: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
CC: Huang Ying <ying.huang@intel.com>
CC: Jin Dongming <jin.dongming@np.css.fujitsu.com>
CC:  Paolo Bonzini <pbonzini@redhat.com>
CC: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Jan Kiszka (24):
  kvm: x86: Fix build in absence of KVM_CAP_ASYNC_PF
  Prevent abortion on multiple VCPU kicks
  Stop current VCPU on synchronous reset requests
  Process vmstop requests in IO thread
  Trigger exit from cpu_exec_all on pending IO events
  Leave inner main_loop faster on pending requests
  Flatten the main loop
  kvm: Report proper error on GET_VCPU_MMAP_SIZE failures
  kvm: Drop redundant kvm_enabled from kvm_cpu_thread_fn
  kvm: Handle kvm_init_vcpu errors
  kvm: Provide sigbus services arch-independently
  Refactor signal setup functions in cpus.c
  kvm: Set up signal mask also for !CONFIG_IOTHREAD
  kvm: Refactor qemu_kvm_eat_signals
  kvm: Call qemu_kvm_eat_signals also under !CONFIG_IOTHREAD
  Set up signalfd under !CONFIG_IOTHREAD
  kvm: Fix race between timer signals and vcpu entry under !IOTHREAD
  kvm: Add MCE signal support for !CONFIG_IOTHREAD
  Introduce VCPU self-signaling service
  kvm: Unconditionally reenter kernel after IO exits
  kvm: Remove static return code of kvm_handle_io
  kvm: Leave kvm_cpu_exec directly after KVM_EXIT_SHUTDOWN
  Refactor kvm&tcg function names in cpus.c
  Fix a few coding style violations in cpus.c

 Makefile.objs      |    2 +-
 configure          |    6 +
 cpu-defs.h         |    1 +
 cpus.c             |  662 ++++++++++++++++++++++++++++++++--------------------
 cpus.h             |    1 +
 kvm-all.c          |   60 +++--
 kvm-stub.c         |    5 +
 kvm.h              |    7 +-
 qemu-common.h      |    1 +
 target-i386/kvm.c  |   11 +-
 target-ppc/kvm.c   |   10 +
 target-s390x/kvm.c |   10 +
 vl.c               |   40 ++--
 13 files changed, 514 insertions(+), 302 deletions(-)

^ permalink raw reply

* [Qemu-devel] [PATCH v2 11/24] kvm: Provide sigbus services arch-independently
From: Jan Kiszka @ 2011-02-01 21:15 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: qemu-devel, kvm
In-Reply-To: <cover.1296594961.git.jan.kiszka@web.de>

From: Jan Kiszka <jan.kiszka@siemens.com>

Provide arch-independent kvm_on_sigbus* stubs to remove the #ifdef'ery
from cpus.c. This patch also fixes --disable-kvm build by providing the
missing kvm_on_sigbus_vcpu kvm-stub.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alexander Graf <agraf@suse.de>
---
 cpus.c             |   10 ++++------
 kvm-all.c          |   10 ++++++++++
 kvm-stub.c         |    5 +++++
 kvm.h              |    7 +++++--
 target-i386/kvm.c  |    4 ++--
 target-ppc/kvm.c   |   10 ++++++++++
 target-s390x/kvm.c |   10 ++++++++++
 7 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/cpus.c b/cpus.c
index 8475757..3a32828 100644
--- a/cpus.c
+++ b/cpus.c
@@ -543,10 +543,9 @@ static void sigbus_reraise(void)
 static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo,
                            void *ctx)
 {
-#if defined(TARGET_I386)
-    if (kvm_on_sigbus(siginfo->ssi_code, (void *)(intptr_t)siginfo->ssi_addr))
-#endif
+    if (kvm_on_sigbus(siginfo->ssi_code, (void *)(intptr_t)siginfo->ssi_addr)) {
         sigbus_reraise();
+    }
 }
 
 static void qemu_kvm_eat_signal(CPUState *env, int timeout)
@@ -579,10 +578,9 @@ static void qemu_kvm_eat_signal(CPUState *env, int timeout)
 
         switch (r) {
         case SIGBUS:
-#ifdef TARGET_I386
-            if (kvm_on_sigbus_vcpu(env, siginfo.si_code, siginfo.si_addr))
-#endif
+            if (kvm_on_sigbus_vcpu(env, siginfo.si_code, siginfo.si_addr)) {
                 sigbus_reraise();
+            }
             break;
         default:
             break;
diff --git a/kvm-all.c b/kvm-all.c
index 1a55a10..5bfa8c0 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1362,3 +1362,13 @@ int kvm_set_ioeventfd_pio_word(int fd, uint16_t addr, uint16_t val, bool assign)
     return -ENOSYS;
 #endif
 }
+
+int kvm_on_sigbus_vcpu(CPUState *env, int code, void *addr)
+{
+    return kvm_arch_on_sigbus_vcpu(env, code, addr);
+}
+
+int kvm_on_sigbus(int code, void *addr)
+{
+    return kvm_arch_on_sigbus(code, addr);
+}
diff --git a/kvm-stub.c b/kvm-stub.c
index 88682f2..d6b6c8e 100644
--- a/kvm-stub.c
+++ b/kvm-stub.c
@@ -147,6 +147,11 @@ int kvm_set_ioeventfd_mmio_long(int fd, uint32_t adr, uint32_t val, bool assign)
     return -ENOSYS;
 }
 
+int kvm_on_sigbus_vcpu(CPUState *env, int code, void *addr)
+{
+    return 1;
+}
+
 int kvm_on_sigbus(int code, void *addr)
 {
     return 1;
diff --git a/kvm.h b/kvm.h
index ca57517..b2fb5c6 100644
--- a/kvm.h
+++ b/kvm.h
@@ -81,6 +81,9 @@ int kvm_set_signal_mask(CPUState *env, const sigset_t *sigset);
 int kvm_pit_in_kernel(void);
 int kvm_irqchip_in_kernel(void);
 
+int kvm_on_sigbus_vcpu(CPUState *env, int code, void *addr);
+int kvm_on_sigbus(int code, void *addr);
+
 /* internal API */
 
 struct KVMState;
@@ -121,8 +124,8 @@ int kvm_arch_init_vcpu(CPUState *env);
 
 void kvm_arch_reset_vcpu(CPUState *env);
 
-int kvm_on_sigbus_vcpu(CPUState *env, int code, void *addr);
-int kvm_on_sigbus(int code, void *addr);
+int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr);
+int kvm_arch_on_sigbus(int code, void *addr);
 
 struct kvm_guest_debug;
 struct kvm_debug_exit_arch;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 05010bb..9df8ff8 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -1839,7 +1839,7 @@ static void kvm_mce_inj_srao_memscrub2(CPUState *env, target_phys_addr_t paddr)
 
 #endif
 
-int kvm_on_sigbus_vcpu(CPUState *env, int code, void *addr)
+int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr)
 {
 #if defined(KVM_CAP_MCE)
     void *vaddr;
@@ -1889,7 +1889,7 @@ int kvm_on_sigbus_vcpu(CPUState *env, int code, void *addr)
     return 0;
 }
 
-int kvm_on_sigbus(int code, void *addr)
+int kvm_arch_on_sigbus(int code, void *addr)
 {
 #if defined(KVM_CAP_MCE)
     if ((first_cpu->mcg_cap & MCG_SER_P) && addr && code == BUS_MCEERR_AO) {
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 710eca1..93ecc57 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -404,3 +404,13 @@ bool kvm_arch_stop_on_emulation_error(CPUState *env)
 {
     return true;
 }
+
+int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr)
+{
+    return 1;
+}
+
+int kvm_arch_on_sigbus(int code, void *addr)
+{
+    return 1;
+}
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 38823f5..1702c46 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -505,3 +505,13 @@ bool kvm_arch_stop_on_emulation_error(CPUState *env)
 {
     return true;
 }
+
+int kvm_arch_on_sigbus_vcpu(CPUState *env, int code, void *addr)
+{
+    return 1;
+}
+
+int kvm_arch_on_sigbus(int code, void *addr)
+{
+    return 1;
+}
-- 
1.7.1

^ permalink raw reply related

* Re: [patch 17/28] posix-timers: Convert timer_gettime() to clockid_to_kclock()
From: john stultz @ 2011-02-01 21:23 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Richard Cochran, Ingo Molnar, Peter Zijlstra
In-Reply-To: <20110201134419.101243181@linutronix.de>

On Tue, 2011-02-01 at 13:52 +0000, Thomas Gleixner wrote:
> plain text document attachment (posix-timers-convert-timer-get.patch)
> Set the common function for CLOCK_MONOTONIC and CLOCK_REALTIME kclocks
> and use the new decoding function. No need to check for the return
> value of it. If we have data corruption in the timer, we explode
> somewhere else anyway. Also all kclocks which implement timer_create()
> need to provide timer_gettime() as well.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Richard Cochran <richard.cochran@omicron.at>
[snip]
> -	CLOCK_DISPATCH(timr->it_clock, timer_get, (timr, &cur_setting));
> +	kc = clockid_to_kclock(timr->it_clock);
> +	kc->timer_get(timr, &cur_setting);

Null check again.

thanks
-john


^ permalink raw reply

* [Bug 33824] New: [r600g, tiling] mipmap rendering errors / block artifacts
From: bugzilla-daemon @ 2011-02-01 21:22 UTC (permalink / raw)
  To: dri-devel

https://bugs.freedesktop.org/show_bug.cgi?id=33824

           Summary: [r600g, tiling] mipmap rendering errors / block
                    artifacts
           Product: DRI
           Version: DRI CVS
          Platform: x86-64 (AMD64)
        OS/Version: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Radeon
        AssignedTo: dri-devel@lists.freedesktop.org
        ReportedBy: liquid.acid@gmx.net


Hi there,

this is a update of bug #31046, where the main issue was GPU resets caused by
enabling tiling (R600_FORCE_TILING=1) with the r600g driver.

The situation however was changed, and now R600_FORCE_TILING only causes mipmap
rendering errors in a few application. Last time I tested ioquake3, doom3
(demo) and both ut2003 and ut2004.

ut03 and ut04 are still affected by the blocky rendering errors (see the other
bug for screenshots illustrating the problem), but both ioquake3 and doom3 now
look perfectly fine.

My hardware:
ATI Technologies Inc Radeon HD 4770 [RV740]

libdrm, mesa and xf86-video-ati are all git master tip.
ColorTiling is enabled in the xorg.conf, verified this by checking the Xorg
logfile.
kernel is d-r-t

Greets,
Tobias

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

^ permalink raw reply

* Re: [patch 16/28] posix-timers: Convert timer_settime() to clockid_to_kclock()
From: John Stultz @ 2011-02-01 21:22 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Richard Cochran, Ingo Molnar, Peter Zijlstra
In-Reply-To: <20110201134419.001863714@linutronix.de>

On Tue, 2011-02-01 at 13:52 +0000, Thomas Gleixner wrote:
> plain text document attachment (posix-timers-convert-timer-set.patch)
> Set the common function for CLOCK_MONOTONIC and CLOCK_REALTIME kclocks
> and use the new decoding function. No need to check for the return
> value of it. If we have data corruption in the timer, we explode
> somewhere else anyway. Also all kclocks which implement timer_create()
> need to provide timer_settime() as well.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Richard Cochran <richard.cochran@omicron.at>
[snip]
> @@ -818,8 +821,8 @@ retry:
>  	if (!timr)
>  		return -EINVAL;
> 
> -	error = CLOCK_DISPATCH(timr->it_clock, timer_set,
> -			       (timr, flags, &new_spec, rtn));
> +	kc = clockid_to_kclock(timr->it_clock);
> +	error = kc->timer_set(timr, flags, &new_spec, rtn);

Again, me being paraniod, would probably want a null check on kc here.
Also, as you suggested on irc, a WARN_ON_ONCE(), since it means kernel
data has been munged.


thanks
-john


^ permalink raw reply

* Re: [PATCH 02/13] IP set core support
From: Jozsef Kadlecsik @ 2011-02-01 21:22 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel, Pablo Neira Ayuso
In-Reply-To: <alpine.DEB.2.00.1102012002430.24267@blackhole.kfki.hu>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3194 bytes --]

On Tue, 1 Feb 2011, Jozsef Kadlecsik wrote:

> On Tue, 1 Feb 2011, Patrick McHardy wrote:
> 
> > Am 31.01.2011 23:52, schrieb Jozsef Kadlecsik:
> > > +static int
> > > +call_ad(struct sk_buff *skb, struct ip_set *set,
> > > +	struct nlattr *tb[], enum ipset_adt adt,
> > > +	u32 flags, bool use_lineno)
> > > +{
> > > +	int ret, retried = 0;
> > > +	u32 lineno = 0;
> > > +	bool eexist = flags & IPSET_FLAG_EXIST;
> > > +
> > > +	do {
> > > +		write_lock_bh(&set->lock);
> > > +		ret = set->variant->uadt(set, tb, adt, &lineno, flags);
> > > +		write_unlock_bh(&set->lock);
> > > +	} while (ret == -EAGAIN &&
> > > +		 set->variant->resize &&
> > > +		 (ret = set->variant->resize(set, retried++)) == 0);
> > > +
> > > +	if (!ret || (ret == -IPSET_ERR_EXIST && eexist))
> > > +		return 0;
> > > +	if (lineno && use_lineno) {
> > > +		/* Error in restore/batch mode: send back lineno */
> > > +		struct nlmsghdr *nlh = nlmsg_hdr(skb);
> > > +		int min_len = NLMSG_SPACE(sizeof(struct nfgenmsg));
> > > +		struct nlattr *cda[IPSET_ATTR_CMD_MAX+1];
> > > +		struct nlattr *cmdattr = (void *)nlh + min_len;
> > > +		u32 *errline;
> > > +
> > > +		nla_parse(cda, IPSET_ATTR_CMD_MAX,
> > > +			  cmdattr, nlh->nlmsg_len - min_len,
> > > +			  ip_set_adt_policy);
> > > +
> > > +		errline = nla_data(cda[IPSET_ATTR_LINENO]);
> > > +
> > > +		*errline = lineno;
> > 
> > This is still not correct. I didn't mean to remove the const attributes
> > (the message is still considered const by the higher layers, the netlink
> > functions just cast this away). You're modifying the received message,
> > I don't see how this can be useful to userspace.
> 
> I can't find where the message is considered const in netlink/nfnetlink.
> It seems to be freely writable via skb.
>  
> > I guess you're relying on that the original message is appended to a
> > nlmsgerr message. That doesn't seem right though, if you want to return
> > something to userspace, you should construct a new message.
> 
> The message we are processing here carried multiple commands (each having 
> an attribute with the line number of the given command) and one failed 
> from some reason. We have to notify the userspace which command, at what 
> line failed. For this reason the multi-command messages have got an 
> attribute, which can be filled out with the line number - that happens 
> here. The attribute is already there, the message is not enlarged, just
> the empty value is overwritten with the proper value.
> 
> The line number reporting works this way, tested in the testsuite too.
> 
> If I had to construct a completely new message and sent it, that'd be more 
> or less the duplication of netlink_ack. Additionally I had to suppress 
> netlink from sending an errmsg/ack too.

Hm, if I lie -EINTR to netlink, then I can construct and send the error 
message manually and keep NLM_F_ACK at the same time. What do you think?
Please have a look at the attached patch.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@mail.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: errline.patch --]
[-- Type: TEXT/x-diff; name=errline.patch, Size: 3188 bytes --]

diff --git a/kernel/ip_set_core.c b/kernel/ip_set_core.c
index 19158bf..cc67d92 100644
--- a/kernel/ip_set_core.c
+++ b/kernel/ip_set_core.c
@@ -1103,7 +1103,7 @@ static const struct nla_policy ip_set_adt_policy[IPSET_ATTR_CMD_MAX + 1] = {
 };
 
 static int
-call_ad(struct sk_buff *skb, struct ip_set *set,
+call_ad(struct sock *ctnl, struct sk_buff *skb, struct ip_set *set,
 	struct nlattr *tb[], enum ipset_adt adt,
 	u32 flags, bool use_lineno)
 {
@@ -1123,12 +1123,25 @@ call_ad(struct sk_buff *skb, struct ip_set *set,
 		return 0;
 	if (lineno && use_lineno) {
 		/* Error in restore/batch mode: send back lineno */
-		struct nlmsghdr *nlh = nlmsg_hdr(skb);
+		struct nlmsghdr *rep, *nlh = nlmsg_hdr(skb);
+		struct sk_buff *skb2;
+		struct nlmsgerr *errmsg;
+		size_t payload = sizeof(*errmsg) + nlmsg_len(nlh);
 		int min_len = NLMSG_SPACE(sizeof(struct nfgenmsg));
 		struct nlattr *cda[IPSET_ATTR_CMD_MAX+1];
-		struct nlattr *cmdattr = (void *)nlh + min_len;
+		struct nlattr *cmdattr;
 		u32 *errline;
 
+		skb2 = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+		if (skb2 == NULL)
+			return -ENOMEM;
+		rep = __nlmsg_put(skb2, NETLINK_CB(skb).pid,
+				  nlh->nlmsg_seq, NLMSG_ERROR, payload, 0);
+		errmsg = nlmsg_data(rep);
+		errmsg->error = ret;
+		memcpy(&errmsg->msg, nlh, nlh->nlmsg_len);
+		cmdattr = (void *)&errmsg->msg + min_len;
+
 		nla_parse(cda, IPSET_ATTR_CMD_MAX,
 			  cmdattr, nlh->nlmsg_len - min_len,
 			  ip_set_adt_policy);
@@ -1136,6 +1149,9 @@ call_ad(struct sk_buff *skb, struct ip_set *set,
 		errline = nla_data(cda[IPSET_ATTR_LINENO]);
 
 		*errline = lineno;
+
+		netlink_unicast(ctnl, skb2, NETLINK_CB(skb).pid, MSG_DONTWAIT);
+		return -EINTR;
 	}
 
 	return ret;
@@ -1174,7 +1190,8 @@ ip_set_uadd(struct sock *ctnl, struct sk_buff *skb,
 				     attr[IPSET_ATTR_DATA],
 				     set->type->adt_policy))
 			return -IPSET_ERR_PROTOCOL;
-		ret = call_ad(skb, set, tb, IPSET_ADD, flags, use_lineno);
+		ret = call_ad(ctnl, skb, set, tb, IPSET_ADD, flags,
+			      use_lineno);
 	} else {
 		int nla_rem;
 
@@ -1185,7 +1202,7 @@ ip_set_uadd(struct sock *ctnl, struct sk_buff *skb,
 			    nla_parse_nested(tb, IPSET_ATTR_ADT_MAX, nla,
 					     set->type->adt_policy))
 				return -IPSET_ERR_PROTOCOL;
-			ret = call_ad(skb, set, tb, IPSET_ADD,
+			ret = call_ad(ctnl, skb, set, tb, IPSET_ADD,
 				      flags, use_lineno);
 			if (ret < 0)
 				return ret;
@@ -1227,7 +1244,8 @@ ip_set_udel(struct sock *ctnl, struct sk_buff *skb,
 				     attr[IPSET_ATTR_DATA],
 				     set->type->adt_policy))
 			return -IPSET_ERR_PROTOCOL;
-		ret = call_ad(skb, set, tb, IPSET_DEL, flags, use_lineno);
+		ret = call_ad(ctnl, skb, set, tb, IPSET_DEL, flags,
+			      use_lineno);
 	} else {
 		int nla_rem;
 
@@ -1238,7 +1256,7 @@ ip_set_udel(struct sock *ctnl, struct sk_buff *skb,
 			    nla_parse_nested(tb, IPSET_ATTR_ADT_MAX, nla,
 					     set->type->adt_policy))
 				return -IPSET_ERR_PROTOCOL;
-			ret = call_ad(skb, set, tb, IPSET_DEL,
+			ret = call_ad(ctnl, skb, set, tb, IPSET_DEL,
 				      flags, use_lineno);
 			if (ret < 0)
 				return ret;

^ permalink raw reply related

* Re: b44 driver causes panic when using swiotlb
From: Chuck Ebbert @ 2011-02-01 21:18 UTC (permalink / raw)
  To: FUJITA Tomonori; +Cc: hancockrwd, ak, linux-kernel, dwmw2
In-Reply-To: <20110201102707C.fujita.tomonori@lab.ntt.co.jp>

On Tue, 1 Feb 2011 10:28:00 +0900
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:

> 
> swiotlb allocates the bounce buffer when a system boots up. We can't
> allocate much in GFP_DMA. swiotlb uses somewhere under 4GB. So it
> can't help devices that have odd dma_mask (that is, except for 4GB).
> 
> Unfortunately, Such device needs to do own custom bouncing or needs
> their subsystem to does that.

I think we're chasing the wrong problem here.

swiotlb uses alloc_bootmem_low_pages() to try to get buffers as low
in memory as possible. I asked someone who is hitting this bug to
try 2.6.36 and he reports the buffers really are low there:

2.6.36:  5c00000
2.6.37: db600000

So something happened very early in the 2.6.37-rc cycle that changed
this behavior. I tried looking at the bootmem code but could not see
the problem. The only related option I could find in .config was this:

# CONFIG_NO_BOOTMEM is not set

It was set this way in both .36 and .37.

^ permalink raw reply

* Re: Network performance with small packets
From: Michael S. Tsirkin @ 2011-02-01 21:21 UTC (permalink / raw)
  To: Shirley Ma; +Cc: David Miller, steved, netdev, kvm
In-Reply-To: <1296591908.26937.809.camel@localhost.localdomain>

On Tue, Feb 01, 2011 at 12:25:08PM -0800, Shirley Ma wrote:
> On Tue, 2011-02-01 at 22:17 +0200, Michael S. Tsirkin wrote:
> > On Tue, Feb 01, 2011 at 12:09:03PM -0800, Shirley Ma wrote:
> > > On Tue, 2011-02-01 at 19:23 +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Jan 27, 2011 at 01:30:38PM -0800, Shirley Ma wrote:
> > > > > On Thu, 2011-01-27 at 13:02 -0800, David Miller wrote:
> > > > > > > Interesting. Could this is be a variant of the now famuous
> > > > > > bufferbloat then?
> > > > > > 
> > > > > > Sigh, bufferbloat is the new global warming... :-/ 
> > > > > 
> > > > > Yep, some places become colder, some other places become warmer;
> > > > Same as
> > > > > BW results, sometimes faster, sometimes slower. :)
> > > > > 
> > > > > Shirley
> > > > 
> > > > Sent a tuning patch (v2) that might help.
> > > > Could you try it and play with the module parameters please? 
> > > 
> > > Hello Michael,
> > > 
> > > Sure I will play with this patch to see how it could help. 
> > > 
> > > I am looking at guest side as well, I found a couple issues on guest
> > > side:
> > > 
> > > 1. free_old_xmit_skbs() should return the number of skbs instead of
> > the
> > > total of sgs since we are using ring size to stop/start netif queue.
> > > static unsigned int free_old_xmit_skbs(struct virtnet_info *vi)
> > > {
> > >         struct sk_buff *skb;
> > >         unsigned int len, tot_sgs = 0;
> > > 
> > >         while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
> > >                 pr_debug("Sent skb %p\n", skb);
> > >                 vi->dev->stats.tx_bytes += skb->len;
> > >                 vi->dev->stats.tx_packets++;
> > >                 tot_sgs += skb_vnet_hdr(skb)->num_sg;
> > >                 dev_kfree_skb_any(skb);
> > >         }
> > >         return tot_sgs; <---- should return numbers of skbs to track
> > > ring usage here, I think;
> > > }
> > > 
> > > Did the old guest use number of buffers to track ring usage before?
> > > 
> > > 2. In start_xmit, I think we should move capacity +=
> > free_old_xmit_skbs
> > > before netif_stop_queue(); so we avoid unnecessary netif queue
> > > stop/start. This condition is heavily hit for small message size.
> > > 
> > > Also we capacity checking condition should change to something like
> > half
> > > of the vring.num size, instead of comparing 2+MAX_SKB_FRAGS?
> > > 
> > >        if (capacity < 2+MAX_SKB_FRAGS) {
> > >                 netif_stop_queue(dev);
> > >                 if (unlikely(!virtqueue_enable_cb(vi->svq))) {
> > >                         /* More just got used, free them then
> > recheck.
> > > */
> > >                         capacity += free_old_xmit_skbs(vi);
> > >                         if (capacity >= 2+MAX_SKB_FRAGS) {
> > >                                 netif_start_queue(dev);
> > >                                 virtqueue_disable_cb(vi->svq);
> > >                         }
> > >                 }
> > >         }
> > > 
> > > 3. Looks like the xmit callback is only used to wake the queue when
> > the
> > > queue has stopped, right? Should we put a condition check here?
> > > static void skb_xmit_done(struct virtqueue *svq)
> > > {
> > >         struct virtnet_info *vi = svq->vdev->priv;
> > > 
> > >         /* Suppress further interrupts. */
> > >         virtqueue_disable_cb(svq);
> > > 
> > >         /* We were probably waiting for more output buffers. */
> > > --->   if (netif_queue_stopped(vi->dev))
> > >         netif_wake_queue(vi->dev);
> > > }
> > > 
> > > 
> > > Shirley
> > 
> > Well the return value is used to calculate capacity and that counts
> > the # of s/g. No?
> 
> Nope, the current guest kernel uses descriptors not number of sgs.

Confused. We compare capacity to skb frags, no?
That's sg I think ...

> not sure the old guest.
> 
> > From cache utilization POV it might be better to read from the skb and
> > not peek at virtio header though...
> > Pls Cc the lists on any discussions in the future.
> > 
> > -- 
> > MST
> 
> Sorry I missed reply all. :(
> 
> Shirley

^ permalink raw reply

* Re: [patch 15/28] posix-timers: Convert timer_create() to clockid_to_kclock()
From: john stultz @ 2011-02-01 21:20 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Richard Cochran, Ingo Molnar, Peter Zijlstra
In-Reply-To: <20110201134418.903604289@linutronix.de>

On Tue, 2011-02-01 at 13:51 +0000, Thomas Gleixner wrote:
> plain text document attachment
> (posix-timers-convert-clock-timer-create.patch)
> Setup timer_create for CLOCK_MONOTONIC and CLOCK_REALTIME kclocks and
> remove the no_timer_create() implementation.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Richard Cochran <richard.cochran@omicron.at>

Acked-by: John Stultz <johnstul@us.ibm.com>



^ permalink raw reply

* [Buildroot] [PATCH] Makefile.package.in: fix upper case $(PKG)_SITE_METHOD
From: Bjørn Forsman @ 2011-02-01 21:19 UTC (permalink / raw)
  To: buildroot
In-Reply-To: <20110201220016.5a34b56d@surf>

On 1 February 2011 22:00, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> On Tue, 1 Feb 2011 21:51:38 +0100
> Daniel Nystr?m <daniel.nystrom@timeterminal.se> wrote:
>
>> Maybe, after all, this is a special case where both upper and lower
>> case should work?
>
> Hum, why would it be necessary to accept upper case spelling ?

The BR documentation uses upper case spelling. So currently
implementation != documentation.

^ permalink raw reply

* Re: [patch 14/28] posix-timers: Remove useless res field from k_clock
From: john stultz @ 2011-02-01 21:18 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Richard Cochran, Ingo Molnar, Peter Zijlstra
In-Reply-To: <20110201134418.808714587@linutronix.de>

On Tue, 2011-02-01 at 13:51 +0000, Thomas Gleixner wrote:
> plain text document attachment
> (posix-timers-remove-unused-res-field.patch)
> The res member of kclock is only used by mmtimer.c, but even there it
> contains redundant information. Remove the field and fixup mmtimer.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Richard Cochran <richard.cochran@omicron.at>

Acked-by: John Stultz <johnstul@us.ibm.com>



^ permalink raw reply

* Re: [PATCH] depca: Fix warnings
From: David Miller @ 2011-02-01 21:19 UTC (permalink / raw)
  To: alan; +Cc: netdev
In-Reply-To: <20110201113214.24608.34659.stgit@bob.linux.org.uk>

From: Alan Cox <alan@lxorguk.ukuu.org.uk>
Date: Tue, 01 Feb 2011 11:32:29 +0000

> From: Alan Cox <alan@linux.intel.com>
> 
> Replace the rather weird use of ++ with + 1 as the value is being assigned
> 
> Signed-off-by: Alan Cox <alan@linux.intel.com>

Applied, thanks Alan.

^ permalink raw reply

* Re: [PATCH 02/18] wl1251: fix 4-byte TX buffer alignment
From: Kalle Valo @ 2011-02-01 21:18 UTC (permalink / raw)
  To: David Gnedt
  Cc: John W. Linville, linux-wireless, Grazvydas Ignotas,
	Denis 'GNUtoo' Carikli
In-Reply-To: <4D45B7B8.1060607@davizone.at>

David Gnedt <david.gnedt@davizone.at> writes:

> This implements TX buffer alignment for cloned or too small skb by
> copying and replacing the original skb.
> Recent changes in wireless-testing seems to make this really necessary.

I have been hit by this as well, thanks for fixing it.

> Signed-off-by: David Gnedt <david.gnedt@davizone.at>

Acked-by: Kalle Valo <kvalo@adurom.com>

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH 1/2] compat-wireless: remove compat for threaded_irq in rt2x00
From: Helmut Schaa @ 2011-02-01 21:17 UTC (permalink / raw)
  To: Hauke Mehrtens; +Cc: mcgrof, linux-wireless
In-Reply-To: <1296594827-30742-1-git-send-email-hauke@hauke-m.de>

Am Dienstag, 1. Februar 2011 schrieb Hauke Mehrtens:
> rt2x00 does not use threaded_irq any more.

Hehe, sorry for that but the interrupt threading caused performance
issues on embedded devices and also introduced some nasty "bugs" on slow
platforms. Hence, we moved everything to per IRQ tasklets.

Thanks a lot Hauke!

Helmut

> Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
> ---
>  patches/09-threaded-irq.patch |   56 -----------------------------------------
>  1 files changed, 0 insertions(+), 56 deletions(-)
> 
> diff --git a/patches/09-threaded-irq.patch b/patches/09-threaded-irq.patch
> index 059e58e..df164e9 100644
> --- a/patches/09-threaded-irq.patch
> +++ b/patches/09-threaded-irq.patch
> @@ -61,59 +61,3 @@ thread in process context as well.
>   };
>   
>   /* Data structure for the WLAN parts (802.11 cores) of the b43 chip. */
> ---- a/drivers/net/wireless/rt2x00/rt2x00.h
> -+++ b/drivers/net/wireless/rt2x00/rt2x00.h
> -@@ -901,6 +901,10 @@ struct rt2x00_dev {
> - 	 * Tasklet for processing tx status reports (rt2800pci).
> - 	 */
> - 	struct tasklet_struct txstatus_tasklet;
> -+
> -+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,31)
> -+	struct compat_threaded_irq irq_compat;
> -+#endif
> - };
> - 
> - /*
> ---- a/drivers/net/wireless/rt2x00/rt2x00pci.c
> -+++ b/drivers/net/wireless/rt2x00/rt2x00pci.c
> -@@ -160,10 +160,18 @@ int rt2x00pci_initialize(struct rt2x00_d
> - 	/*
> - 	 * Register interrupt handler.
> - 	 */
> -+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,31)
> -+	status = compat_request_threaded_irq(&rt2x00dev->irq_compat,
> -+					  rt2x00dev->irq,
> -+					  rt2x00dev->ops->lib->irq_handler,
> -+					  rt2x00dev->ops->lib->irq_handler_thread,
> -+					  IRQF_SHARED, rt2x00dev->name, rt2x00dev);
> -+#else
> - 	status = request_threaded_irq(rt2x00dev->irq,
> - 				      rt2x00dev->ops->lib->irq_handler,
> - 				      rt2x00dev->ops->lib->irq_handler_thread,
> - 				      IRQF_SHARED, rt2x00dev->name, rt2x00dev);
> -+#endif
> - 	if (status) {
> - 		ERROR(rt2x00dev, "IRQ %d allocation failed (error %d).\n",
> - 		      rt2x00dev->irq, status);
> -@@ -187,7 +195,11 @@ void rt2x00pci_uninitialize(struct rt2x0
> - 	/*
> - 	 * Free irq line.
> - 	 */
> -+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,31)
> -+	compat_free_threaded_irq(&rt2x00dev->irq_compat);
> -+#else
> - 	free_irq(rt2x00dev->irq, rt2x00dev);
> -+#endif
> - 
> - 	/*
> - 	 * Free DMA
> -@@ -202,6 +214,9 @@ EXPORT_SYMBOL_GPL(rt2x00pci_uninitialize
> -  */
> - static void rt2x00pci_free_reg(struct rt2x00_dev *rt2x00dev)
> - {
> -+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,31)
> -+	compat_destroy_threaded_irq(&rt2x00dev->irq_compat);
> -+#endif
> - 	kfree(rt2x00dev->rf);
> - 	rt2x00dev->rf = NULL;
> - 
> 


^ permalink raw reply

* Re: What's the typical RAID10 setup?
From: David Brown @ 2011-02-01 21:18 UTC (permalink / raw)
  To: linux-raid
In-Reply-To: <20110201160245.GA25659@www2.open-std.org>

On 01/02/11 17:02, Keld Jørn Simonsen wrote:
> On Tue, Feb 01, 2011 at 11:01:33AM +0100, David Brown wrote:
>> On 31/01/2011 23:52, Keld Jørn Simonsen wrote:
>>> raid1+0 and Linux MD raid10 are similar, but significantly different
>>> in a number of ways. Linux MD raid10 can run on only 2 drives.
>>> Linux raid10,f2 has almost RAID0 striping performance in sequential read.
>>> You can have an odd number of drives in raid10.
>>> And you can have as many copies as you like in raid10,
>>>
>>
>> You can make raid10,f2 functionality from raid1+0 by using partitions.
>> For example, to get a raid10,f2 equivalent on two drives, partition them
>> into equal halves.  Then make md0 a raid1 mirror of sda1 and sdb2, and
>> md1 a raid1 mirror of sdb1 and sda2.  Finally, make md2 a raid0 stripe
>> set of md0 and md1.
>
> I don't think you get the striping performance of raid10,f2 with this
> layout. And that is one of the main advantages of raid10,f2 layout.
> Have you tried it out?

No, I haven't tried it yet.  I've got four disks in this PC with an 
empty partition on each specifically for testing such things, but I 
haven't taken the time to try it properly.

But I believe you will get the striping performance - the two raid1 
parts are striped together as raid0, and they can both be accessed in 
parallel.

>
> As far as I can see the layout of blocks are not alternating between the
> disks. You have one raid1 of sda1 and sdb2, there a file is allocated on
> blocks sequentially on sda1 and then mirrored on sdb2, where it is also
> sequentially allocated. That gives no striping.
>

Suppose your data blocks are 0, 1, 2, 3, ... where each block is half a 
raid0 stripe.  Then the arrangement of this data on raid10,f2 is:

sda: 0 2 4 6 .... 1 3 5 7 ....
sdb: 1 3 5 7 .... 0 2 4 6 ....

The arrangement inside my md2 is (striped but not mirrored) :

md0: 0 2 4 6 ....
md1: 1 3 5 7 ....

Inside md0 (mirrored) is then:
sda1: 0 2 4 6 ....
sdb2: 0 2 4 6 ....

Inside md1 (mirrored) it is:
sdb1: 1 3 5 7 ....
sda2: 1 3 5 7 ....

Thus inside the disks themselves you have
sda: 0 2 4 6 .... 1 3 5 7 ....
sdb: 1 3 5 7 .... 0 2 5 7 ....


>> I don't think there is any way you can get the equivalent of raid10,o2
>> in this way.  But then, I am not sure how much use raid10,o2 actually is
>> - are there any usage patterns for which it is faster than raid10,n2 or
>> raid10,f2?
>
> In theory raid10,o2 should have better performance on SSD's because of
> the low latency, and raid10,o2 doing multireading from each drive, which
> raid0,n2 does not.
>

I think it should beat raid10,n2 for some things - because of 
multireading.  But I don't see it being faster than raid10,f2, which 
multi-reads even better.  In particular with SSD's, the disadvantage of 
raid10,f2 - the large head movements on writes - disappears.

> We lack some evidence from benchmarks, tho.
>

Indeed.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [patch 13/28] posix-timers: Convert clock_getres() to clockid_to_kclock()
From: john stultz @ 2011-02-01 21:17 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Richard Cochran, Ingo Molnar, Peter Zijlstra
In-Reply-To: <20110201134418.709802797@linutronix.de>

On Tue, 2011-02-01 at 13:51 +0000, Thomas Gleixner wrote:
> plain text document attachment
> (posix-timers-convert-clock-getres.patch)
> Use the new kclock decoding. Fixup the fallout in mmtimer.c
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Richard Cochran <richard.cochran@omicron.at>

Acked-by: John Stultz <johnstul@us.ibm.com>



^ permalink raw reply


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.