[PATCH -qemu] nvme: support Google vendor extension

All of lore.kernel.org
 help / color / mirror / Atom feed

From: pbonzini@redhat.com (Paolo Bonzini)
Subject: [PATCH -qemu] nvme: support Google vendor extension
Date: Sat, 21 Nov 2015 13:56:16 +0100	[thread overview]
Message-ID: <565069F0.5000805@redhat.com> (raw)
In-Reply-To: <1448060745.6565.1.camel@ssi>



On 21/11/2015 00:05, Ming Lin wrote:
> [    1.752129] Freeing unused kernel memory: 420K (ffff880001b97000 - ffff880001c00000)
> [    1.986573] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x30e5c9bbf83, max_idle_ns: 440795378954 ns
> [    1.988187] clocksource: Switched to clocksource tsc
> [    3.235423] clocksource: timekeeping watchdog: Marking clocksource 'tsc' as unstable because the skew is too large:
> [    3.358713] clocksource:                       'refined-jiffies' wd_now: fffeddf3 wd_last: fffedd76 mask: ffffffff
> [    3.410013] clocksource:                       'tsc' cs_now: 3c121d4ec cs_last: 340888eb7 mask: ffffffffffffffff
> [    3.450026] clocksource: Switched to clocksource refined-jiffies
> [    7.696769] Adding 392188k swap on /dev/vda5.  Priority:-1 extents:1 across:392188k 
> [    7.902174] EXT4-fs (vda1): re-mounted. Opts: (null)
> [    8.734178] EXT4-fs (vda1): re-mounted. Opts: errors=remount-ro
> 
> Then it doesn't response input for almost 1 minute.
> Without this patch, kernel loads quickly.

Interesting.  I guess there's time to debug it, since QEMU 2.6 is still 
a few months away.  In the meanwhile we can apply your patch as is, 
apart from disabling the "if (new_head >= cq->size)" and the similar 
one for "if (new_ tail >= sq->size".

But, I have a possible culprit.  In your nvme_cq_notifier you are not doing the 
equivalent of:

	start_sqs = nvme_cq_full(cq) ? 1 : 0;
        cq->head = new_head;
        if (start_sqs) {
            NvmeSQueue *sq;
            QTAILQ_FOREACH(sq, &cq->sq_list, entry) {
                timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
            }
            timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
        }

Instead, you are just calling nvme_post_cqes, which is the equivalent of

	timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);

Adding a loop to nvme_cq_notifier, and having it call nvme_process_sq, might
fix the weird 1-minute delay.

Paolo

> void memory_region_add_eventfd(MemoryRegion *mr,
>                                hwaddr addr,
>                                unsigned size,
>                                bool match_data,
>                                uint64_t data,
>                                EventNotifier *e)
> 
> Could you help to explain what "match_data" and "data" mean?

If match_data is true, the eventfd is only signalled if "data" is being written to memory.

Paolo

WARNING: multiple messages have this Message-ID (diff)

From: Paolo Bonzini <pbonzini@redhat.com>
To: Ming Lin <mlin@kernel.org>
Cc: fes@google.com, axboe@fb.com, tytso@mit.edu,
	qemu-devel@nongnu.org, linux-nvme@lists.infradead.org,
	virtualization@lists.linux-foundation.org, keith.busch@intel.com,
	Rob Nelson <rlnelson@google.com>, Christoph Hellwig <hch@lst.de>,
	Mihai Rusu <dizzy@google.com>
Subject: Re: [PATCH -qemu] nvme: support Google vendor extension
Date: Sat, 21 Nov 2015 13:56:16 +0100	[thread overview]
Message-ID: <565069F0.5000805@redhat.com> (raw)
In-Reply-To: <1448060745.6565.1.camel@ssi>



On 21/11/2015 00:05, Ming Lin wrote:
> [    1.752129] Freeing unused kernel memory: 420K (ffff880001b97000 - ffff880001c00000)
> [    1.986573] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x30e5c9bbf83, max_idle_ns: 440795378954 ns
> [    1.988187] clocksource: Switched to clocksource tsc
> [    3.235423] clocksource: timekeeping watchdog: Marking clocksource 'tsc' as unstable because the skew is too large:
> [    3.358713] clocksource:                       'refined-jiffies' wd_now: fffeddf3 wd_last: fffedd76 mask: ffffffff
> [    3.410013] clocksource:                       'tsc' cs_now: 3c121d4ec cs_last: 340888eb7 mask: ffffffffffffffff
> [    3.450026] clocksource: Switched to clocksource refined-jiffies
> [    7.696769] Adding 392188k swap on /dev/vda5.  Priority:-1 extents:1 across:392188k 
> [    7.902174] EXT4-fs (vda1): re-mounted. Opts: (null)
> [    8.734178] EXT4-fs (vda1): re-mounted. Opts: errors=remount-ro
> 
> Then it doesn't response input for almost 1 minute.
> Without this patch, kernel loads quickly.

Interesting.  I guess there's time to debug it, since QEMU 2.6 is still 
a few months away.  In the meanwhile we can apply your patch as is, 
apart from disabling the "if (new_head >= cq->size)" and the similar 
one for "if (new_ tail >= sq->size".

But, I have a possible culprit.  In your nvme_cq_notifier you are not doing the 
equivalent of:

	start_sqs = nvme_cq_full(cq) ? 1 : 0;
        cq->head = new_head;
        if (start_sqs) {
            NvmeSQueue *sq;
            QTAILQ_FOREACH(sq, &cq->sq_list, entry) {
                timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
            }
            timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
        }

Instead, you are just calling nvme_post_cqes, which is the equivalent of

	timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);

Adding a loop to nvme_cq_notifier, and having it call nvme_process_sq, might
fix the weird 1-minute delay.

Paolo

> void memory_region_add_eventfd(MemoryRegion *mr,
>                                hwaddr addr,
>                                unsigned size,
>                                bool match_data,
>                                uint64_t data,
>                                EventNotifier *e)
> 
> Could you help to explain what "match_data" and "data" mean?

If match_data is true, the eventfd is only signalled if "data" is being written to memory.

Paolo

WARNING: multiple messages have this Message-ID (diff)

From: Paolo Bonzini <pbonzini@redhat.com>
To: Ming Lin <mlin@kernel.org>
Cc: fes@google.com, axboe@fb.com, tytso@mit.edu,
	qemu-devel@nongnu.org, linux-nvme@lists.infradead.org,
	virtualization@lists.linux-foundation.org, keith.busch@intel.com,
	Rob Nelson <rlnelson@google.com>, Christoph Hellwig <hch@lst.de>,
	Mihai Rusu <dizzy@google.com>
Subject: Re: [Qemu-devel] [PATCH -qemu] nvme: support Google vendor extension
Date: Sat, 21 Nov 2015 13:56:16 +0100	[thread overview]
Message-ID: <565069F0.5000805@redhat.com> (raw)
In-Reply-To: <1448060745.6565.1.camel@ssi>



On 21/11/2015 00:05, Ming Lin wrote:
> [    1.752129] Freeing unused kernel memory: 420K (ffff880001b97000 - ffff880001c00000)
> [    1.986573] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x30e5c9bbf83, max_idle_ns: 440795378954 ns
> [    1.988187] clocksource: Switched to clocksource tsc
> [    3.235423] clocksource: timekeeping watchdog: Marking clocksource 'tsc' as unstable because the skew is too large:
> [    3.358713] clocksource:                       'refined-jiffies' wd_now: fffeddf3 wd_last: fffedd76 mask: ffffffff
> [    3.410013] clocksource:                       'tsc' cs_now: 3c121d4ec cs_last: 340888eb7 mask: ffffffffffffffff
> [    3.450026] clocksource: Switched to clocksource refined-jiffies
> [    7.696769] Adding 392188k swap on /dev/vda5.  Priority:-1 extents:1 across:392188k 
> [    7.902174] EXT4-fs (vda1): re-mounted. Opts: (null)
> [    8.734178] EXT4-fs (vda1): re-mounted. Opts: errors=remount-ro
> 
> Then it doesn't response input for almost 1 minute.
> Without this patch, kernel loads quickly.

Interesting.  I guess there's time to debug it, since QEMU 2.6 is still 
a few months away.  In the meanwhile we can apply your patch as is, 
apart from disabling the "if (new_head >= cq->size)" and the similar 
one for "if (new_ tail >= sq->size".

But, I have a possible culprit.  In your nvme_cq_notifier you are not doing the 
equivalent of:

	start_sqs = nvme_cq_full(cq) ? 1 : 0;
        cq->head = new_head;
        if (start_sqs) {
            NvmeSQueue *sq;
            QTAILQ_FOREACH(sq, &cq->sq_list, entry) {
                timer_mod(sq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
            }
            timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
        }

Instead, you are just calling nvme_post_cqes, which is the equivalent of

	timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);

Adding a loop to nvme_cq_notifier, and having it call nvme_process_sq, might
fix the weird 1-minute delay.

Paolo

> void memory_region_add_eventfd(MemoryRegion *mr,
>                                hwaddr addr,
>                                unsigned size,
>                                bool match_data,
>                                uint64_t data,
>                                EventNotifier *e)
> 
> Could you help to explain what "match_data" and "data" mean?

If match_data is true, the eventfd is only signalled if "data" is being written to memory.

Paolo

next prev parent reply	other threads:[~2015-11-21 12:56 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-18  5:47 [RFC PATCH 0/2] Google extension to improve qemu-nvme performance Ming Lin
2015-11-18  5:47 ` [Qemu-devel] " Ming Lin
2015-11-18  5:47 ` Ming Lin
2015-11-18  5:47 ` [PATCH -kernel] nvme: improve performance for virtual NVMe devices Ming Lin
2015-11-18  5:47   ` [Qemu-devel] " Ming Lin
2015-11-18  5:47   ` Ming Lin
2015-11-18  5:47 ` [PATCH -qemu] nvme: support Google vendor extension Ming Lin
2015-11-18  5:47   ` [Qemu-devel] " Ming Lin
2015-11-18  5:47   ` Ming Lin
2015-11-19 10:37   ` Paolo Bonzini
2015-11-19 10:37     ` [Qemu-devel] " Paolo Bonzini
2015-11-19 10:37     ` Paolo Bonzini
2015-11-20  8:11     ` Ming Lin
2015-11-20  8:11       ` [Qemu-devel] " Ming Lin
2015-11-20  8:11       ` Ming Lin
2015-11-20  8:58       ` Paolo Bonzini
2015-11-20  8:58         ` [Qemu-devel] " Paolo Bonzini
2015-11-20  8:58         ` Paolo Bonzini
2015-11-20 23:05         ` Ming Lin
2015-11-20 23:05           ` [Qemu-devel] " Ming Lin
2015-11-20 23:05           ` Ming Lin
2015-11-21 12:56           ` Paolo Bonzini [this message]
2015-11-21 12:56             ` [Qemu-devel] " Paolo Bonzini
2015-11-21 12:56             ` Paolo Bonzini
2015-11-22  7:45             ` Ming Lin
2015-11-22  7:45               ` [Qemu-devel] " Ming Lin
2015-11-22  7:45               ` Ming Lin
2015-11-24  6:29               ` Ming Lin
2015-11-24  6:29                 ` [Qemu-devel] " Ming Lin
2015-11-24  6:29                 ` Ming Lin
2015-11-24 11:01                 ` Paolo Bonzini
2015-11-24 11:01                   ` [Qemu-devel] " Paolo Bonzini
2015-11-24 11:01                   ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=565069F0.5000805@redhat.com \
    --to=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.