qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Benoît Canet" <benoit.canet@irqsave.net>
To: Kevin Wolf <kwolf@redhat.com>
Cc: pl@kamp.de, qemu-devel@nongnu.org, mreitz@redhat.com,
	stefanha@redhat.com, pbonzini@redhat.com,
	xiawenc@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH v3 17/29] block: Generalise and optimise COR serialisation
Date: Wed, 22 Jan 2014 21:00:26 +0100	[thread overview]
Message-ID: <20140122200026.GC3053@irqsave.net> (raw)
In-Reply-To: <1389968119-24771-18-git-send-email-kwolf@redhat.com>

Le Friday 17 Jan 2014 à 15:15:07 (+0100), Kevin Wolf a écrit :
> Change the API so that specific requests can be marked serialising. Only

I find the spelling "serialising" instead of "serializing" really odd since QEMU
code is full of z like the word virtualization.

Reviewed-by: Benoit Canet <benoit@irqsave.net>

> these requests are checked for overlaps then.
> 
> This means that during a Copy on Read operation, not all requests
> overlapping other requests are serialised any more, but only those that
> actually overlap with the specific COR request.
> 
> Also remove COR from function and variable names because this
> functionality can be useful in other contexts.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> ---
>  block.c                   | 48 ++++++++++++++++++++++++++++-------------------
>  include/block/block_int.h |  5 +++--
>  2 files changed, 32 insertions(+), 21 deletions(-)
> 
> diff --git a/block.c b/block.c
> index d7156ce..efa8979 100644
> --- a/block.c
> +++ b/block.c
> @@ -2028,6 +2028,10 @@ int bdrv_commit_all(void)
>   */
>  static void tracked_request_end(BdrvTrackedRequest *req)
>  {
> +    if (req->serialising) {
> +        req->bs->serialising_in_flight--;
> +    }
> +
>      QLIST_REMOVE(req, list);
>      qemu_co_queue_restart_all(&req->wait_queue);
>  }
> @@ -2042,10 +2046,11 @@ static void tracked_request_begin(BdrvTrackedRequest *req,
>  {
>      *req = (BdrvTrackedRequest){
>          .bs = bs,
> -        .offset = offset,
> -        .bytes = bytes,
> -        .is_write = is_write,
> -        .co = qemu_coroutine_self(),
> +        .offset         = offset,
> +        .bytes          = bytes,
> +        .is_write       = is_write,
> +        .co             = qemu_coroutine_self(),
> +        .serialising    = false,
>      };
>  
>      qemu_co_queue_init(&req->wait_queue);
> @@ -2053,6 +2058,14 @@ static void tracked_request_begin(BdrvTrackedRequest *req,
>      QLIST_INSERT_HEAD(&bs->tracked_requests, req, list);
>  }
>  
> +static void mark_request_serialising(BdrvTrackedRequest *req)
> +{
> +    if (!req->serialising) {
> +        req->bs->serialising_in_flight++;
> +        req->serialising = true;
> +    }
> +}
> +
>  /**
>   * Round a region to cluster boundaries
>   */
> @@ -2105,26 +2118,31 @@ static bool tracked_request_overlaps(BdrvTrackedRequest *req,
>      return true;
>  }
>  
> -static void coroutine_fn wait_for_overlapping_requests(BlockDriverState *bs,
> -        BdrvTrackedRequest *self, int64_t offset, unsigned int bytes)
> +static void coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
>  {
> +    BlockDriverState *bs = self->bs;
>      BdrvTrackedRequest *req;
>      int64_t cluster_offset;
>      unsigned int cluster_bytes;
>      bool retry;
>  
> +    if (!bs->serialising_in_flight) {
> +        return;
> +    }
> +
>      /* If we touch the same cluster it counts as an overlap.  This guarantees
>       * that allocating writes will be serialized and not race with each other
>       * for the same cluster.  For example, in copy-on-read it ensures that the
>       * CoR read and write operations are atomic and guest writes cannot
>       * interleave between them.
>       */
> -    round_bytes_to_clusters(bs, offset, bytes, &cluster_offset, &cluster_bytes);
> +    round_bytes_to_clusters(bs, self->offset, self->bytes,
> +                            &cluster_offset, &cluster_bytes);
>  
>      do {
>          retry = false;
>          QLIST_FOREACH(req, &bs->tracked_requests, list) {
> -            if (req == self) {
> +            if (req == self || (!req->serialising && !self->serialising)) {
>                  continue;
>              }
>              if (tracked_request_overlaps(req, cluster_offset, cluster_bytes)) {
> @@ -2743,12 +2761,10 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
>  
>      /* Handle Copy on Read and associated serialisation */
>      if (flags & BDRV_REQ_COPY_ON_READ) {
> -        bs->copy_on_read_in_flight++;
> +        mark_request_serialising(req);
>      }
>  
> -    if (bs->copy_on_read_in_flight) {
> -        wait_for_overlapping_requests(bs, req, offset, bytes);
> -    }
> +    wait_serialising_requests(req);
>  
>      if (flags & BDRV_REQ_COPY_ON_READ) {
>          int pnum;
> @@ -2797,10 +2813,6 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
>      }
>  
>  out:
> -    if (flags & BDRV_REQ_COPY_ON_READ) {
> -        bs->copy_on_read_in_flight--;
> -    }
> -
>      return ret;
>  }
>  
> @@ -2999,9 +3011,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BlockDriverState *bs,
>      assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
>      assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
>  
> -    if (bs->copy_on_read_in_flight) {
> -        wait_for_overlapping_requests(bs, req, offset, bytes);
> -    }
> +    wait_serialising_requests(req);
>  
>      ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);
>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 97a4d23..d8443df 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -60,6 +60,7 @@ typedef struct BdrvTrackedRequest {
>      int64_t offset;
>      unsigned int bytes;
>      bool is_write;
> +    bool serialising;
>      QLIST_ENTRY(BdrvTrackedRequest) list;
>      Coroutine *co; /* owner, used for deadlock detection */
>      CoQueue wait_queue; /* coroutines blocked on this request */
> @@ -296,8 +297,8 @@ struct BlockDriverState {
>      /* Callback before write request is processed */
>      NotifierWithReturnList before_write_notifiers;
>  
> -    /* number of in-flight copy-on-read requests */
> -    unsigned int copy_on_read_in_flight;
> +    /* number of in-flight serialising requests */
> +    unsigned int serialising_in_flight;
>  
>      /* I/O throttling */
>      ThrottleState throttle_state;
> -- 
> 1.8.1.4
> 
> 

  reply	other threads:[~2014-01-22 20:00 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-17 14:14 [Qemu-devel] [PATCH v3 00/29] block: Support for 512b-on-4k emulation Kevin Wolf
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 01/29] block: Move initialisation of BlockLimits to bdrv_refresh_limits() Kevin Wolf
2014-01-17 22:39   ` Benoît Canet
2014-01-20  9:31     ` Kevin Wolf
2014-01-20  9:49       ` Peter Lieven
2014-01-21 12:49   ` Benoît Canet
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 02/29] block: Inherit opt_transfer_length Kevin Wolf
2014-01-17 22:42   ` Benoît Canet
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 03/29] block: Update BlockLimits when they might have changed Kevin Wolf
2014-01-17 22:47   ` Benoît Canet
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 04/29] qemu_memalign: Allow small alignments Kevin Wolf
2014-01-17 22:49   ` Benoît Canet
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 05/29] block: Detect unaligned length in bdrv_qiov_is_aligned() Kevin Wolf
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 06/29] block: Don't use guest sector size for qemu_blockalign() Kevin Wolf
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 07/29] block: rename buffer_alignment to guest_block_size Kevin Wolf
2014-01-21 12:54   ` Benoît Canet
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 08/29] raw: Probe required direct I/O alignment Kevin Wolf
2014-01-21 13:03   ` Benoît Canet
2014-01-21 13:29     ` Kevin Wolf
2014-01-17 14:14 ` [Qemu-devel] [PATCH v3 09/29] block: Introduce bdrv_aligned_preadv() Kevin Wolf
2014-01-21 13:13   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 10/29] block: Introduce bdrv_co_do_preadv() Kevin Wolf
2014-01-17 23:59   ` Max Reitz
2014-01-21 13:29   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 11/29] block: Introduce bdrv_aligned_pwritev() Kevin Wolf
2014-01-21 13:31   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 12/29] block: write: Handle COR dependency after I/O throttling Kevin Wolf
2014-01-21 13:33   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 13/29] block: Introduce bdrv_co_do_pwritev() Kevin Wolf
2014-01-18  0:00   ` Max Reitz
2014-01-21 13:36   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 14/29] block: Switch BdrvTrackedRequest to byte granularity Kevin Wolf
2014-01-17 23:19   ` Max Reitz
2014-01-21 13:49   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 15/29] block: Allow waiting for overlapping requests between begin/end Kevin Wolf
2014-01-22 19:46   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 16/29] block: Make zero-after-EOF work with larger alignment Kevin Wolf
2014-01-17 23:21   ` Max Reitz
2014-01-22 19:50   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 17/29] block: Generalise and optimise COR serialisation Kevin Wolf
2014-01-22 20:00   ` Benoît Canet [this message]
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 18/29] block: Make overlap range for serialisation dynamic Kevin Wolf
2014-01-22 20:15   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 19/29] block: Allow wait_serialising_requests() at any point Kevin Wolf
2014-01-22 20:21   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 20/29] block: Align requests in bdrv_co_do_pwritev() Kevin Wolf
2014-01-22 20:29   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 21/29] block: Assert serialisation assumptions in pwritev Kevin Wolf
2014-01-17 23:42   ` Max Reitz
2014-01-24 16:09   ` Benoît Canet
2014-01-24 16:18     ` Kevin Wolf
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 22/29] block: Change coroutine wrapper to byte granularity Kevin Wolf
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 23/29] block: Make bdrv_pread() a bdrv_prwv_co() wrapper Kevin Wolf
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 24/29] block: Make bdrv_pwrite() " Kevin Wolf
2014-01-17 23:43   ` Max Reitz
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 25/29] iscsi: Set bs->request_alignment Kevin Wolf
2014-01-24 16:29   ` Benoît Canet
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 26/29] blkdebug: Make required alignment configurable Kevin Wolf
2014-01-17 23:50   ` Max Reitz
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 27/29] qemu-io: New command 'sleep' Kevin Wolf
2014-01-17 23:55   ` Max Reitz
2014-01-20  9:58     ` Kevin Wolf
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 28/29] qemu-iotests: Test pwritev RMW logic Kevin Wolf
2014-01-18 16:01   ` Max Reitz
2014-01-20  9:44     ` Kevin Wolf
2014-01-17 14:15 ` [Qemu-devel] [PATCH v3 29/29] block: Switch bdrv_io_limits_intercept() to byte granularity Kevin Wolf
2014-01-17 23:59   ` Max Reitz
2014-01-22 20:30 ` [Qemu-devel] [PATCH v3 00/29] block: Support for 512b-on-4k emulation Christian Borntraeger
2014-01-23 10:29   ` Kevin Wolf
2014-01-23 11:12     ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140122200026.GC3053@irqsave.net \
    --to=benoit.canet@irqsave.net \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=xiawenc@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).