qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: "Fam Zheng" <fam@euphon.net>,
	"Eduardo Habkost" <eduardo@habkost.net>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@virtuozzo.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	qemu-block@nongnu.org, "Juan Quintela" <quintela@redhat.com>,
	qemu-devel@nongnu.org, "John Snow" <jsnow@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Hanna Reitz" <hreitz@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v7 00/31] block layer: split block APIs in global state and I/O
Date: Tue, 1 Mar 2022 17:08:10 +0100	[thread overview]
Message-ID: <Yh5E6odY1pUw2LVf@redhat.com> (raw)
In-Reply-To: <20220211145153.2861415-1-eesposit@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 7780 bytes --]

Am 11.02.2022 um 15:51 hat Emanuele Giuseppe Esposito geschrieben:
> Currently, block layer APIs like block.h contain a mix of
> functions that are either running in the main loop and under the
> BQL, or are thread-safe functions and run in iothreads performing I/O.
> The functions running under BQL also take care of modifying the
> block graph, by using drain and/or aio_context_acquire/release.
> This makes it very confusing to understand where each function
> runs, and what assumptions it provided with regards to thread
> safety.
> 
> We call the functions running under BQL "global state (GS) API", and
> distinguish them from the thread-safe "I/O API".
> 
> The aim of this series is to split the relevant block headers in
> global state and I/O sub-headers. The division will be done in
> this way:
> header.h will be split in header-global-state.h, header-io.h and
> header-common.h. The latter will just contain the data structures
> needed by header-global-state and header-io, and common helpers
> that are neither in GS nor in I/O. header.h will remain for
> legacy and to avoid changing all includes in all QEMU c files,
> but will only include the two new headers. No function shall be
> added in header.c .
> Once we split all relevant headers, it will be much easier to see what
> uses the AioContext lock and remove it, which is the overall main
> goal of this and other series that I posted/will post.
> 
> In addition to splitting the relevant headers shown in this series,
> it is also very helpful splitting the function pointers in some
> block structures, to understand what runs under AioContext lock and
> what doesn't. This is what patches 21-27 do.
> 
> Each function in the GS API will have an assertion, checking
> that it is always running under BQL.
> I/O functions are instead thread safe (or so should be), meaning
> that they *can* run under BQL, but also in an iothread in another
> AioContext. Therefore they do not provide any assertion, and
> need to be audited manually to verify the correctness.
> 
> Adding assetions has helped finding 2 bugs already, as shown in
> my series "Migration: fix missing iothread locking".
> 
> Tested this series by running unit tests, qemu-iotests and qtests
> (x86_64).
> Some functions in the GS API are used everywhere but not
> properly tested. Therefore their assertion is never actually run in
> the tests, so despite my very careful auditing, it is not impossible
> to exclude that some will trigger while actually using QEMU.
> 
> Patch 1 introduces qemu_in_main_thread(), the function used in
> all assertions. This had to be introduced otherwise all unit tests
> would fail, since they run in the main loop but use the code in
> stubs/iothread.c
> Patches 2-27 (with the exception of patch 9-10, that are an additional
> assert) are all structured in the same way: first we split the header
> and in the next (or same, if small) patch we add assertions.
> Patch 28-31 take care instead of the block layer permission API,
> fixing some bugs where they are used in I/O functions.
> 
> This serie depends on my previous serie "block layer: permission API
> refactoring in preparation to the API split"
> 
> Based-on: <20220209105452.1694545-1-eesposit@redhat.com>
> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

Verifying the correctness of all the categorisations looked hard, so
instead of that I'll give you something to review now. :-)

The best way I could think of is to hack up a small script that checks
the consistency of all the macro annotations, i.e. no direct caller of
IO_CODE() may indirectly call GLOBAL_STATE_CODE() etc.

I'll attach that script so you can verify if the approach makes sense to
you and if my code is correct.

The good news is that the resulting list of errors is relatively small,
but it's not entirely empty. A good number of them look like false
positives (probably everything going through bdrv_co_drain_bh_cb()), but
there seem to be a few real ones, too:

Error: bdrv_get_full_backing_filename() is IO_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_get_full_backing_filename() -> bdrv_make_absolute_filename() -> bdrv_dirname() -> GLOBAL_STATE_CODE()

Error: blk_error_action() is IO_CODE(), but calls GLOBAL_STATE_CODE() code
       blk_error_action() -> send_qmp_error_event() -> blk_iostatus_is_enabled() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_end_no_poll() is IO_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_drained_end_no_poll() -> bdrv_do_drained_end() -> bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: nbd_co_do_establish_connection() is IO_CODE(), but calls GLOBAL_STATE_CODE() code
       nbd_co_do_establish_connection() -> nbd_handle_updated_info() -> bdrv_apply_auto_read_only() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_end_no_poll() is IO_CODE(), but calls IO_OR_GS_CODE() code
       bdrv_drained_end_no_poll() -> bdrv_do_drained_end() -> bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin_quiesce() -> IO_OR_GS_CODE()

Error: blk_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       blk_drain() -> bdrv_ref() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_begin() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_drained_begin() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_subtree_drained_begin() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_subtree_drained_begin() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_drained_end() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_drained_end() -> bdrv_do_drained_end() -> bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_subtree_drained_end() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_subtree_drained_end() -> bdrv_do_drained_end() -> bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_apply_subtree_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_apply_subtree_drain() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_unapply_subtree_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_unapply_subtree_drain() -> bdrv_do_drained_end() -> bdrv_do_drained_end() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_co_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_co_drain() -> bdrv_drained_begin() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Error: bdrv_drain() is IO_OR_GS_CODE(), but calls GLOBAL_STATE_CODE() code
       bdrv_drain() -> bdrv_drained_begin() -> bdrv_do_drained_begin() -> bdrv_do_drained_begin() -> bdrv_co_yield_to_drain() -> bdrv_co_drain_bh_cb() -> bdrv_drain_all_begin() -> GLOBAL_STATE_CODE()

Do you want to address some of these cases before we merge the series? I
expect that usually we just miscategorised the caller, so the assertion
wouldn't actually fail at runtime, but I haven't checked in detail yet.

Kevin

[-- Attachment #2: check_calls.py --]
[-- Type: text/plain, Size: 2866 bytes --]

#!/usr/bin/env python3

from typing import Dict, List, Sequence, Set, Tuple
import subprocess
import sys

def cflow(filenames: Sequence[str]) -> str:
    args = [
        'cflow',
        '-AArd2',
        *filenames
    ]
    print(' '.join(f'"{x}"' for x in args))
    p = subprocess.Popen(args,
                         stdout=subprocess.PIPE,
                         stderr=None,
                         universal_newlines=True)
    return p.communicate()[0]

CallGraph = Dict[str, List[str]]

def build_call_graph(cflow_output: str) -> CallGraph:
    call_graph = {}
    cur = None

    for line in cflow_output.split('\n'):
        if not line:
            continue
        indented = line[0] == ' '
        line = line.strip().split(None, 1)[0]
        if line.endswith(':'):
            line = line[:-1]

        if not indented:
            if line in call_graph:
                print(f"Warning: {line} redefined")
            else:
                call_graph[line] = []
            cur = line
        else:
            call_graph[cur].append(line)

    return call_graph

def recursive_callers(call_graph: CallGraph, start: str
                      ) -> Tuple[Set[str],Dict[str, str]]:
    result = set()
    to_process = [start]
    paths = {start: start}

    while to_process:
        caller = to_process.pop()
        if caller in result:
            continue
        result.add(caller)
        if caller in call_graph:
            to_process += call_graph[caller]
            for c in call_graph[caller]:
                paths[c] = f'{c} -> {paths[caller]}'
    return result, paths

def check_conflict(call_graph: CallGraph,
                   checked_class: str,
                   disallowed: Set[str],
                   disallowed_class: str,
                   disallowed_paths: Dict[str, str]) -> None:
    if checked_class not in call_graph:
        return

    for fn in call_graph[checked_class]:
        if fn in disallowed:
            print(f'Error: {fn} is {checked_class}, but calls {disallowed_class} code')
            print(f'       {disallowed_paths[fn]}')
            print()

def main() -> None:
    filenames = sys.argv[1:]
    cflow_output = cflow(filenames)
    call_graph = build_call_graph(cflow_output)

    calling_gs, paths_gs = recursive_callers(call_graph, 'GLOBAL_STATE_CODE()')
    calling_io_or_gs, paths_io_or_gs = recursive_callers(call_graph, 'IO_OR_GS_CODE()')
    calling_io, paths_io = recursive_callers(call_graph, 'IO_CODE()')

    check_conflict(call_graph, 'IO_CODE()',
                   calling_gs, 'GLOBAL_STATE_CODE()', paths_gs)
    check_conflict(call_graph, 'IO_CODE()',
                   calling_io_or_gs, 'IO_OR_GS_CODE()', paths_io_or_gs)
    check_conflict(call_graph, 'IO_OR_GS_CODE()',
                   calling_gs, 'GLOBAL_STATE_CODE()', paths_gs)

if __name__ == '__main__':
    main()

  parent reply	other threads:[~2022-03-01 16:10 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-11 14:51 [PATCH v7 00/31] block layer: split block APIs in global state and I/O Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 01/31] main-loop.h: introduce qemu_in_main_thread() Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 02/31] main loop: macros to mark GS and I/O functions Emanuele Giuseppe Esposito
2022-03-01 11:29   ` Kevin Wolf
2022-02-11 14:51 ` [PATCH v7 03/31] include/block/block: split header into I/O and global state API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 04/31] assertions for block " Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 05/31] IO_CODE and IO_OR_GS_CODE for block I/O API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 06/31] block/export/fuse.c: allow writable exports to take RESIZE permission Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 07/31] include/sysemu/block-backend: split header into I/O and global state (GS) API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 08/31] block/block-backend.c: assertions for block-backend Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 09/31] IO_CODE and IO_OR_GS_CODE for block-backend I/O API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 10/31] block.c: assertions to the block layer permissions API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 11/31] include/block/block_int: split header into I/O and global state API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 12/31] assertions for block_int " Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 13/31] IO_CODE and IO_OR_GS_CODE for block_int I/O API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 14/31] block: introduce assert_bdrv_graph_writable Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 15/31] include/block/blockjob_int.h: split header into I/O and GS API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 16/31] GS and IO CODE macros for blockjob_int.h Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 17/31] block.c: add assertions to static functions Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 18/31] include/block/blockjob.h: global state API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 19/31] assertions for blockjob.h " Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 20/31] include/sysemu/blockdev.h: " Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 21/31] assertions for blockdev.h " Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 22/31] include/block/snapshot: global state API + assertions Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 23/31] block/copy-before-write.h: " Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 24/31] block/coroutines: I/O and "I/O or GS" API Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 25/31] block_int-common.h: split function pointers in BlockDriver Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 26/31] block_int-common.h: assertions in the callers of BlockDriver function pointers Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 27/31] block_int-common.h: split function pointers in BdrvChildClass Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 28/31] block_int-common.h: assertions in the callers of BdrvChildClass function pointers Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 29/31] block-backend-common.h: split function pointers in BlockDevOps Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 30/31] job.h: split function pointers in JobDriver Emanuele Giuseppe Esposito
2022-02-11 14:51 ` [PATCH v7 31/31] job.h: assertions in the callers of JobDriver function pointers Emanuele Giuseppe Esposito
2022-03-01 16:08 ` Kevin Wolf [this message]
2022-03-03 12:06   ` [PATCH v7 00/31] block layer: split block APIs in global state and I/O Emanuele Giuseppe Esposito

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yh5E6odY1pUw2LVf@redhat.com \
    --to=kwolf@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).