qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <richard.henderson@linaro.org>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: cota@braap.org, qemu-devel@nongnu.org
Subject: Re: [RFC PATCH 0/5] Experimenting with tb-lookup tweaks
Date: Wed, 24 Feb 2021 16:28:54 -0800	[thread overview]
Message-ID: <7d23665f-fa20-028f-d48a-2ea79ab35b2f@linaro.org> (raw)
In-Reply-To: <20210224165811.11567-1-alex.bennee@linaro.org>

On 2/24/21 8:58 AM, Alex Bennée wrote:
> Hi Richard,
> 
> Well I spun up some of the ideas we talked about to see if there was
> anything to be squeezed out of the function. In the end the results
> seem to be a washout with my pigz benchmark:
> 
>  qemu-system-aarch64 -cpu cortex-a57 \
>    -machine type=virt,virtualization=on,gic-version=3 \
>    -serial mon:stdio \
>    -netdev user,id=unet,hostfwd=tcp::2222-:22 \
>    -device virtio-net-pci,netdev=unet,id=virt-net,disable-legacy=on \
>    -device virtio-scsi-pci,id=virt-scsi,disable-legacy=on \
>    -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-buster-arm64 \
>    -device scsi-hd,drive=hd,id=virt-scsi-hd \
>    -smp 4 -m 4096 \
>    -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image \
>    -append "root=/dev/sda2 systemd.unit=benchmark-pigz.service" \
>    -display none -snapshot
> 
> | Command | Mean [s]       | Min [s] | Max [s] | Relative |
> |---------+----------------+---------+---------+----------|
> | Before  | 46.597 ± 2.482 |  45.208 |  53.618 |     1.00 |
> | After   | 46.867 ± 2.242 |  45.871 |  53.180 |     1.00 |

Well that's disappointing.

> Maybe the code cleanup itself makes it worthwhile. WDYT?

I think there's little doubt that the first 3 patches are a good code cleanup.

Patch 4 I think is still beneficial, simply so that we can add that "Above
fields" comment.

Patch 5 would only be worthwhile if we could measure any positive difference,
which it seems we cannot.

I have a follow-up patch to remove the parallel_cpus global variable which I
will post in a moment.  While it removes a handful of insns from this
fast-path, I doubt it helps.  But getting rid of a global is probably always
positive, no?

I was glancing through the lookup function for alpha, instead of aarch64 and saw:

 21e:   33 43 18                xor    0x18(%rbx),%eax
 221:   4c 31 e1                xor    %r12,%rcx
 224:   44 31 ea                xor    %r13d,%edx
 227:   09 c2                   or     %eax,%edx
 229:   48 0b 4b 08             or     0x8(%rbx),%rcx

and thought -- hang on, how come we're just ORing nor XORing here?  Of course
it's the cs_base field, which alpha has set to zero.  The compiler has
simplified bits |= 0 ^ tb->cs_base.

Which got me thinking: what if we had a per-cpu

typedef struct {
    target_ulong pc;
    ...
} TranslationBlockID;

static inline bool arch_tbid_cmp(TranslationBlockID x,
                                 TranslationBlockID y)
{
    return x.pc == y.pc && ...;
}

We could potentially reduce this to memcmp(&x, &y).

First, this would allow cs_base to be eliminated where it is not used.  Second,
this would allow cs_base to be renamed for the non-x86 targets for which it is
being abused.  Third, it would allow tb->flags to be either (a) elided or (b)
extended by the target as needed.

This final is directed at ARM, of course, where we've overflowed the uint32_t
that is tb->flags.  We could now extend that to 64-bits.

Obviously, some tweaks to tb_hash_func would be required as well, but that's
manageable.

What do you think about this last?


r~


  parent reply	other threads:[~2021-02-25  0:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-24 16:58 [RFC PATCH 0/5] Experimenting with tb-lookup tweaks Alex Bennée
2021-02-24 16:58 ` [RFC PATCH 1/5] accel/tcg: rename tb_lookup__cpu_state and hoist state extraction Alex Bennée
2021-02-24 16:58 ` [RFC PATCH 2/5] accel/tcg: move CF_CLUSTER calculation to curr_cflags Alex Bennée
2021-02-24 16:58 ` [RFC PATCH 3/5] accel/tcg: drop the use of CF_HASH_MASK and rename params Alex Bennée
2021-02-24 16:58 ` [RFC PATCH 4/5] include/exec: lightly re-arrange TranslationBlock Alex Bennée
2021-02-24 16:58 ` [RFC PATCH 5/5] include/exec/tb-lookup: try and reduce branch prediction issues Alex Bennée
2021-02-25  0:28 ` Richard Henderson [this message]
2021-02-25 10:15   ` [RFC PATCH 0/5] Experimenting with tb-lookup tweaks Alex Bennée
2021-02-25 15:45 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d23665f-fa20-028f-d48a-2ea79ab35b2f@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=alex.bennee@linaro.org \
    --cc=cota@braap.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).