qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@kaod.org>
To: Peter Delevoryas <pdel@fb.com>
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Andrew Jeffery" <andrew@aj.id.au>,
	"Joel Stanley" <joel@jms.id.au>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"berrange@redhat.com" <berrange@redhat.com>,
	"eduardo@habkost.net" <eduardo@habkost.net>,
	"marcel.apfelbaum@gmail.com" <marcel.apfelbaum@gmail.com>,
	"richard.henderson@linaro.org" <richard.henderson@linaro.org>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"ani@anisinha.ca" <ani@anisinha.ca>,
	"Cameron Esfahani via" <qemu-devel@nongnu.org>,
	qemu-arm <qemu-arm@nongnu.org>,
	"Alex Bennée" <alex.bennee@linaro.org>
Subject: Re: [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public
Date: Wed, 29 Jun 2022 11:11:41 +0200	[thread overview]
Message-ID: <07128acf-329a-f372-c48c-0c3cb498d3d0@kaod.org> (raw)
In-Reply-To: <e07ec4fe-6968-b19a-e649-298a9aaccba5@kaod.org>

On 6/24/22 18:50, Cédric Le Goater wrote:
> On 6/23/22 20:43, Peter Delevoryas wrote:
>>
>>
>>> On Jun 23, 2022, at 8:09 AM, Cédric Le Goater <clg@kaod.org> wrote:
>>>
>>> On 6/23/22 12:26, Peter Delevoryas wrote:
>>>> Signed-off-by: Peter Delevoryas <pdel@fb.com>
>>>
>>> Let's start simple without flash support. We should be able to
>>> load FW blobs in each CPU address space using loader devices.
>>
>> Actually, I was unable to do this, perhaps because the fb OpenBMC
>> boot sequence is a little weird. I specifically _needed_ to have
>> a flash device which maps the firmware in at 0x2000_0000, because
>> the fb OpenBMC U-Boot SPL jumps to that address to start executing
>> from flash? I think this is also why fb OpenBMC machines can be so slow.
>>
>> $ ./build/qemu-system-arm -machine fby35 \
>>      -device loader,file=fby35.mtd,addr=0,cpu-num=0 -nographic \
>>      -d int -drive file=fby35.mtd,format=raw,if=mtd
> 
> 
> 
> Ideally we should be booting from the flash device directly using
> the machine option '-M ast2600-evb,execute-in-place=true' like HW
> does. Instructions are fetched using SPI transfers. But the amount
> of code generated is tremendous. See some profiling below for a
> run which barely reaches DRAM training in U-Boot.

Some more profiling on both ast2500 and ast2600 machines shows :


* ast2600-evb,execute-in-place=true :

Type               Object  Call site                Wait Time (s)         Count  Average (us)
---------------------------------------------------------------------------------------------
BQL mutex  0x564dc03922e0  accel/tcg/cputlb.c:1365       14.21443      32909927          0.43
condvar    0x564dc0f02988  util/thread-pool.c:90         10.02312            56     178984.32
condvar    [           2]  softmmu/cpus.c:423             0.10051             6      16752.04
BQL mutex  0x564dc03922e0  util/rcu.c:269                 0.04372             4      10930.60
BQL mutex  0x564dc03922e0  cpus-common.c:341              0.00151             8        189.16
condvar    0x564dc0390360  cpus-common.c:176              0.00092             8        115.04
condvar    0x564dc0392280  softmmu/cpus.c:642             0.00013             2         65.04
condvar    0x564dc0392240  softmmu/cpus.c:571             0.00010             2         49.54
BQL mutex  0x564dc03922e0  accel/tcg/cputlb.c:1426        0.00006           467          0.14
condvar    0x564dc03903a0  cpus-common.c:206              0.00004             8          5.28
---------------------------------------------------------------------------------------------


* ast2500-evb,execute-in-place=true :

Type               Object  Call site                Wait Time (s)         Count  Average (us)
---------------------------------------------------------------------------------------------
condvar    0x55a581137f88  util/thread-pool.c:90         10.01158            28     357556.50
BQL mutex  0x55a57f0e02e0  accel/tcg/cputlb.c:1365        0.29886      14394475          0.02
condvar    0x55a5814cb5a0  softmmu/cpus.c:423             0.02182             2      10912.44
BQL mutex  0x55a57f0e02e0  util/rcu.c:269                 0.01420             4       3549.56
mutex      0x55a5813381c0  tcg/region.c:204               0.00007          3052          0.02
condvar    0x55a57f0e0280  softmmu/cpus.c:642             0.00006             1         59.79
mutex      [           2]  chardev/char.c:118             0.00003          1492          0.02
BQL mutex  0x55a57f0e02e0  util/main-loop.c:318           0.00002            34          0.72
BQL mutex  0x55a57f0e02e0  accel/tcg/cputlb.c:1426        0.00002           973          0.02
condvar    0x55a57f0e0240  softmmu/cpus.c:571             0.00002             1         15.16
---------------------------------------------------------------------------------------------

C.



> 
> * execute-in-place=true
> 
> Each sample counts as 0.01 seconds.
>    %   cumulative   self              self     total
>   time   seconds   seconds    calls  ns/call  ns/call  name
> 100.00      0.02     0.02   164276   121.75   121.75  memory_region_init_rom_device
>    0.00      0.02     0.00 1610346008     0.00     0.00  tcg_code_capacity
>    0.00      0.02     0.00 567612621     0.00     0.00  type_register_static_array
>    0.00      0.02     0.00 328886191     0.00     0.00  do_common_semihosting
>    0.00      0.02     0.00 297215811     0.00     0.00  container_get
>    0.00      0.02     0.00 292670030     0.00     0.00  arm_cpu_tlb_fill
>    0.00      0.02     0.00 195416119     0.00     0.00  arm_cpu_register_gdb_regs_for_features
>    0.00      0.02     0.00 193326677     0.00     0.00  object_type_get_instance_size
>    0.00      0.02     0.00 182365829     0.00     0.00  tcg_op_insert_after
>    0.00      0.02     0.00 150668458     0.00     0.00  plugin_gen_tb_end
>    0.00      0.02     0.00 142171940     0.00     0.00  gen_new_label
>    0.00      0.02     0.00 133200628     0.00     0.00  smbios_build_type_38_table
>    0.00      0.02     0.00 130540338     0.00     0.00  object_dynamic_cast_assert
>    0.00      0.02     0.00 129223195     0.00     0.00  cpu_loop_exit_atomic
>    0.00      0.02     0.00 121759298     0.00     0.00  tcg_remove_ops_after
>    0.00      0.02     0.00 116887887     0.00     0.00  in_code_gen_buffer
>    0.00      0.02     0.00 111803833     0.00     0.00  tcg_emit_op
>    0.00      0.02     0.00 106052221     0.00     0.00  object_class_dynamic_cast_assert
>    0.00      0.02     0.00 99704054     0.00     0.00  __jit_debug_register_code
>    0.00      0.02     0.00 97812458     0.00     0.00  object_get_class
>    0.00      0.02     0.00 88952594     0.00     0.00  tcg_splitwx_to_rx
>    0.00      0.02     0.00 85790920     0.00     0.00  object_class_dynamic_cast
>    0.00      0.02     0.00 73780673     0.00     0.00  helper_exit_atomic
>    0.00      0.02     0.00 65337482     0.00     0.00  tcg_op_supported
>    0.00      0.02     0.00 61213619     0.00     0.00  tcg_func_start
>    0.00      0.02     0.00 54477684     0.00     0.00  tcg_flush_softmmu_tlb
>    0.00      0.02     0.00 53968980     0.00     0.00  tcg_temp_new_internal
>    0.00      0.02     0.00 51526008     0.00     0.00  qemu_in_vcpu_thread
>    0.00      0.02     0.00 40750952     0.00     0.00  pflash_cfi02_register
>    0.00      0.02     0.00 38039442     0.00     0.00  tcg_gen_op2
>    0.00      0.02     0.00 37068039     0.00     0.00  tcg_gen_op1
>    0.00      0.02     0.00 36473276     0.00     0.00  tcg_gen_op3
>    0.00      0.02     0.00 36310225     0.00     0.00  gen_gvec_uaba
>    0.00      0.02     0.00 30985436     0.00     0.00  tb_set_jmp_target
>    0.00      0.02     0.00 30291796     0.00     0.00  tcg_constant_internal
>    0.00      0.02     0.00 29857950     0.00     0.00  ssi_transfer
> 
> * execute-in-place=false
> 
> Each sample counts as 0.01 seconds.
>    %   cumulative   self              self     total
>   time   seconds   seconds    calls  ns/call  ns/call  name
>   40.00      0.02     0.02   551149    36.29    36.29  aspeed_board_init_flashes
>   20.00      0.03     0.01  3937238     2.54     2.54  register_cp_regs_for_features
>   20.00      0.04     0.01   674096    14.83    14.83  gen_gvec_uaba
>   20.00      0.05     0.01   457461    21.86    21.86  finalize_target_page_bits
>    0.00      0.05     0.00  5364258     0.00     0.00  arm_gt_hvtimer_cb
>    0.00      0.05     0.00  2467532     0.00     0.00  helper_neon_narrow_sat_s8
>    0.00      0.05     0.00  2431860     0.00     0.00  opb_opb2fsi_address
>    0.00      0.05     0.00  1828453     0.00     0.00  cpsr_read
>    0.00      0.05     0.00  1820659     0.00     0.00  cpu_get_tb_cpu_state
>    0.00      0.05     0.00  1441344     0.00     0.00  arm_cpu_tlb_fill
>    0.00      0.05     0.00  1427177     0.00     0.00  cxl_usp_to_cstate
>    0.00      0.05     0.00  1161059     0.00     5.85  aarch64_sync_64_to_32
>    0.00      0.05     0.00   886523     0.00     0.00  helper_iwmmxt_maxsb
>    0.00      0.05     0.00   831393     0.00     0.00  arm_log_exception
>    0.00      0.05     0.00   746940     0.00     0.00  helper_v7m_preserve_fp_state
>    0.00      0.05     0.00   728354     0.00     0.00  hmp_calc_dirty_rate
>    0.00      0.05     0.00   681634     0.00     0.00  helper_sadd8
>    0.00      0.05     0.00   487743     0.00     7.14  qmp_query_cpu_definitions
>    0.00      0.05     0.00   420528     0.00     0.00  arm_v7m_cpu_do_interrupt
>    0.00      0.05     0.00   382245     0.00     0.00  helper_ssub8
>    0.00      0.05     0.00   374192     0.00     0.00  helper_usub8
>    0.00      0.05     0.00   347199     0.00     0.00  usb_msd_load_request
>    0.00      0.05     0.00   325862     0.00     0.00  target_disas
>    0.00      0.05     0.00   322375     0.00     0.00  arm_hcrx_el2_eff
>    0.00      0.05     0.00   317835     0.00     0.00  virtio_bus_device_iommu_enabled
>    0.00      0.05     0.00   309559     0.00     0.00  mig_throttle_counter_reset
>    0.00      0.05     0.00   301557     0.00     0.00  ram_bytes_remaining
>    0.00      0.05     0.00   292888     0.00     0.00  helper_v7m_blxns
>    0.00      0.05     0.00   289093     0.00     0.00  tpm_util_show_buffer
>    0.00      0.05     0.00   274156     0.00     0.00  helper_sxtb16
>    0.00      0.05     0.00   273588     0.00     0.00  write_v7m_exception
>    0.00      0.05     0.00   271619     0.00     0.00  page_size_init
>    0.00      0.05     0.00   270247     0.00     0.00  qemu_fdt_setprop_sized_cells_from_array
>    0.00      0.05     0.00   229643     0.00    14.69  helper_neon_addl_u32



  reply	other threads:[~2022-06-29  9:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20220623102617.2164175-1-pdel@fb.com>
     [not found] ` <20220623102617.2164175-3-pdel@fb.com>
2022-06-23 12:11   ` [PATCH 02/14] sysbus: Remove sysbus_address_space Peter Maydell
     [not found] ` <20220623102617.2164175-5-pdel@fb.com>
2022-06-23 12:15   ` [PATCH 04/14] sysbus: Add sysbus_mmio_map_in Peter Maydell
2022-06-23 18:29     ` Peter Delevoryas
     [not found] ` <20220623102617.2164175-9-pdel@fb.com>
2022-06-23 12:57   ` [PATCH 08/14] aspeed: Replace direct get_system_memory() calls Peter Maydell
2022-06-23 15:39     ` Cédric Le Goater
2022-06-23 18:45       ` Peter Delevoryas
     [not found] ` <20220623102617.2164175-13-pdel@fb.com>
2022-06-23 15:09   ` [PATCH 12/14] aspeed: Make aspeed_board_init_flashes public Cédric Le Goater
2022-06-23 18:43     ` Peter Delevoryas
2022-06-24 16:50       ` Cédric Le Goater
2022-06-29  9:11         ` Cédric Le Goater [this message]
2022-06-29 14:14           ` Alex Bennée
2022-06-29 15:54             ` Cédric Le Goater
2022-06-29 18:24               ` Alex Bennée
2022-06-30  8:49                 ` Cédric Le Goater
2022-06-30  9:43                   ` Alex Bennée
2022-07-05 12:35                     ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07128acf-329a-f372-c48c-0c3cb498d3d0@kaod.org \
    --to=clg@kaod.org \
    --cc=alex.bennee@linaro.org \
    --cc=andrew@aj.id.au \
    --cc=ani@anisinha.ca \
    --cc=berrange@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=f4bug@amsat.org \
    --cc=joel@jms.id.au \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=pdel@fb.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).