Execution speed of FVP models

Hi,

From what I understand FVP Fast models are supposed to run at speeds comparable to real hardware. But when I’m trying to evaluate the example specified here → Image Classification Code Sample the inference speed is orders of magnitude slower (on FVP_Corstone_SSE-300_Ethos-U55) than the real hardware. Are there any custom settings that can be used to improve the execution timing for the FVP platform and make it more closer to real-time? It would really be helpful to get some idea on the limitation here. Thank you :slight_smile:

I’ve already tried disabling timing adapters via ETHOS_U_NPU_TIMING_ADAPTER_ENABLED.

Hi @karandewan,

Thanks for getting in contact with your question! The FVP may not always be as fast as execution on actual hardware, but at least for the FVP_Corstone_SSE-300 you can supply the following as an argument when launching the FVP `-C ethosu.extra_args="–fast" to help.

So for example your full command would be something like:

~/FVP_Corstone_SSE-300/models/Linux64_GCC-9.3/FVP_Corstone_SSE-300_Ethos-U55 ./bin/ethos-u-img_class.axf -C ethosu.num_macs=128 -C ethosu.extra_args="–fast"

This will make execution much faster when running anything on the ethos-U. Note that your output will still be bit-accurate, but the cycle numbers produced can no longer be trusted to be accurate.

Hope this helps,
Richard

1 Like

Hi @Burton2000,

This is extremely helpful. Thank you so much !!

Using this option speeds up the execution of the code on NPU significantly.

Are there additional options to improve the overall speed of the platform too. For e.g. On staring the simulation I see that the CPU clock rate is set to 32000000Hz (32MHz). Is there an option to increase it too?

telnetterminal0: Listening for serial connection on port 5000
telnetterminal1: Listening for serial connection on port 5001
telnetterminal2: Listening for serial connection on port 5002
telnetterminal5: Listening for serial connection on port 5003

    Ethos-U rev 136b7d75 --- Apr 12 2023 13:44:01
    (C) COPYRIGHT 2019-2023 Arm Limited
    ALL RIGHTS RESERVED

sh: line 1: xterm: command not found
INFO - WARN - MPS3_SCC->CFG_ACLK reads 0. Assuming default clock of 32000000
Processor internal clock: 32000000Hz
INFO - V2M-MPS3 revision A
INFO - Application Note AN228, Revision C
INFO - MPS3 build 3
INFO - MPS3 core clock has been set to: 32000000Hz
INFO - CPU ID: 0x411fd220
INFO - CPU: Cortex-M55 r1p0
...
...
...
...

Regards,
Karan Dewan

Hi @Burton2000,

Thanks for your suggestion earlier regarding speeding up NPU execution on FVP.

Could you provide some insight into improving the overall. CPU clock ret which is being set, please?

Regards,
Karan Dewan

Hi @karandewan,

Unfortunately I do not believe there is any way to improve the speed of the platform other than run it on a faster system.

Best regards,
Richard