I tried many images that come built-in on Ethos-U65. Then I tried it with custom inputs and benchmarked it. Is there a way to give video as input for trying algorithms and benchmarking ?
Hi @eby,
Are you trying this using the FVP or FPGA?
If you are using the FVP, benchmarking anything that is not mapped onto the NPU won’t work because the the CPU model is functional only - it cannot give you good estimates for cycles or time. If you want to do a functional test only you can look into AVH FVP models: Virtual Streaming Interface (VSI). We have experimental support for it on another branch (for object detection use case only): experimental/vsi. Alternatively, the option mentioned below for FPGA will also work.
If you are using the MPS3 FPGA, you should be able to benchmark CPU load as well. To achieve this, the easiest option would still be baking the images in. If you have a video stream, you can dump its individual frames into a folder and build the application with custom inputs along with another minor code change to loop over these images endlessly. This would give you a mock video feed coming into the application. The other options for FPGA are more involved and will likely involve you writing more code - to either get a live camera working on the shield interface or sending image frames in using another mechanism.
Hope this helps.
Thanks,
KS
Thanks for detailed answer,
What differences may there be for numerical values in terms of video and image inputs when using FVP?
Hi @eby,
In the above suggestion, the conversion from video to a set of images will be done offline on your host machine. So, if done right, there shouldn’t be any difference between the video frames that you might have otherwise processed (on device) and the images that you will generate on your host machine.
Thanks,
KS
Since we do not have an MPS3 FPGA, do you have any experimental results for this hardware that you can share ?
Thanks,
Enes
Hi Enes,
We can provide numbers for existing use cases that we run on the FPGA - these will be for inference only. Which use case are you interested in? For all use cases, the numbers you get from the FVP for NPU will closely match those from the FPGA (within 10% tolerance).
Thanks,
KS