Quick answer: there is no provision provided in the repo to do this quickly without changes.
Long version: You could potentially create an application with multiple neural networks baked in and run them one after the other. You will need to modify the linker script (scatter file if using Arm compiler) if you run out of memory. For a few models this approach would work. However, for a more scaleable solution, our recommendation would be to automate building and deploying the application on MPS3 using a Python script.
If you are using the FVP to benchmark the NPU only, then it would be easier to build the inference runner application with dynamic model load support (so you only need to build it once and pass the neural network model path as a command line argument - which can point to a different model each time). See Building with dynamic model load capability. This would be fairy straight forward to automate via a script and log the performance numbers.
If you’re using the FPGA (to be able to profile the CPU and if your networks have operators falling back onto the CPU), an automation script would still work, but there is no escaping building the application each time with a different model. Therefore, the process will be slower as each build and deploy cycle will involve compiling and copying of files on to the SD card of the FPGA, but it will still give you the desired outcome. The important thing in the deployment part is to make sure you copy over the files and safely unmount the FPGA’s SD card (enumerated as a USB mass storage device). And you will need to wait for certain markers in the UART serial output to know when the application has finished executing.
Hope this helps.