Hello
I’ve been experimenting with the Arm ML Embedded Evaluation Kit for running inference on a deep learning model; but I’m encountering memory constraints when deploying models that exceed a certain size.
What strategies /optimizations are recommended for managing memory usage effectively while maintaining performance?
Are there specific model compression techniques ; memory management best practices that are suited to this platform? I have checked Challenge Validation cpq documentation guide but still need help.
Also; has anyone had success in running larger models by using quantization / pruning?
Thank you !