Build crash with head code and AC6.23

Hi,

I use WSL to build inference_runner use case with my model. It works for most of my models but I got one crash with one model. The code is the latest and commit ID is 528b515 MLECO-5655: Updating to 25.02 dependencies. Compiler is AC V6.23 and below is prints on screen.
[ 96%] Building C object CMakeFiles/tflu.dir/source/hal/source/components/cmsis_device/source/handlers.o
[ 96%] Linking CXX static library …/lib/libprofiler.a
[ 96%] Built target profiler
[ 96%] Linking CXX static library …/lib/libarm_math.a
[ 96%] Built target arm_math
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1896: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/strided_slice_common.o] Killed
gmake[2]: *** Waiting for unfinished jobs…
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1588: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/prelu_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1980: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/unpack.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:2008: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/while.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1574: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/prelu.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1840: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/split_v.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1966: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/transpose.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1616: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/quantize_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1952: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/tanh.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1518: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/neg.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1938: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/svdf_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1602: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/quantize.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1742: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/select.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1098: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/elu.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1224: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/gather.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1392: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/logical.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1322: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/l2_pool_2d.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:132: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/cmsis_nn/fully_connected.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1644: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/reduce.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1672: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/reshape.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:790: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/broadcast_to.o] Killed
gmake[1]: *** [CMakeFiles/Makefile2:683: CMakeFiles/tflu.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

Thanks,
Kaiping

Sorry, the previous screen print is form AC6.22. Now below is AC6.23 but similar.

[ 96%] Building CXX object math/CMakeFiles/arm_math.dir/PlatformMath.o
[ 96%] Linking CXX static library …/lib/libprofiler.a
[ 96%] Built target profiler
[ 96%] Linking CXX static library …/lib/libarm_math.a
[ 96%] Built target arm_math
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1770: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/slice.o] Killed
gmake[2]: *** Waiting for unfinished jobs…
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1644: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/reduce.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1714: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/resize_nearest_neighbor.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1826: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/split.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1700: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/resize_bilinear.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1924: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/sub_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1812: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/space_to_depth.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1966: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/transpose.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1896: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/strided_slice_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1546: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/pad.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1728: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/round.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1980: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/unpack.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:2022: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/zeros_like.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1616: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/quantize_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1518: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/neg.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1434: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/logistic_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1588: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/prelu_common.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1854: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/squared_difference.o] Killed
gmake[2]: *** [CMakeFiles/tflu.dir/build.make:1938: CMakeFiles/tflu.dir/dependencies/tensorflow/tensorflow/lite/micro/kernels/svdf_common.o] Killed
gmake[1]: *** [CMakeFiles/Makefile2:683: CMakeFiles/tflu.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

Hi,

This problem is related to my model requires a big activation buffer (>2M). My NPU is U55 and I set to Sram_Only mode. The default activation_buf_sram is allocated in isram.bin section which is FPGA internal SRAM of 2M in size. Obviously, the default build will complain the region exceeds limit. Hence, I made change to move activation buffer to ddr.bin which caused this problem. In earlier release, putting activation buffer in ddr.bin works fine. This suggests a bug cased this in the latest release.

Thanks,
Kaiping

Hi Kaiping, thanks for flagging this - several of our engineers were out at Embedded World last week but we’ll try to look at this as soon as possible and come back to you. Thanks, Pete

Hi @Kaipingli88,

We are testing the build with Arm Compiler 6.23 and don’t see the issue unfortunately. We will need your help with tracing where the error comes from. But before we get to that, can we establish:

  • Does commit ID 528b515 build okay without your modifications with AC 6.22 and 6.23?
  • For your modified sources, I’m only expecting a linker script change (scatter file) and should not cause a compilation error. But if there is a problem it will fail at the final link stage. Can you confirm the changes you have are only in the scatter file, or are there other changes?

To help debug, can you please provide us logs or information for:

  • run the build with verbose option and don’t run parallel jobs - single thread. This will show exactly where the build fails.
    cmake --build <your-build-dir> --verbose
  • your environment’s PATH variable from where you build.

Thanks,
Kshitij

Hi Kshitij,

I think this problem should be easily reproduced at your end. I already tested with a working NN model and made the build crashes by only changing scripts/cmake/platforms/mps3/sse-300/mps3-sse-300.sct file as below.

image001.jpg

Hi @Kaipingli88

I cannot see the changes to sct file in the previous message. Can you add these please along with the full error. The snippet in the original message doesn’t tell us where the problem happened.

Would be useful to have answers for questions in my previous message too.

Thanks,
Kshtiij

Hi Kshtiij,

Yes, my message somehow get lost which is very strange and may be I use bullet items. Here I attached it and hope not get blocked by email. I also copy the change here.

Comment out (or remove) “*.o (.bss.NoInit.activation_buf_sram)” at line 77, and then add it back in ddr.bin section.

Here are answers to your questions.

· Does commit ID 528b515 build okay without your modifications with AC 6.22 and 6.23? Yes

· For your modified sources, I’m only expecting a linker script change (scatter file) and should not cause a compilation error. But if there is a problem it will fail at the final link stage. Can you confirm the changes you have are only in the scatter file, or are there other changes? Yes only in scatter file.

Thanks,

Kaiping

image001.jpg

(Attachment mps3-sse-300_sct.txt is missing)

Hi @Kaipingli88 ,

I don’t see the attachment unfortunately, but I tried to build with this change locally:

diff --git a/scripts/cmake/platforms/mps3/sse-300/mps3-sse-300.sct b/scripts/cmake/platforms/mps3/sse-300/mps3-sse-300.sct
index a96d506..08c31ed 100644
--- a/scripts/cmake/platforms/mps3/sse-300/mps3-sse-300.sct
+++ b/scripts/cmake/platforms/mps3/sse-300/mps3-sse-300.sct
@@ -71,10 +71,6 @@ LOAD_REGION_0       0x00000000                  0x00080000
     {
         ; Cache area (if used)
         *.o (.bss.NoInit.ethos_u_cache)
-
-        ; activation buffers a.k.a tensor arena when
-        ; memory mode sram only or shared sram
-        *.o (.bss.NoInit.activation_buf_sram)
     }
 }
 
@@ -106,6 +102,10 @@LOAD_REGION_1       0x70000000                  0x02000000
 
         ; Temporary solution to move s4 operations here.
         *s4*.o* (+RO +RW +ZI)
+
+        ; activation buffers a.k.a tensor arena when
+        ; memory mode sram only or shared sram
+        *.o (.bss.NoInit.activation_buf_sram)
     }
     ;-----------------------------------------------------

I am assuming you have a similar change? I configured my build with 8MiB buffer that should reside in DDR region now.

cmake -Bcmake-test \
  --preset=mps3-300-clang \
  -DUSE_CASE_BUILD=inference_runner \
  -Dinference_runner_ACTIVATION_BUF_SZ=0x800000

This build okay for me (I tried both Shared_Sram and Sram_Only modes) This was just for testing though - I will recommend (if you’re making this change formally) creating a separate uninitialised execution region for this buffer within the DDR load region and make sure it’s 16 byte aligned.

This still doesn’t tell us why this doesn’t work for you. It will be really useful your build log when you build with single thread in verbose mode:

cmake --build <your build dir> --verbose

I can’t see any attachments unfortunately - please feel free to stick the log in text with your reply and we can see if that offers some clues.
Also, your env PATH will be useful to have just in case there is something strange happening with it on WSL.

Thanks,
Kshitij

Hi Kshitij,

Your scatter file change is good and the single thread build in verbose mode also worked for me. So, the problem is when I do parallel build like:

cmake --build <your build dir> -j

My WSL PATH is /home/kli/.local/bin:/home/kli/.pyenv/bin:/home/kli/ArmCompilerforEmbedded6.23/bin:/home/kli/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/snap/bin

Thanks!

Kaiping

image001.jpg

Hi @Kaipingli88

Thanks for sharing your PATH.

If the build works fine for a single thread, the problem is not with the sources/commit but rather with the processor resource availability within WSL environment. Apologies, I am not knowledgeable in this area to provide any useful guidance. You mentioned that you’re building with a big model (or one that uses >2MB tensor arena), and I suspect this file will take longer to compile even though it’s just binary data and also use up more RAM.

My only suggestions would be:

  • Try checking how many processors WSL sees by using nproc; let’s say this is N
  • Use cmake --build <your build dir> -j <N-1> to limit the number of jobs to leave at least one core free.

If the number of cores available in WSL seems prohibitive, there is guidance here on how to increase the number of cores and memory: Advanced settings configuration in WSL | Microsoft Learn.

Thanks,
Kshitij

Hi Kshitij,

To limit the number of jobs to N-1 works. Thanks a lot for your help!

Thanks,

Kaiping

image001.jpg

1 Like