Improved LLM unit tests for UNFLEE (trying to prevent failures for brave archer)

4 jobs from improving-unit-tests in 1 minute 50 seconds (queued for 17 minutes 27 seconds)
Status Job ID Name Coverage
  Build
passed #42007
minecraft
build_mod

01:50

 
  Test
manual #42008
minecraft allowed to fail manual
gpt-3.5-turbo
manual #42009
minecraft allowed to fail manual
gpt-4o
manual #42010
minecraft allowed to fail manual
llama3-8b