Pipeline · Public / CreatureChat

New experimental prompt strategy to only show valid and invalid combinations of…

New experimental prompt strategy to only show valid and invalid combinations of behavior emojis, instead of example phrases. Works well with gpt-4o-mini and llama3-70b. Does not work great with gpt-3.5-turbo. Also added negative LLM unit tests checks for behaviors (FOLLOW and not ATTACK).

4 jobs from emoji-behaviors in 1 minute 49 seconds (queued for 7 minutes 58 seconds)

97b6bb2f ...

Pipeline
Jobs 4

Status	Job ID	Name
Build
passed	#41987 minecraft	build_mod	01:49 May 05, 2025

Test
manual	#41988 minecraft allowed to fail manual	gpt-3.5-turbo
manual	#41989 minecraft allowed to fail manual	gpt-4o
manual	#41990 minecraft allowed to fail manual	llama3-8b