Ex-OpenAI CTO Reveals Plan to Fix LLMs Biggest Problem

TLDR

Thinking Machines has identified batch size as the cause of non-determinism in LLMs and developed a method to ensure the same prompt always yields the same response.

Takeways

• Non-determinism in LLMs results in varying responses to the same prompt.

• Varying batch sizes in LLM processing cause slight calculation changes, affecting next-word predictions.

• Maintaining consistent batch processing speeds and order ensures deterministic and reliable LLM outputs.

Miramarati's Thinking Machines is working to solve the issue of non-determinism in large language models, where the same prompt can yield different responses. They've identified that varying batch sizes during the question processing causes slight changes in calculations, leading to different next-word predictions. By maintaining a constant batch processing speed and order, Thinking Machines has achieved deterministic results, which are critical for trust, debugging, and auditing of AI models.

Non-Determinism in LLMs

• 00:00:33 Non-determinism in LLMs means the same prompt given multiple times yields different responses due to the way language models sample potential results. Even when the randomness is reduced using temperature settings, non-determinism persists. Thinking Machines is trying to solve this problem by prompting an LLM with the same exact prompt to get back the same exact response every single time.

The Cause of Non-Determinism

• 00:03:52 Thinking Machines argues the real cause of non-determinism is the batch size, where questions are processed in carpools that vary in size depending on system load. The size changes the order of tiny calculations, leading to variances in the next word prediction. Therefore, little changes in the ordering of things changes everything.

The Fix for Non-Determinism

• 00:05:59 The solution involves maintaining a constant carpool speed regardless of batch size to ensure consistency. This involves weighing each bowl the same, and mixing the ingredients in the exact same way by choosing one stable setup. The model should also look back at what it wrote in the same order.

Benefits of Determinism

• 00:08:03 Achieving determinism in LLMs allows for easier trust, debugging, auditing, and verification, as the same input will consistently produce the same output. Stable inputs and outputs also make benchmarks more reliable. When you go into audit and figure out why a model thinks the way it does, it becomes easier to do that.