Run the same arms with GPT, Gemini, Llama and Qwen generating the ideas, not just judging them. If forced bisociation and best-of-N selection win across every model family, the result is a law of language-model cognition, not a quirk of one.
Each model has its own default cloud; modulation is measured relative to each. The far-and-good frontier is computed per family, then compared.
A result that holds on one model is an observation. A result that holds on all of them is a law, and laws are what found a field.