[2024-06-13] synthetic data (we live in a society)

imagine training a language model on its own outputs, leading to a feedback loop which results in progressively worse performance until the model becomes useless

except oops human society has produced essentially all of the training data for other humans

with human civilizations we describe phenomena caused by this feedback loop as "culture" or "societal norms" or "the related works section of my paper"

i mean sure we live in a universe so stuff's somewhat grounded in the actual laws of nature and whatnot

and for the vast majority of human history we weren't so great at this whole science thing

training on your own outputs is a fundamentally unstable thing

from a certain perspective (i've realized recently i tend to use 'from a certain perspective' to preface incorrect opinions but anyway) all human conflicts can be chalked up to failures of this feedback loop

but all scientific progress is also a result of this feedback loop

this feedback loop is not an obstacle to superintelligence, but rather a necessary component of it

imo, no real reason to study alignment before models incorporating these feedback loops as their main or sole data source become mainstream though

since paradigms not based on this feedback loop like internet pretraining are fundamentally stable

today's agi report: probably >2 years away