[2024-06-09] Build software for AI
i'm surprised that more people aren't building tools / apps designed to be used by AI
like, humans have very little startup cost when it comes to doing complicated tasks because society has already spent so long building human centric operating systems and hardware and IDEs and browsers and everything else
i can imagine the possibility that language models are already at a human level of intelligence, but we simply can't tell because they're getting hardcapped by their terrible working environments
like, imagine writing code but without being able to run / test / debug it, use a linter, consult online sources, or even ability to go back and edit code you've already written
that's the environment GPT is working in when you ask it to write a function for you
the fact that it has any degree of success at all is practically superhuman
which is to say, people could probably print money if they just started startups about building software for ai to use
i guess a similar idea exists in terms of education
imagine dropping an llm into an actual modern education system for humans
they would watch lectures, take notes, do homework, take exams, receive grades, interact with their peers, all the while updating with ppo from the feedback they receive or something
the issue with stuff like this is probably like, economically speaking the value of building tools or providing education is a bit less clear when the recipient of those things is a fungible and not necessarily persistent entity like a neural network instance
GPT can't exactly pay you tuition, since GPT doesn't own things like money
and a GPT instance doesn't make money either, unless you count whoever owns the server hosting their model as part of "them"
taking a step back,
i think it's obvious that this whole internet pretraining paradigm does not lead to superintelligence on its own
even rlhf / preference tuning is a stone age version of what the infrastructure for training superintelligences will look like
honestly i doubt internet pretraining even leads to phd-level intelligence considering that the average thing a phd student thinks about is generally not published more than a couple times on the internet anyway, or what is published sure as heck isn't reliable
honestly even the kind of stuff i do research in is probably useless in the grand scheme of things
superintelligence is not unlocked by some better loss function or model architecture or consuming internet data in a slightly more efficient way
i mean sure, openai will fiddle with some hyperparameters and filter their datasets a bit better and add moar model and moar data and get an appreciable smarter gpt in the coming few years
but true superintelligence is unlocked by building systems for ai.
by placing a bunch of ais in a classroom or college or research lab like setting
by building tools that let agents experiment and think and interact and receive feedback, mostly from each other rather than from humans
it's a software engineering problem, and not the cool ml scaling kind
literally like, build khan academy or github or openreview or reddit but designed purely for ai consumption
something like that \end{4am rant}