[2024-06-09] Build software for AI

i'm surprised that more people aren't building tools / apps designed to be used by AI

like, humans have very little startup cost when it comes to doing complicated tasks because society has already spent so long building human centric operating systems and hardware and IDEs and browsers and everything else

i can imagine the possibility that language models are already at a human level of intelligence, but we simply can't tell because they're getting hardcapped by their terrible working environments

like, imagine writing code but without being able to run / test / debug it, use a linter, consult online sources, or even ability to go back and edit code you've already written

that's the environment GPT is working in when you ask it to write a function for you

the fact that it has any degree of success at all is practically superhuman

which is to say, people could probably print money if they just started startups about building software for ai to use

i guess a similar idea exists in terms of education

imagine dropping an llm into an actual modern education system for humans

they would watch lectures, take notes, do homework, take exams, receive grades, interact with their peers, all the while updating with ppo from the feedback they receive or something

the issue with stuff like this is probably like, economically speaking the value of building tools or providing education is a bit less clear when the recipient of those things is a fungible and not necessarily persistent entity like a neural network instance

GPT can't exactly pay you tuition, since GPT doesn't own things like money

and a GPT instance doesn't make money either, unless you count whoever owns the server hosting their model as part of "them"

taking a step back,

i think it's obvious that this whole internet pretraining paradigm does not lead to superintelligence on its own

even rlhf / preference tuning is a stone age version of what the infrastructure for training superintelligences will look like

honestly i doubt internet pretraining even leads to phd-level intelligence considering that the average thing a phd student thinks about is generally not published more than a couple times on the internet anyway, or what is published sure as heck isn't reliable

honestly even the kind of stuff i do research in is probably useless in the grand scheme of things

superintelligence is not unlocked by some better loss function or model architecture or consuming internet data in a slightly more efficient way

i mean sure, openai will fiddle with some hyperparameters and filter their datasets a bit better and add moar model and moar data and get an appreciable smarter gpt in the coming few years

but true superintelligence is unlocked by building systems for ai.

by placing a bunch of ais in a classroom or college or research lab like setting

by building tools that let agents experiment and think and interact and receive feedback, mostly from each other rather than from humans

it's a software engineering problem, and not the cool ml scaling kind

literally like, build khan academy or github or openreview or reddit but designed purely for ai consumption

something like that \end{4am rant}