I think the difference is that AI can search and process external data sources rather than being given a data set.
The rocket engine above was made by searching the 'internet of everything' and designing from scratch rather than being given a pre existing blueprint.
The distinguishing AI feature is learning rather than simply executing.
The "AI" that exists does not do any of that.
Existing "AI" is a vast database of pre-processed disjointed "phrases" that it recombines in response to a prompt. If you ask about "the Sermon on the Mount" current "AI" systems doesn't read respond to your prompt by reading all the versions of the bible and then providing a reasoned analysis. "AI" searches it's database for the phrase "Sermon on the Mount" and then looks at all the other extended phrases in its database that includes the term and creates a coherent amalgam of those stored phrases.
Here is an excellent discussion on what AI is and what it isn't:
So I stirred up a bit of conversation on Twitter last week when I noted that I had already been handed ChatGPT produced assignments. For those who are unaware, ChatGPT is an ‘AI’ chatbo…
acoup.blog
Here is an outtake:
ChatGPT is chatbot (a program designed to mimic human conversation) that uses a large language model (a giant model of probabilities of what words will appear and in what order). That large language model was produced through a giant text base (
some 570GB, reportedly) though I can’t find that OpenAI has been transparent about what was and was not in that training base (though no part of that training data is post-2021, apparently).
The program was then trained by human trainers who both gave the model a prompt and an appropriate output to that prompt (supervised fine tuning) or else had the model generate several responses to a prompt and then humans sorted those responses best to worst (the reward model). At each stage the model is refined (
CGP Grey has a very accessible description of how this works) to produce results more in keeping with what the human trainers expect or desire. This last step is really important whenever anyone suggests that it would be trivial to train ChatGPT on a large new dataset;
a lot of human intervention was in fact required to get these results.
It is
crucial to note, however,
what the data is that is being collected and refined in the training system here: it is purely information about how words appear in relation to each other. That is, how often words occur together, how closely, in what relative positions and so on. It is not, as we do, storing
definitions or associations between those words and their real world referents, nor is it storing a perfect copy of the training material for future reference. ChatGPT does not sit atop a great library it can peer through at will; it has read every book in the library
once and distilled the statistical relationships between the words in that library
and then burned the library.
ChatGPT does not understand the
logical correlations of these words or the actual things that the words (as symbols) signify (their ‘referents’). It does not know that water makes you wet, only that ‘water’ and ‘wet’ tend to appear together and humans sometimes say ‘water makes you wet’ (in that order) for reasons it does not and cannot understand.
In that sense, ChatGPT’s greatest limitation is that it
doesn’t know anything about anything; it isn’t storing definitions of words or a sense of their meanings or connections to real world objects or facts to reference about them.
ChatGPT is, in fact, incapable of knowing anything at all. The assumption so many people make is that when they ask ChatGPT a question, it ‘researches’ the answer the way we would, perhaps by checking Wikipedia for the relevant information. But ChatGPT doesn’t have ‘information’ in this sense; it has no discrete facts. To put it one way, ChatGPT does not and cannot know that “World War I started in 1914.” What it
does know is that “World War I” “1914” and “start” (and its synonyms) tend to appear together in its training material, so when you ask, “when did WWI start?” it can give that answer.
But it can also give absolutely nonsensical or blatantly wrong answers with exactly the same kind of confidence because the language model has no space for knowledge as we understand it; it merely has a model of the statistical relationships between how words appear in its training material.
In artificial intelligence studies, this habit of manufacturing false information gets called an “
artificial hallucination,” but I’ll be frank I think this sort of terminology begs the question.
4