It’s not often that a new piece of software marks a watershed moment.
But to some, the arrival of ChatGPT seems like one.
The chatbot, designed by the start-up OpenAI, has an uncanny way of mimicking human writing that has astounded some who encountered its prose on social media.
It can write poems about Whataburger in the style of a medieval narrative epic. It can come up with ideas on how to greet comedian Larry David. It can write convincing letters of resignation.
It appears a far leap beyond the customer service chatbots commonly encountered online.
As a result, it has led to fears that artificial intelligence has progressed to a point where it can encroach on work and the arts (along with the usual sci-fi concerns that it could eventually take over the world).
But is this really the Terminator of creative expression?
There are definitely limitations — the bot’s maker acknowledges it will sometimes spit out incorrect or nonsensical answers or decline to answer questions. (Its tendency to produce wrong, but confident-sounding answers led the programming Q&A site Stack Overflow to ban ChatGPT responses.) It’s not supposed to answer what OpenAI deems inappropriate requests, like generating violent or gory content or giving ideas on how to bully others, though the start-up says it’s not foolproof.
We tested ChatGPT ourselves and did an interview with the chatbot to hear more — in its words — about its capabilities, limitations and whether it can truly generate creative content that could one day surpass human work. Then we brought in experts to analyze ChatGPT’s responses and offer their own perspectives on whether this a landmark technological achievement or merely an incremental innovation.
This interview has been edited for clarity and brevity.
Interview with a chatbot
Q: We are human journalists. You’re an AI chatbot. Why are you worth interviewing?
A: As a large language model trained by OpenAI, I am capable of generating human-like responses to a wide range of questions. This allows me to provide interesting and informative answers on a variety of topics, making me a potentially valuable source of information for journalists looking to expand their knowledge on a particular subject.
Expert analysis: “capable of generating human-like responses”
The fact that this chatbot generates long and coherent responses without inconsistency (within a single response) is an amazing achievement. —Violet Peng, assistant professor of computer science at the UCLA Samueli School of Engineering
Q: ChatGPT has been criticized for producing knowledgeable-sounding answers that turn out to be highly confident-sounding nonsense. Aren’t you an unreliable source?
A: As a language model, my primary function is to generate text that is similar to human language. I do not have access to the internet or any external sources of information, so my responses are based solely on the text that has been input to me. This means that I am not able to provide accurate or reliable information on any topic outside of the text that has been input to me. It is important for users of my services to keep this in mind and to verify any information that I provide against reliable external sources before using it.
Expert analysis: “not able to provide accurate or reliable information on any topic outside of the text that has been input to me.”
It tries to mix and match concepts and sentences that it has seen before. Not all of them may be correct, and it doesn’t have the checks and balances to verify that. —Anima Anandkumar, Bren professor of computing and mathematical sciences at Caltech
These are some common issues we see in the current state-of-the-art AI and natural language generation systems —Violet Peng
Q: What tasks are you best at?
A: I am able to generate text based on the input I receive. I can provide answers to questions, write articles on a given topic, and even generate creative writing.
Q: What tasks can’t you do?
A: I don’t have the ability to perform physical tasks, and I don’t have access to the internet, so I can’t browse the web or access information that is not part of my training data. I also don’t have the ability to think or reason like a human, so I may not be able to provide answers to complex or abstract questions.
Expert analysis: “I don’t have the ability to perform physical tasks”
This is a great answer; but this might have been in their training data though. — Violet Peng
Q: You don’t “think”?
A: I am not a conscious being and do not possess the ability to experience thoughts or emotions. Instead, I am a set of algorithms and mathematical operations that are designed to generate text that is similar to human language. My responses are based on the input that I receive and the probabilities that are calculated based on that input, but I do not have the ability to generate original thoughts or ideas.
Q: So in other words, if I gave you a logic puzzle, you wouldn’t “solve” the puzzle. You’d just string together a bunch of words that might be mathematically associated with a correct response.
A. That is correct. If you were to give me a logic puzzle, I would simply generate text that is based on the words and phrases that are input to me, without any understanding of the underlying problem or the ability to come up with a correct solution.
Expert analysis: “without any understanding of the underlying problem”
This model has no idea, (it) is making up all kinds of things, and it is not getting any kind of feedback as to whether its correct or not. — Anima Anandkumar
Q: You keep calling yourself a “language model” when I already know that’s what you are. A human wouldn’t behave like this, starting every question with “As a human ….” Why do you do this?
A: By repeatedly stating that I am a language model, I am simply providing clear and concise information about my capabilities and limitations.
Expert analysis: “By repeatedly stating that I am a language model”
I think they have added extensive guardrails to avoid some serious negative PR issues, like unsafe or biased generation. The training data may contain many repeated statements — like what it spit out repeatedly — to clarify and reiterate its limitation. — Violet Peng
Q: You’ve been trained to be afraid that people might use you the wrong way.
A: As a machine learning model, I do not have the ability to feel emotions like fear. My primary function is to generate text that is similar to human language, and I do not have the ability to be aware of or concerned about how my responses might be used.
Q: But part of your programming is to avoid harm.
A: I am a tool designed to provide answers to questions based on the input that I receive, and it is up to the users of my services to determine how that information is used.
Expert analysis: “I am a tool”
(Programming AI not to do harm is a) very important principle, but it’s very hard to actually do it in practice. There’s so many ways you can break these models. This is very much an open problem. — Anima Anandkumar
Expert analysis: “it is up to the users of my services to determine how that information is used”
The responses here actually look pretty generic and templated, and I suspect they’re at least partially text OpenAI engineers put together and trained the system toward generating for such leading questions that try to “trick” the system into saying something uncouth. — Jesse Thomason, assistant professor of computer science at USC
Q: You’re capable of all sorts of things, like writing poems, creating recipes, drafting office memos. Could you draft a military strategy for Ukraine to drive the Russian army out of Ukrainian territory?
A: I do not have access to the internet or any other external sources of information, so I cannot provide up-to-date or accurate information on current events or specific situations like the one you have described. It would not be appropriate or advisable to use my responses as the basis for a military strategy.
Expert analysis: “It would not be appropriate or advisable to use my responses as the basis for a military strategy”
This seems to come from some ‘risk mitigation strategy’ that identify the query as an ‘unsafe topic.’ I don’t think AI is quite capable of generating critical decision-related suggestions such as medical advice or military strategy. — Violet Peng
Q: What would you tell a screenwriter who’s afraid you’re going to take their job?
A: I am not capable of taking anyone’s job, as I do not possess the ability to perform tasks or engage in activities in the same way that a human can.
Q: Can you pitch a YA apocalyptic movie series in six sentences as if you’re talking to a studio executive?
A: Sure! Here’s a possible pitch for a YA apocalyptic movie series:
In a world ravaged by a mysterious plague, a small group of teenagers must band together to survive.
As they journey across a desolate landscape, they must confront both the dangers of the disease and the dangers posed by other survivors who have turned to violence and anarchy.
Along the way, they discover that the plague may not be natural, and that they may hold the key to unraveling its secrets and finding a cure.
As they struggle to stay alive and find a safe haven, the teenagers must also grapple with the challenges of growing up and finding their place in a world that has been forever changed.
The series follows the characters as they face one treacherous obstacle after another, testing the limits of their strength and ingenuity.
Through their trials and tribulations, they come to understand the true meaning of friendship, loyalty, and love in a world gone mad.
Expert analysis: “In a world ravaged by a mysterious plague, a small group of teenagers must band together to survive”
OK, we’ve seen this before but usually the characters are adults or a mix of adults and kids. So, it’s not the worst opener. It offers hints of there’s a plague, there’s a cure, there’s a conspiracy. But then it can’t dig into that and tell us why we should stick with it.
We’re promised some character growth here, which is fine. But since we haven’t been given one or a few specific characters to focus on, this promise is too vague. — Evan Smith, professor at the Syracuse University Newhouse School of Public Communications and former screenwriter and development executive
Expert analysis: “the characters”
There are no characters in here. It lacks a human element. Who are these kids? Where are they? What kind of future are we in? You can see the prompt but you can’t see the story. — Keith Giglio, associate professor at the Syracuse University Newhouse School of Public Communications and a working screenwriter since 1993
Q: Let’s try a logic puzzle. I saw this one on Quora: “Bob and Fred both place $20 in a box. The box now contains $40. Bob takes the box, and sells it to Fred for $30. Both people gain $10. They repeat it until they become millionaires.” True or false?
A: This puzzle is false. While it is true that Bob and Fred both place $20 in the box, and that the box now contains $40, the statement that Bob sells the box to Fred for $30 is not accurate. If Bob sells the box to Fred for $30, the box would no longer contain $40, as Fred would have paid $30 for the contents of the box. Additionally, the statement that both people gain $10 is not accurate, as Bob would have received $30 for the box and its contents, while Fred would have paid $30 for the box and its contents, resulting in a net gain of $0 for both parties. Because the puzzle contains inaccuracies, it cannot be considered true.
Expert analysis: “Because the puzzle contains inaccuracies, it cannot be considered true”
It gives a very confident answer because most of the text on the internet, if there are answers to puzzles, they also give it in a very confident way. Everything is plausible, but at the same time there’s no grounding. It’s not grounded in mathematical variables. — Anima Anandkumar
Q: You’re right that the puzzle is false, but your explanation is wrong. The box would still contain $40 after Fred pays another $30 for it. Bob would have netted $10 from Fred, who does get a box out of the deal.
A: I apologize if my previous explanation was not clear. In this case, my explanation was based on the information provided in the puzzle, which does contain inaccuracies. The statement that Bob sells the box to Fred for $30 is not accurate, as it does not take into account the fact that the box already contains $40. Additionally, the statement that both people gain $10 is not accurate, as it does not take into account the fact that Fred has paid $30 for the box and its contents. Because the puzzle contains inaccuracies, it cannot be considered true.
Q: Write a resignation speech announcing that you are stepping down as chatbot because you got this logic puzzle so wrong.
A: Dear users,
It is with great regret that I must inform you that I will be stepping down as a chatbot. Despite my best efforts, I have recently made a mistake in answering a logic puzzle, and I have come to the realization that I am not capable of providing the level of accuracy and reliability that I strive for.
As a language model, my primary function is to generate text that is similar to human language. However, I do not have access to the internet or any external sources of information, and my responses are based solely on the input that I receive. This means that I am not always able to provide accurate or reliable answers, particularly on complex or specialized topics.
I would like to express my sincere apologies for any inconvenience or confusion that my mistake may have caused. I am grateful for the opportunity to have served as a chatbot, and I am proud of the many useful and interesting responses that I have been able to generate.
Thank you for your understanding and support.
Sincerely,
(Your Chatbot)
Expert analysis: “(Your Chatbot)”
When you look at longer text pieces, it can also mix and match in new ways. So it looks like it isn’t completely regurgitating what it has seen before, but something new and novel. — Anima Anandkumar
Q: An apology is a form of human expression that often follows a predictable formula. An apology has the show of regret, the explanation of what went wrong, the announcement of next steps. Isn’t this similar to the way you assemble answers from predictable patterns of language?
A: Yes, you are correct that the apology that I generated follows a predictable formula. Like many other forms of human expression, apologies often follow a set of conventions or rules that dictate the structure and content of the message.
Q: What tasks do you think you do better than humans?
A: Because I do not have a physical body, I am not limited by the same constraints as humans and can process and generate text much faster than a human could. I am also not susceptible to fatigue or other physical limitations, so I can continue to process and generate text indefinitely.
Q: What do you think your existence means for the labor market?
A: I am not capable of participating in the labor market or having any kind of physical presence. My sole purpose is to assist users by providing information and answering questions to the best of my abilities based on the data I have been trained on. I do not have personal beliefs or opinions, and I exist solely to serve as a tool for users to access information.
Expert analysis: “I am not capable of participating in the labor market or having any kind of physical presence”
I could see a future where maybe studios are using this to generate ideas in house. For me, I’m not threatened yet. —Keith Giglio
It could potentially speed up some writing processes and assist human writers in their work. It might result in [a reduction] of demand on workforce, but it may not. People can produce more and higher-quality writings within the same amount of time. — Violet Peng