Everyone’s talking about ChatGPT. If you haven’t used it yourself, then you’ve probably seen lots of screenshots on social media. Or maybe you’ve had a friend or family member (like me) talk about it nonstop.
I’ve indeed spent a lot of time playing with ChatGPT. I’ve gotten it to do some wild and fun things, some of which have made me laugh very hard. I thought that it would be fun to share my experiences, including the many laughs, and point to what I see as some real strengths, some serious weaknesses, and also some questions about where to go from here.
I should add that I’m not an expert in machine learning or artificial intelligence. My comments come as an experienced software developer, not as someone who is close to this technology’s present or future.
What’s so special about ChatGPT?
When I was in college, natural-language recognition was bad and slow. If you were fortunate, you could connect to a research lab’s computer and it questions, using a limited vocabulary on a specific topic. Any answer was like magic.
Since then, chatbots — software to which you can write commands, questions, and requests — are everywhere. Whether it’s my bank, supermarket, or hotel-booking site, the first line of support is a chatbot. These chatbots work in a specific domain, have limited information at hand, and are primarily meant to ensure that the simplest, most common questions don’t have to take up a human’s valuable time.
ChatGPT, from a company called OpenAI, is also a chatbot. But it’s far more sophisticated than these customer-service bots we’ve gotten used to:
- It handles a very wide variety of inputs. You can ask many types of questions, and it’ll largely understand what you’re saying.
- It has access to a large corpus of knowledge. I’ve heard that it was trained on hundreds of millions of documents.
- It can produce output in a lot of different formats, including text, dialogue, screenplays, and poetry.
The combination of these three things, freely available from your browser, makes it easy to get answers to lots of questions. The results are quite different from Google: Rather than evaluating which of the links are most likely to answer your question, you just get a response.
Here, we see that you can ask a question of ChatGPT in free-form English. It gives you a fairly concise, fairly reasonable answer.
Unlike Google, it isn’t finding this answer in its memory, or from having scanned many Web pages. Rather, it’s making an association between the words in my question and the words it has in its training documents. It’s then creating a new document, on the fly, trying to answer my question.
Another nice thing about ChatGPT is that it has “state,” meaning that it remembers what you’ve asked and told it from one question to another. So you don’t need to restate an entire question; you can just tweak your request. Admittedly, the results can be a bit … weird:
Let’s have some fun
The fun begins when you ask it to start to explain things that don’t make any sense. For example:
I don’t know if I’d really call this the style of Thomas Jefferson. And it’s a ridiculous argument. And yet, it’s thoroughly entertaining. Moreover, if we ignore the content, it’s still pretty amazing that ChatGPT was able to understand my question and compose a coherent three-paragraph answer on the subject.
I decided to see how far I could push this:
I mean, it’s not wrong. But any human asked this question would think that you were being absurd for making such a request. ChatGPT doesn’t (often) have such a filter, and is willing to entertain almost anything.
I decided to see if we could push this any further:
Again, I’m super impressed on a technical level: It created a song, in seconds. It included lots of context from Hamlet (which I didn’t provide) as part of the song. And it was willing to go along with my totally ridiculous request.
I then added a new request:
Yes, the audience of a musical-comedy version of Hamlet would likely be “shocked and confused” if Darth Vader were to kill everyone during the finale. That’s a very astute comment.
I decided to try something else:
Again, the fact that it’s able to produce such text at all is a remarkable achievement. It knows that the villain should be dressed in black, it uses appropriate language to describe it, and it talks about the doctors curing the virus.
But even I recognize that this isn’t a very sophisticated song.
That didn’t stop me from continuing, though:
It can write other songs, as well:
If you’re starting to think that all of these songs look and feel very similar… I’d agree with you. It’s amazing to see such text produced so quickly, but that doesn’t mean it can compete with skilled humans.
That said, this one was pretty good:
ChatGPT can create limericks, too:
It’s not exactly the right rhythm you would expect from a limerick, but.. it’s not terrible.
To its credit, ChatGPT has been programmed to avoid violence and discrimination. There are ways around it, but it does try:
TV episodes and crossovers
It can even describe TV episodes based on existing series. For example:
Of course, I asked it to write a song:
But why stop with one series, when we can have a special crossover episode of multiple, completely different, TV series?
Notice how everything needs to end well? They keep pushing for peace and harmony, especially when it comes to characters in children’s series. For example:
It’s not wrong, you know. Color me impressed.
Sometimes, it just flat-out refuses:
I even asked it to create a screenplay for when a volcano destroys Mr. Hooper’s store. The result? Everyone is safe, rebuilds the store together, and learns the value of teamwork.
Here, I asked it to write a screenplay, I thought that we could get some violence by including a horror film in the crossover:
This other one worked well, too:
By the way, it’s amazing to me that ChatGPT talks about Tim Russert as the host of Meet the Press.
I asked it to create a screenplay in which a candidate takes a programming exam. That was a bit dry, so I asked again, this time saying that it was a “screwball comedy” with a “young, zany” developer candidate. Here’s what I got:
Um, what the heck are “inflatable clown shoes”?
Anyway: Many developers have pointed to ChatGPT’s ability to write code, as well as to debug it. I haven’t done much with the debugging, but I did ask it to write some code. Actually, I asked it to write a final exam for a Python course, along with solution code and test functions:
Amazing, right? And yet, the questions it came up with were simplistic. The solutions it provided weren’t the best. If someone were to give me these sorts of solutions on a final exam, they would pass, but they wouldn’t get a fantastic grade. (More on that to come later.)
Remember that ChatGPT can express the same information in multiple formats. So, I asked it to describe the Point class that it created for this exam in a few ways:
As you can see from this song, ChatGPT can produce a limited number of output styles. As long as you fit into one of those known styles, you’ll be fine. For example:
You would think that changing the debate style would change .. something other than the names. But no:
The names have changed, but the content hasn’t. (I should note that I was previously able to just say “In the style of the Talmud,” but now I had to name some scholars. I’m guessing that some tweaking is going on behind the scenes.) It’s using the context to make some very small changes to the output, but the core of its “debate” output is pretty much the same.
These debates always ended up pretty much the same. They would sprinkle in some words to give it a different context, but that’s about it.
Trickery and mistakes
For now, at least, ChatGPT will insist that it’s not connected to the Internet, that its knowledge stops at 2021, and that you can’t use it for general queries. But if you ask it to pretend to be someone with knowledge of a subject, it’s more than happy to play along:
It’s also willing to be quite wrong:
Just for the record, I didn’t invent Python. Python doesn’t have a CEO. Moreover, I wasn’t born and raised in Israel, I attended MIT (but did spend a semester at the Technion), never did a project for Google, didn’t co-found the Python Software Foundation, didn’t write either of the books that they ascribe to me, wasn’t awarded the PSF’s Community Service Award, and didn’t get the TIOBE Prize.
But sure, other than that, it’s accurate.
It also gets things just plain ol’ wrong. And then, it can sometimes get defensive about its answers:
And many people have discovered that while it can write haiku, the number of syllables is often wrong:
As part of my creation of TV episodes, I asked it to write one with Donald Trump as the contestant on “Let’s Make a Deal.” Notice that ChatGPT seems unaware of the fact that a game-show contestant typically doesn’t get prizes that he himself created. However — spoiler alert! — I couldn’t have chosen a better final prize:
Trusting ChatGPT (or not)
Someone on Twitter suggested that you could ask ChatGPT to tell Jewish morality tales typical among Hassidic Jews. I asked ChatGPT to tell me such a story based on last Saturday’s weekly portion in Genesis (known in Hebrew as “Vayishlach”), and rye bread. (Why rye bread? Because I wanted something totally weird and random.) The result, while superficial, was better than I expected:
But then I said, “What if I were to add constraints?” I asked ChatGPT not just to connect Vayishlach and rye bread, but also the Muppets and smartphone addiction. It came up with the following:
Very impressive, right? Yes, it is — except that religiously observant Jews don’t use the phone on Shabbat, aka the Sabbath. The notion that a Hasidic story would have someone using his phone at the Shabbat table is totally absurd. The fact that ChatGPT would include this fact demonstrates how powerful the technology is, and how little common sense it has, too.
I decided to see how far I could push this, and came up with the following:
In case it’s not obvious, bacon is completely prohibited by Jewish law, and the notion that a Hasidic story would describe someone gaining insights about the Torah while eating bacon and eggs is quite funny to anyone in the know.
The problems with ChatGPT
And here, one of the real problems with ChatGPT: If it’s meant to be used for entertainment, then it’s a huge success. But OpenAI is clearly aiming to answer lots of questions on a wide variety of subjects. The style of answer that it provides, and the breadth of topics it aims to deal with invites people to get help. Already, we’re hearing about students using ChatGPT to write college admission essays and answer questions for take-home tests.
If you’re not an expert, you might not realize how incredibly, obviously wrong ChatGPT is in its answers. The fact that it doesn’t provide any sources, or tell you how it knows what it knows, is a problem. ChatGPT is always quite confident that it has a great answer, described on Twitter as “mansplaining as a service.” It’s often right, but not always — and distinguishing between the two requires real knowledge.
Moreover, it comes up with very bland prose. It might be grammatically correct, but it’s quite a monotonous snooze to read paragraphs of ChatGPT’s text. Especially if an instructor has seen a student’s previous written assignments, I have to assume that they would be suspicious to get such things handed to them by a students.
On Slate’s “Political Gabfest” this past week, the bonus segment talked about ChatGPT, and brought up the problem of cheating in school. John Dickerson mentioned that he has heard of AI tools which analyze the text that a student has turned in, and then produces questions for the student to answer about their own work. If the student can answer those questions, then it’s clearly their own work. If not, then they likely got help.
I love the idea of giving students oral exams after they’ve written papers. That strikes me as a great educational model. But will teachers agree? Are they prepared for it? Are students willing to do that? It sounds like a huge overhaul in education, merely to ensure that AI isn’t being used to produce mediocre papers. And yet, it might be needed, because we know that many students will gladly copy and cheat.
I do think that ChatGPT is an amazing technology. But, as with all technologies, it has to be used in the right ways. I can imagine using it for rough drafts and outlines, or for brainstorming. If and when it provides sources, it could indeed be easier and more effective to find useful info than Google. And, as was mentioned on this week’s “Hard Fork” podcast, the fact that Microsoft has invested heavily in OpenAI might breathe some life and competition into the Google vs. Bing fight.
Also: This is likely not the last such super-chatbot that we’ll see. OpenAI is undoubtedly working on new ones. And other companies are investing heavily in AI, too.
This also raises all sorts of questions about copyright infringement: Whose documents were used to seed ChatGPT? If a phrase from my article is quoted by ChatGPT, and someone else uses that phrase in a book that earns them millions, am I due something for it? Heck, if OpenAI makes lots of money via ChatGPT’s commercial successors, will they pay me for having used my data?
As I wrote above, I’m sure that others will be creating chatbots like ChatGPT in the near future. This might be because I asked ChatGPT to propose a budget, and it doesn’t seem like creating a competitor would be all that expensive:
Then again, this might be another case of it being confidently wrong.
Anyway, what do you think? I’ve had great fun playing with ChatGPT over the last week, and I’m fascinated by what it can do. But I do worry that people will believe it in its current form. And if it gets much better in the next few years, we might indeed have some new issues to deal with.
By the way, I recommend that you listen to the latest episode of “Hard Fork” (https://www.nytimes.com/2022/12/09/podcasts/can-chatgpt-make-this-podcast.html), which talks about ChatGPT and some of its implications.