Analyzing ChatGPT’s use of cohesive devices to help international LLM students improve cohesion in their writing

Post by Stephen Horowitz, Professor of Legal English, with special thanks to Prof. Julie Lake and Prof. Heather Weger for their time and linguistics expertise in analyzing and discussing the texts and editing this post, which is far more cohesive because of them.

Hot on the heels of my recent experiment to try and better understand ChatGPT’s view of improving language and grammar (See “Analyzing ChatGPT’s ability as a grammar fixer,” 2/23/23), I was grading my students’ timed midterm exams and noticed a paragraph in one students’ answer that had all the right pieces but decidedly lacked cohesion.

“….the biggest takeaway of all for this experiment…..ChatGPT can help instructors identify the kinds of cohesive devices that a student is not using and then support the student in learning to use and become more comfortable and familiar with those cohesive devices.”

So I mentioned this in a comment and gave some suggestions as to how to improve the cohesion in the paragraph. And then I had a thought:

Maybe ChatGPT can help!

The student’s paragraph, by the way, was describing the rules related to whether custody exists for the purposes of determining whether a Miranda warning was required prior to questioning. The student included and described all the key parts of the rule. But it felt like a list or a recipe rather than an essay.

In other words, it was missing cohesion, or cohesive devices, i.e., words that would help the reader understand the connections and relationships between each piece of the larger rule.

One of the hard things about teaching cohesion is that there’s no one right way to accomplish it. You can tell a student they need to add “Additionally,….” at the beginning of a sentence. But that’s just one of a wide variety of cohesive devices–some lexical, some grammatical, and each with its own nuances. So a great way to help students improve cohesion is to look at a text and to try and notice or identify the different kinds of cohesive devices used.

That’s when I thought: “This sounds like a job for ChatGPT!” Because I could ask it to create one or more improved versions of the text and then help the student notice what kinds of cohesive devices ChatGPT is using that the student’s writing is lacking.

Part I: What do the student and ChatGPT have in common in their writing?

My next decision was what instruction to give ChatGPT. “Improve the language issues in this text:……”? Or “Improve the cohesion in this text….”? The easy solution, of course, is to do both. And that brings me to Takeaway #1, which is that it didn’t make too much of a difference. Or if it did, I can’t really tell. Because even if I hit the “Regenerate response” button for the first option, it would come out a little different.

Takeaway #2 is that, notably, under both instructions, ChatGPT did not eradicate all language errors. For example, the student wrote, “An accused is in custody if….” Typically, we might say accused person, but we don’t generally use “accused” as a noun preceded by the article “an.” (Though interestingly we do talk about “the accused” in other contexts.) However, ChatGPT apparently decided to defer to the student’s phrasing in both versions produced.

Takeaway #3 is that both the student and ChatGPT frequently made use of the cohesive devices of keyword repetition and referents (i.e., pronouns). For example, one keyword repetition centered around the idea of custody. The student, after using the adjective form to discuss “custodial interrogation” at the end of the first sentence, used the noun form: “A person in custody must, prior to interrogation,….” And ChatGPT wrote, “Prior to interrogation, a person in custody must….” in one version and, in the second version, wrote, “This means that before interrogation can take place, a person in custody must….” Hence, the repetition of custody and interrogation creates a sense of connection between the end of the first sentence and the beginning of the second sentence.

Regarding the use of referents, the student also used the word “he” in the sentence: “A person in custody must, prior to interrogation, be clearly informed that he….”

In addition to repetition and use of referents, these examples also help demonstrate the cohesive devices of fronting the “old” information (i.e., taking information the reader already knows from a previous sentence and putting it at the front of a sentence) as well as proximity (i.e., creating cohesion by repeating words that are physically close to each other.) Both of these tend to overlap with the cohesive device of repetition. Although repetition is generally lexical (i.e., word choice), fronting old information and proximity can involve grammatical knowledge which is needed to position information in a desired way.

Part II: Where did the student and ChatGPT diverge in use of cohesive devices?

Takeaway #4 is that ChatGPT, unlike the student, used fronted prepositional phrases as a cohesive device. For example, whereas the student started out writing, “The Miranda warnings and a valid waiver are prerequisites…..,” ChatGPT wrote, “In order for a statement made by an accused during custodial interrogation to be admissible, two prerequisites must be met:….” and also, “In the United States, the admissibility of any statement made by an accused during custodial interrogation is dependent on the fulfillment of two prerequisites:…..” While there was nothing wrong with the student’s sentence, the ChatGPT versions using fronted prepositional phrases help the reader feel a bit more situated.

Takwaway #5(a): Another cohesive device ChatGPT seemed to like is the reduced relative (or adjective) clause with an -ing word. Where the student wrote, “A person in custody must, prior to interrogation, be clearly informed that he has the right to remain silent,……” ChatGPT wrote, “Prior to interrogation, a person in custody must be clearly informed of their rights, including the right to remain silent,….” and also, “This means that before interrogation can take place, a person in custody must be clearly informed of their rights, including the right to remain silent,….” The use of “including” as a reduced clause (i.e., “which includes”) is a slightly more advanced grammar structure that the student was perhaps less comfortable or familiar with, though I imagine that they would understand it when encountered in reading.

Takwaway #5(b): Even though ChatGPT’s choice to use “including” as part of a reduced relative clause is a more sophisticated grammar structure, it’s notable that it’s not an entirely accurate word choice if we assume that the list of rights in the Miranda warning is a finite list. Because “including” suggests that there may be some rights not mentioned in the list that follows. So one point to ChatGPT for cohesion, but minus one point for accurate paraphrasing.

Additionally, the use/non-use of the reduced clause created another slight grammatical difference that can be associated with cohesion: nominalization. (We’ll call this Takeaway #6.) The student’s phrasing, “….informed that….” leads to a series of clauses, i.e., “…he has the right to remain silent, anything he says can be used against him in court, he has the right to the presence of an attorney,….” In contrast, the ChatGPT versions’ use of “including” leads to a series of noun phrases: “….including the right to remain silent, the fact that anything they say…., the right to an attorney,….” The use of nominalization creates packets of information that are slightly easier to keep track of in a list, which is part of the reason they tend to be preferred in academic/legal writing. Use of clauses in a list, on the other hand, feels a bit clunkier as the brain needs to keep track of parallelism in the wording which can reduce the feeling of cohesion a bit. Again, grammatically speaking, either way is fine. But in terms of style, nominalization is one characteristic that may subtly provide contours for a text that makes it feel a bit more cohesive.

For Takeaway #7, let’s look at another notable grammatical cohesive device used by ChatGPT that the student did not use: prepositional phrases that convey degree or extent and connect to an additional clause that provides additional information.

ChatGPT Ex 1: “An accused is considered to be in custody if they have been formally arrested by law enforcement, or if their freedom of movement is restrained to the point where they cannot terminate the interrogation and leave.

ChatGPT Ex 2: “To determine if an accused is in custody, two circumstances are considered: formal arrest by law enforcement, or restraint of freedom to the extent that the individual cannot terminate the interrogation and leave.”

In contrast, the student used two sentences to connect the definition of custody with the notion of the degree or extent. The student’s use of the word “that” suggests that the student is looking for a way to connect these two ideas but lacked knowledge of the appropriate grammar structure, writing:

“An accused is in custody if he is formally arrested by law enforcement, or his freedom of movement is restrained that he cannot terminate interrogation and leave. The degree of restraint of freedom should be equivalent to the degree of formal arrest.”

So this is a great example of a grammar structure and phrasing that the student should become more familiar with. Like with the reduced clause (“including”), the student is most likely familiar with the meaning when encountered as input (i.e., reading or listening.) But the student may not have developed comfort or confidence with it, or awareness of when to use it, in connection with output (i.e., writing or speaking.)

Interestingly, ChatGPT also picked up on another type of cohesion error by the student. The student paired the concepts of arrest and restraint on freedom of movement and put the notion of degree in a separate sentence. But the degree of restraint is really just a corollary to the concept of restraint, describing or adding additional information about what it means. So it would make more sense, and feel more cohesive, to keep those two ideas together in the same sentence.

And for Takeaway #8, here is one final example of a cohesive device used by ChatGPT but not by the student: fronting information with a prepositional phrase and a gerund (i.e., -ing word.) In this case, after describing factors that might be used to determine whether custody exists, the student wrote: “The age of child/juvenile is relevant to the custody analysis because….”

In contrast, one of the ChatGPT versions changed the same sentence to start: “In analyzing custody for a child or juvenile, the age of the individual is a relevant factor because….” In addition to use of a more complex prepositional phrase as an adverbial at the beginning of a sentence, the combination of preposition plus gerund (i.e., -ing word) is a grammar structure that is often not in the toolbox for many students I’ve had. As with the above examples, the student likely understands it in reading, but is not using it in writing either because it’s just not on their list of go-to grammar structures or because they’re not confident in how to use it.

This leads to Takeaway #9, the biggest takeaway of all for this experiment, which is that ChatGPT can help instructors identify the kinds of cohesive devices that a student is not using and then support the student in learning to use and become more comfortable and familiar with those cohesive devices.

Without ChatGPT, an instructor would need to have a strong working knowledge of which cohesive devices aren’t being used, and then make an educated guess at the kinds that the student should use. Or, the instructor could replicate this experiment by re-writing the student’s text multiple times and seeing what kinds of cohesive devices the instructor uses in the re-written text. But with ChatGPT, it’s easy to generate multiple versions of the same text which can then be analyzed to see what kinds of cohesive devices are popping up.

Instructionally, the next step can be to draw the student’s attention to the different cohesive devices that have been identified, maybe ask them what they know about them or if they’re comfortable using them, and then maybe having them look for examples of the target cohesive devices in other texts to develop a stronger feel of how and when to use them and to raise any questions about their use.

********BONUS********

By the way, in case you think I overlooked a key follow-up question, I actually asked ChatGPT to identify the cohesive devices in each text. And Takeaway #10 is that ChatGPT seems to only view cohesion through a lexical lens, and Takeaway #11 that it is fairly inconsistent in listing those cohesive devices.

Here are the lists of cohesive devices that ChatGPT came up with for each text:

Student TextChatGPT Version #1ChatGPT Version #2
1. Definition
2. Enumeration
3. Citation
4. Comparison
1. Repetition
2. Enumeration
3. Lexical cohesion [for which it just lists a series of legal terms, which I think really just means repetition.]
4. Reference (i.e., citations)
5. Conjunction
6. Transitional phrases
1. Repetition
2. Synonyms [a variation on repetition]
3. Pronouns [referents, another variation on repetition]
4. Transition words
5. Citations

The only items on any of the above lists that could even be considered related to grammar are comparison, which requires certain grammatical knowledge to pull off, and transitional phrases, which could include more complex prepositional phrases or adverbial clauses. However, the only examples cited are “Prior to interrogation,” “Therefore,….” “Moreover,….” and “In addition,….”

In other words, while ChatGPT uses more complex cohesive devices, it doesn’t seem inclined to recognize or identify them when prompted. (Though props at least for being generally familiar with the concept of cohesive devices.)

Regarding the inconsistency, repetition is not repeated in the list for the student’s text even though they clearly made use of it. And items like conjunctions, transition words, and pronouns are listed as present in all three texts but only listed for some of them.

So Takeaway #12 boils down to: Don’t rely on ChatGPT for a comprehensive list of the cohesive devices in a text!

************

By the way, for more on cohesion in US legal writing, and a great primer on cohesion in general, I strongly encourage reading Elizabeth R. Baldwin’s Beyond Contrastive Rhetoric: Helping International Lawyers Use Cohesive Devices in U.S. Legal Writing, 26 Fla. J. Int’l L. 399 (2014), https://digitalcommons.law.uw.edu/faculty-articles/225

************

Do you have reactions or comments about this post? Or other thoughts about cohesive devices? Or have you tried any other experiments with (or unrelated to) ChatGPT? Please share in the comments or contact me directly.

Leave a Reply

Your email address will not be published. Required fields are marked *

css.php