Analysis using clouds
Discussion from 2008 Working Group report
This section discusses using clouds for analysis, based on the discussion in the 2008 Working Group report.
In exploring the tool Wordle as an approach to representing the key ideas from an interview in word clous, the Working Group generated word clouds for three interviews: Winifred Asprey, Tracy Camp, and Susan Gerhart. The tool Wordle appealed to the WG members initially because of the aesthetics and the simplicity of representing a multi-page interview as an artistic arrangement of words.
The three sample clouds suggest three sources of “noise” in this type of analysis.
- First, without pre-editing, multi-word phrases display in the word cloud as individual words, for example “computer” and “science”, rather than “computer science.”
- Second, some words common in spoken language can add results potentially irrelevant to the study; for example, occurrences of the words “one” and “two”, or prominent appearances of the adverb “really” and the interjection “well”.
- Three, because the interview questions also appear in the text, the words spoken by the interviewer affect the overall frequencies.
The Working Group explored several methods for minimizing the effects of “noise” on the resulting word cloud.
- The transcript can be edited to force phrases to be evaluated as a unit by replacing the space by a tilde character, for example “computer~science.”
- To reduce the impact of “noise” words, the researcher can successively refine the transcript to prevent these words from appearing in the word cloud results. On each pass, the researcher generates a word cloud, observes occurrences of words that are irrelevant to the study, and creates an edited version of the transcript that replaces each “noise” word by one that the generation tool ignores (such as “a”).
- To eliminate words uttered by the interviewer, rather than the interviewee, the researcher can create a version of the transcript where the questions have been eliminated. These steps can help ensure that the word cloud for that interview highlights the main ideas from the interview.