DGAH Midterm – DGAH Midterm

Introduction

For my midterm, I decided to do a text analysis of Elizabeth Robins’ play, Votes for Women. I wanted to find the most commonly used words in the play and see how their frequencies compared to each other in 10 equal segments of the text. I used Voyant Tools to analyze the text and find the most common words and frequencies. In the end, I found that the five most commonly used words, not including characters’ names or titles, were “women,” “oh,” “men,” “know,” and “it’s.” An interesting detail that I found was that the words “women” and “men” had very similar trends with “women” being more frequent than “men” in segments 1-8, being tied in segment 9, and “men” being more frequent than “women” in segment 10.

Sources

I used the Votes For Women text file which can be found at Project Gutenberg, a website with over 60,000 free eBooks.

Processes

I started the process by cleaning the text file in the Notepad app. What I did for this was remove everything before and after the play itself and remove some special characters, like underscores and asterisks. Then I used Voyant Tools to analyze the text. I used the “Trends” tab on Voyant Tools to see the most commonly used words and their relative frequencies over 10 equal segments of the text.

Presentation

For my presentation of the data, I decided to export an HTML snippet of the relative frequencies graph of the five most common words and embedded it in this page (see below). I thought this was a reasonable presentation of the data considering it is what I used to see the relative frequencies and it was already created using Voyant Tools.

Data Visualization

Significance

This text analysis of Votes for Women showed how frequently certain words were being used throughout the play in comparison to other words which was interesting to see when visualized. I think the most interesting thing was the trends of the words “women” and “men” because they were so similar yet “women” was more frequently used than “men” throughout almost all of the 10 equal segments.

A downside I found when using Voyant Tools that is apparent in the data visualization is that it is very difficult to visually analyze the trends of more than two words at a time because it gets very crowded with all of the overlapping lines. That is why, in the end, I stuck to comparing the frequencies of just “women” and “men” even though I wish I could have compared more words at the same time without it being extremely difficult.