Tool Sharing

How to approach DH?

-Text analysis
-Social network analysis
-Geo-spatial mapping
-Distance reading / content analysis
-Visual/sound analysis
-Visualization

Resources

Dirt Directory (dirtdirectory.org)
-comprehensive website/registry listing resources to help you conduct research
-can be categorized by your approach (text analysis, numeric data, etc.)

Tags (for twitter date collection)
-allows you to collect any tweet you want by the minute
-only need twitter and gmail account
-using twitter’s API including location, vast amounts of data

Voyant (voyant-tools.org) for text analysis
-load your own dataset
-enables you to quantify the humanities into datasets just as scientists and social scientists do
-shows (from left to right) a word cloud, an automatic summary (including words per sentence, frequent words, distinctive words, vocabulary density, etc.), the top five words, and words preceding and following specific words
-tool to exclude phrases you do not want to count as words

Programming Historian (programminghistorian.org)
-valuable especially for isolated regions where resources may be more limited
-always looking for contributors
-tutorials are well-written
-using regular expression to clean OCR text

Open Refine (openrefine.org)

Text grid labs – downloadable application for text analysis
-upload photos of manuscript
-can embed links, etc.

Gephi (gephi.org) for visualization

Palladio (hdlab.standford.edu/palladio) for visualizing historical data
-perfect for exploring and catered to be user-friendly
-partially funded by NEH

Google nGram

Social network analysis
-lots of statistics
-all you need is two columns of two related persons
-difference from Palladio – shows nodes (persons beyond the first degree of separation)
-analysis includes:
-maximum geodesic distance – diameter (“hops” of degrees of separation from one side of the chart to the other side)
-centrality (how many times people have go through you to get to another relation)
-exemplifies “power law curve”
-Eigenvector unit – “proximity to power” (how close you are to people with high scores of centrality)

Oxygen

Omeka
-omeka.org and omeka.net
-free, easy, nice to use
-really good at presenting all the metadata, making it very accessible
-comprehensive source for manuscript, images, audio, video

Zotero
-good for articles, books, embedding
-create things in zotero and you can embed on Omeka using a connecting tool

Categories: Data Mining, Digital Literacy, Research Methods, Session Notes, Visualizations |

About Amanda Lam

I am a senior at the George Washington University in Washington, DC.