THATCamp DC 2017 Making History Tue, 04 Apr 2017 16:57:54 +0000 en-US hourly 1 THATCamp DC 2017 32 32 Intro to API’s session Tue, 04 Apr 2017 16:56:37 +0000 Continue reading ]]>

Scribed by Sage P.

“Room A Section 1: Topic- Intro to API’s

DPLA- Digital Public Library

API-Application programming interface

Behind the scenes, to get data


Scrape data from webite

Data in a structured way



Postman in chrome (Add app)

(Don’t have to sign up)

hit “send”

It will email a key

API request


Switch to GET. Enter domain: api/dpla/v2/items

Params (parameter)

Key & Value

Api_key… # from email

Key: q (query)

Value: (search)

Click send

0 is first not 1

Count -# of matched

Start   Limit   Docs

Different way to search, defined on Dpla website

Specific info back

(ID & titles)  put in fields   Pagination   Saving it is by send & download

More way to access

Web console

Web application- developed gwu library

Commandline- twarc (twitter app)

Code- can always write code

(depend on API and what you want)

Read documentation (to understand)

You may not understand data if you don’t

Fairly stable

Tweeting slides”

Tool sharing with Rebecca (text analysis) Sat, 01 Apr 2017 18:08:17 +0000 Continue reading ]]>

Rebecca Benefiel presented and talked about using tools for data analysis that would be useful for businesses, social media, and email communications. Rebecca said that when you analyze text, which approaches do you use depends on what is the focus of your research. Specifically, for social research your often confronted with a lot of language material. In essence, when you do the analysis with one of the tools you focus on detailed analysis of text. Some of the uses of text analysis tools that were mentioned and briefly explained are:

  • Social network analysis
  • Geo-spatial/mapping
  • Distance reading/content analysis
  • Visual/sound analysis
  • Visualization

Also, Rebecca mentioned Gephi; another tool platform, which is used for data analysts and scientists keen to explore and understand graphs. A tool in which “the user interacts with the representation, manipulate the structures, shapes and colors to reveal hidden patterns”.

Some of the tools for text analysis mentioned are:

AND Notes from Neil on the same workshop  Tool Sharing w/ Rebecca (Text Analysis

  • Easier to learn by example
  • Types of digital tools
    • Text analysis
    • Social network analysis
    • Geo-spatial/mapping
    • Distance reading/content analysis
    • Visual/sound analysis
    • Visualization
  • Voyant tools demonstration
    • Allows people to replicate your analysis about the data set
    • Open source
    • Many tool options
    • Helpful community
    • Ability to filter out words in data, while maintaining full original text
    • Can increase tool window size (term #)
    • How to get Twitter text?
      • “tags” – allows you to collect data every minute from Twitter
    • The Programming Historian
      • Used to ask and answer humanities quetsions
      • Well-written tutorials
    • Open Refine
      • Helpful explanation video on Youtube
    • Text grid
      • Software for download (open-sourced)
      • Search image (create hyperlink)
    • org
      • Visualizing networks
      • Viewing group clusters
    • Palladio
      • Limited in showing statistical importance of people in a social netowrk
    • Node Excel
    • Omeka
      • Good for images, video, sound
Wikipedia Mon, 27 Mar 2017 20:07:03 +0000 Continue reading ]]>

THAT Camp Conference Summary

Transcribed by: Manuel Fiallos Garcia

The conference I was in talked about Wikipedia. It focused in the fact that there’s a lot of information in Wikipedia that you can come to believe it’s fake. That is because people could edit information easily and there was no problem with it, even if it was fake. People started doubting about using Wikipedia as a credible source and developed a tendency to disapprove it as a primary source of information. In the conference, they explained that there are actually back up proves of who the author is and what he added to the information, bringing more credibility to the source. As well, now you need to become an approved user to be able to edit in Wikipedia. They told us that also if you get to add too much fake information, people can flag you and you can get to a point were you are blocked from editing the content in Wikipedia. There was also someone who had edited a page in Wikipedia and showed us where her user appeared and what she had edited. A lady was saying that there might be content that even though it is backed up by an author’s biography website or some sort of website containing the content he added up; there might a probability that what he added is fake and could have make up that information in the website just to add content to Wikipedia. We saw a video containing how to edit Wikipedia and how people really use this tool to spread the information they know and contribute to the digital information world. This is the audio file that contains the conference, it is not that great the audio but it can provide a further insight of what they talked in the session. Thanks.


Building History Databases: What’s Overkill? Mon, 27 Mar 2017 19:21:01 +0000 Continue reading ]]>

Hosted by: David M- PHD at George Mason

-Helpful for social media analysis

-Information overload

Google Sheets, Google Docs, etc.
-Easy to share with a broader community
-Able to receive feedback in real-time
-Can access data from anywhere with a wifi connection
-Free, no costly fees of database subscriptions, software expenses etc.
-User friendly

-Runs risk of glitches
-Less shortcuts than excel
-Not as advanced as excel when it comes to visual analyzation ie: pivot tables, etc.

Alternative methods:
-Open Refined Program (

-All comes down to personal preference
-It is possible to use various programs simultaniously for different needs
-Tailor your user experience to match project needs
-Research alternative methods to expedite process
-Share tools + tips with the digital community for enhanced user experience

What’s new in institutional repositories? Mon, 27 Mar 2017 16:40:56 +0000 Continue reading ]]>

What are people using?

  • ShareShelf
    • better for cataloging and art history
    • great for images
    • those inside the instituation can see content uploaded
  • ShareShelf Commons
    • Uploaded items can be published externally
  • DSpace
    • Updates regularly, making it difficult to customize when it comes time to upgrade to new version
    • Difficult with API because it keeps changing
  • Hydra in a Box
  • Fedora
    • Difficult with API because it keeps changing
    • Humanities commons is possibly a fedora database
  • Digital Commons
    • Allows publishing
    • Better for smaller schools
  • Greenstone
  • Islandora
    • Maybe moving to digital commons
  • DSpace and Symplectic Elements
    • Amanda uses at Virginia Tech
    • Allows professors to show what they’ve done and then it is uploaded directly into DSpace
  • Institutional vs. Subject Repositories
    • MIT has 44% of faculty publications which is a high number
    • Researchers are more aligned with their field than the institution they work for, so they’re more likely to use the subject repositories
    • Faculty is more likely to use for-profit and subject based repositories
    • SSRN is now for-profit
  • Patrick does Omeka S Beta demonstration
    • What’s different?
      • Stops people from putting htmls in description which helps with metadata
      • Media tab will now allow all types of media
      • Arbitrary html will get its own spot, which will make it easier for developers to post from across the web
      • Huge list of properties is cleaned up
      • Ability to add sites, each with their own modules and themes
    • When will it be out of beta
      • Hopefully by fall semester
Session Notes: Invisible Labor of Digital History Collaboration & I vs. We in DH Sun, 26 Mar 2017 15:42:15 +0000 Continue reading ]]>
  • “Gnomes and elves problem” – invisible labor in special collections and other fields
  • What is invisible labor?
    • Thread on Twitter a few weeks ago stating person found this hidden gem; no, a librarian found it, catalogued it, shouted it out, you didn’t discover it
    • Invisibility – things difficult to assess (like course surveys); direct vs. indirect measures, skills that don’t manifest until later on, hard to measure and communicate
      • Big issue for funding – erasure of labor has financial implications
    • Technology may have exasperated this issue of erasing labor
      • Physical card catalog bank vs. online – mass quantity better indicates labor behind it
    • Librarian older model seen as support for researchers, not co-researchers
    • DH asking for more and different support – role as librarian changing based on tasks of researchers, which plays into idea of visibility
      • Visibility obvious as providing a service, but less so in other ways
    • Be more conscious about delivering products and benefits to get away from being seen as a call center
      • Ex. Mukurtu Project – active collaboration with indigenous peoples; normally invisible labor but support highlighted because of direct engagement with this community
      • Linked the product with the process
      • Engage with invisible voices that want to interpret themselves
    • Public transcription can help as well
      • Ex. transcription center at Smithsonian
    • Use the word “product” – makes things visible, but also still invisible
      • Everything in libraries being “projectized”
      • Hard to make something like knowledge a commodity
      • Creating end products like transcribed Diller jokes, which is important to show what labor is doing, but then worried that we are going to be judged off of that
        • Turns things into assets, money-makers (look what we can do in 5 months, how much $ we can make from it)
        • McDonalds vs. working with a chef
      • Product view also erases maintenance issue
    • Library as servant
      • I’m going to go to library and librarians going to find what I want vs. librarians teaching me to find something I need
      • View librarians as partners
    • Don’t necessarily want to be more of a partner by making invisible labor visible
      • Show the demands, need for funding, etc.
      • DH is helping uncover hidden labor
    • Flexibility and prioritization
      • If we have different levels of say, describing millions of collection objects
      • Triage and categorize objects as demonstration of value
      • And different levels of support
      • Counter: some don’t want to value different things differently
      • Doesn’t mean priorities won’t change, just see immediate demand in current environment
    • Important to define our goals – is it # of papers scanned into internet, or more indirect measures that may take longer to manifest down the road
      • Quantitative AND qualitative – qualitative harder to put in grant report, for example
      • Subjective harder to quantify
    • Need a big, broad picture narrative
    • Librarians as experts vs. community collaboration/ownership
      • Use librarians as starting point to get academic project off the ground – like a guide for best practices and best tools
      • Write them into your budgets
    • MFA becoming the new MBA – we are critical
    • Transcribing – undergraduate students present learned how much effort it takes
    • Crowdfunding – often seen as solution to all labor issues (unpaid labor)
      • But also sense of community and getting people involved
    • Want to bring digital documents into physical space – show and tell model
      • See what digital looks like in person – seeing that would give one a different awareness
Intro to WordPress with Amanda Sun, 26 Mar 2017 02:29:24 +0000 Continue reading ]]>

Session 1 – WordPress – Ruth & Stephen B.

Free & Open.

WordPress runs about 27% websites on the internet

Started with blogs

Difference btw (hosted solution) and

WordPress is open built software

Matt Mullenwag created wordpress – founded company called automatic in mid 2000s

Unlike Facebook, can get the wordpress software (opensource) vs.
The way WordPress is built is to be “extensible”
Totally true that if you get the wordpress software, you can do whatever you want with it on your own website. Part of the reason why it took off so much. People like to customize.
Extensible nature comes into play with themes and plugins.
Themes control how site works — can customize themes and plugins on your own site
Plug-ins are simple idea, instead of customizing the look and feel of your website, they allow you to customize what your website can do. There are plugins for everything. Lists of plugins can be found on
You can embed everything from videos on youtube to maps on google and even documents.
Software is basically the same on either
People will develop for own site and then give it away for other people to use
Others use it as a portfolio for job searches
Is there a difference between plug in (function) and embedding (content) is a great tool to teach people about, and a great way to learn web design, code, etc.
Iframe could be helpful for people who are looking to embed/plug in things that are more complex
Can control pixel width and height — control size of images — in iframe code

The blog part is also valuable, albeit “old fashioned.”
Can be used in a classroom setting to make the discussion more interactive
How to get a high profile?
-don’t use a hosted site

Reclaim Hosting — place to buy web hosting
What are you best uses for WordPress as undergrad students/people preparing to go into career after graduation?
The kinds of things you can put up on your own website that you can’t do on LinkedIn. Can post coursework, papers that you’ve done, actually documenting what you did in courses.
Really good for a portfolio
Benefits of having your own websites
It’s very customizable
Can add your twitter feed, LibraryThing (what you’re currently reading)

There are plugins that allow you to integrate your social media for “sharing” purposes also can be used for long form writing

How to gain/improve readership
Be lucky (lol)
There are so many systems of engagement, work to connect them all.
Orchid — researcher identifier
Be authentic and water your presence like a garden
Content and FREQUENT content are very important.
Keep things up to date

Tool Sharing Sat, 25 Mar 2017 19:44:10 +0000 Continue reading ]]>

How to approach DH?

-Text analysis
-Social network analysis
-Geo-spatial mapping
-Distance reading / content analysis
-Visual/sound analysis


Dirt Directory (
-comprehensive website/registry listing resources to help you conduct research
-can be categorized by your approach (text analysis, numeric data, etc.)

Tags (for twitter date collection)
-allows you to collect any tweet you want by the minute
-only need twitter and gmail account
-using twitter’s API including location, vast amounts of data

Voyant ( for text analysis
-load your own dataset
-enables you to quantify the humanities into datasets just as scientists and social scientists do
-shows (from left to right) a word cloud, an automatic summary (including words per sentence, frequent words, distinctive words, vocabulary density, etc.), the top five words, and words preceding and following specific words
-tool to exclude phrases you do not want to count as words

Programming Historian (
-valuable especially for isolated regions where resources may be more limited
-always looking for contributors
-tutorials are well-written
-using regular expression to clean OCR text

Open Refine (

Text grid labs – downloadable application for text analysis
-upload photos of manuscript
-can embed links, etc.

Gephi ( for visualization

Palladio ( for visualizing historical data
-perfect for exploring and catered to be user-friendly
-partially funded by NEH

Google nGram

Social network analysis
-lots of statistics
-all you need is two columns of two related persons
-difference from Palladio – shows nodes (persons beyond the first degree of separation)
-analysis includes:
-maximum geodesic distance – diameter (“hops” of degrees of separation from one side of the chart to the other side)
-centrality (how many times people have go through you to get to another relation)
-exemplifies “power law curve”
-Eigenvector unit – “proximity to power” (how close you are to people with high scores of centrality)


Omeka and
-free, easy, nice to use
-really good at presenting all the metadata, making it very accessible
-comprehensive source for manuscript, images, audio, video

-good for articles, books, embedding
-create things in zotero and you can embed on Omeka using a connecting tool

Wikis and Wikithons Sat, 25 Mar 2017 19:43:09 +0000 Continue reading ]]>

Wikis and Wikithons

Transcribed by: Hope Gillespie

  • There is a widespread perception that Wikipedia isn’t a “trustworthy” source
    • But crowdsourcing is helping
  • Initially, anyone could edit
    • They took on a group of editors
    • NOT primary source driven,  must be able to cite source of information
  • Wiki ambassadors
    • Can train students to edit pages
  • Editing
    • If the format exists already, It’s easy to follow
  • VisualEditor
    • Gives you the ability to edit, but you must have an account and you can only work on one at a time
  • Wikipedia Tutorial
  • Adam Lewis ( DC Wiki Ambassador)
  • Sandboxes
    • Not meant to be permanent, low risk
The Value of Federal Support for DH with Diane Cline Sat, 25 Mar 2017 19:41:46 +0000 Continue reading ]]>

Transcribed by: Rachel Cousins

  • Happy hours for writing letters “write to your happy hour”
  • website resources
    • Breakdown of federal funding by state
    • Information of the history of the endowment
  • The NEH was at its peak under Nixon’s administration
  • The biggest recipients of federal funding are not states you would expect (Vermont, Alaska)
  • When lobbying for federal funding it is important to go into a breakdown of programs that would benefit
  • National Humanities Alliance
  • We need more innovative curriculums that teach problem solving to students of the humanities
  • How do you highlight what goes away if you take away national humanities funding?
    • The impact on society
  • Using NEH grants to match donors to incentivize them to donate
  • Documentary film is having difficulty under the competition of other digital sources for funding
  • How do you know whether to trust the information presented in a documentary
    • NEH approval on documentary films provides reliability because of the peer review process
    • Also applies to websites
  • The NEH has to be open with their practices to the public because they are funded by taxpayer dollars in a way private foundations don’t have to be
  • The National Humanities Alliance
    • They bring people from organizations all around the country to speak to their representatives on the Hill to talk about the importance of their organizations