(last revision since 2013-02-28 19:49)
MALS 75500/ASCP 81500 (3 credits)
Digital Humanities Methods and Practices
Mondays 11:45am-1:45pm, CUNY Graduate Center Room 3307
Instructor: Dr. Arienne M Dwyer Visiting Professor, CUNY GC / Professor, KU Anthropology
Co-Director, University of Kansas Institute for Digital Research in the Humanities
Email: adwyer AT gc.cuny.edu
Office hours: Mondays 2-4pm, Tuesdays by appointment, Room 4104 (alternate room: 4103)
Course blog: https://dhmethods13.commons.gc.cuny.edu/
This is a hands-on course, which is the second semester of a two course core sequence in Digital Humanities. Its practice generally entails four main stages: data capture, annotation, exploration and analysis, and dissemination. Workshops surveying the tools, methods, and standards entailed in these stages are a regular part of the weekly class. In addition to training in these skills, the course will explore the profound cultural changes that the practice of digital humanities instantiates and reflects. First, the tools and data we choose shape the research questions we ask. Second, the digital humanities generally aim to create community resources, going beyond the work of the individual. Students will be assessed on their applying this new approach to scholarship and information to a project of their choice. The course has no prerequisites, and welcomes students from all humanities, social-science, and allied disciplines.
MALS 75400/ASCP 81500/ENGL 89020 Debates in the Digital Humanities is recommended but not required. No programming knowledge is required. Laptops will be needed in many of the classes to get the most out of the hands-on workshops (CUNY GC has loaner laptops available).
At the end of this course, students will:
- Have an overview of the range of tools and methods for doing Digital Humanities work, including those in data capture, annotation, exploration and analysis, and dissemination;
- Have an understanding of the cultural practices underlying digital scholarship;
- Be able to assess the pros and cons of particular methods and tools;
- Have used a subset of these tools and methods appropriate to their interests, applying them to one or more projects throughout the term;
- Be able to manipulate data into forms suitable for analysis and presentation.
The ultimate aim of this course is for the student to have the critical and technical skills to deepen their work in their own disciplines (philosophy, history, English, anthropology, etc) as they complete their graduate degrees.
—————-Course Topics & Calendar—————
Class meets Mondays, 11.45am-1.45pm, unless otherwise noted.
Readings and Software installations are to be completed by the day listed;
Assignments are to be completed during the week listed.
N.B.: Contributions to your individual and the class blog are expected weekly.
„Major“ methods/tools are marked with an asterisk (*).
I: Theoretical and cultural issues: The practice of digital humanities
Introductions & Course Practicalities (course blog and group; presentations signup)
Creating resources and interpreting them
Critically evaluating methods: The tools we use shape the questions we ask
Are the practices of DH threatening literary and cultural studies?
Mixed method approaches are the rule
What this course doesn’t cover
Assignment: (1) Create a blog on CUNY Commons; (2) briefly your goals for the course (e.g. „I’m particularly interested in exploring methods/tools A, B, C in order to do research in topics X and Y“ or „I have no concrete research goals at the moment, but my field is X and I’m looking forward try trying Y….“ Antipathies also welcome: „method x seems irrelevant to my work“ „tool Y is intimidating/better optimized for Z…“ „I’d suggest Tool Z instead of the Tool Y planned, it’s more versatile because…“)
What are ‘data’, exactly?
Identification – what counts as data
Habitat – where to find data
Care and feeding – what digital humanists do with data
Lifecycle – the origin, lifespan, and possible death of data
Environment – Why humanists may be allergic to the term ‘data’
What sorts of questions do Digital Humanists ask?
Meanings – speaker/authorial intentionality, social contexts
Do digital tools change these questions?
Preview to next week : Introduction to Regular Expressions (RegEx)
Readings: Regular expressions: http://dh.obdurodon.org/regex.html
Assignment (for this and every subsequent week): Make at least one posting this week to your individual blog, and at least one additional posting to the course blog.
II. Key methods: Capture, exploration & analysis, mapping, visualization
[In that order, except where guest speaker schedules dictate otherwise]
Methods: Data capture
Issues: Born digital vs. analog; formats & conversion
Some data types:
Text (plain & other)
Other data, e.g. social media
- *Data cleaning (incl. *RegEx)
Metadata: keeping track of data
Readings: Digitization and metadata: http://toolingup.stanford.edu/?page_id=123
Deegan and Tanner, Conversion of primary sources (Ch 32 of http://www.digitalhumanities.org/companion/ )
Software: Make sure your laptop has a text editor (e.g. Notepad, TextEdit), and get familiar with it. Simple data cleaning: Dragging MS-word through TextEdit or Notepad: http://www2.le.ac.uk/webcentre/plone/build/other/cleaning
More powerful data cleaning: Text wrangler http://vis.stanford.edu/wrangler/
(Data cleaning can also be done with e.g. Perl, Python, or R – great but beyond the scope of this class.)
Assignment: In your individual blog, discuss the type(s) of data you have and data capture issues you face with these data. If you do not yet have data, get a small set that interests you (from any source).
(18 February President’s Day holiday – class moved to Wednesday)
20 February (Wednesday) 11:45-13.45 (due to President’s Day holiday)
Methods: Exploration and analysis (part 1)
- Parallel texts (similarity comparisons, e.g. *Juxta, http://juxtacommons.org/ , *The Versioning Machine, http://v-machine.org/ )
- Diachronic analysis (e.g. *NGrams): http://books.google.com/ngrams
- Social network analysis, e.g. *Voyant http://taporware.ualberta.ca
Scatter plot, e.g. Voyant (direct link: http://voyant-tools.org/)
[NB – we will come back to the exploration and analysis of images in April]
Readings: Overview of Text Analysis (Stanford U, Tooling Up for the Digital Humanities)
plus one of the following (your choice):
-A straight-up survey in historical context (in more detail than Stanford’s Overview) by John Burrows, Textual Analysis (Ch 23 of http://www.digitalhumanities.org/companion)
-For a more critical-reflexive and more advanced take, see Ted Underwood re text mining: http://tedunderwood.com/2012/08/14/where-to-start-with-text-mining/
[brief jump ahead to 6. Methods: Spatial Analysis]
25 February – Introduction to GIS (Geographic Information Systems)
Guest instructor: Frank Donnelly, Geospatial Librarian, Baruch College CUNY
(Want more? Sign up for a full-day GIS workshop on March 8th & April 26th, only $30)
Readings: USGS GIS intro poster: http://egsc.usgs.gov/isb/pubs/gis_poster/
Assignment: (1) Browse the website for open-source GIS software *QGIS and blog about the tool and today’s tutorial. (2) Obtain an open-source text (or one you hold copyright to) in .txt format and bring it on your laptop to the 4 March class.
4 March (class runs 11:45-1:15 on this day; office hrs Monday 9-10.30am only this week)
Document analysis: literary, linguistic, and historical text analysis
The specifics of Annotation (=Markup)
Presentational vs. Descriptive markup
*TEI and non-TEI based text annotation
Readings: Text Encoding. Ch. 17 of http://www.digitalhumanities.org/companion/ by Allen Renear,
An even gentler introduction to XML: http://dh.obdurodon.org/what-is-xml.xhtml , by David Birnbaum,and the following as reference only:
TEI P5 guidelines: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index-toc.html
Software: <oXygen/> (free 30-day trial ): www.oxygenxml.com/download.html
Methods: Getting the most out of your texts: XSLT
Guest instructor: David J. Birnbaum, Digital Humanist, University of Pittsburgh
*XSLT – XML stylesheets to allow different views of your texts
Readings: XSLT: http://dh.obdurodon.org/xslt_basics.html
Methods: Exploration and analysis (part 2)
Life-tracking (Lifestream, Lifelogging etc) – demo only
Concordancing – e.g. *TextStat
Trends tracking of various sorts (*Voyant, Taporware or *Textalyser)
Keyword density (vocab. richness e.g. Agatha Christie)
term frequency (tf-idf)
Readings: Lancashire & Hirst 2009, Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of Dementia: A Case Study. Online:http://ftp.cs.toronto.edu/pub/gh/Lancashire+Hirst-extabs-2009.pdf
Software: TextStat http://neon.niederlandistik.fu-berlin.de/en/textstat/
Keyword density via Voyant (above) or Textalyser http://textalyser.net/ or
Assignment: If we were to do Lancashire and Hirst’s study today, what other tools might we use? What are the advantages and disadvantages of such tools?
23 March (Sat.)-2 April (Tues.) CUNY Spring Break
Methods: Spatial Analysis (part 2)
8 April Accessible mapping tools (both GIS and non-GIS)
Guest instructor: Steve Romalewski (CUNY GC Mapping Service Director)
Browsings (not readings this week):
Online mapping tools:
Some interesting visualizations:
- ORBIS: The Stanford Geospatial Network Model of the Roman World (click the “Mapping ORBIS” tab)
- Our GC Interactive map of Census race/ethnicity data: (also featured in the latest GC Folio)
- National Center for Educational Statistics demographics map
– NYC in the 1940s (using maps as an entry point for data, photos, etc): www.1940snewyork.com
Steve’s GIS Resources page for his class at Pratt Institute
- Simple visualization tools like *Many Eyes http://www-958.ibm.com/software/data/cognos/manyeyes/
- Timeline, e.g. *TimelineJS: http://timeline.verite.co
Sign up for Blog/Presentation consultations on 22 April
Readings: Data visualization overview: http://toolingup.stanford.edu/?page_id=1247
John Theibault, Visualizations and Historical Arguments (Spr2012 ) Online: http://writinghistory.trincoll.edu/evidence/theibault-2012-spring/
Assignment:Prepare questions for Blog/Presentations consultations
22 April (class runs 11:45-1:15 on this day; office hrs Monday 9-10.30am only this week)
-Blog and Presentation consultations (15-20 min appts with the instructor)
-Peer advising and review of each others’ blogs
29 April Visualization redux: Images and Video
Guest instructor (not yet confirmed): Lev Manovich, CUNY GC, Software Studies
Readings: Data Visualization (all sub-entries): http://toolingup.stanford.edu/?page_id=1247
- Methods: Data management: How do I organize my data?
- Revisiting any method or tool that deserves more time
13 May Project presentations
10 minutes each.