A gentle introduction to Python for linguists

V4Py Summer School

David Lukeš

June 24th–28th, 2019

Introductions

About me

Python gives you wings!

Python XKCD

Credits:

Randall Munroe, XKCD, https://xkcd.com/353/

About you

  • Who has programmed before? In what language(s)? Python?
  • What’s your academic field? Linguistics, history, digital humanities…?
  • Who is reasonably familiar with working with language data on a computer (e.g. corpora etc.)?
  • Who knows what regular expressions are? Who uses them?
  • What are you hoping to learn this week?

About the course

Python: https://www.python.org/

Python

  • a simple, fun and approachable programming language
  • FLOSS (Free, Libre, Open-Source Software) × e.g. Microsoft Word
  • created in 1991 by Guido van Rossum
  • why is it named Python?

Using Python

NLTK Book: http://www.nltk.org/book/

NLTK Book

The NLP pipeline

NLP

What we’ll cover

  • Python basics (functions, control flow, collections)
  • The NLTK package & book as a good starting point for people interested in language data
  • How text is represented inside computers
  • Regular expressions in Python
  • Accessing web services (“REST APIs”) from Python & Automatic annotation of language data (tagging, parsing) – both courtesy of Rudolf Rosa
  • Getting data into Python (raw text & tabular data)
  • Some visualizations (dispersion plots, wordclouds)
  • Case studies: collocation strength, keyword analysis
  • Hackathon on Friday!