Advanced Social Data Science II

Course content

The wealth of new data in the digital society is characterized by high frequency observations in a high granularity setting, allowing for both comprehensive and detailed analysis of social and individual behaviour. Messages in digital form and comments and conversations on social media have the potential to provide thick descriptions of social interactions and individual values in large-scale, sometimes population level, settings. At the same time, digitalization of large corpuses of legal, administrative and political texts allows for dynamic analysis of evolving social ideas and issues. At the same time, most digital data do not arrive in simple accessible, quantifiable and comparable forms, but as text, sound and pictures. Advanced Social Data Science II focuses on unstructured data and methods for processing, transforming and dealing with complex and high dimensional data.

The course presents classic unsupervised learning methods for characterizing and developing typologies and categories of individual and social behaviour, networks and ideas. Furthermore, it introduces state-of-the-art methods of self-supervision and transfer learning for classifying complex unstructured data such as text and images, and relates such data-driven methods to existing theoretical methods and models, as well as quantitative and qualitative methods, in the social sciences.

Education

Mandatory course on MSc programme in Social Data Science at University of Copenhagen. The course is only open for students enrolled in the MSc programme in Social Data Science.

From Spring 2023, the course is also offered as an elective to Master's programme students at:

Department of Economics

Department of Sociology

Department of Political Science

Learning outcome

At the end of the course, students are able to:

 

Knowledge

  • Explain the differences between and capabilities of neural network architectures such as CNN, RNN, LSTM and attention-based models.
  • Account for various learning strategies, algorithms as well as approaches: clustering and unsupervised learning, supervised learning, semi-supervised learning, transfer learning, multi-task learning.
  • Account for the potential of different representations, encodings and transformations of text, structured and unstructured.

 

Skills

  • Extract reliable information from text data using supervised learning and techniques from natural language processing.
  • Handle advanced matrix and tensor manipulation using a major deep learning framework (e.g. PyTorch, TensorFlow).
  • Apply state-of-the-art deep transfer learning to classify unstructured data.
  • Master computer vision methods to extract features from image data.

 

Competencies

  • Integrate theoretical and applied knowledge within the field of Social Data Science and formulate compelling research questions given an unstructured dataset.
  • Construct data sets for social science research from unstructured text and media data that are validated and well documented.
  • Independently carry out an end-to-end analysis given an unstructured dataset of text or images, including exploratory analysis and discovery using unsupervised methods and supervised learning for measurement, and assessment of model-based biases.
  • Critically evaluate the implications of results, considering model limitations and biases, and systematic noise introduced by data collection and sampling methods.
  • Communicate results using comprehensive statistics and modern visualization methods in particular plotting new data types to specialists within the academic field.

This class will be taught using a combination of lectures and hands-on lab exercises working with problem sets.

Examples of course readings:

 

- Bishop, Christopher: *Pattern Recognition and Machine Learning*. Spring Publishing, 2006.

- Cantu, Francisco & Michelle Torres: "Learning to See: Visual Analysis for Social Science Data".

- Gentzkow, M., Kelly, B. T., & Taddy, M. Text as Data. *Journal of Economic Literature*.

- Grimmer, J., & Stew art, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. *Political Analysis*, 21(3), 267-297.

- Hastie, T., & Tibshirani, R. & Friedman, J.(2008). *The Elements of Statistical Learning; Data Mining, Inference and Prediction*.

- Jurafsky, Dan, and James H. Martin. *Speech and language processing*. Vol. 3. London: Pearson, 2014.

- Krippendorff, Klaus. Content analysis: An introduction to its methodology. Sage publications, 2018.

Students are expected to be familiar with the basics of Python and Jupyter Notebooks

Students at Social Data Science can only register for the exam for Advanced Social Data Science II if they have passed all compulsory courses on the first semester on the master's programme in Social Data Science.

Oral
Peer feedback (Students give each other feedback)
ECTS
7,5 ECTS
Type of assessment
Written assignment
Type of assessment details
Individual 72-hour take home assignment.
Aid
All aids allowed
Marking scale
7-point grading scale
Censorship form
No external censorship
Criteria for exam assessment

The exam will be assessed on the basis of the learning outcome (knowledge, skills and competencies) for the course.

  • Category
  • Hours
  • Lectures
  • 28
  • Preparation
  • 112
  • Exercises
  • 42
  • Exam
  • 24
  • English
  • 206

Kursusinformation

Language
English
Course number
ASDK20006U
ECTS
7,5 ECTS
Programme level
Full Degree Master
Full Degree Master choice
Duration

1 block

Placement
Block 4
Capacity
100 students.
Studyboard
Social Data Science
Contracting department
  • Social Data Science
  • Department of Political Science
  • Department of Sociology
  • Department of Economics
Contracting faculty
  • Faculty of Social Sciences
Course Coordinator
  • Frederik Georg Hjorth   (2-686a426b6875306d7730666d)
Saved on the 07-11-2022

Are you BA- or KA-student?

Are you bachelor- or kandidat-student, then find the course in the course catalog for students:

Courseinformation of students