Summerschool 2017: Social Data Science

Course content

The objective of this course is to learn how to analyze, gather and work with modern quantitative social science data. Increasingly, social data that capture how people behave and interact with each other is available online in new, challenging forms and formats. This opens up the possibility of gathering large amounts of interesting data, to investigate existing theories and new phenomena, provided that the analyst has sufficient computer literacy while at the same time being aware of the promises and pitfalls of working with various types of data.


BSc programme in Economics - summer school after the 2.year

MSc programme in Economics – elective course

Learning outcome

We will introduce students to the state of the art social science literature using computational methods and social data.

We will present students with an overview of key benefits and challenges of working with different kinds of social data. We will show how various kinds of data (survey, webbased,experimental, administrative, etc.) can be used to answer different questions within the social sciences. Furthermore, we will discuss ethical challenges related to the use of different types of data.

We will introduce students to statistical techniques for predicting and classification, known as statistical learning, and we will discuss how these methods relate to existing empirical tools within economics such as causal inference and regression.

We will present modern data science methods needed for working with computational social science and social data in practice. Being an effective economist and data scientist means spending large fractions of our time writing and debugging code. In this section you will learn how to write code to clean, transform, scrape, merge, visualize and analyze social data. In addition to core computational concepts, the class exercises will focus on the following topics

1. Generating new data: We will learn how to collect and scrape data from websites as well as working with  APIs.

2. Data manipulation tools: Participants will learn how to go from unstructured data to a dataset ready for analysis. This includes to import, preprocess, transform and merge data from various sources.

3. Visualization tools: We will learn best practices for visualizing data in different steps of a data analysis. Participants will learn how to visualize raw data as well as effective tools for communicating results from statistical models for broader audiences.

4. Reproducability tools: We will cover key implementations of statistical learning algorithms and participants will learn how to apply and interpret these models in practice.


After the course the student should:

- Have strong knowledge of the state of the art social science literature using computational methods and social data.

- Have strong knowledge of advantages and challenges in using different kinds of data to answer various questions in the social sciences

- Strong practical data science skills such as the ability to scrape web pages, import and export data from numerous sources, basic knowledge of functional programming and effective data visualization skills.

- Have knowledge of widely used statistical prediction algorithms as well as the ability to estimate these models in practice.


The course will consist of lectures and exercises and problem solving. The lectures will focus on the broad topics covered in the course (part 1¬3 listed above). In the exercise classes we will get our hands dirty and present data science methods needed for collecting and analyzing real¬world data. The exercises do not have a large amount of time for learning how to code. We will use some of this time like development meetings: going over assignments, having detailed code reviews of various forms, and discussing blocking issues and potential solutions.

A comprehensive reading list as well as detailed information about the course will be available on the course website soon. Reading list see:

The course builds on a wide range of techniques so the students are expected to have an interest in social data science and at least in one of the following: statistics, econometrics, linear algebra, and a scripting language (in the course it will be Python).

3 hours lecturing in 2 weeks (9 AM to 12 noon) followed by guidance in the week where the students do project work.
3 hours of exercise in the afternoon, 13-16 PM.

Timetable and venue:
To see the time and location of the lectures please press the link:
-Select Department: “2200-Økonomisk Institut” (and wait for respond)
-Select Module:: “2200-B5-5F17; [Name of course]”
-Select Report Type: “List – Week Days”
-Select Period: “Efterår/Autumn – Weeks 31-5”
Press: “ View Timetable”

7,5 ECTS
Type of assessment
Written assignment, 7 days
The exam is an 7-days project assignment written in groups of 3 to 4 participants.
All aids allowed



Marking scale
7-point grading scale
Censorship form
External censorship
100% censorship
Criteria for exam assessment

Students are assessed on the extent to which they master the learning outcome for the course.

Single subject courses (day)

  • Category
  • Hours
  • Lectures
  • 30
  • Preparation
  • 106
  • Project work
  • 40
  • Class Instruction
  • 30
  • English
  • 206