Summerschool 2020: Introduction to Social Data Science

Course content

The objective of this course is to learn how to analyze, gather and work with modern quantitative social science data. Increasingly, social data that capture how people behave and interact with each other is available online in new, challenging forms and formats. This opens up the possibility of gathering large amounts of interesting data, to investigate existing theories and new phenomena, provided that the analyst has sufficient computer literacy while at the same time being aware of the promises and pitfalls of working with various types of data.

 

In addition to core computational concepts, the class exercises will focus on the following topics:

 

1. Gathering data: Learning how to collect and scrape data from websites as well as working with  APIs.

2. Data manipulation tools: Learning how to go from unstructured data to a dataset ready for analysis. This includes to import, preprocess, transform and merge data from various sources.

3. Visualization tools: Learning best practices for visualizing data in different steps of a data analysis. Participants will learn how to visualize raw data as well as effective tools for communicating results from statistical models for broader audiences.

4. Prediction tools: Covering key implementations of statistical learning algorithms and participants will learn how to apply and interpret these models in practice.

Education

MSc programme in Economics – elective course

Bacheloruddannelsen i økonomi – valgfag efter 2. år

The Danish BSc programme in Economics - elective course after the 2. year

 

The course has changed name from "Social Data Science" to "Introduction to Social Data Science". The content, syllabus and type of exam is the same.  

Learning outcome

After completing the course the student is expected to be able to:

 

Knowledge:

  • Understand use cases for different kinds of data (survey, webbased, experimental, administrative, etc.) to answer various questions in the social sciences.
  • Account for benefits and challenges of working with different kinds of social data.
  • Identify and account for strengths and weaknesses of linear statistical prediction algorithms and estimate these models in practice.
  • Discuss ethical challenges related to the use of different types of data.
  • Discuss how prediction tools relate to existing empirical tools within social sciences such as linear regression for inference.

 

Skills:

  • Program in basic Pythion, write and debug code.
  • Use data manipulation and data visualization to clean, transform, scrape, merge, visualize and analyze social data.
  • Generate new data by collecting and scraping web pages (import and export data from numerous sources) and work with data APIs.
  • Apply and interpret machine learning algorithms and models in practice.

 

Competences:

  • Independently master and implement computational methods and social data in the field of the state of the art social science literature.
  • Present modern data science methods needed for working with computational social science and social data in practice.

The course will in the two first weeks consist of lectures and exercises with problem solving. The lectures will focus on the broad topics covered in the course. In the exercise classes we will get our hands dirty and present data science methods needed for collecting and analyzing real-world data. The student must be aware that the exercises do not have a large amount of time for learning how to code.

The third week of the summer school will consist of peerfeedback, guidance and project writing.

The main textbooks are:

  • Python for Data Analysis, 2nd ed. (2017) by Wes McKinney
  • Python Machine Learning, 2nd ed. (2017) by Sebastian Raschka & Vahid Mirjalili
  • Big by Bit - Social research in the digital age by Matthew J. Salganik

 

A comprehensive reading list as well as detailed information about the course will be available on the course website soon. For last year’s reading list see:

https://abjer.github.io/sds/readings/

This course is available to students and pracitioners who are interested in social data science.

Because the course builds on a wide range of techniques, we do not have any hard requirements,but students are expected to have an interest in at least one of the following: Statistics, econometrics, linear algebra and a scripting language (we will focus on Python in this course).

Schedule:
3 hours lecturing, 9 AM to 12 noon, in week 33 and 34.
3 hours of exercise in the afternoon, 13-16 PM, in week 33 and 34.
Week 35: The students participate in peer feedback.
Week 35; Monday 24th of August to Wednesday 26th (tbc) the students can groupevise participate in meetings with the TAs for guidance of the project.

Timetable and venue:
Exact timetable and venue will be available from April 1st, 2020

Press the link:
https:/​/​skema.ku.dk/​ku2021/​uk/​module.htm
-Select Department: “2200-Økonomisk Institut” (and wait for respond)
-Select Module:: “2200-B5-5F20; [Name of course]””
-Select Report Type: "List - Week Days"
-Select Period: “Efterår/Autumn – Week 31-5”
Press: “ View Timetable”

Written
Oral
Individual

 

The students receive: 

  • Written feedback from assignments (correction and solution).
  • Written feedback from responses to quizzes.
  • Oral feedback and supervision sessions by TAs.
  • Feedback by their peers on the project assignment.
ECTS
7,5 ECTS
Type of assessment
Written assignment, 7 days
The exam is a project paper. The project can be written individually or in groups of 3 to 4 participants. The students can give peer feedback to the project assignment of each other.
Please be aware of the rules for co-writing assignments for the groups as stated in the curriculum. As well as the plagiarism rules must be complied. The project paper must be written in English.

The groups are randomly assigned at the beginning of the course.
____
Aid
All aids allowed

 

 

Marking scale
7-point grading scale
Censorship form
No external censorship
for the written exam. The exam may be chosen for external censorship by random check.
____
Criteria for exam assessment

Students are assessed on the extent to which they master the learning outcome for the course.

 

To receive the top grade, the student must with no or only a few minor weaknesses be able to demonstrate an excellent performance displaying a high level of command of all aspects of the relevant material and can make use of the knowledge, skills and competencies listed in the learning outcomes.

Single subject courses (day)

  • Category
  • Hours
  • Preparation
  • 106
  • Lectures
  • 30
  • Class Instruction
  • 30
  • Project work
  • 40
  • English
  • 206