Intro to R for Social Data Science

Course content

R is free to use for everyone and powerful. It has become one of the most widely-used programming languages for statistical analyses in the social sciences and is, for this reason, a highly-sought skill among employers. This is also true for the new emerging field of “Data Science”, which goes way beyond the social sciences.

This course will teach you how to do (social) data science with R: You will learn how to get your data into shape, transform and manipulate it, visualise it and how to statistically model it. The course will also briefly introduce students to logistic regression and multilevel modelling. Apart from these skills that are necessary for conducting classical statistics, you will also learn some basic programming in R and how to do reproducible research and report your results using R Markdown. Beware that this class presumes that you have a solid background in basic statistics (i.e., descriptive statistics and multiple OLS regression).

Education

MA Research Methodology and Practice (MSc Curriculum 2015)

Course package (MSc 2015):

Welfare, inequality and mobility
Knowledge, organisation and politics
Culture, lifestyle and everyday life
 

Credit students must be at master level

BA-Undergraduates from foreign countries (exchange students) can sign up for this course

Learning outcome

knowledge:

· R programming language

· R Studio

· Dynamic documents with R Markdown
 

Skills:

· Students will be able to conduct statistical analyses with R.

· Students will be able to manage and transform complex data with R.

· Students will be able to prepare presentations and reports with R Markdown.
 

Competences:

· Students will increase their analytical and logical cognitive capacities.

· Students should be able to transform and manipulate data to prepare it for statistical analyses. They will be able to think about data in less narrow way, because R is more flexible than other statistical programming languages.

· Students should be able to conduct own research based on analyses for which they use R.

· Students should be able to prepare reproducible research reports and presentations with R Markdown.

 

Lectures, class assignments, student presentations, a final paper that consists of an empirical analysis reported using R Markdown. Students are expected to contribute actively.

The course is largely based on: Grolemund, G. & Wickham, H. (2017): R for Data Science. O’Reilly. This book is freely available at: http://r4ds.had.co.nz/

 

Other useful books are:

Matloff, N. (2011): The Art of R Programming. No Starch Press

Teetor, P. (2011): R Cookbook. O’Reilly.

This course is no introduction to statistics! I expect that students have a solid background in basic statistics. They should have a thorough understanding of linear regression (OLS) with dummy variable predictors and interaction terms. This is a prerequisite. Otherwise I suggest to first visit my “Applied Multilevel Modeling” course in the fall semester.

Students will need to bring their own laptop.

Continuous feedback during the course of the semester

I give structured feedback to student presentations, and the final paper. Solutions to the class assignments will be presented as well.

ECTS
7,5 ECTS
Type of assessment
Written assignment
Individual/group.
A written take-home essay is defined as an assignment that addresses one or more questions. The exam is based on the course syllabus, i.e. the literature set by the teacher.

The written take-home essay must be no longer than 10 pages. For group assignments, an extra 5 pages is added per additional student. Further details for this exam form can be found in the Curriculum and in the General Guide to Examinations at KUnet.
Marking scale
7-point grading scale
Censorship form
No external censorship
Criteria for exam assessment

See learning outcome.

  • Category
  • Hours
  • Lectures
  • 28
  • Preparation
  • 118
  • Exam
  • 60
  • English
  • 206