Using R for Practical Research and Data Visualisation

This course is intended for applied data analysis users. It will examine questions dealt with in public policy, politics and industry using real data. This includes voter surveys; housing, unemployment and prison data; and surveys on water use in Bangladesh. The unit will help build participants’ ability to undertake basic statistical analysis (including means, confidence intervals and linear regression) in R and create publication-standard graphs of the results. The end result will be more professional and easy to understand analyses of data.

 
Level 2 - runs over 5 days
Instructor: 

Dr Shaun Ratcliff is a quantitative political scientist working at the United States Studies Centre at the University of Sydney. His research focuses on using traditional and novel data sources and methods to study public opinion and party behaviour in the US, Australia and comparative democracies. He is particularly interested in examining the policy preferences and behaviour of political actors, and the role of parties as interest aggregators, and how these influence public policy outcomes.

He teaches voter behaviour and public opinion, and the use of quantitative research methods to solve problems in the social sciences.

Prior to working at the University of Sydney, Shaun taught politics, political psychology and methodology in the social sciences at Monash University and the University of Melbourne.

He is an advocate for the use of quantitative research methods to better understand politics and society, and is a member of the executive committee of the Australian Society for Quantitative Political Science.

Shaun received a PhD in political science from Monash University, and he has a background working in politics and government relations, and has consulted for political campaigns.

Course dates: Monday 6 July 2015 - Friday 10 July 2015
Course status: Course completed (no new applicants)
Week: 
Week 2
About this course: 

R is open source and free, no licences are required. R is flexible, powerful and intuitive and it is excellent for data visualisation. As it is open source, R has thousands of developers in leading universities, corporate research labs and other institutions across the world. This means its capabilities tend to exceed competing software, with new packages added all the time. As there is no licence, you can take it with you wherever you go. Wherever you work, you don't have to change software when you change employers. As a result, R has becoming increasingly popular for academic research, economics analysis and public policy development. This trend is only likely to continue. Becoming conversant in R will help build your personal capabilities and employment opportunities by making you a more flexible worker capable of undertaking analysis many other researchers and analysts cannot.

 

No prior experience with R, regression or any sophisticated quantitative methods are required for this course. Participants should use data in their occupations (or study if they are a student) and understand some of the basics (what is the mean, the median the standard deviation, for instance).

 

This is a course for subject matter experts who want to use more quantitative analysis in their work. By the end of the week you will be able to better conduct basic descriptive analysis and simple linear regression in R, and will be able to graph that analysis in a way that looks professional and is easy to understand.

Course syllabus: 

Day 1: Getting started

R is excellent for conducting simple yet effective analyses of data. In particular, it is useful in plotting descriptive data such as trends in unemployment, oil prices and crime rates. We will look at plotting data to provide you with methods you can use in your work or your research.

The course starts with instructions on how to access data in R, calculate descriptive statistics and plotting the results. We will then cover graphing means and variance so you can better understand the structure of your data. The first day of the unit will also include how to plot the trends of multiple indicators (for instance, several economic indicators such as unemployment, inflation, consumer confidence, oil prices, building starts, job vacancies and business investment) in a way that looks professional and sophisticated, with just a few a few lines of code.

Day 2: Graphing probabilities

Starting with the descriptive statistics above, how can we add to this analyses? It would help to understand how much certainty we have in this estimate. This can be done by calculating the confidence intervals of the data. On the second day we will estimate the confidence intervals and probabilities for real-life problems and graph the results. Learning how to create plots with point estimates and 95% confidence intervals that communicate your findings, making it easier for you to publish your work in a way your audience will clearly understand.

Day 3: Understanding elections and well-switching in Bangladesh

Some problems we may want to explore in the course of our research or work have multiple dimensions. For instance, politics can be seen as having two dimensions (economic and social issues). To understand multiple dimensions we will look at creating scatter plots to better understand the structure of our data. On the third day of the course, we will also explore some more complex issues, such as voting behaviour among different demographic groups and surveys of community behaviour relating to access to clean water in Bangladesh.

Day 4: Regression

Sometimes you need to do more than look at the descriptive data. There may be confounding factors, such as the effects of economic, political and demographic factors on policy outcomes. We can control from these and learn far more from our data using simple linear regression. On Day four we will look at fitting linear regression models to the data examined in the previous sessions, to provide a more thorough analysis of those problems. We will also look at graphing the regression coefficients to make our results clearly understandable.

Day 5: Bringing it all together

On the final day we will explore some slightly more complex regression models and look at graphing estimates and predictions from the model outputs. One-on-one consultations will be undertaken in the afternoon to go over specific parts of the course participants want further advice on.

 

Course format: 

This course will be held in a classroom. Course participants will require a laptop for this course with R installed. ACSPRI staff and the course instructor will be able to help with this in the weeks leading up to the course.

Recommended Background: 

ACSPRI course Fundamentals of Statistics or a similar level basic understanding of statistical analysis.

Course fees
Member: 
$1,870
Non Member: 
$3,485
Full time student Member: 
$1,870
Program: 
Winter Program 2015
Notes: 

Data and course notes will be provided.