University of Technology Sydney

42046 Data Processing Using R

Warning: The information on this page is indicative. The subject outline for a particular session, location and mode of offering is the authoritative source of all information about the subject for that offering. Required texts, recommended texts and references in particular are likely to change. Students will be provided with a subject outline once they enrol in the subject.

Subject handbook information prior to 2022 is available in the Archives.

UTS: Information Technology: Computer Science
Credit points: 3 cp

Subject level:

Postgraduate

Result type: Grade and marks

There are course requisites for this subject. See access conditions.

Description

R is a very popular high-level programming language for doing data analytics, statistics and interactive visualisation. It is an interpreted, object-oriented and interactive scripting language. R’s high-level built-in data structures make it suitable for rapid application development. R's functional programming features make it a very expressive language. It is widely used for statistics, scientific and numeric computing, education, software development, and interactive visualisation applications. This subject covers the basics of using R, including programming constructs, packages and object-oriented programming. Useful R analysis procedures for data analytics are also introduced.

Subject learning objectives (SLOs)

Upon successful completion of this subject students should be able to:

1. Use the R language to write custom programs for analysing data
2. Manipulate data for analysis and visualisation using extensible and transferable approaches
3. Apply data visualisation and analytical techniques to summarise and analyse datasets

Course intended learning outcomes (CILOs)

This subject also contributes specifically to the development of the following Course Intended Learning Outcomes (CILOs):

  • Design Oriented: FEIT graduates apply problem solving, design thinking and decision-making methodologies in new contexts or to novel problems, to explore, test, analyse and synthesise complex ideas, theories or concepts. (C.1)
  • Technically Proficient: FEIT graduates apply theoretical, conceptual, software and physical tools and advanced discipline knowledge to research, evaluate and predict future performance of systems characterised by complexity. (D.1)

Teaching and learning strategies

This subject will be delivered in 4 intensive collaborative sessions with a focus on hands-on tutorial approaches, designed to both learn about and immediately practice techniques for programming for data analysis. Within the interactive sessions, there will be several opportunities for testing the student’s ability to apply the new skills to a choice from a set of specified basic problems. These problems will help to provide students with initial low-stakes feedback on their progress within the class in early weeks, but will also form part of the assessment in later weeks.

The face-to-face sessions will be supported by several online collaborative sessions, focused on both solving remaining technical problems, discussing the design and efficiency of chosen program designs, and maintaining peer-to-peer collaboration within the cohort.

Finally, students will reflect on the techniques they use and how they inter-relate as they move forward to apply the skills to a more open project, where they will both create a dataset and interact with existing data, to solve a business analytics problem they define.

Content (topics)

  • Basic data import
  • Data types and basic array operations
  • Data frames and manipulating data
  • Factors and tabulation
  • Managing datasets and summarising data
  • Linear regression
  • Analysis of variance
  • Plotting data with ggplot
  • Multivariate data visualisation
  • Writing functions and scripts
  • Reading data from online sources, and parsing data from untidy data files
  • Multivariate plotting for exploratory data analysis
  • Building interactive graphics and interactive dashboards
  • Reproducible research
  • Fitting statistical and machine learning models
  • Time-series statistical techniques
  • Object-oriented programming
  • R packages
  • Functional programming in R

Assessment

Assessment task 1: Interactive Learning Journal

Intent:

To ensure that the student has a firm understanding of R programming basics. This will facilitate the learning of advanced topics.

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

1, 2 and 3

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

C.1 and D.1

Type: Journal
Groupwork: Individual
Weight: 60%
Length:

Approximately 4-6 pages per problem, for a total of 28-42 pages. The submission should be the output of a RMarkdown with html file, submitted with the code chunks included.

Assessment task 2: Assignment

Intent:

To ensure the student can design, implement and execute a data processing task independently using the R language.

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

1, 2 and 3

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

C.1 and D.1

Type: Project
Groupwork: Individual
Weight: 40%
Length:

Online format of about 3000 words, but including at least 6 visualisations or appropriate analytics outputs

Minimum requirements

To pass this subject, students must achieve an overall mark of 50% or greater.