University of Technology Sydney

32130 Fundamentals of Data Analytics

Warning: The information on this page is indicative. The subject outline for a particular session, location and mode of offering is the authoritative source of all information about the subject for that offering. Required texts, recommended texts and references in particular are likely to change. Students will be provided with a subject outline once they enrol in the subject.

Subject handbook information prior to 2024 is available in the Archives.

UTS: Information Technology: Computer Science
Credit points: 6 cp

Subject level:

Postgraduate

Result type: Grade and marks

Anti-requisite(s): 31250 Introduction to Data Analytics

Description

Data analytics is the art and science of teasing meaningful information and patterns out of large quantities of data. It combines statistical methods for identifying patterns in data and making inferences with a number of IT technologies, including database technologies for handling massive volumes of data, intelligent and smart systems technologies, visualisation and other multimedia techniques that appeal to human pattern discovery capabilities. The subject offers broad background to data analytics and data analytics methods and their application in practice. It brings together the state-of-the-art research and practice in related areas and provides students with the necessary knowledge and capacity to initiate and lead data analytics projects that can turn company data into commercially valuable information.

Subject learning objectives (SLOs)

Upon successful completion of this subject students should be able to:

1. Identify skills and attributes required for the understanding and application of data analytics in Industry. (D.1)
2. Describe the methods involved in data analytics, their scope and limitations. (D.1)
3. Design a data analytics project in a business environment. (C.1)
4. Apply data analytics methods for descriptive and predictive analytics tasks in a business environment. (D.1)

Course intended learning outcomes (CILOs)

This subject also contributes specifically to the development of the following Course Intended Learning Outcomes (CILOs):

  • Design Oriented: FEIT graduates apply problem solving, design thinking and decision-making methodologies in new contexts or to novel problems, to explore, test, analyse and synthesise complex ideas, theories or concepts. (C.1)
  • Technically Proficient: FEIT graduates apply theoretical, conceptual, software and physical tools and advanced discipline knowledge to research, evaluate and predict future performance of systems characterised by complexity. (D.1)

Teaching and learning strategies

Subject presentation includes combined lecture and workshop sessions and practical data analytics tasks for the assignments. Students will need to undertake preparation using material on Canvas to make effective use of their workshop time. Lectures will present the theoretical aspects of data analytics, including guest lectures about case studies of real-world business applications of data mining techniques. The workshop sessions focus on hands-on experience in data analytics and data analytics tools, and understanding and interpretation of the results. Practical assignments can be performed anywhere.

Prepreparation will help students to participate in the in-class individual and group exercises. Regular zero mark quizzes throughout the semester will allow students to gauge their progress

Content (topics)

The subject will cover topics from the following:

  1. Introduction to data analytics: problems; data analytics concepts, types of data that we collect, the data mining and knowledge discovery process (CRISP DM methodology), differences between data analytics and knowledge discovery, what can be discovered, overview of application areas, the data analytics professional.
  2. Data pre-processing and transformation: problems; small and large data sets; missing data and dealing with it; noisy data and sampling; missing data; techniques for data cleaning.
  3. Visual data exploration and analytics: data visualisation techniques and their applicability in data analytics, visual data analytics methods.
  4. Clustering: problems for cluster analysis; partitioning methods, hierarchical methods; k-means and related methods.
  5. Classification and prediction: problems for classification and prediction; classification by decision tree induction; classification by support vector machine; ensemble methods and random forest; classification accuracy; issues in prediction.

Assessment

Assessment task 1: Dream Jobs

Intent:

The intent is to disclose skill gaps.

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

1

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

D.1

Type: Report
Groupwork: Individual
Weight: 15%
Length:

The task requires submission of a report of approx. 1000 words (2-3 pages in an 11 or 12 point font).

Assessment task 2: Data Exploration and Preparation

Intent:

The intent is to master basic data exploration and preparation skills for data analytics.

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

2 and 4

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

D.1

Type: Report
Groupwork: Individual
Weight: 35%
Length:

A report of about 2500-3000 work report (approx. 20 pages in an 11 or 12 point font).

Assessment task 3: Data Analytics in Action

Intent:

The intent is to utilise data analytics skills for problem-solving.

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

3 and 4

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

C.1 and D.1

Type: Report
Groupwork: Individual
Weight: 50%
Length:

2000-3000 word (approx. 10-12 pages) report. Oral defence of around 5 minutes.

Minimum requirements

In order to pass the subject, a student must achieve an overall mark of 50% or more.

Recommended texts

  1. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, and Anuj Karpatne (2019). Introduction to Data Mining, Global Edition, 2nd edition, Pearson Higher Education. Available online in UTS library.
  2. https://www.knime.com/
  3. Wes McKinney (2022). Python for Data Analysis 3e - Data Wrangling with pandas, NumPy, and Jupyter, O'Reilly Media, Inc, USA. This is a practical textbook with comprehensive code projects in github: https://github.com/wesm/pydata-book.
  4. Graham Williams (2011). Data Mining with Rattle and R, Springer. This is a nice simple introduction to data mining using the R statistical language and Rattle, a package that sits on top of it.
  5. Margaret H. Dunham (2002). Data Mining: Introductory and Advanced Topics, Prentice Hall. The book offers the undergraduate Computing and IT student an introduction to the full spectrum of data mining concepts and algorithms in a comprehensive and consistent manner. The depth of coverage of each topic or method is exactly right and appropriate. Each algorithm is presented in pseudocode sufficient for any interested student to convert it into a working implementation.
  6. Han, J., Kamber, M., and Pei, J (2012). Data Mining: Concepts and Techniques, third edition, Morgan Kaufmann. The book comes from an experienced database professional and also provides an introduction to the data mining concepts and techniques, but from a database perspective. The book provides details about data warehousing and OLAP
  7. techniques, examines algorithms, data structures, data types, and complexity of algorithms.
  8. Pyle, D. (1999). Data preparation for data mining, San Francisco, Calif.: Morgan Kaufmann Publishers. A key book on data pre-processing Bramer, M (2020). Principles of Data Mining, 4th edition. .
  9. Witten, I. H., Frank, E. Hall, M, and Pal, C (2017). Data Mining: Practical Machine Learning Tools and Techniques forth edition, Morgan Kaufmann. The book is a light broad view of data mining. The book complements the WEKA toolkit used in the class

References

The UTS Coursework Assessment Policy & Procedure Manual, at www.gsu.uts.edu.au/policies/coursewkassess.html

Other resources

Subject announcements, the topic discussion boards for the subject and other communication tools will be in UTS Canvas: https://canvas.uts.edu.au/.