University of Technology Sydney

36106 Machine Learning Algorithms and Applications

Warning: The information on this page is indicative. The subject outline for a particular session, location and mode of offering is the authoritative source of all information about the subject for that offering. Required texts, recommended texts and references in particular are likely to change. Students will be provided with a subject outline once they enrol in the subject.

Subject handbook information prior to 2025 is available in the Archives.

UTS: Analytics and Data Science: TD School
Credit points: 8 cp

Subject level:

Postgraduate

Result type: Grade, no marks

Requisite(s): 48 credit points of completed study in 48.0000000000 Credit Points spk(s): C04379 Master of Business Analytics (Extension)
These requisites may not apply to students in certain courses.
There are course requisites for this subject. See access conditions.
Anti-requisite(s): 36113 Applied Data Science for Innovation AND 36114 Advanced Data Science for Innovation

Requisite elaboration/waiver:

Any student wishing to enrol in first- and second-year subjects concurrently must apply for a waiver.

Description

This subject introduces students to key machine learning algorithms and their application in real-world settings. Participants are guided in developing an intuitive understanding of how the algorithms work, as well as their strengths and weaknesses. In addition to gaining practical experience with the algorithms, students develop an understanding of the basic principles of machine learning and the connections between different algorithms. Additionally, they are exposed to industry standard methodologies for data mining and analytics via readings and assessments. Since data science problems are infused with assumptions, often with ethical and legal implications, due attention is given to questioning the assumptions behind data and approaches used to analyse it.

Subject learning objectives (SLOs)

Upon successful completion of this subject students should be able to:

1. Apply an industry standard analytics life cycle methodology for data mining and pattern discovery
2. Interpret, synthesise and communicate insights extracted from machine learning algorithms in a context-appropriate manner.
3. Articulate the strengths, weaknesses and assumptions of a selection of machine learning algorithms in relation to structured and unstructured data
4. Execute and interpret machine learning approaches available for extracting value from data
5. Demonstrate an appreciation, with examples, of a critical, ethical perspective on decisions made throughout the analytics lifecycle

Course intended learning outcomes (CILOs)

This subject also contributes specifically to the development of the following course outcomes:

  • Making predictions and informing data discovery
    Analyse the value of different models, established assumptions and generalisations, about the behaviour of particular systems, for making predictions and informing data discovery investigations (1.3)
  • Exploring, interpreting and visualising data
    Explore, analyse, manipulate, interpret and visualise data using data science techniques, software and technologies to make sense of data rich environments (2.2)
  • Designing and managing data investigations
    Apply and assess data science concepts, theories, practices and tools for designing and managing data discovery investigations in professional environments that draw upon diverse data sources, including efforts to shed light on underrepresented components (2.4)
  • Developing strategies for innovation
    Explore, interrogate, generate, apply, test and evaluate problem-solving strategies to extract economic, business, social, strategic or other value from data (3.1)
  • Examining and articulating data value
    Critically examine the perceived value of data analytics outcomes and clearly articulate implications for different stakeholders and organisations (3.2)
  • Working together
    Develop a collaborative and team-oriented mindset to harness value for stakeholders to produce innovative solutions to challenges (3.3)
  • Engaging audiences
    Explore and craft interpretative narratives that engage key audiences with data analytics and potential significance for action, at a societal, industrial, organisational, group or individual levels (4.2)
  • Informing decision making
    Develop, test, justify and deliver data project propositions, methodologies, analytics outcomes and recommendations for informing decision-making, both to specialist and non-specialist audiences (4.3)
  • Becoming a reflective data practitioner
    Engage in active, reflective practice that supports flexible navigation of assumptions, alternatives and uncertainty in professional data science contexts (5.1)

Contribution to the development of graduate attributes

Your experiences as a student in this subject support you to develop the following graduate attributes (GA):

GA 1 Sociotechnical systems thinking
GA 2 Creative, analytical and rigorous sense making
GA 3 Create value in problem solving and inquiry
GA 4 Persuasive and robust communication
GA 5 Ethical citizenship

Teaching and learning strategies

Blend of online and face to face activities: This subject is offered through a series of block sessions and blends online with face-to-face learning. Students participate in interactive learning experiences in timetabled on-campus sessions, where they make use of the subject materials that they have already engaged with online. In between campus sessions, students will engage in individual and collaborative online activities designed to support the understanding of the machine learning algorithms and their application in real-world settings

Collaborative work: A strong emphasis is placed on group activities and interaction, given that graduates of this course will need to approach professional projects and challenges from a collaborative and consensus position. Insights obtained and developed within the groups is then reworked by individual students to develop the final summative assessment activity. Group assessments and activities enable students to leverage peer-learning and demonstrate effective skills associated with the topics covered in this subject.

Transdisciplinary approaches: Starting from an elemental perspective on data and data science, students will approach learning from their specific professional and potential future contexts. As the subject progresses, the students will be able to combine their analytical and technical skills in developing and applying various machine-learning algorithms, as well as to consider standards and ethical implications of their work.

An aim of this subject is to help you develop academic and professional language and communication skills in order to succeed at university and in the workplace. To determine your current academic language proficiency, one of the assessment tasks in this subject will be used to assess your level of academic English language. If you receive an unsatisfactory level for English language, you must attend follow up language development activities, in order to pass the subject. These activities are designed to support you to develop your language and communication skills. Students who do not attend 80% of the language development activities will receive a Fail X grade for the subject.

Assessment

Assessment task 1: Regression models

Intent:

Gain hands-on experience of building regression models using realistic datasets.

Objective(s):

1, 2, 3, 4 and 5

Type: Report
Groupwork: Individual
Weight: 30%
Length:

Deliverables:

  1. All Python code used to generate the model.
  2. A report articulating an understanding of the problem, the identification and breakdown of tasks relating to the solution process (as per CRISP-DM) with appropriate visualizations as well as the technical choices made and the reasons for them. In addition to a detailed discussion of the results, the report should also contain a listing of the key assumptions and their implications.
Criteria:

Both parts of the assignment will be assessed by the following criteria (see assessment brief for details)

1. Quality, relevance and cleanliness of code and visualisation

2. Quality of scientific experimentation and analytic investigation

3. Depth of discussion of ethics/privacy issues (including matters related to Indigenous people ), value, benefits, risks and recommendation for business stakeholders and final users

4. Strength of justification and explanation for features selected and model used

5. Clarity and quality of written report, data visualisation and appropriateness of communication style to audience

Assessment task 2: Building and Interpreting a classfication model

Intent:

This assignment is focused on classification modelling in detail.

Objective(s):

1, 2, 3, 4 and 5

Type: Report
Groupwork: Individual
Weight: 30%
Length:

Deliverables

  1. All Python code used to generate the model.
  2. A comprehensive report articulating an understanding of the problem, the identification and breakdown of tasks relating to the solution process (as per CRISP-DM) as well as the technical choices made and the reasons for them. In addition to a detailed discussion of the results, the report should also contain a listing of the key assumptions and their implications. The appendix should list individual contributions to the projects.
Criteria:

Assignments will be assessed by the following criteria (see assessment brief for details)

1. Quality, relevance and cleanliness of code and visualisation

2. Pertinence and quality of scientific experimentation and analytic investigation

3. Depth of discussion of ethics/privacy issues (including matters related to Indigenous people ), value, benefits, risks and recommendation for business stakeholders and final users

4. Strength of justification and explanation of models selected, data transformation performed, hyperparameters selected and accuracy of results with evidence supporting claims

5. Clarity and quality of written report, data visualisation and appropriateness of communication style to audience

Assessment task 3: End-end Data Science Project

Intent:

The intent of this assessment is to help students gain hands-on experience on a real-world project within a team of data scientists. To this end, students will prepare, process and analyse provided dataset, interpret results and present insights as a report. They will also reflect on their choices and decisions made on the appropriate solutions to tackle the given problem.

Objective(s):

1, 2, 3, 4 and 5

Type: Report
Groupwork: Group, group and individually assessed
Weight: 40%
Length:
  1. All Python code used to generate the model.
  2. A comprehensive report articulating an understanding of the problem, the identification and breakdown of tasks relating to the solution process (as per CRISP-DM) as well as the technical choices made and the reasons for them. In addition to a detailed discussion of the results, the report should also contain a listing of the key assumptions and their implications. The appendix should list individual contributions to the projects.
Criteria:

Assignments will be assessed by the following criteria (see assessment brief for details)

1. Quality, relevance and cleanliness of code and visualisation

2. Breadth of evidence of collaborative work (e.g. meeting minutes, details of contributions etc)

3. Depth of discussion of ethics/privacy issues (including matters related to Indigenous people ), value, benefits, risks and recommendation for business stakeholders and final users

4. Soundness of justification for selected techniques for a given business problem and objectives

5. Clarity and accuracy of results achieved and supporting evidence provided

6. Clarity and quality of written report, data visualisation and appropriateness of communication style to audience

Minimum requirements

Students must participate in all online and face to face requirements, as well as complete assessment tasks.

Recommended texts

Hastie, T., Tibshirani, R. and Friedman, J. (2010), The Elements of Statistical Learning – Data Mining, Prediction and Inference (Second Edition), New York, NY: Springer-Verlag (More detailed and mathematically oriented than the textbook)

Géron, A. (2019), Hands-on machine learning with Scikit-Learn, Keras and TensorFlow: concepts, tools, and techniques to build intelligent systems (Second Edition), O'Reilly

So, A. (2020), The Data Science Workshop (Second Edition), Packt Publishing

References

Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662-679.

Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3), 199-231.


Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., ... & Zhou, Z. H. (2008). Top 10 algorithms in data mining. Knowledge and information systems, 14(1), 1-37.


Kirkpatrick, K. (2016). Battling algorithmic bias: how do we ensure algorithms treat us fairly?. Communications of the ACM, 59(10), 16-17.


Goldman, E. (2005). Search engine bias and the demise of search engine utopianism. Yale Journal of Law & Tech., 8, 188.


Zarsky, T. (2016). The trouble with algorithmic decisions: An analytic road map to examine efficiency and fairness in automated and opaque decision making. Science, Technology, & Human Values, 41(1), 118-132.


Burrell, J. (2016). How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 1-12.

Note: Many of the above papers are available to UTS students via https://drr.lib.uts.edu.au/search.html?q=36106

Other resources

Flinders Guide - Appropriate Terminology, Representations and Protocols of Acknowledgement for Aboriginal and Torres Strait Islander Peoples.

Walter, M., Lovett, R., Maher, B., Williamson, B., Prehn, J., Bodkin?Andrews, G., & Lee, V. (2021). Indigenous data sovereignty in the era of big data and open data. Australian Journal of Social Issues, 56(2), 143-156.

Global Indigenous Data Alliance. (2022).‘Indigenous Data Sovereignty and Governance.’

Global Indigenous Data Alliance: Global Indigenous Data Alliance (gida-global.org)

United Nations Information on data collection and disaggregation for Indigenous peoples: https://www.un.org/development/desa/indigenouspeoples/mandated-areas1/data-and-indicators.html

ANU publications on Indigenous Data sovereignty: https://press.anu.edu.au/publications/series/caepr/indigenous-data-sovereignty

Lowitja Institute flyer on Indigenous Data Governance and Sovereignty: https://www.lowitja.org.au/icms_docs/328550_data-governance-and-sovereignty.pdf

Australian Indigenous & Torres Strait Islander Service (AIATSIS) publication on Delivering Indigenous Data Sovereignty: https://aiatsis.gov.au/publication/116530

AIATSIS Code of Ethics for Aboriginal and Torres Strait Islander Research

Working With Aboriginal Peoples and Communities

Lovett, R., Jones, R., & Maher, B. (2020). The intersection of indigenous data sovereignty and closing the gap policy in Australia. In Indigenous Data Sovereignty and Policy (pp. 36-50). Routledge.