University of Technology Sydney

36106 Machine Learning Algorithms and Applications

Warning: The information on this page is indicative. The subject outline for a particular session, location and mode of offering is the authoritative source of all information about the subject for that offering. Required texts, recommended texts and references in particular are likely to change. Students will be provided with a subject outline once they enrol in the subject.

Subject handbook information prior to 2024 is available in the Archives.

UTS: Analytics and Data Science: TD School
Credit points: 8 cp

Subject level:

Postgraduate

Result type: Grade, no marks

There are course requisites for this subject. See access conditions.
Anti-requisite(s): 36113 Applied Data Science for Innovation AND 36114 Advanced Data Science for Innovation

Requisite elaboration/waiver:

Any student wishing to enrol in first- and second-year subjects concurrently must apply for a waiver.

Description

This subject introduces students to key machine learning algorithms and their application in real-world settings. Participants are guided in developing an intuitive understanding of how the algorithms work, as well as their strengths and weaknesses. In addition to gaining practical experience with the algorithms, students develop an understanding of the basic principles of machine learning and the connections between different algorithms. Additionally, they are exposed to industry standard methodologies for data mining and analytics via readings and assessments. Since data science problems are infused with assumptions, often with ethical and legal implications, due attention is given to questioning the assumptions behind data and approaches used to analyse it.

Subject learning objectives (SLOs)

Upon successful completion of this subject students should be able to:

1. Apply an industry standard analytics life cycle methodology for data mining and pattern discovery
2. Interpret, synthesise and communicate insights extracted from machine learning algorithms in a context-appropriate manner.
3. Articulate the strengths, weaknesses and assumptions of a selection of machine learning algorithms in relation to structured and unstructured data
4. Execute and interpret machine learning approaches available for extracting value from data
5. Demonstrate an appreciation, with examples, of a critical, ethical perspective on decisions made throughout the analytics lifecycle

Course intended learning outcomes (CILOs)

This subject also contributes specifically to the development of the following course outcomes:

  • Understanding relationships & processes within systems
    Identify and represent the human and technical elements and processes within complex systems and organise them within frameworks of relationships (1.1)
  • Exploring and testing models and describing behaviours of complex systems
    Explore and test models and generalisations for describing the behaviour of sociotechnical systems and selecting data sources, taking into account the needs and values of different contexts and stakeholders (1.2)
  • Exploring, interpreting and visualising data
    Explore, analyse, manipulate, interpret and visualise data using data science techniques, software and technologies to make sense of data rich environments (2.2)
  • Designing and managing data investigations
    Apply and assess data science concepts, theories, practices and tools for designing and managing data discovery investigations in professional environments that draw upon diverse data sources, including efforts to shed light on underrepresented components (2.4)
  • Examining and articulating data value
    Critically examine the perceived value of data analytics outcomes and clearly articulate implications for different stakeholders and organisations (3.2)
  • Working together
    Develop a collaborative and team-oriented mindset to harness value for stakeholders to produce innovative solutions to challenges (3.3)
  • Developing communication skills
    Collaborate to develop and refine multimodal communication skills needed to successfully work in data science teams (4.1)
  • Engaging audiences
    Explore and craft interpretative narratives that engage key audiences with data analytics and potential significance for action, at a societal, industrial, organisational, group or individual levels (4.2)
  • Becoming a reflective data practitioner
    Engage in active, reflective practice that supports flexible navigation of assumptions, alternatives and uncertainty in professional data science contexts (5.1)
  • Embracing ethical responsibilities
    Interrogate and justify ethical responsibilities related to data selection, access, analysis and governance to create a framework for practice (5.2)

Contribution to the development of graduate attributes

Your experiences as a student in this subject support you to develop the following graduate attributes (GA):

GA 1 Sociotechnical systems thinking
GA 2 Creative, analytical and rigorous sense making
GA 3 Create value in problem solving and inquiry
GA 4 Persuasive and robust communication
GA 5 Ethical citizenship

Teaching and learning strategies

Blend of online and face to face activities: This subject is offered through a series of block sessions and blends online with face-to-face learning. Students participate in interactive learning experiences in timetabled on-campus sessions, where they make use of the subject materials that they have already engaged with online. In between campus sessions, students will engage in individual and collaborative online activities designed to support the understanding of the machine learning algorithms and their application in real-world settings

Collaborative work: A strong emphasis is placed on group activities and interaction, given that graduates of this course will need to approach professional projects and challenges from a collaborative and consensus position. Insights obtained and developed within the groups is then reworked by individual students to develop the final summative assessment activity. Group assessments and activities enable students to leverage peer-learning and demonstrate effective skills associated with the topics covered in this subject.

Transdisciplinary approaches: Starting from an elemental perspective on data and data science, students will approach learning from their specific professional and potential future contexts. As the subject progresses, the students will be able to combine their analytical and technical skills in developing and applying various machine-learning algorithms, as well as to consider standards and ethical implications of their work.

Assessment

Assessment task 1: Regression models

Intent:

Gain hands-on experience of building regression models using realistic datasets.

Objective(s):

1, 2, 3, 4 and 5

Type: Report
Groupwork: Individual
Weight: 30%
Length:

Deliverables:

  1. All Python code used to generate the model.
  2. A report articulating an understanding of the problem, the identification and breakdown of tasks relating to the solution process (as per CRISP-DM) with appropriate visualizations as well as the technical choices made and the reasons for them. In addition to a detailed discussion of the results, the report should also contain a listing of the key assumptions and their implications.
Criteria:

Both parts of the assignment will be assessed by the following criteria (see assessment brief for details)

  1. Quality of data exploration (visual + summary stats)
  2. Strength of justification for features selected and model used
  3. Quality of code
  4. Accuracy of results and evidence supporting claims
  5. Depth of discussion of ethics/privacy issues (including Indigenous people ), value, benefits and recommendation for business

Assessment task 2: Building and Interpreting a classfication model

Intent:

This assignment is focused on classification modelling in detail.

Objective(s):

1, 2, 3, 4 and 5

Type: Report
Groupwork: Individual
Weight: 30%
Length:

Deliverables

  1. All Python code used to generate the model.
  2. A comprehensive report articulating an understanding of the problem, the identification and breakdown of tasks relating to the solution process (as per CRISP-DM) as well as the technical choices made and the reasons for them. In addition to a detailed discussion of the results, the report should also contain a listing of the key assumptions and their implications. The appendix should list individual contributions to the projects.
Criteria:

Assignments will be assessed by the following criteria (see assessment brief for details)

  1. Justification of models selected, data transformation performed, hyperparameters selected and accuracy of results with evidence supporting claims
  2. Quality of findings and recommendations
  3. Quality of code
  4. Clarity and quality of visualisations and written report
  5. Depth of discussion of ethics/privacy issues (including Indigenous people ), value, benefits and recommendation for business

Assessment task 3: End-end Data Science Project

Intent:

The intent of this assessment is to help students gain hands-on experience on a real-world project within a team of data scientists. To this end, students will prepare, process and analyse provided dataset, interpret results and present insights as a report. They will also reflect on their choices and decisions made on the appropriate solutions to tackle the given problem.

Objective(s):

1, 2, 3, 4 and 5

Type: Report
Groupwork: Group, group and individually assessed
Weight: 40%
Length:
  1. All Python code used to generate the model.
  2. A comprehensive report articulating an understanding of the problem, the identification and breakdown of tasks relating to the solution process (as per CRISP-DM) as well as the technical choices made and the reasons for them. In addition to a detailed discussion of the results, the report should also contain a listing of the key assumptions and their implications. The appendix should list individual contributions to the projects.
Criteria:

Assignments will be assessed by the following criteria (see assessment brief for details)

1. Soundness of justification for selected technique

2. Quality of code and visualisations

3. Accuracy of results and evidence supporting claims

4. Breadth of evidence of collaborative work (e.g. meeting minutes, details of contributions etc)

5, Depth of discussion of ethics/privacy issues (including Indigenous people ), value, benefits and recommendation for business

6. Appropriateness of communication style to audience

Minimum requirements

Students must participate in all online and face to face requirements, as well as complete assessment tasks.

Recommended texts

Hastie, T., Tibshirani, R. and Friedman, J. (2010), The Elements of Statistical Learning – Data Mining, Prediction and Inference (Second Edition), New York, NY: Springer-Verlag (More detailed and mathematically oriented than the textbook)

Géron, A. (2019), Hands-on machine learning with Scikit-Learn, Keras and TensorFlow: concepts, tools, and techniques to build intelligent systems (Second Edition), O'Reilly

So, A. (2020), The Data Science Workshop (Second Edition), Packt Publishing

References

Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662-679.

Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3), 199-231.


Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., ... & Zhou, Z. H. (2008). Top 10 algorithms in data mining. Knowledge and information systems, 14(1), 1-37.


Kirkpatrick, K. (2016). Battling algorithmic bias: how do we ensure algorithms treat us fairly?. Communications of the ACM, 59(10), 16-17.


Goldman, E. (2005). Search engine bias and the demise of search engine utopianism. Yale Journal of Law & Tech., 8, 188.


Zarsky, T. (2016). The trouble with algorithmic decisions: An analytic road map to examine efficiency and fairness in automated and opaque decision making. Science, Technology, & Human Values, 41(1), 118-132.


Burrell, J. (2016). How the machine ‘thinks’: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 1-12.

Note: Many of the above papers are available to UTS students via https://drr.lib.uts.edu.au/search.html?q=36106

Other resources

Flinders Guide - Appropriate Terminology, Representations and Protocols of Acknowledgement for Aboriginal and Torres Strait Islander Peoples.

Walter, M., Lovett, R., Maher, B., Williamson, B., Prehn, J., Bodkin?Andrews, G., & Lee, V. (2021). Indigenous data sovereignty in the era of big data and open data. Australian Journal of Social Issues, 56(2), 143-156.

Global Indigenous Data Alliance. (2022).‘Indigenous Data Sovereignty and Governance.’

Global Indigenous Data Alliance: Global Indigenous Data Alliance (gida-global.org)

United Nations Information on data collection and disaggregation for Indigenous peoples: https://www.un.org/development/desa/indigenouspeoples/mandated-areas1/data-and-indicators.html

ANU publications on Indigenous Data sovereignty: https://press.anu.edu.au/publications/series/caepr/indigenous-data-sovereignty

Lowitja Institute flyer on Indigenous Data Governance and Sovereignty: https://www.lowitja.org.au/icms_docs/328550_data-governance-and-sovereignty.pdf

Australian Indigenous & Torres Strait Islander Service (AIATSIS) publication on Delivering Indigenous Data Sovereignty: https://aiatsis.gov.au/publication/116530

AIATSIS Code of Ethics for Aboriginal and Torres Strait Islander Research

Working With Aboriginal Peoples and Communities

Lovett, R., Jones, R., & Maher, B. (2020). The intersection of indigenous data sovereignty and closing the gap policy in Australia. In Indigenous Data Sovereignty and Policy (pp. 36-50). Routledge.