University of Technology Sydney

42028 Deep Learning and Convolutional Neural Network

Warning: The information on this page is indicative. The subject outline for a particular session, location and mode of offering is the authoritative source of all information about the subject for that offering. Required texts, recommended texts and references in particular are likely to change. Students will be provided with a subject outline once they enrol in the subject.

Subject handbook information prior to 2024 is available in the Archives.

UTS: Information Technology: Computer Science
Credit points: 6 cp

Subject level:

Undergraduate and Postgraduate

Result type: Grade and marks

Requisite(s): 31250 Introduction to Data Analytics OR 32130 Fundamentals of Data Analytics OR 36106 Machine Learning Algorithms and Applications

Recommended studies:

basics of statistics and probability, Python programming

Description

The subject focuses on state-of-the-art research on deep learning and convolutional neural networks (CNNs) with practical applications. Recent advances in neural network approaches have significantly increased the performance of state-of-the-art data analytics, image recognition and object detection systems. This subject presents the details of deep learning architectures with a focus on learning end-to-end models for tasks, particularly image classification and object detection. State-of-the-art software tools are discussed and used for the implementation of image classification systems. Labs focus on setting-up deep learning libraries for image classification and object detection problems, and fine-tuning trained networks. Students learn to implement, train and test their own deep CNNs from scratch on GPUs. Student also explore how to deploy the trained models and build an AI system. Students demonstrate comprehension of state-of-the-art research individually, and then work in groups to apply them to the implementation of an image classification and object detection systems.

Subject learning objectives (SLOs)

Upon successful completion of this subject students should be able to:

1. Compare and contrast latest trends in Deep Learning strategies with traditional machine learning techniques. (D.1)
2. Build, train and test basic Convolutional Neural Networks for real image classification projects. (D.1)
3. Use Deep Learning software tools to explore state-of-the-art techniques and applications. (D.1)
4. Build, train and test customised object detection systems using Deep CNN-based techniques. (D.1)
5. Collaboratively develop the analysis, design and implementation of solutions to real-world computer vision problems. (C.1, E.1)

Course intended learning outcomes (CILOs)

This subject also contributes specifically to the development of the following Course Intended Learning Outcomes (CILOs):

  • Design Oriented: FEIT graduates apply problem solving, design thinking and decision-making methodologies in new contexts or to novel problems, to explore, test, analyse and synthesise complex ideas, theories or concepts. (C.1)
  • Technically Proficient: FEIT graduates apply theoretical, conceptual, software and physical tools and advanced discipline knowledge to research, evaluate and predict future performance of systems characterised by complexity. (D.1)
  • Collaborative and Communicative: FEIT graduates work as an effective member or leader of diverse teams, communicating effectively and operating autonomously within cross-disciplinary and cross-cultural contexts in the workplace. (E.1)

Contribution to the development of graduate attributes

Engineers Australia Stage 1 Competencies

This subject contributes to the development of the following Engineers Australia Stage 1 Competencies:

  • 1.2. Conceptual understanding of the mathematics, numerical analysis, statistics, and computer and information sciences which underpin the engineering discipline.
  • 1.3. In-depth understanding of specialist bodies of knowledge within the engineering discipline.
  • 2.2. Fluent application of engineering techniques, tools and resources.
  • 2.3. Application of systematic engineering synthesis and design processes.
  • 2.4. Application of systematic approaches to the conduct and management of engineering projects.
  • 3.3. Creative, innovative and pro-active demeanour.
  • 3.6. Effective team membership and team leadership.

Teaching and learning strategies

This subject is very much related to WHAT students usually have experienced as end users, but always wondered HOW it works. Some common examples are phones detecting faces (or smiling faces) while taking photos, self-driving cars, UAVs working without human assistance, etc. This subject will encourage students to begin to understand HOW Deep Learning and Neural Networks actually work. Students will become capable of building many different applications in a fun-filled learning environment.

In-class learning activities are combined with a traditional pedagogical approach to demystify the difficulty of understanding Neural Networks or CNNs theory. Classes are planned under a central theme focusing on two elements in the learning experience: explicit visualisation and clear motivation. Each theme is strategically realised by building two pillars of successful learning:

  1. to master the fundamental concepts and skills in the theory, students will build corresponding state-of-the-art systems.
  2. to understand the implementation of the theory in a context of latest research and developments, students will build their own image classification/object detection systems for practical application of the theory.

Theoretical aspects are delivered through in-class lectures and online slides, which are made available prior to the lecture session, alongside supplemental learning materials. The 1.5-hour face-to-face lecture covers the theoretical aspect of machine learning and deep learning, and build the foundation for the tutorial/lab activities.

In the 2-hour tutorial/lab session, students work on tasks, implementing state-of-the-art image classification and object detection systems. Each week’s task will address a necessary component to complete the subject’s project. Students will preview part of the experiments on Canvas, and perform the experiments collaboratively with assistance during the tutorials, receiving formative feedback. The central theme of the planned experiments is to process images and categorize them. Students will be guided through the development of the subject’s project.

The subject contains three assessments: one demonstration/replication assignment, an individual project and a group project. A primary aim of this subject is for students to engage with the Neural Network and CNNs theory with the assistance of simple programming tools. The assessment structure promotes a strong understanding of neural networks. In the projects, students will build their own CNN architecture for image classification object detection or other computer vision problems and apply MLOPS to the project lifecyle. Students are provided hands-on experience in building custom CNN architectures for image classification and object detection. Detailed formative and summative feedback will be given for each of the assignments.

Tutorials/laboratories are closely related to the assignments. Tasks will be divided into multiple stages. Students can assess their progress and receive feedback promptly. Tutors will provide necessary help. In some tasks, students will collaborate with or challenge peers, so they can also help each other to learn.

Content (topics)

  • Deep Learning and neural network basics
  • Traditional supervised and unsupervised learning overview
  • Deep Learning and Convolutional Neural Networks (CNN) in details
  • Software tools for deep neural networks
  • Image classification using Deep CNNs
  • Object detection and localization using Deep CNNs
  • Computer vision-based application development and deployment

Assessment

Assessment task 1: Assignment 1

Intent:

To ensure that the student has a firm understanding of the basics. This will facilitate the learning of advanced topics.

To allow students to self-assess whether/how the subject suits their learning objectives (this task will be done before the census date).

To guide students to start the course project and Task 2.

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

1 and 2

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

D.1

Type: Report
Groupwork: Individual
Weight: 30%
Length:

No particular length limit. Generally, the report should be about 5 pages maximum.

Assessment task 2: Assignment 2

Intent:

To ensure that the student has a firm understanding of CNNs and Object Detections algorithms. This will facilitate the learning of advanced topics for research and also assist in completing the project.

To allow students to self-assess whether/how the subject suits their learning objectives.

To guide students to start the course project (see Task 3).

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

1 and 2

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

D.1

Type: Report
Groupwork: Individual
Weight: 30%
Length:

No particular length limit. Generally, the report should be about 10 pages maximum.

Assessment task 3: Project

Intent:

To ensure that students have a clear understanding of the theoretical concepts, are able to implement them in a real-world problem context, and can independently/collaboratively design and implement solutions using these techniques.

Objective(s):

This assessment task addresses the following subject learning objectives (SLOs):

1, 2, 3, 4 and 5

This assessment task contributes to the development of the following Course Intended Learning Outcomes (CILOs):

C.1, D.1 and E.1

Type: Project
Groupwork: Group, group and individually assessed
Weight: 40%
Length:

No particular length limit. Generally, the report should be 20 pages maximum.

Minimum requirements

In order to pass the subject, a student must achieve an overall mark of 50% or more.

Required texts

  1. Deep Learning: Ian Goodfellow, Yoshua Bengio, and Aaron Courville, The MIT press (http://www.deeplearningbook.org/) (https://github.com/janishar/mit-deep-learning-book-pdf)
  2. Dive into Deep Learning (https://d2l.ai/index.html)

Recommended texts

1. Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms
Buduma, Nikhil ; Locascio, Nicholas

2. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
Géron, Aurélien

References

R. Girshick. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, pages 1440-1448, 2015.

J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3431-3440, 2015.

S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91-99, 2015.

Z. -Q. Zhao, P. Zheng, S. -T. Xu and X. Wu, "Object Detection With Deep Learning: A Review," in IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11, pp. 3212-3232, Nov. 2019, doi: 10.1109/TNNLS.2018.2876865.

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens and Z. Wojna, "Rethinking the Inception Architecture for Computer Vision," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2818-2826, doi: 10.1109/CVPR.2016.308.

K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.

Other resources

1. Tensorflow: https://www.tensorflow.org/learn
2. Numpy: https://numpy.org/learn/
3. Pytorch: https://pytorch.org/tutorials/
4. Google Golab: https://colab.research.google.com/notebooks/intro.ipynb

5. Automate the Boring Stuff with Python: https://automatetheboringstuff.com/
6. OpenCV tutorial and reference: https://github.com/dalgu90/opencv-tutorial

7. MLOps, ClearML: https://clear.ml/