94692 Data Science Practice
Warning: The information on this page is indicative. The subject outline for a
particular session, location and mode of offering is the authoritative source
of all information about the subject for that offering. Required texts, recommended texts and references in particular are likely to change. Students will be provided with a subject outline once they enrol in the subject.
Subject handbook information prior to 2025 is available in the Archives.
Credit points: 8 cp
Result type: Grade, no marks
Description
The subject covers the following topics in detail:
- introduction to programming concepts
- practical introduction to Python and R programming
- collaboration using version control with Git
- introduction to data stores and SQL querying language
- working with UNIX systems and Docker container
Subject learning objectives (SLOs)
Upon successful completion of this subject students should be able to:
1. | Participate in the development of data related projects using popular programming languages including Python, SQL and R. |
---|---|
2. | Articulate the strengths, weaknesses and use-cases of common code control workflows, and demonstrate ability to work collaboratively using these tools. |
3. | Interact with databases and query data sources using SQL. |
4. | Work confidently in a Unix environment, including the use of Docker and Bash via a Command Line Interface. |
5. | Understanding basic programming concepts. |
Course intended learning outcomes (CILOs)
This subject contributes specifically to the development of the following course intended learning outcomes:
- Identify and represent the human and technical elements and processes within complex systems and organise them within frameworks of relationships (1.1)
- Explore and test models and generalisations for describing the behaviour of sociotechnical systems and selecting data sources, taking into account the needs and values of different contexts and stakeholders (1.2)
- Critique contemporary trends and theoretical frameworks in data science for relevance to one's own practice (2.1)
- Explore, analyse, manipulate, interpret and visualise data using data science techniques, software and technologies to make sense of data rich environments (2.2)
- Understand and deal critically and openly with the uncertainty, ambiguity and complexity associated with people, systems and data (2.3)
- Apply and assess data science concepts, theories, practices and tools for designing and managing data discovery investigations in professional environments that draw upon diverse data sources, including efforts to shed light on underrepresented components (2.4)
- Develop a collaborative and team-oriented mindset to harness value for stakeholders to produce innovative solutions to challenges (3.3)
- Collaborate to develop and refine multimodal communication skills needed to successfully work in data science teams (4.1)
- Engage in active, reflective practice that supports flexible navigation of assumptions, alternatives and uncertainty in professional data science contexts (5.1)
- Take a leadership role in promoting positive change in data science contexts, recognising individual, organisational and community issues (5.3)
Contribution to the development of graduate attributes
The subject gives students a practical introduction to Data Science practices that are commonly used in industry. Popular technologies and practices are covered and students are given opportunities to apply them in realistic settings. . Students will come away with an understanding of how to work effectively in teams, appreciate how to “get things done” in corporate environments, as well as a familiarity with some of the common risks when deploying data science projects, and controls that can be used to minimize those risks.
The subject addresses the following graduate attributes (GA):
GA 1 Sociotechnical systems thinking
GA 2 Creative, analytical and rigorous sense making
GA 4 Persuasive and robust communication
GA 5 Ethical citizenship and leadership
Teaching and learning strategies
This subject is conducted in 7 face-to-face sessions on campus with weekly activities & readings assigned between classes. The classes are a mix of lecture components and collaborative “lab” sessions, working on data science projects as a team. Each session runs for 3 hours on Wednesdays as decided in the timetable.
The lab components involve two types of activities:
- ‘code together’ sessions in which the instructor and students build understanding through collaboratively coding solutions to problems or implementing theoretical concepts.
- Practical coding tasks for students to complete themselves or in small groups.
Assignments are a mix of practical coding exercises, report writing (for a business audience) as well as solution design and implementation tasks. Through these students get deep exposure to historical and current industry trends and challenges, while developing tangible skills to implement these technologies and in a work context.
Due to the rapidly advancing nature of this field it is important for students to develop skills in quickly absorbing, dissecting and understanding new technologies and their value to business problems. This assignments in the subject are designed to help students develop these critical new skills
Assessment
Assessment task 1: Building Currency Converter in Python
Intent: | Develop a Python program that will perform currency conversion using data fetched from an open-source API |
---|---|
Objective(s): | This task addresses the following subject learning objectives: 1 and 5 This assessment task contributes to the development of course intended learning outcome(s): 1.2, 2.3, 2.4 and 4.1 |
Type: | Project |
Groupwork: | Individual |
Weight: | 30% |
Criteria: |
|
Assessment task 2: Analysing Company Performance with SQL
Intent: | Load data into a database and perform data analysis on the performance of a company using SQL[no content] |
---|---|
Objective(s): | This task addresses the following subject learning objectives: 3 and 5 This assessment task contributes to the development of course intended learning outcome(s): 2.2, 2.3, 4.1 and 5.3 |
Type: | Report |
Groupwork: | Individual |
Weight: | 35% |
Criteria: | [no content]
|
Assessment task 3: Collaborative Development of Data Explorer Web App
Intent: | Collaborate as a team to develop a containerised web application in Python and analyse the content of a dataset. |
---|---|
Objective(s): | This task addresses the following subject learning objectives: 1, 2, 4 and 5 This assessment task contributes to the development of course intended learning outcome(s): 1.1, 1.2, 2.1, 2.4, 3.3 and 5.1 |
Type: | Report |
Groupwork: | Group, group and individually assessed |
Weight: | 35% |
Criteria: |
|
Minimum requirements
Students must participate in all online requirements, as well as complete assessment tasks.
References
https://www.atlassian.com/git/tutorials/learn-git-with-bitbucket-cloud
D. Sculley et al, 2014. Machine Learning: The High Interest Credit Card of Technical Debt. SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop)
https://seankross.com/the-unix-workbench/
https://earlconf.com/archive/_downloads/london_speakers/EARL2018_-_London_-_Leanne_Fitzpatrick.pdf
https://jeroen.github.io/uros2018/#1
https://fivebooks.com/best-books/computer-science-data-science-hadley-wickham/