21 March 2018
The “Data Analysis in Python” course will introduce you to all most essential and practical applications of Python programming language for data manipulation, management, analysis and basic visualisations. The course will provide you with practical skills in general Python programming language a number of Python’s libraries designed for scientific computing and data analysis e.g. NumPy, pandas, matplotlib, IPython, SciPy etc. During the course you will learn to:
Use Python’s Anaconda distribution and its integrated development environment Spyder with Jupyter Notebooks to manage, develop and share a Python analytics project,
Understand and differentiate between a variety of data structures within the core Python language as well as a highly-efficient and optimised data structures from NumPy and pandas libraries,
Perform basic mathematical and more advanced control flow operations,
Import and export data from/to various data file formats e.g. Excel spreadsheets, CSV, tab-delimited, text files, and also SQL databases,
Prepare, transform and manage datasets and their variables, add/delete rows, create samples and subsets, identify specific cases based on conditional search, sort cases, add/edit value and variable labels, deal with missing data, standardise, normalise and reshape data, merge datasets and use joins,
Carry out an extensive Exploratory Data Analysis (EDA): inspect the structure of datasets and their variables, calculate cross-tabulations and descriptive statistics to summarise the data e.g. pivot tables, summary tables and data aggregations,
Introduction to EDA plotting and graphical visualisations: histograms, density plots, scatterplots, box plots, bar plots, line graphs etc.,
Perform simple hypothesis testing and inference statistics: tests of differences and correlations. Run tests for normality assumptions, t-tests, analyses of variance (ANOVA), correlations and simple regressions.
Carry out simple data modelling tasks using multiple linear regressions.