Professor Kerrie Mengersen, Queensland University of Technology & Associate Professor Tomasz Bednarz, Queensland University of Technology and CSIRO Data61
What is this thing called “big data”? What does it mean for the world in general, and for mathematical scientists in particular? What skills do mathematical scientists need to develop in order to engage effectively in the “big data era”? In this course we address these questions and learn some of the theory, methods and computational tools that are useful in modelling, analysis and visualisation of big data. The course will be strongly built on the MOOCs developed by the Queensland University of Technology (QUT) and the Australian Research Council Centre of Excellence in Mathematical and Statistical Frontiers (ACEMS), delivered using the FutureLearn platform (https://www.futurelearn.com/courses/big-data-analytics).
The course content will be an expansion of three topics covered in the MOOCs in the Big Data Analytics series: (i) managing and manipulating big data, (ii) statistical inference and machine learning, and (iii) visualisation of big data. The content will be deeper than that provided in the MOOCs and will be commensurate with an Honours/Postgraduate level course. Each topic will involve theory, methodology, computation and application. The topics that will be covered will include:
- Big data: what, where and why?
- The Big Data Wheel
- Big data papers that changed the world
- Managing big data
- Big data management tools 1: SQL, HDFS and Hadoop
- Big data management tools 2: MapReduce, Apache Pig, Apache Spark
- Big data methods: classification, clustering, regression and dimension reduction
- Popular algorithms for big data analysis
- Software tools for big data analysis
- Visualisation: science visualisation, information visualisation, communication, aesthetics and design approaches to visualisation.
Please see the Timetable for scheduled class times during the AMSI Summer School 2017.
Please note: Mathematics and Statistics of Big Data is held concurrently with Harmonic Analysis and students cannot attend both courses in 2017.
Students taking the course should:
- have the equivalent of first year mathematics and statistics
- have confidence in dealing with data-related software packages; no specific knowledge is assumed about any of the software packages used in the course
- preferably have competency in programming languages such as R, Java, Python
- Three written assignments: 15% each
- Final examination: 55%
The course will use a variety of software packages. All of the packages can be downloaded freely from the web. Instructions on downloading the required packages will be given as part of the course.
Professor Kerrie Mengersen, Queensland University of Technology
Kerrie holds a Chair in Statistics at the Queensland University of Technology in Brisbane, Australia. Her long-term research interests have been primarily in Bayesian statistical modelling, computation and analysis. More recently she has become active in big data analytics. Her applied interests are in health, the environment and industry. Kerrie is an Australian Research Council (ARC) Laureate Fellow and a Deputy Director of the ARC Centre of Excellence in Mathematical and Statistical Frontiers for Big Data, Big Models and New Insights. At QUT, her Bayesian Research and Applications Group (BRAG, bragqut.wordpress.com) comprises around a dozen awesome postgraduate and postdoctoral researchers.
Associate Professor Tomasz Bednarz, Queensland University of Technology and CSIRO Data61
Tomasz has a joint position as a Principal Research Fellow in ACEMS and Institute of Future Environments at the Queensland University of Technology, and as a Team Leader in Data61 at CSIRO. He has long expertise in platforms for big data analytics and visual analytics, aiming to connect data analysis, statistical modelling, image analytics, machine learning and visualisation. His broad range of interests span image analysis, numerical simulations and experiments with fluids, computer graphics, human-computer interactions, immersive visualisation. He actively promotes use of computational and visualisation techniques for science and research, and art + science methodology.