Skip to main content
McMaster University Menu Search

MacData promotes the engagement of researchers and students within McMaster as well as externally with industry, government and community to strengthen McMaster’s position as an international leader on all matters related to data. These include data analysis – AI, machine learning, statistical learning, and statistics – as well as collection, ethics, transformation, and visualization. MacData was created to identify synergies to foster collaboration among McMaster University’s many institutes, centres, and researchers whose work involves the many facets of data.

Our Goals

MacData’s goals are to:

  • Show leadership regarding best practices around data security, privacy, proprietary issues, and ethical collection and use of data. Identify, shape, and support training and experiential leaning focused on data literacy and skills needed to harness big data.
  • Encourage and promote developments in data analytics, including work across and beyond campus in AI, machine learning, statistical learning, and statistics.
  • Improve efficiencies related to data creation, collection, management, and transformation.
  • Promote and support development of large-scale partnerships with private and public organizations.
  • Run seminars, research workshops and graduate training programs on big data.
  • Encourage researchers and students from diverse disciples to explore synergies that can further data science and innovation.
  • Support the recruitment of talented researchers and students.

Our Mandate

We are a university-wide institute founded in 2015 and operate under the office of the Vice-President of Research. Our mandate is to promote innovative research and training programs on big data. We work with, and support faculties and institutes on data issues and initiatives.

The institute’s mission can be summarized with reference to three general themes:


This relates to showcasing the scientific developments and successes of McMaster researchers and their students have made on issues related to data science locally, nationally, and internationally. The types of developments to be showcased include matters related to the creation, collection, and curation of data; the transformation of data for research and policy analysis; adavances in AI and machine learning; developments in statistical analysis and visualization; advances in data storage; and responsible dissemination of results.


This focuses on coordination and collaboration across disciples and methodologies to ensure researchers are supported and have opportunities to expand the scope of their research. The development of new techniques for analytics – including AI, manince learning, statistical learning, and statistics – and data development, for example, falls under this theme. Activities related to enhancing training opportunities and the use of students in ways that promote building skills they will need to traverse the big data terrain are also under this theme.


This focuses on promotion of data literacy skills, training, and integration with industry and government to support the development of methodologies, technology, and analytics techniques to help commerce, governments, science and public services.

Leading the Way on Fundamental Issues

Though every research project is unique, certain fundamental issues often arise when it comes to working with, and interpreting big, data. Broadly speaking, the issues can be grouped under the headings of creation, collection, management, transformation, and analysis.


This relates to AI, machine learning, statisitcal learning, and statistics, and includes visualization as well as analysis techinques. In addition to the utilization of existing advanced techniques, the development of new analytical tools is crucial to permit the analysis of large, complex and evolving data types.


This relates to the collection of structured and unstructured measures and data that is manually or electronically captured. When concerned about creation of data, the skills and expertise needed relates to the development or integration of methodologies for capturing information, for example, how to improve survey taking, the development of recording devices, and the capturing information.


This relates to work done to get access to data that have been collected by third party sources (for example, businesses or governments). Activities related to collection include database development, the tracking of data use, and activities related to data collection and gaining access to data.


This relates to things like innovations tied to data storage (for example, server capacity, cloud solutions, secure housing of data), data archiving, data curation, and data access. The efficient use of servers and network infrastructures, the development of firmware, software, and processes all relate to data management.


This relates to the use of measures from primary and secondary data sources to create data sets to permit analysis. Before data can be analyzed, many issues must be addressed, including assessing the data quality, identifying rules for data development, transformation of information, liking of records, de-identification and encryption of records/measures, and archiving.


McMaster is already known for its interdisciplinary and inter-sectoral collaborations and MacData will help further this reputation by promoting an integrated approach around all aspects of data creation, collection, management, transformation, and analysis.