Skip to main content
McMaster University Menu Search

MacData promotes the engagement of researchers and students with industry, government and community to strengthen McMaster’s position as an international leader on all matters related to data. MacData was created to identify synergies to foster collaboration among McMaster University’s many institutes, centres, and researchers whose work involves the many facets of data (collection, transformation, analysis, visualization). We are not a data warehouse, nor do we collect or create data. We’re devoted to thinking about data and to working with researchers, policy makers, industry experts and the community to help harness the tremendous potential of data.

Our Goals

MacData’s goals are to:

  • Show leadership regarding best practices around data security, privacy, proprietary issues, and ethical collection and use of data. Identify, shape, and support training and experiential leaning focused on data literacy and skills needed to harness big data.
  • Improve efficiencies related to data: creation, collection, management, transformation and analyses.
  • Promote and support development of large-scale partnerships with private and public organizations.
  • Run seminars, research workshops and graduate training programs on big data
  • Encourage researchers and students from diverse disciples to explore synergies that can further data science and innovation.
  • Support the recruitment of talented researchers and students.

Our Mandate

We are a university-wide institute founded in 2015 and operate under the office of the Vice-President of Research. Our mandate is to promote innovative research and training programs on big data. We work with, and support faculties and institutes on data issues and initiatives.

The institute’s mission can be summarized with reference to three general themes:

Visibility

This relates to showcasing the scientific developments and successes of McMaster researchers and their students have made on issues related to big data locally, nationally, and internationally. The types of developments to be showcased include matters related to the creation, collection, and curation of data; the transformation of data for research and policy analysis; advances in data storage; and statistical analysis, visualization, and dissemination of results

Integration

This focuses on coordination and collaboration across disciples and mythologies to ensure researchers are supported and have opportunities to expand the scope of their research. The development of new techniques for analysis and data development, for example, falls under this theme. Activities related to enhancing training opportunities and the use of students in ways that promote building skills they’ll need to traverse the big data terrain are also under this theme.

Enabling

This focuses on promotion of data literacy skills, training, and integration with industry and government to support the development of methods, technology, and analyses to help commerce, governments, science and public services.

Leading the Way on Fundamental Issues

Though every research project is unique, certain fundamental issues often arise when it comes to working with, and interpreting big, data. Broadly speaking, the issues can be grouped under the headings of creation, collection, management, transformation, and analysis.

Creation

This relates to the collection of structured and unstructured measures and data that is manually or electronically captured (for example, surveys, experiments, simulations, tracking systems). When concerned about creation of data, the skills and expertise needed relates to the development or integration of methodologies for capturing information, for example, how to improve survey taking, the development of recording devices, and the capturing of information from the internet and other devices.

Collection

This relates to work done to get access to data that have been collected by third party sources (for example, businesses or governments). Activities related to collection include database development, the tracking of data use, and activities related to data collection and gaining access to data.

Management

This relates to things like innovations tied to data storage (for example, server capacity, cloud solutions, secure housing of data), data archiving, data curation, and data access. The efficient use of servers and network infrastructures, the development of firmware, software, and processes all relate to data management.

Transformation

This relates to the use of measures from primary and secondary data sources to create data sets to permit analysis. Before data can be analyzed, many issues must be addressed, including assessing the data quality, identifying rules for data development, transformation of information, liking of records, de-identification and encryption of records/measures, and the archiving of information. The activities and expertise needed for data transformation include the development of algorithms, software, and other processes

Analysis

this relates to the use of information and data – from statistical analysis to visualization, to the development of new analytical tools that permit the analysis of large and complex data sets.

McMaster is already known for its interdisciplinary and inter-sectoral collaborations and MacData will help further this reputation by promoting an integrated approach around all aspects of data creation, collection, management, transformation, and analysis.

Big Data

The fact that “Big Data” has been in the news the past few years is really just a sign of the times. It’s a reflection of the unprecedented amount of information – “data” – that’s being collected in all sorts of ways. The tracking of website clicks, tweets, purchases charged to debit and credit cards, thermostats that track energy consumption in real time, information tracked by personal fitness trackers are just some examples familiar to most of us.

One of the earliest definitions of big data can be traced back to the early 2000s when analyst Doug Laney defined big data using three terms – the 3Vs as they’re commonly referred to:

Volume

Big data is voluminous – think terabytes and more!

Velocity

The flow of data is fast – Nanoseconds is too slow a measure!

Variety

The data comes in all kinds of formats – Not just structured formats

Though big data can take many forms, there are certain fundamental issues common to all types of data – issues related to creation, collection, processing, storage, and analysis. Advancing techniques and methods related to these issues is what MacData is all about.

What big data means to researchers and policymakers

Data itself has little inherent value. It becomes valuable only if someone does something with it. That’s where researchers and policymakers come in – and where the Institute has an important role to play in encouraging innovative research and collaboration across disciplines, fostering the development of new techniques for analysis and data development.

Recognizing that the ability to analyze data – data literacy – is becoming a foundational skill for researchers and policy makers, the Institute will also help provide critical support and data literacy training.