A New Materials Data Platform in MI2I

MI2I stands for the Materials Research by Information Integration Initiative, which is a project aimed at promoting research on and establishing a system for materials informatics. Researchers in the fields of materials science and data science take part in the project.

Of particular importance in the advancement of materials informatics research is the maintenance of a materials database. MI2I utilizes the large-scale materials database operated by NIMS called MatNavi. However, while MatNavi was developed with a user-interface to allow users access to the data, MI2I is currently developing an application program interface (API) capable of accessing data in the database from an application program.

Recent achievements of the initiative include the thermal design of compound semiconductors using Bayesian optimization, and a function for searching new compounds according to Bayesian inference.

While there is less data on materials than in biotechnology fields, one advantage of materials science is that data can still be produced without experimental data by solving fundamental equations on a computer. It is for this reason we are seeing a movement worldwide to create computational databases similar to MatNavi. Here, proper preparation of data including methods of approximation will be essential.

Materials having the same composition may exhibit different properties when synthesized according to different processes. Therefore, it is necessary to collect data on such processes. The key issue here is how to collect data that researchers acquire every day in the lab. Last year we began a project at NIMS to compile such data into a database. We found that it is important both to develop a method of collecting data without overburdening the researchers, and to present the data in a way that anyone can understand. We also believe there is more incentive for researchers to participate in data collection when the value of such data is enhanced, such as when spectrum analyses based on data science and other added value is included.

We are also considering a system of incorporating data obtained through collaboration with large-scale research facilities, such as SPring-8 and J-PARC, and mining materials data from published papers and institutional repositories. It is hoped that industrial companies will study these techniques and apply “real” data, which in turn will lead to further business development. While real data possessed by individual companies is not likely to become public right away, information in industrial patents may at some stage be entered into the data platform. Current discussions at NIMS indicate that full-scale design and development of such a framework could begin in April.

Concept of a next-generation materials data platform

Satoshi Itoh

National Institute for Materials Science (NIMS)