Research | 中山優吾

Introduction

In recent years, there has been an explosive increase in electronically processable data, represented by the term “big data”, as well as improvements in computer processing power and technological innovations such as artificial intelligence. The utilization of data will be increasingly important in the future.

One characteristic of big data is its high dimensionality. In particular, a challenging setting is the high-dimensional low-sample-size (HDLSS) setting, where the dimensionality is much larger than the sample size. For example, microarray data has tens of thousands to millions of genes (dimensions) but only tens of samples.

Conventional statistics assumes that the sample size is sufficiently large compared to the dimensionality, so it cannot be used directly. In some cases, conventional statistics can cause various problems in high-dimensional small sample data analysis due to the curse of dimensionality.

My Research

My research (and our group) has proposed methods for analyzing HDLSS based on geometric representation of the data space and dual space. Our research has been highly evaluated and we have received various awards.

Current Research

While I have been conducting theoretical research in high-dimensional statistics, I am currently working at Nissan Motor Co.’s Research Institute, applying machine learning and deep learning to practical applications.

Future Directions

I believe that educational systems and research environments that allow students to learn statistics and machine learning mathematically will become even more important in the future. In addition to promoting research in my current position, I would like to contribute to society by utilizing my experience in high school education and joint research with universities.