Stef van Buuren
Data are the new gold. Real data are always incomplete. Sometimes we can derive valid conclusions by just ignoring the missing data. More typically, the implications caused by the unknown data simply fail to evaporate. I have always been fascinated by the question of how the limited scope of information affects our judgement. Had the missing data been known, what would our conclusion have been?
During my career in both TNO and academia, I've pioneered quantitative algorithms for "filling up the missing data" (imputation). These methods learn plausible values from the observed data. Nowadays, the MICE algorithm is the de facto international standard for the analysis of incomplete data. Investigators across all sciences rely on MICE.
I've employed MICE and related methods in many TNO projects, especially in child growth, child development and healthy living. In the coming years, I like to implement and publish novel quantitative methods into JAMES web services. This will make growth charts prediction and the D-score, a new system for expressing child development on a quantitative scale, available to investigators worldwide. I collaborate with the World Health Organization and the Bill & Melinda Gates Foundation.
- In November 2018, a group of 40 experts headed by the World Health Organisation elected the D-score as the most promising methodology to create a worldwide useable instrument to measure child development.
- In 2018, the second edition of Flexible Imputation of Missing Data appeared. This version also includes free and integral online version, with all R code to calculate the results.
- In the year 2019, the MICE paper was referenced at a rate of over 1000 citations per year. The current download rate of the MICE software is 46,000 downloads per month.
- In 2019, an experimental version saw the light of JAMES (Joint Anthropometric Measurement and Evaluation System). JAMES is a web service that automates prediction and filtering based on child growth and development.
- In 2019, the first draft of the D-score booklet Turning milestones into measurement appeared.
- In 2020, we published six releases of our flagship product MICE on [CRAN](https://CRAN.R-project.org/package=mice). Major [improvements](https://amices.org/mice/news/index.html) include: 1. Predictive mean matching is up to 600 times faster; 2. New NARFCS method for MNAR data; 3. New ignore facility to enhance machine learning. MICE is currently downloaded over 80,000 times per month.
- We investigated the statistical properties of estimates calculated after the MICE algorithm for missing data imputation. We found that we may stop MICE after 5-10 iterations, much earlier than alternative measures that typically suggest 50 or more.
- We published a new package brokenstick on [CRAN](https://CRAN.R-project.org/package=brokenstick) in Nov 2020. The broken stick method excels at combining, analysing and predicting individual health trajectories.
- The experimental `shinyMice` [package](https://github.com/amices/shinyMice) by Hanne Oberman offers interactive diagnostics for missing data imputation.
- We set up new Github organisations for [missing data imputation](https://github.com/amices), for the [D-score](https://github.com/d-score) and for the [Joint Automatic Measurement and Evaluation System (JAMES)](https://github.com/growthcharts).
- Academic [promotion]( https://www.uu.nl/agenda/promotie-wietze-pasma-use-of-anesthesia-data-for-research) by Wietze Pasma. Role: 2nd
- [4000+ citations](https://scholar.google.nl/citations?user=_3y5C0UAAAAJ) in 2020.
- Mingyang Cai (Privately funded)
- Wietze Pasma (UMCU)
- Hanne Oberman
- Weber, A. M., Rubio-Codina, M., Walker, S. P., van Buuren, S., Eekhout, I., Grantham-McGregor, S. M., . . . Hamadani, J. D. (2019). The D-score: a metric for interpreting the early development of infants and toddlers across global settings. BMJ Global Health, 4(6). Retrieved from https://gh.bmj.com/content/4/6/e001724.abstract
- van Buuren, S. (2018). Flexible Imputation of Missing Data. Second Edition. Boca Raton, FL: Chapman & Hall/CRC Press.
- Audigier, V., White, I. R., Jolani, S., Debray, T. P. A., Quartagno, M., Carpenter, J., van Buuren, S., Resche-Rigon, M. (2018). Multiple imputation for multilevel data with continuous and binary variables. Statistical Science, 33(2), 160-183. Retrieved from https://projecteuclid.org/download/pdfview_1/euclid.ss/1525313140
- Personal page: https://stefvanbuuren.name
- Google scholar: https://scholar.google.nl/citations?user=_3y5C0UAAAAJ&hl=nl&oi=ao#
- GitHub: https://github.com/stefvanbuuren
- University page: https://www.uu.nl/staff/SvanBuuren/Profile