Researchers have automated brain MRI image labeling, needed to teach machine learning image recognition models, by deriving important labels from radiology reports and accurately assigning them to the corresponding MRI examinations, allowing more than 100,00 MRI examinations to be labeled in less than half an hour.
This was the first study that allowed researchers at King's College London (London UK) to label complex MRI image datasets at scale. The researchers say it would take years to manually perform labelling of more than 100,000 MRI examinations. Deep learning typically requires tens of thousands of labelled images to achieve the best possible performance in image recognition tasks. This represents a bottleneck to the development of deep learning systems for complex image datasets, particularly MRI which is fundamental to neurological abnormality detection.
"By overcoming this bottleneck, we have massively facilitated future deep learning image recognition tasks and this will almost certainly accelerate the arrival into the clinic of automated brain MRI readers. The potential for patient benefit through, ultimately, timely diagnosis, is enormous," said senior author, Dr. Tom Booth from the School of Biomedical Engineering & Imaging Sciences at King's College London.
"This study builds on recent breakthroughs in natural language processing, particularly the release of large transformer-based models such as BERT and BioBERT which have been trained on huge collections of unlabeled text such as all of English Wikipedia, and all PubMed Central abstracts and full-text articles; in the spirit of open-access science, we have also made our code and models available to other researchers to ensure that as many people benefit from this work as possible," added lead author, Dr. David Wood from the School of Biomedical Engineering & Imaging Sciences.
According to the researchers, while one barrier has now been overcome, further challenges will be, firstly, to perform the deep learning image recognition tasks which also have multiple technical challenges; and secondly, once this is achieved, to ensure the developed models can still perform accurately across different hospitals using different scanners.
King's College London