My project for the summer was to build an interface that allowed for browsing (very) large amounts of medical data. I created a Mongo database to hold information from over 270,000 clinical trials on ClinicalTrials.gov. After doing minimal cleaning on the data, I built and deployed a MERN (Mongo, Express, React, Node) app to allow users to browse the data. The website allows users to search by condition and view all the interventions, outcomes, and publications that have been associated with studies of that condition.
Future work will include cleaning the data further (to collapse interventions and outcomes that are effectively the same but have slightly different string values) and adding data from other sources, such as PubMed. Once data from other sources is added, NLP techniques will be required to pick out the PICO (population, intervention, condition, outcome) information from the texts, since they will not be structured like the XML files of ClinicalTrials.gov. The interface will also continue to improve and offer more features.