Evelina is a data scientist, machine learning researcher, and an avid conference speaker.
Currently she does most of her programming in R and F#, and she got awarded the Microsoft MVP award for her work in the F# community. She originally started as a programmer but got interested in machine learning early on and did a mathematics PhD at the University of Cambridge, developing new machine learning methods to analyze complex biomedical datasets.
After that she worked on data analysis in cancer research and now she’s joining the Alan Turing Institute as a data scientist.
Data science is emerging as a hot topic across many areas both in industry and academia. In my research, I’m using machine learning methods to build mathematical models for cancer cell behaviours. But using today’s data science tools is hard – we waste a lot of time figuring out what format different CSV files use or what is the structure of JSON or XML files. Often, we need to switch between Python, Matlab, R and other tools to call functions that are missing elsewhere. And why are many programming languages used in data science missing tools standard in modern software engineering?
In this talk I’ll look at data science tools in F# and how they simplify the life of a modern scientist, who heavily relies on data analytics. F# provides a unique way of integrating external data sources and tools into a single environment. This means that you can seamlessly access not only data, but also R statistical and visualization packages, all from a single environment. Compile-time static checking and rich interactive tooling gives you many of the standard tools known from software engineering, while keeping the explorative nature of simple, scripting languages.
Using examples from my own research in bioinformatics, I’ll show how to use F# for data analysis using various type providers and other tools available in F#.