Birth of the Data Science Team at Exploristics

Birth of the Data Science Team at Exploristics

Birth of the Data Science Team at Exploristics 1040 800 Exploristics

By Exploristics CDSO, Kimberley Hacquoil

From the outset, Exploristics have embedded innovation into the fabric and culture of the company. Indeed, we invest more than one third of our revenue every year in research and development of new technology and statistical innovation.  This is evidenced by the unique tools and services that we offer.

One of these tools is our flagship product KerusCloud, a state-of-the-art simulation and analytics software package which generates realistic clinical study simulations in a virtual environment. It allows the simultaneous evaluation of multiple study parameters to give a better understanding of the complex interplay between factors that may affect study outcomes.  This allows the key drivers of study success to be pinpointed in silico so that the best design and analysis approach can be selected for real studies.

A question of data?

As we started to showcase KerusCloud to potential customers, a common question arose; “but where do you get your assumptions for the simulations?”. This is a very valid question and often the answer was, “multiple sources – wherever we can get reliable information to inform the clinical trial design”. But, being statisticians, this didn’t feel like it was a robust enough answer. It also identified an opportunity to do things better. From this, the Data Science team was born to build a more structured process and a toolset which would facilitate the exploitation of external datasets and provide an extensive overview of evidence within a disease area. We were successfully awarded two Innovate UK Smart Grants to turn the innovative ideas into a commercial reality.

So where do we get our assumptions from?

Exploristics utilise data from a large range of sources including, clinical trial data repositories and registries, disease registries, historical in-house clinical trials and registry data, open source data repositories, commercial clinical trials and -omics databases, as well as the scientific literature with support from NLP automated searches.

The Data Science team have built an evidence synthesis pipeline which contains data/information, tools and expertise to construct Clinical Trial Disease Models, which provide disease specific information including:

  • eligibility criteria
  • population subgroups
  • core endpoints (at baseline and across multiple timepoints)
  • variety of interventions, including standard of care

The Disease Models are the basis to inform more realistic clinical trial simulation using KerusCloud and to generate virtual cohorts and synthetic data. They are part of a broader ecosystem Exploristics is building to utilise information and evidence to better design and analyse clinical trials.

Growing to be so much more

The team acquired in-depth expertise in sifting through disparate data sources with their suite of in-house developed software tools that could extract key information to create disease specific data models for use in KerusCloud. From this the wider application of these skills and tools became apparent. This powerful capability can be used to deliver a semi-automated pipeline of relevant data for customers seeking to answer more bespoke clinical questions.

The team can, for example, investigate specific endpoints across disease areas and within pathologies, such as exploratory biomarkers, to help trial design reflect aspects of treatment mechanism, and treatment potential, regardless of disease area.  The pipeline of current and emerging clinical information provided by our team can also be used to deliver pertinent data insights to help support decisions, clarifying expectations regarding clinical trial endpoints or trial outcomes and providing systematic reviews and evidence dossiers for regulatory interactions.

The Data Science team continues to expand and grow, and they are now a critical part of the offering at Exploristics to integrate the existing and emerging knowledge and data sources into study simulations. They provide the foundations for robust evidence synthesis and are key to the success of better-informed clinical trial design through in silico simulations.