PDA Letter Article

New Technology Meets Old Data Integrity Challenges

by Monica Cahilly, Green Mountain Quality Assurance , Peter Baker, Green Mountain Quality Assurance , and Kir Henrici, The Henrici Group

Big Data

The ecosystem of life science data has experienced a seismic shift. Industry 4.0, the Internet of Things and next generation intelligence have enabled unprecedented capabilities in using data to support product development, process excellence, compliance and innovation. We are now in a new era suffused with promise for health and well-being.

But as this big data revolution sends shockwaves through every corner of our industry, and inspired futurists flurry to adopt new technologies, build analytical models and tinker with artificial intelligence (AI), stewards of data integrity might find themselves reeling.

The renewed focus on established data integrity requirements has resulted from regulatory inspections identifying critical, and, at times, fraudulent data integrity breaches. Organizations have rallied, tilting efforts toward remediation and process improvements, while ALCOA, standing for “attributable, legible contemporaneous, original and accurate,” has become the abiding standard for ensuring the integrity of GMP data.

In this emerging landscape, the industry can be divided into proactive and reactive players, further subdivided by size and technology with—unfortunately—the occasional miscreant, leaving data integrity champions with their hands full. Regulators, experts and quality assurance (QA) leaders all wrestle with creating cohesive rules, assessment strategies, communications and training to enable data integrity. As manufacturing processes have become more complex due to the range of paperbased and automated systems, hybrid systems and complex data acquisition models, new perspectives are desperately needed.

QA Meets IT: A Success Story?

Some companies have turned to ITQA functions to ensure the integrity of data managed by computerized systems. By enabling the IT department to manage data integrity compliance in an increasingly digitalized environment, theoretically, the integrity of computerized data would be better ensured. Yet manufacturers are finding ITQA departments lack the requisite GMP expertise necessary to assess the data lifecycle and its impact on product quality and patient safety.

Compliance gaps and breaches continue to occur. In the best of cases, traditional QA departments learn computer system validation principles on the fly through regulatory and industry guidance documents. In the worst of cases, QA finds themselves assessing product quality in response to a breach. In either scenario, it can be argued that innovation has outpaced the rules.

Meanwhile, big keeps getting bigger.

In the life sciences, data is being generated at an exponential pace. Wearables, sensors, smartphone wellness applications, smart manufacturing, digital technologies and software programs have enabled exchange, creating a borderless data ecosystem in the cloud. Data is now an asset. In the big data revolution, this burgeoning, disparate and, at times, ominous collection of data is just the beginning. The value is in the intelligence, or analytics. Data modelling, algorithms and deep learning will drive next generation life science innovation.

Catching Up to New Technology

From the perspective of data integrity compliance, the situation is thorny. The big data revolution has been the stomping ground of IT, engineers and computational scientists. For QA, the knowledge and experience of big data and AI has been thin. QA can no longer simply rely on an ALCOA checklist. “Wranglers” of big data repositories must manipulate data for usability. When software platforms perform the “wrangling,” the programs learn from the decision-making of the user and complete the “manipulations” independently. The transparency of such manipulations, in particular, deletions, may not be visible to QA.

Although data lineage is a data quality attribute among data scientists, and in theory, connects the output to its raw data point, when it comes to big data, this pathway is circuitous, and data may be “scrubbed” prior to review. “Scrubbing,” i.e., well-intentioned decisions to delete information considered erroneous, may itself introduce risks.

Another challenge is the large number of algorithms, including some that operate without human intervention and do not afford visibility into the decision-making process.

All of these challenges fuel a call to action. T his is an opportunity (and some might say responsibility) to pioneer next generation data integrity. Champions of data integrity can swap shy for savvy and win the interdisciplinary skills necessary to enhance the QA paradigm and enable agile compliance in tenor with big data and AI disruption, thus empowering innovation.

And now for some good news.

Global regulators are recognizing the profound benefits of big data and AI, and the subsequent need for regulatory change. T he U.S. FDA, for example, has been a leader and early adopter with programs such as the Sentinel Initiative for postmarketing surveillance of big data datasets generated by health insurers, and its INFORMED initiative aimed at implementing big data analytics to inform oncological regulatory science. The Agency is also data mining via AI tools and methodology to detect “signals” across internal databases and collect safety data. These initiatives offer insight into FDA’s thinking in terms of compliance, for example, the forming of multidisciplinary teams, the need to standardize data and analytical models and the value of data sharing. According to former FDA Commissioner Scott Gottlieb, in an FDA Voices blog post from August 2018, FDA is actively “developing a new regulatory framework to promote innovation in this space and support the use of AI-based technologies” (1). Less than a year later, FDA released a discussion paper and request for feedback, proposing a regulatory framework for changes in algorithms that may impact technological efficacy and patient safety (2).

Additionally, EMA and the Heads of Medicines Agencies (HMA) have established a joint task force to “investigate the potential role of ‘big data’ in the context of medicines development and regulation in the European Union” (3). In February, the task force published the HMA-EMA Joint Big Data Taskforce Summary Report, a seminal body of work that acknowledges that in regard to big data and AI the “uncertainties about the quality of the data, the models and the level of quality management used undermine the confidence in the validity and reliability of the evidence generated” (4). T he good news is the summary report is conclusive on the benefits of big data and AI, offering robust learnings, recommendations and strategic objectives to support a roadmap.

As global regulators continue to cite data integrity issues during GMP inspections, the industry can expect to see further requirements that address data integrity for big data technologies. Recent FDA Warning Letters evidence that the Agency maintains little tolerance for intentional or unintentional breaches in data integrity. We can expect that FDA will retain this mode of thinking as applied to emerging big data technologies, considering the Agency’s overall mission to both protect and promote public health.

We are privileged, as data integrity pioneers in spirited collaboration with our industry partners, to support global regulators in the adoption of big data and AI innovative healthcare in commitment to the lives and well-being of patients.

References

  1. Gottlieb, S. “ FDA’s Comprehensive Effort to Advance New Innovations: Initiatives to Modernize for Innovation.” FDA Voices (Aug. 29, 2018) https://www.fda.gov/news-events/fdavoices-perspectives-fda-experts/fdas-comprehensive-effort-advance-new-innovations-initiativesmodernize-innovation
  2. “Artificial Intelligence and Machine Learning in Software as a Medical Device.” U.S. FDA, April 2, 2019 https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligenceand-machine-learning-software-medical-device
  3. “Big data.” European Medicines Agency website. www.ema.europa.eu. (accessed June 4, 2019) https://www.ema.europa.eu/en/about-us/howwe-work/big-data
  4. HMA-EMA Joint Big Data Taskforce Summary Report. Feb. 13, 2019. https://www.ema.europa. eu/en/documents/minutes/hma/ema-joint-taskforce-big-data-summary-report_en.pdf