From data chaos to breakthroughs

The role of artificial intelligence in identifying drug targets and unmet medical needs.
from
Gillian Hertlein

Introduction

In the fast-paced world of pharmaceutical research and development, finding unmet needs and identifying targets is like the proverbial search for a needle in a haystack. The search for breakthrough treatments requires a deep understanding of disease, a trained eye for innovative solutions, and a firm commitment to improving patient outcomes. However, the sheer volume of data and the complexities involved have made this task challenging for scientists and researchers.

With its ability to analyze large amounts of data, identify hidden patterns and make intelligent predictions, artificial intelligence (AI) has taken on a revolutionary role in the pharmaceutical industry. Through the use of machine learning algorithms and advanced analytics, AI is transforming the entire drug development process. In this series of articles, we will look at practical applications of machine learning in the pharmaceutical industry, starting with this first part, which focuses on the first phase: identifying unmet needs and goals.

Overview of the pharmaceutical value chain

Figure 1: Pharmaceutical value chain

The estimated cost of developing an approved drug/therapy is $9 billion with a development time of over ten years. Behind one approved drug are 5 million active ingredients that have failed along the way. Because of this long and intensive research, not only are materials and money used, but a significant amount of life is invested - an army of researchers putting over 200,000 hours of their time into the process. A cornerstone of drug development is the identification of an unmet need and its respective optimal target that will endure and generate sufficient revenue during the estimated 10-year development period. Once the unmet need and drug development strategy are defined, the drug development process begins with early research and development (compound identification, validation, optimization, and toxicology screening), followed by clinical development, regulation, reimbursement, manufacturing, distribution, and commercialization. Across the value chain, AI/ML is expected to create significant value by saving costs and time and increasing the probability of success (POS). Although quantifying value creation is challenging, the firm belief of major pharmaceutical companies and investors have resulted in a total of $60.2 billion invested in AI for drug development companies (Deep Pharma Intelligence).

Challenges in identifying unmet needs

Pharmaceutical companies face a multitude of complexities and difficulties as they navigate through vast amounts of data and seek breakthrough treatments. Selecting the right target and indication is the cornerstone of treatment development. This is critical to success because it defines everything else, such as obstacles during development, costs during clinical development, potential risk from competitors, and limits on potential return on investment. This means analyzing the status quo, understanding the reasons for potential shortcomings, and predicting future developments in the field. Although a wealth of information is available in scientific publications, patents, clinical trial data, and real-world evidence, traditional manual approaches often do not meet the requirements to efficiently use this wealth of data. Many commercial databases offer information and even analysis on portions of the information, but technical hurdles such as limited batch downloads and licensing restrictions prevent automated data linkage and machine-learning analysis of the full data. As a result, an analyst's path is paved with copy/paste and customized solutions with high manual overhead. Challenges encountered are:

  1. Data, data, data
    The amount of data available to researchers has increased dramatically in recent years. The scientific literature alone is expanding rapidly, making it nearly impossible for human researchers to keep up with the influx of new information. In addition, data extends beyond published papers to patents, clinical trial results, electronic patient records, conference presentations and more. Often, results that should be published are opaque or difficult to find to protect intellectual property or avoid unfavorable publicity.
  2. Information silos
    Data is spread across multiple sources and formats. While many commercial databases provide information and even analysis on pieces of the information, technical hurdles such as limited batch downloads and licensing restrictions prevent automated data linking and machine-learning analysis of the full data. This fragmented information creates "islands of information" that hinder seamless integration and comprehensive analysis. In addition, researchers often struggle to continuously find, monitor, integrate, and validate information from new data sources such as Twitter and LinkedIn, preventing or delaying knowledge generation.
  3. Limited scalability
    Manual methods for analyzing large data sets are time consuming and resource intensive. Traditional manual techniques cannot keep up with the exponential growth of data. One of the current solutions is to rely on key opinion leaders who are assumed to have an overview based on their network and experience. However, this can be influenced by personal beliefs, and information can easily be overlooked. Researchers must rely on more innovative and rapid approaches to penetrate massive data sets, identify patterns, and make data-based decisions.

The role of AI in addressing the challenges

To address these challenges, Artificial Intelligence (AI) is being used as a transformative force. With its remarkable ability to process and analyze large volumes of structured and unstructured data, AI offers immense advantages in the identification phase of the pharmaceutical value chain.

AI-driven solutions

  1. Domination of data
    AI has the ability to harness, process, and analyze the vast amount of multimodal data available to researchers. Unlike manual methods, AI-driven systems can efficiently link and extract relevant information, identify key insights, and summarize large amounts of data in a very short time.

  2. Bringing data together
    AI can help break down islands of information that hinder seamless integration and comprehensive data analysis. Data platforms can provide researchers with valuable information. However, they often encounter up to 10 different data platforms and additional scattered and poorly linked data sources. This becomes a puzzle to put together based on initial knowledge. The quality of the result then depends heavily on the time spent assembling and initial knowledge base. AI systems can link data from different sources and fill gaps between knowledge domains.

  3. Improved scalability
    AI-driven systems can quickly process huge datasets, identify relevant information, and filter out noise. This enables researchers to explore a broader range of possibilities within shorter timeframes. The scalability of AI enables researchers to analyze large data sets and keep pace with the exponential growth of information.

  4. Pattern Recognition
    One of the most remarkable capabilities of AI is its ability to identify patterns, correlations, and potential relationships in large data sets. AI is able to detect the most subtle and non-obvious connections. By using machine learning, AI can identify hidden connections and uncover new relationships.

Data integration in pharmaceutical research

In the ever-evolving landscape of pharmaceutical research, access to and efficient integration of data is becoming increasingly critical. Companies in the pharmaceutical industry are striving for FAIR (Findable, Accessible, Interoperable, Reusable) data to overcome data islands and improve findability and accessibility.

One of the main goals of data integration is to automate the process of exporting and linking data, resulting in improved interoperability and reusability. By seamlessly connecting internal and external data sources, analysts can generate comprehensive summaries and leverage existing knowledge in both areas. Data platforms play a critical role in accelerating the linking process and enable efficient exploration of multiple data sources.

The strength of data integration lies in its ability to enable multimodal analyses. For example, by combining internal tandem mass spectrometry data with public protein databases, important insights into active sites and molecular interactions can be gained. These synergistic analyses enable a better understanding of drug targets and support targeted research efforts.

In addition, data integration facilitates the prediction of future research trends based on historical indications and target information. By tracking the increasing number of articles and publications on a given area, analysts can anticipate emerging research areas and adjust their strategies accordingly. In addition, automated linking with key opinion leaders in the industry fosters collaboration and knowledge sharing, driving progress in pharmaceutical research.

In summary, data integration is revolutionizing pharmaceutical research by improving the discoverability, accessibility and interoperability of critical information. By harnessing the power of data platforms and advanced analytics, companies can gain new insights, accelerate the drug discovery process, and ultimately contribute to the development of innovative therapies for unmet medical needs.

Outlook and conclusion

The future of AI in identifying unmet needs and targets in the pharmaceutical industry is promising and transformative. As AI continues to evolve and mature, opportunities exist to further enhance its capabilities and revolutionize drug discovery and development.

At the forefront of integrating AI into their drug development process, from analyzing unmet needs to identifying targets, generating drug candidates, optimizing compounds, toxicology screening and clinical development, are technology-enabled drug developers such as Exscentia, Insistro, Evotec and Recursion. Their goal is to redefine drug development by designing data generation for AI applications, rather than integrating AI solutions into existing approaches. The hope is to quickly narrow the funnel by identifying failures early in the research cycle, when they are relatively inexpensive, rather than limiting the scope to the best-known approaches from the outset.

In particular, LLMs can help increase the speed and efficiency of identifying and tracking unmet needs. They enable researchers to extract relevant data faster and link data together without the need for programming skills. Ferma.ai, one of the first commercial applications to integrate LLMs for broader pharmaceutical and research applications, is already creating buzz by enabling researchers to communicate directly with conference abstracts and publications to answer complex questions with referenced knowledge.

In summary, AI has already demonstrated its immense value in identifying unmet needs and ideal targets. Researchers can use AI algorithms and data platforms to efficiently navigate large data sets, identify patterns, predict drug targets, and repurpose existing drugs. AI brings unmatched efficiency, scalability, and accuracy to the identification phase, addressing the challenges of data overload, information islands, and limited scalability. By harnessing the potential of AI, it is important to continue to invest in research, foster collaboration, and ensure ethical and responsible use of AI technologies.

Stay tuned for part two!

Subscribe now to the Merantix Momentum Newsletter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

More articles

The latest industry news, interviews, technologies and resources.

Building Better Medicines: Exploring AI-Driven Compound Optimization

An expert interview on research into AI-driven drug optimization.

From data chaos to breakthroughs

An expert interview with Dr. Stephan Hegge, VP of Corporate Strategy at HotSpot Therapeutics and Dr. Thomas Wollmann, CTO at Merantix Momentum

From data chaos to breakthroughs

An expert interview with Dr. Stephan Hegge, VP of Corporate Strategy at HotSpot Therapeutics and Dr. Thomas Wollmann, CTO at Merantix Momentum

DNA-Encoded Libraries and AI for Compound Identification

An expert interview on the use of DNA-encoded libraries for connection detection.

How do you make 30 years of data accessible?

In conversation with Jonas Münch, who developed a RAG chatbot at Bayer that combines 30 years of internal data and clinical results in one place.

From data chaos to breakthroughs

Introduction

In the fast-paced world of pharmaceutical research and development, finding unmet needs and identifying targets is like the proverbial search for a needle in a haystack. The search for breakthrough treatments requires a deep understanding of disease, a trained eye for innovative solutions, and a firm commitment to improving patient outcomes. However, the sheer volume of data and the complexities involved have made this task challenging for scientists and researchers.

With its ability to analyze large amounts of data, identify hidden patterns and make intelligent predictions, artificial intelligence (AI) has taken on a revolutionary role in the pharmaceutical industry. Through the use of machine learning algorithms and advanced analytics, AI is transforming the entire drug development process. In this series of articles, we will look at practical applications of machine learning in the pharmaceutical industry, starting with this first part, which focuses on the first phase: identifying unmet needs and goals.

Overview of the pharmaceutical value chain

Figure 1: Pharmaceutical value chain

The estimated cost of developing an approved drug/therapy is $9 billion with a development time of over ten years. Behind one approved drug are 5 million active ingredients that have failed along the way. Because of this long and intensive research, not only are materials and money used, but a significant amount of life is invested - an army of researchers putting over 200,000 hours of their time into the process. A cornerstone of drug development is the identification of an unmet need and its respective optimal target that will endure and generate sufficient revenue during the estimated 10-year development period. Once the unmet need and drug development strategy are defined, the drug development process begins with early research and development (compound identification, validation, optimization, and toxicology screening), followed by clinical development, regulation, reimbursement, manufacturing, distribution, and commercialization. Across the value chain, AI/ML is expected to create significant value by saving costs and time and increasing the probability of success (POS). Although quantifying value creation is challenging, the firm belief of major pharmaceutical companies and investors have resulted in a total of $60.2 billion invested in AI for drug development companies (Deep Pharma Intelligence).

Challenges in identifying unmet needs

Pharmaceutical companies face a multitude of complexities and difficulties as they navigate through vast amounts of data and seek breakthrough treatments. Selecting the right target and indication is the cornerstone of treatment development. This is critical to success because it defines everything else, such as obstacles during development, costs during clinical development, potential risk from competitors, and limits on potential return on investment. This means analyzing the status quo, understanding the reasons for potential shortcomings, and predicting future developments in the field. Although a wealth of information is available in scientific publications, patents, clinical trial data, and real-world evidence, traditional manual approaches often do not meet the requirements to efficiently use this wealth of data. Many commercial databases offer information and even analysis on portions of the information, but technical hurdles such as limited batch downloads and licensing restrictions prevent automated data linkage and machine-learning analysis of the full data. As a result, an analyst's path is paved with copy/paste and customized solutions with high manual overhead. Challenges encountered are:

  1. Data, data, data
    The amount of data available to researchers has increased dramatically in recent years. The scientific literature alone is expanding rapidly, making it nearly impossible for human researchers to keep up with the influx of new information. In addition, data extends beyond published papers to patents, clinical trial results, electronic patient records, conference presentations and more. Often, results that should be published are opaque or difficult to find to protect intellectual property or avoid unfavorable publicity.
  2. Information silos
    Data is spread across multiple sources and formats. While many commercial databases provide information and even analysis on pieces of the information, technical hurdles such as limited batch downloads and licensing restrictions prevent automated data linking and machine-learning analysis of the full data. This fragmented information creates "islands of information" that hinder seamless integration and comprehensive analysis. In addition, researchers often struggle to continuously find, monitor, integrate, and validate information from new data sources such as Twitter and LinkedIn, preventing or delaying knowledge generation.
  3. Limited scalability
    Manual methods for analyzing large data sets are time consuming and resource intensive. Traditional manual techniques cannot keep up with the exponential growth of data. One of the current solutions is to rely on key opinion leaders who are assumed to have an overview based on their network and experience. However, this can be influenced by personal beliefs, and information can easily be overlooked. Researchers must rely on more innovative and rapid approaches to penetrate massive data sets, identify patterns, and make data-based decisions.

The role of AI in addressing the challenges

To address these challenges, Artificial Intelligence (AI) is being used as a transformative force. With its remarkable ability to process and analyze large volumes of structured and unstructured data, AI offers immense advantages in the identification phase of the pharmaceutical value chain.

AI-driven solutions

  1. Domination of data
    AI has the ability to harness, process, and analyze the vast amount of multimodal data available to researchers. Unlike manual methods, AI-driven systems can efficiently link and extract relevant information, identify key insights, and summarize large amounts of data in a very short time.

  2. Bringing data together
    AI can help break down islands of information that hinder seamless integration and comprehensive data analysis. Data platforms can provide researchers with valuable information. However, they often encounter up to 10 different data platforms and additional scattered and poorly linked data sources. This becomes a puzzle to put together based on initial knowledge. The quality of the result then depends heavily on the time spent assembling and initial knowledge base. AI systems can link data from different sources and fill gaps between knowledge domains.

  3. Improved scalability
    AI-driven systems can quickly process huge datasets, identify relevant information, and filter out noise. This enables researchers to explore a broader range of possibilities within shorter timeframes. The scalability of AI enables researchers to analyze large data sets and keep pace with the exponential growth of information.

  4. Pattern Recognition
    One of the most remarkable capabilities of AI is its ability to identify patterns, correlations, and potential relationships in large data sets. AI is able to detect the most subtle and non-obvious connections. By using machine learning, AI can identify hidden connections and uncover new relationships.

Data integration in pharmaceutical research

In the ever-evolving landscape of pharmaceutical research, access to and efficient integration of data is becoming increasingly critical. Companies in the pharmaceutical industry are striving for FAIR (Findable, Accessible, Interoperable, Reusable) data to overcome data islands and improve findability and accessibility.

One of the main goals of data integration is to automate the process of exporting and linking data, resulting in improved interoperability and reusability. By seamlessly connecting internal and external data sources, analysts can generate comprehensive summaries and leverage existing knowledge in both areas. Data platforms play a critical role in accelerating the linking process and enable efficient exploration of multiple data sources.

The strength of data integration lies in its ability to enable multimodal analyses. For example, by combining internal tandem mass spectrometry data with public protein databases, important insights into active sites and molecular interactions can be gained. These synergistic analyses enable a better understanding of drug targets and support targeted research efforts.

In addition, data integration facilitates the prediction of future research trends based on historical indications and target information. By tracking the increasing number of articles and publications on a given area, analysts can anticipate emerging research areas and adjust their strategies accordingly. In addition, automated linking with key opinion leaders in the industry fosters collaboration and knowledge sharing, driving progress in pharmaceutical research.

In summary, data integration is revolutionizing pharmaceutical research by improving the discoverability, accessibility and interoperability of critical information. By harnessing the power of data platforms and advanced analytics, companies can gain new insights, accelerate the drug discovery process, and ultimately contribute to the development of innovative therapies for unmet medical needs.

Outlook and conclusion

The future of AI in identifying unmet needs and targets in the pharmaceutical industry is promising and transformative. As AI continues to evolve and mature, opportunities exist to further enhance its capabilities and revolutionize drug discovery and development.

At the forefront of integrating AI into their drug development process, from analyzing unmet needs to identifying targets, generating drug candidates, optimizing compounds, toxicology screening and clinical development, are technology-enabled drug developers such as Exscentia, Insistro, Evotec and Recursion. Their goal is to redefine drug development by designing data generation for AI applications, rather than integrating AI solutions into existing approaches. The hope is to quickly narrow the funnel by identifying failures early in the research cycle, when they are relatively inexpensive, rather than limiting the scope to the best-known approaches from the outset.

In particular, LLMs can help increase the speed and efficiency of identifying and tracking unmet needs. They enable researchers to extract relevant data faster and link data together without the need for programming skills. Ferma.ai, one of the first commercial applications to integrate LLMs for broader pharmaceutical and research applications, is already creating buzz by enabling researchers to communicate directly with conference abstracts and publications to answer complex questions with referenced knowledge.

In summary, AI has already demonstrated its immense value in identifying unmet needs and ideal targets. Researchers can use AI algorithms and data platforms to efficiently navigate large data sets, identify patterns, predict drug targets, and repurpose existing drugs. AI brings unmatched efficiency, scalability, and accuracy to the identification phase, addressing the challenges of data overload, information islands, and limited scalability. By harnessing the potential of AI, it is important to continue to invest in research, foster collaboration, and ensure ethical and responsible use of AI technologies.

Stay tuned for part two!

Oops! Something has gone wrong.
Oops! Something has gone wrong.
Oops! Something has gone wrong.
Oops! Something has gone wrong.

Discover more whitepapers

Leveraging the EU AI Act to your advantage

Using the EU AI Act to your advantage

Towards Tabular Foundation Models

About the status quo, challenges and opportunities

The AI Canvas: Our tool for project evaluation

Discover the AI Canvas!

Data-driven to the drug of tomorrow

Opportunities and barriers of AI in a GxP world.

Towards Tabular Foundation Models

About the status quo, challenges and opportunities

The AI Canvas: Our tool for project evaluation

Discover the AI Canvas!
We would like to get to know you!

Start your AI journey with us now

Subscribe now to the Merantix Momentum Newsletter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.