Skip to content

NIEHS/ToxPipe

Repository files navigation

ToxPipe Logo

ToxPipe: Semi-autonomous AI integration of diverse toxicological data streams

Python application

🗂️ Table of Contents
  1. What is ToxPipe?
  2. Approach
  3. System Architecture
  4. Deployment
  5. Useful Links
  6. Repo Structure
  7. 🛠️ Built With
  8. Funding Sources

🤔 What is ToxPipe?

ToxPipe aims to explore the use of expert entrained AI-based systems for the rapid analysis and interpretation of toxicological properties of various compounds. By leveraging cutting-edge semi-autonomous AI systems, ToxPipe will enable scientists and toxicologists to explore diverse types of toxicologically relevant data through natural language instructions. Further, through use of expert entrainment ToxPipe will provide context generation that will act as a guide to novel, contemporary data streams that were previously challenging to access and integrate into toxicological characterization.

ToxPipe is meant to be a platform for interacting with various toxicological data streams. It comprises multiple components and like any agentic retrieval augmented generation (RAG) system, requires managing agents, state, prompts, database connections, APIs, and other systems.

↑ Back to Top ↑

Approach

Large language models (LLMs), such as OpenAI’s GPT-based models, can be used to solve complicated tasks with natural language as a generic interface. By using techniques like retrieval augmented generation (RAG), LLMs can be given a set of instructions and can (semi-)autonomously explore various data sources. The LLMs will then generate responses or interpretations based on information stored inside the models along with the contextual data retrieved through RAG.

ToxPipe aims to repurpose (semi-)autonomous AI agents for AI-augmented exploration of existing toxicological data and literature. Some of the tasks that we believe are possible with autonomous agents and RAG are:

  • Generation of toxicological narratives with deep explanatory context
  • Analysis of chemical structure
  • Analysis of biological assay results
  • Summarization of journal abstracts
  • Biological database exploration using text-to-SQL AI models
  • A variety of other tasks that currently require large amounts of human time and labor.

By offloading these tasks to ToxPipe, it would allow toxicologists to repurpose their time towards higher-level cognitive tasks of directing the AI towards specific outputs.

↑ Back to Top ↑

System Architecture

The following diagram demonstrates an overall structure of ToxPipe. This model is subject to change as the project develops.

ToxPipe Overview

Architecture documentation is in docs/architecture. Stack decisions are saved in docs/decisions. This is where we will document the reasoning behind our stack decisions.

↑ Back to Top ↑

Deployment

Deployment information is contained in docs/deployment.

↑ Back to Top ↑

Related Repositories

↑ Back to Top ↑

Repo Structure

  • docs: Documentation and guides
  • examples: Example code and vignettes for common use cases
  • src: Source code for the backend and frontend components of ToxPipe
    • web: Containers (Docker) and configuration for ToxPipe's contituent services (LibreChat, LiteLLM, Langflow, Langfuse, Ollama, etc.)
    • toxpipe-api: Source code for the ToxPipe FastAPI web API and RAG/literature search features

↑ Back to Top ↑

🛠️ Built With

FastAPI Badge

↑ Back to Top ↑

Funding Sources

This work was funded by the National Institutes Health (NIH) under the following grants:

About

Series of tools used for LLM interaction with toxicological data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors