FANTASIA Deployment Guide¶
This guide provides a step-by-step process for deploying FANTASIA locally.
Prerequisites¶
Before proceeding, ensure you have the following dependencies installed:
System Requirements¶
Operating System: Linux (Ubuntu recommended)
Python: Version 3.10 or higher
Docker: Installed and running. If not installed, follow the Docker installation guide and the post-installation steps to run Docker without sudo.
CD-HIT: Must be installed and available in the system PATH. You can install it from your package manager (e.g., sudo apt install cd-hit) or compile it from source at the [CD-HIT website](http://weizhong-lab.ucsd.edu/cd-hit/).
Machine Learning Dependencies¶
NVIDIA Driver: Version 550.120 or newer (verify using
nvidia-smi).CUDA: Version 12.4 or newer (verify using
nvcc --version).
Database Dependencies¶
PostgreSQL Client: Version 16 or later, required to restore database backups without compatibility issues.
Warning
🚨 Important for Ubuntu 22.04 and older 🚨
PostgreSQL 16 is not available in the default repositories for Ubuntu 22.04 and earlier. If you try to restore a backup using pg_restore, you may encounter incompatibility issues.
Python Environment¶
Poetry: Used for dependency management.
curl -sSL https://install.python-poetry.org | python3 - export PATH="$HOME/.local/bin:$PATH" source ~/.bashrc # o source ~/.zshrc
Cloning the Repository¶
Clone the repository and navigate into the project directory:
git clone https://github.com/CBBIO/FANTASIA.git
cd FANTASIA
Creating and Activating the Virtual Environment¶
Use poetry to manage the virtual environment. Follow these steps:
Ensure Poetry is installed and up to date:
poetry self update
If using Poetry 1.5 or later, install the required shell plugin:
poetry self add poetry-plugin-shell
Create and activate the virtual environment:
poetry env use <python_version> # Specify the desired Python version (e.g., 3.12) poetry install poetry env activate
Note
If using Conda, avoid managing environments with both Poetry and Conda simultaneously to prevent dependency conflicts.
We recommend using PyCharm for development due to its seamless integration with Poetry, making environment management and package handling more intuitive.
Starting Required Services¶
Ensure PostgreSQL and RabbitMQ services are running.
docker run -d --name pgvectorsql \
-e POSTGRES_USER=usuario \
-e POSTGRES_PASSWORD=clave \
-e POSTGRES_DB=BioData \
-p 5432:5432 \
pgvector/pgvector:pg16
docker run -d --name rabbitmq \
-p 15672:15672 \
-p 5672:5672 \
rabbitmq:management
You can access the RabbitMQ management interface at:
http://localhost:15672
(Default credentials: guest/guest).
Configuration¶
Before proceeding, create the necessary directories with proper permissions:
mkdir -p ~/fantasia/dumps ~/fantasia/embeddings ~/fantasia/results ~/fantasia/redundancy
chmod -R 755 ~/fantasia
Ensure the following parameters are correctly set in fantasia/config.yaml:
DB_USERNAME: usuario
DB_PASSWORD: clave
DB_HOST: pgvectorsql
DB_PORT: 5432
DB_NAME: BioData
rabbitmq_host: rabbitmq
rabbitmq_user: guest
rabbitmq_password: guest
Initialization¶
Download embeddings and initialize the database:
python fantasia/main.py initialize --config ./fantasia/config.yaml
Verify that the embeddings are loaded into:
The directory specified in base_directory.
The configured PostgreSQL database.
Running the Pipeline¶
fantasia --help