Skip to content

Project

All project has been managed through github

  • project plan + roadmap
  • issues

We used a flavor of XP programming to enhanced collaboration and communication (3 meetings per week included 1 to 1 dedicated meeting if necessary).

We were focused on delivery through short delivery cycle (more than 5 times per week) to get rapid feedback, and to keep motivation up !

Note

See github statistics

The following part sumup the setup

Project Setup and Environment Configuration

Tasks

  • Set up the development environment (Docker, virtual environments, etc.).
  • Install necessary libraries and frameworks (Airflow, MLflow, etc.).
  • Set up GitHub repository for version control.

Deliverables

  • Dockerfile and docker-compose.yml for each services
  • Environment configuration files
  • Initial commit to GitHub

Data Collection and Preprocessing

Tasks

  • Create Airflow DAG for data extraction from Kraken.
  • Preprocess the data (cleaning, normalization, etc.).
  • Store processed data in PostgreSQL database.

Deliverables

  • Airflow DAG script for data collection
  • Data preprocessing Python scripts
  • PostgreSQL schema and data

Model Development

Tasks

  • Develop the LSTM model for Bitcoin price prediction.
  • Split data into training, validation, and test sets.
  • Train the LSTM model and evaluate its performance.

Deliverables

  • LSTM model script
  • Trained model artifacts
  • Model evaluation metrics

Model Tracking and Versioning

Tasks

  • Integrate MLflow for model tracking and versioning.
  • Log model parameters, metrics, and artifacts in MLflow.

Deliverables

  • MLflow integration script
  • MLflow server setup
  • Model version tracking in MLflow

API Development

Tasks

  • Develop a private API for sensitive operations (training, etc.)
  • Develop a bridge API for communication with the frontend, the private API and other services
  • Create endpoints for data input and output.

Deliverables

  • FastAPI application for model management (prédiction, training, etc.)
  • FastAPI application for authentication and communication with the frontend
  • API documentation

Monitoring and Logging

Tasks

  • Set up Prometheus for monitoring logs and metrics.
  • Configure Prometheus to trigger Airflow DAGs based on log metrics.
  • Integrate Grafana for log visualization and alerting.

Deliverables

  • Prometheus configuration files
  • Kibana dashboards
  • Alerts and triggers configuration

Visualization Dashboard

Tasks

  • Develop Streamlit dashboard for visualizing model predictions and metrics, manage users and assets.
  • Integrate Streamlit with Gateway API to access backend services.

Deliverables

  • Streamlit application.

Testing and Validation

Tasks

  • Perform end-to-end testing of the MLOps pipeline.
  • Validate the accuracy and reliability of predictions.
  • Gather feedback from stakeholders and make necessary adjustments.

Deliverables

  • Test cases and results
  • Validation report

Deployment

Tasks

  • Deploy the entire MLOps pipeline to a developement environment included unit tests
  • Deploy the entire MLOps pipeline to a production environment (VM EC2 DataScientest).
  • Ensure all components are correctly integrated and functional.

Deliverables

  • Deployment scripts under Githhub Actions
  • Live development system with test
  • Live production system

Documentation

Tasks:

  • Document the entire project, including setup instructions, usage guidelines, and technical details.
  • Prepare a presentation for the jury explaining the project objectives, process, and outcomes.

Deliverables

  • Comprehensive project documentation
  • Presentation document under MkDocs

Project Timeline

  • Week 1-2: Project Setup and Environment Configuration
  • Week 3-4: Data Collection and Preprocessing
  • Week 5-6: Model Development
  • Week 7-8: Model Tracking and Versioning
  • Week 9: API Development
  • Week 10-11: Monitoring and Logging
  • Week 12: Visualization Dashboard
  • Week 13-14: Testing and Validation
  • Week 15: Deployment 🚀
  • Week 16: Documentation
Note

Github project