LibreDataHub

A comprehensive, self-hosted data science and AI platform designed to democratize access to modern data analytics tools for small to medium organizations while maintaining enterprise-grade security and governance.

LibreDataHub integrates a suite of open-source tools for data management, analysis, AI (machine learning/deep learning), statistics, and data warehousing, all accessible from a standard Linux server.

LibreDataHub can be freely installed on a Linux machine. It’s a platform where data science daily tools (Python, R, SQL, orchestrators and more) are ready to go for users and data teams. A collaborative space for working, exchanging code, and developing research ideas, with seamless management of user access and applications.

Try the demo Installation Guide

Core Features

  • πŸ”’ Secure Multi-tenant Environment: Project-based isolation with role-based access control (RBAC) and enterprise authentication (OIDC/Keycloak)
  • πŸš€ Complete Data Science Ecosystem: Integrated Jupyter, RStudio, VS Code, and specialized tools in a unified platform
  • πŸ€– AI-Ready Infrastructure: Built-in LLM support via Ollama with GPU acceleration for modern AI workflows
  • πŸ“Š End-to-End Analytics Pipeline: From data exploration to dashboard publishing, with workflow orchestration
  • πŸ—„οΈ Unified Database Integration: Native PostgreSQL with CitusDB extension, DuckDB support, and seamless connectivity across all tools
  • πŸ”„ Real-time Data Operations: Automated backup strategies with S3-compatible storage and pgBackRest integration
  • πŸ—οΈ Vertical Scalability: Optimized for small research teams to medium organizations through efficient resource utilization
  • πŸ”§ Open Source First: Integration of best-in-class open-source tools with seamless interoperability
  • βš–οΈ Fair Resource Management: Intelligent resource sharing and monitoring across users and projects on a single robust server

Key Differentiators

  • Zero-Config Data Science: Pre-configured environments with persistent user libraries and settings
  • Integrated AI/ML Workflows: Local LLM inference capabilities with GPU support for AI-powered analytics
  • Project-Centric Security: Granular access controls with personal, private, and public data spaces per database schema
  • Universal Database Access: Same database connections available across Jupyter, RStudio, Code-server, and CloudBeaver
  • Dashboard Publishing: Direct notebook-to-dashboard conversion via MyST integration
  • Enterprise Data Protection: Automated backup and archival with configurable retention policies
  • Single-Server Efficiency: Maximum productivity from one powerful machine rather than complex distributed setups
  • Simple Docker Compose Deployment: Straightforward installation and management via Docker Compose without complex orchestration