MCPcopy
hub / github.com/alvinreal/awesome-opensource-ai

github.com/alvinreal/awesome-opensource-ai @main sqlite

repository ↗ · DeepWiki ↗
17 symbols 49 edges 1 files 0 documented · 0%
README

Awesome Open Source AI

Awesome Open Source AI

Curated open-source artificial intelligence models, libraries, infrastructure, and developer tools.

Awesome


Contributing

Contents


About this list

Awesome Open Source AI is a curated list of open-source projects for people building with AI.

The goal is to help readers find useful models, libraries, tools, infrastructure, datasets, and learning resources without sorting through a directory dump.

Projects do not need a minimum number of GitHub stars to be included. Stars can be useful context, but they are only one signal. A smaller project may belong here if it is useful, well-maintained, technically interesting, clearly documented, or important to a specific part of the AI ecosystem.

Good entries should have a clear reason to exist. They should help people build, study, run, evaluate, or understand AI systems.


1. Core Frameworks & Libraries

Core libraries and frameworks used to build, train, and run AI and machine learning systems.

Deep Learning Frameworks

  • PyTorch - Dynamic computation graphs, Pythonic API, dominant in research and production. The current standard for most frontier AI work. GitHub stars
  • TensorFlow - End-to-end platform with excellent production deployment, TPU support, and large-scale serving tools. GitHub stars
  • JAX - High-performance numerical computing with composable transformations (JIT, vmap, grad). Rising favorite for research and scientific ML. GitHub stars + Flax GitHub stars
  • dm-haiku - JAX-based neural network library from Google DeepMind. Elegant functional API with state management, widely used in DeepMind's research. Apache 2.0 licensed. GitHub stars
  • Equinox - Elegant easy-to-use neural networks and scientific computing in JAX. Callable PyTrees with filtered transformations, seamless interoperability with the JAX ecosystem. Apache 2.0 licensed. GitHub stars
  • Diffrax - Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable ODE/SDE/CDE solvers for scientific machine learning and neural differential equations. Apache 2.0 licensed. GitHub stars
  • vit-pytorch - Comprehensive Vision Transformer (ViT) implementations in PyTorch. Reference implementations of all major vision transformer variants including ViT, DeiT, Swin, and more. MIT licensed. GitHub stars
  • NumPyro - Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation. Bayesian modeling and inference at scale. GitHub stars
  • Keras - High-level, beginner-friendly API that now runs on multiple backends (TensorFlow, JAX, PyTorch). Perfect for rapid experimentation. GitHub stars
  • tinygrad - Minimalist deep learning framework with tiny code footprint. The "you like PyTorch? you like micrograd? you love tinygrad!" philosophy - simple yet powerful. GitHub stars
  • PaddlePaddle - Industrial deep learning platform from Baidu serving 23+ million developers and 760,000+ companies. China's first independent R&D framework with advanced distributed training and deployment capabilities. GitHub stars
  • PyTorch Geometric - Library for deep learning on irregular input data such as graphs, point clouds, and manifolds. Part of the PyTorch ecosystem. GitHub stars
  • timm (PyTorch Image Models) - The largest collection of PyTorch image encoders and backbones. 900+ pretrained models including ResNet, EfficientNet, Vision Transformer, ConvNeXt, and more with training and inference scripts. Apache 2.0 licensed. GitHub stars
  • Triton - Language and compiler for writing highly efficient custom deep-learning primitives. Powers kernel optimizations in PyTorch, JAX, and other frameworks. MIT licensed. GitHub stars
  • GGML - Tensor library for machine learning. The foundational C/C++ library powering llama.cpp and many on-device inference engines. MIT licensed. GitHub stars
  • MLX - Array framework for machine learning on Apple silicon. Efficient unified memory design with NumPy-like API, automatic differentiation, and multi-device support. MIT licensed. GitHub stars

High-Performance Compute Libraries

  • oneDNN - oneAPI Deep Neural Network Library. Cross-platform performance library of basic building blocks for deep learning, optimized for Intel CPUs, GPUs, and Arm architectures. Apache 2.0 licensed. GitHub stars
  • ONNX - Open standard for machine learning interoperability. Open Neural Network Exchange provides an open ecosystem that empowers AI developers to choose the right tools as their project evolves. Apache 2.0 licensed. GitHub stars
  • IREE - Retargetable MLIR-based machine learning compiler and runtime toolkit. Lowers ML models to unified IR that scales from datacenter to mobile and edge deployments. Apache 2.0 licensed. GitHub stars

Rust ML Frameworks

  • Burn - Next-generation deep learning framework in Rust. Backend-agnostic with CPU, GPU, WebAssembly support. GitHub stars
  • Candle (Hugging Face) - Minimalist ML framework for Rust. PyTorch-like API with focus on performance and simplicity. GitHub stars
  • linfa - Comprehensive Rust ML toolkit with classical algorithms. scikit-learn equivalent for Rust with clustering, regression, and preprocessing. GitHub stars

Julia ML Frameworks

  • Flux.jl - 100% pure-Julia ML stack with lightweight abstractions on top of native GPU and AD support. Elegant, hackable, and fully integrated with Julia's scientific computing ecosystem. GitHub stars
  • MLJ.jl - Comprehensive Julia machine learning framework providing a unified interface to 200+ models with meta-algorithms for selection, tuning, and evaluation. MIT licensed. GitHub stars
  • ModelingToolkit.jl - High-performance symbolic-numeric modeling framework for scientific machine learning. Automatically generates fast functions for model components like Jacobians and Hessians with automatic sparsification and parallelization. MIT licensed. GitHub stars

NLP & Transformers

  • spaCy (Explosion AI) - Industrial-strength natural language processing with 75+ languages, transformer pipelines, and production-grade NER, parsing, and text classification. GitHub stars
  • Transformers (Hugging Face) - The de facto standard library for pretrained NLP models. 1M+ models, 250,000+ downloads/day. BERT, GPT, Llama, Qwen, and hundreds more. GitHub stars
  • sentence-transformers - Classic library for sentence and image embeddings. GitHub stars
  • tokenizers (Hugging Face) - Fast state-of-the-art tokenizers for training and inference. GitHub stars
  • fairseq2 - FAIR Sequence Modeling Toolkit 2. Complete rewrite of fairseq with modern PyTorch APIs, native support for LLM training (70B+ models), vLLM integration, and first-party recipes for instruction finetuning and preference optimization. MIT licensed. GitHub stars

Data Processing & Manipulation

  • Pandas - The gold standard for data analysis and manipulation in Python. GitHub stars
  • Polars - Blazing-fast DataFrame library (Rust backend) - modern alternative to Pandas for large-scale workloads. GitHub stars
  • cuDF - GPU DataFrame library from RAPIDS. Accelerates Pandas workflows on NVIDIA GPUs with zero code changes using cuDF.pandas accelerator mode. GitHub stars
  • Modin - Parallel Pandas DataFrames. Scale Pandas workflows by changing a single line of code - distributes data and computation automatically. GitHub stars
  • Dask - Parallel computing for big data - scales Pandas/NumPy/scikit-learn to clusters. GitHub stars
  • NumPy - Fundamental array computing library that powers almost every AI stack. GitHub stars
  • SciPy - Scientific computing algorithms (optimization, linear algebra, statistics, signal processing). GitHub stars
  • CuPy - NumPy and SciPy-compatible array library for GPU-accelerated computing in Python. GitHub stars
  • NetworkX - Creation, manipulation, and study of complex networks. The foundational graph analysis library for Python data science. GitHub stars
  • cuGraph - GPU graph analytics library with NetworkX-compatible API. 10-100x faster than CPU for large-scale graph algorithms. Apache 2.0 licensed. GitHub stars
  • Vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python. Visualize and explore billion-row datasets at millions of rows per secon

Core symbols most depended-on inside this repo

read_lines
called by 2
tools/validate_awesome.py
graphql_literal
called by 2
tools/validate_awesome.py
print_report
called by 2
tools/validate_awesome.py
github_anchor_slug
called by 1
tools/validate_awesome.py
parse_repo_ref
called by 1
tools/validate_awesome.py
parse_entries
called by 1
tools/validate_awesome.py
validate_toc
called by 1
tools/validate_awesome.py
validate_duplicates
called by 1
tools/validate_awesome.py

Shape

Function 12
Class 4
Method 1

Languages

Python100%

Modules by API surface

tools/validate_awesome.py17 symbols

For agents

$ claude mcp add awesome-opensource-ai \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact