NVIDIA repositories

Megatron-LM

Public

Ongoing research training transformer models at scale

transformers model-para large-language-models

Python

•

Other

•3.4k•14k•330•239•Updated

Dec 7, 2025

tilus

Public

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

tile programming kernelcuda

Python

•

Apache License 2.0

•12•414•8•0•Updated

Dec 7, 2025

KAI-Scheduler

Public

KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale

Go

•

Apache License 2.0

•114•956•23•36•Updated

Dec 7, 2025

torch-harmonics

Public

Differentiable signal processing on the sphere for PyTorch

machine-learning signal-processing spherepytorch

Jupyter Notebook

•

Other

•62•610•3•4•Updated

Dec 7, 2025

aistore

Public

AIStore: scalable storage for AI applications

kubernetes high-performance distributed-storagehigh-availability object-storage multi-cloud batch-jobs s3-compatible multipart-upload ml-training

Go

•

MIT License

•228•1.7k•0•0•Updated

Dec 7, 2025

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.

Python

•

Apache License 2.0

•206•1.6k•68•44•Updated

Dec 7, 2025

cuda-python

Public

CUDA Python: Performance meets Productivity

Python

•

Other

•227•3.1k•205•15•Updated

Dec 7, 2025

nvidia-resiliency-ext

Public

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to failures and interruptions.

Python

•

Other

•37•239•1•15•Updated

Dec 7, 2025

accelerated-computing-hub

Public

NVIDIA curated collection of educational resources related to general purpose GPU programming.

Jupyter Notebook

•

Other

•167•926•13•4•Updated

Dec 7, 2025

Fuser

Public

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++

•

Other

•69•364•205•212•Updated

Dec 7, 2025

TensorRT-LLM

Public

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.

cuda pytorch moeblackwell llm-serving

Python

•

Other

•1.9k•12k•618•453•Updated

Dec 7, 2025

k8s-dra-driver-gpu

Public

NVIDIA DRA Driver for GPUs

Go

•

Apache License 2.0

•101•504•94•27•Updated

Dec 7, 2025

TensorRT-Incubator

Public

Experimental projects related to TensorRT

MLIR

•19•116•37•12•Updated

Dec 7, 2025

k8s-nim-operator

Public

An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.

Go

•

Apache License 2.0

•34•140•7•29•Updated

Dec 7, 2025

NV-Kernels

Public

Ubuntu kernels which are optimized for NVIDIA server systems

C

•

Other

•43•70•0•11•Updated

Dec 7, 2025

JAX-Toolbox

Public

JAX-Toolbox

Python

•

Apache License 2.0

•67•367•80•43•Updated

Dec 7, 2025

TileGym

Public

Helpful kernel tutorials and examples for tile-based GPU programming

Python

•

Other

•9•252•0•0•Updated

Dec 7, 2025

go-gpuallocator

Public

Go Abstraction for Allocating NVIDIA GPUs with Custom Policies

Go

•

Apache License 2.0

•26•119•5•7•Updated

Dec 7, 2025

nvidia-container-toolkit

Public

Build and run containers leveraging NVIDIA GPUs

Go

•

Apache License 2.0

•446•3.9k•117•23•Updated

Dec 7, 2025

mig-parted

Public

MIG Partition Editor for NVIDIA GPUs

Go

•

Apache License 2.0

•52•231•22•16•Updated

Dec 7, 2025

NVSentinel

Public

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go

•

Apache License 2.0

•25•99•40•7•Updated

Dec 7, 2025

k8s-device-plugin

Public

NVIDIA device plugin for Kubernetes

kubernetes

Go

•

Apache License 2.0

•758•3.6k•75•34•Updated

Dec 7, 2025

cuCollections

Public

datastructures cpp gpucuda hashmap cpp17 hashset hashtable

C++

•

Apache License 2.0

•101•598•55•13•Updated

Dec 7, 2025

gpu-operator

Public

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

kubernetes gpu cudanvidia

Go

•

Apache License 2.0

•420•2.4k•94•67•Updated

Dec 7, 2025

cuopt

Public

GPU accelerated decision optimization

gpu optimization cudalinear-programming

Cuda

•

Apache License 2.0

•100•601•72•26•Updated

Dec 7, 2025

cloud-native-docs

Public

Documentation repository for NVIDIA Cloud Native Technologies

kubernetes containers kubernetes-operator

PowerShell

•

Apache License 2.0

•33•31•5•10•Updated

Dec 7, 2025

vgpu-device-manager

Public

NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes

Go

•

Apache License 2.0

•22•152•0•16•Updated

Dec 7, 2025

stdexec

Public

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++

•

Apache License 2.0

•217•2.1k•112•11•Updated

Dec 6, 2025

nvbench

Public

CUDA Kernel Benchmarking Library

benchmark performance gpucuda nvidia cuda-kernels kernel-benchmark

Cuda

•

Apache License 2.0

•94•773•53•8•Updated

Dec 6, 2025

cccl

Public

CUDA Core Compute Libraries

cpp hpc gpumodern-cpp parallel-computing cuda nvidia gpu-acceleration cuda-kernels gpu-computing

C++

•

Other

•297•2.1k•1.1k•195•Updated

Dec 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA Corporation

All

All

639 repositories

Megatron-LM

tilus

KAI-Scheduler

torch-harmonics

aistore

Model-Optimizer

cuda-python

nvidia-resiliency-ext

accelerated-computing-hub

Fuser

TensorRT-LLM

k8s-dra-driver-gpu

TensorRT-Incubator

k8s-nim-operator

NV-Kernels

JAX-Toolbox

TileGym

go-gpuallocator

nvidia-container-toolkit

mig-parted

NVSentinel

k8s-device-plugin

cuCollections

gpu-operator

cuopt

cloud-native-docs

vgpu-device-manager

stdexec

nvbench

cccl

All

All

Repositories list

639 repositories