GitXplorerGitXplorer
a

axlearn

public
1789 stars
233 forks
33 issues

Commits

List of commits on branch main.
Verified
dd2e4492c6efe3f2152270a207c1171dd20f2092

Add CuDNN fused MHA kernel to axlearn (#705)

kkelvin-zou committed 2 hours ago
Verified
eaa023af7c6cf40a7260dea731d1f0159914e002

add v6e system characteristics (#691)

eehorning committed a day ago
Verified
c45a018997d8eef460d2f7e68fe86ae30da59510

Fix race condition while doing async tf save. (#704)

hhanzhi713 committed 2 days ago
Verified
70a2bd7e646313c44ce49d1b670fe23781fce948

Fix a few issues for AOT compilation (#701)

cchanglan committed 3 days ago
Verified
f2326c539d36b7b98f66231faba9d27099a7ad52

Support customized mesh rules to support different HWs (#696)

kkelvin-zou committed 6 days ago
Verified
48cd51983a2bde1986cffbda848aaaece96bc404

Add segment_ids to the forward pass of a Causal LM (#700)

cchanglan committed 7 days ago

README

The README file for this repository.

The AXLearn Library for Deep Learning

This library is under active development and the API is subject to change.

Table of Contents

Section Description
Introduction What is AXLearn?
Getting Started Getting up and running with AXLearn.
Concepts Core concepts and design principles.
CLI User Guide How to use the CLI.
Infrastructure Core infrastructure components.

Introduction

AXLearn is a library built on top of JAX and XLA to support the development of large-scale deep learning models.

AXLearn takes an object-oriented approach to the software engineering challenges that arise from building, iterating, and maintaining models. The configuration system of the library lets users compose models from reusable building blocks and integrate with other libraries such as Flax and Hugging Face transformers.

AXLearn is built to scale. It supports the training of models with up to hundreds of billions of parameters across thousands of accelerators at high utilization. It is also designed to run on public clouds and provides tools to deploy and manage jobs and data. Built on top of GSPMD, AXLearn adopts a global computation paradigm to allow users to describe computation on a virtual global computer rather than on a per-accelerator basis.

AXLearn supports a wide range of applications, including natural language processing, computer vision, and speech recognition and contains baseline configurations for training state-of-the-art models.

Please see Concepts for more details on the core components and design of AXLearn, or Getting Started if you want to get your hands dirty.