Releases · hpcaitech/ColossalAI

10 Mar 06:57

github-actions

v0.2.6

89aa792

Version v0.2.6 Release Today!

What's Changed

Release

[release] v0.2.6 (#3057) by Frank Lee

Doc

[doc] moved doc test command to bottom (#3075) by Frank Lee
[doc] specified operating system requirement (#3019) by Frank Lee
[doc] update nvme offload doc (#3014) by ver217
[doc] add ISC tutorial (#2997) by binmakeswell
[doc] add deepspeed citation and copyright (#2996) by ver217
[doc] added reference to related works (#2994) by Frank Lee
[doc] update news (#2983) by binmakeswell
[doc] fix chatgpt inference typo (#2964) by binmakeswell
[doc] add env scope (#2933) by binmakeswell
[doc] added readme for documentation (#2935) by Frank Lee
[doc] removed read-the-docs (#2932) by Frank Lee
[doc] update installation for GPT (#2922) by binmakeswell
[doc] add os scope, update tutorial install and tips (#2914) by binmakeswell
[doc] fix GPT tutorial (#2860) by dawei-wang
[doc] fix typo in opt inference tutorial (#2849) by Zheng Zeng
[doc] update OPT serving (#2804) by binmakeswell
[doc] update example and OPT serving link (#2769) by binmakeswell
[doc] add opt service doc (#2747) by Frank Lee
[doc] fixed a typo in GPT readme (#2736) by cloudhuang
[doc] updated documentation version list (#2730) by Frank Lee

Workflow

[workflow] fixed doc build trigger condition (#3072) by Frank Lee
[workflow] supported conda package installation in doc test (#3028) by Frank Lee
[workflow] fixed the post-commit failure when no formatting needed (#3020) by Frank Lee
[workflow] added auto doc test on PR (#2929) by Frank Lee
[workflow] moved pre-commit to post-commit (#2895) by Frank Lee

Booster

[booster] init module structure and definition (#3056) by Frank Lee

Example

[example] fix redundant note (#3065) by binmakeswell
[example] fixed opt model downloading from huggingface by Tomek
[example] add LoRA support (#2821) by Haofan Wang

Autochunk

[autochunk] refactor chunk memory estimation (#2762) by Xuanlei Zhao

Chatgpt

[chatgpt] change critic input as state (#3042) by wenjunyang
[chatgpt] fix readme (#3025) by BlueRum
[chatgpt] Add saving ckpt callback for PPO (#2880) by LuGY
[chatgpt]fix inference model load (#2988) by BlueRum
[chatgpt] allow shard init and display warning (#2986) by ver217
[chatgpt] fix lora gemini conflict in RM training (#2984) by BlueRum
[chatgpt] making experience support dp (#2971) by ver217
[chatgpt]fix lora bug (#2974) by BlueRum
[chatgpt] fix inference demo loading bug (#2969) by BlueRum
[ChatGPT] fix README (#2966) by Fazzie-Maqianli
[chatgpt]add inference example (#2944) by BlueRum
[chatgpt]support opt & gpt for rm training (#2876) by BlueRum
[chatgpt] Support saving ckpt in examples (#2846) by BlueRum
[chatgpt] fix rm eval (#2829) by BlueRum
[chatgpt] add test checkpoint (#2797) by ver217
[chatgpt] update readme about checkpoint (#2792) by ver217
[chatgpt] startegy add prepare method (#2766) by ver217
[chatgpt] disable shard init for colossalai (#2767) by ver217
[chatgpt] support colossalai strategy to train rm (#2742) by BlueRum
[chatgpt]fix train_rm bug with lora (#2741) by BlueRum

Dtensor

[DTensor] refactor CommSpec (#3034) by YuliangLiu0306
[DTensor] refactor sharding spec (#2987) by YuliangLiu0306
[DTensor] implementation of dtensor (#2946) by YuliangLiu0306

Hotfix

[hotfix] skip auto checkpointing tests (#3029) by YuliangLiu0306
[hotfix] add shard dim to aviod backward communication error (#2954) by YuliangLiu0306
[hotfix]: Remove math.prod dependency (#2837) by Jiatong (Julius) Han
[hotfix] fix autoparallel compatibility test issues (#2754) by YuliangLiu0306
[hotfix] fix chunk size can not be divided (#2867) by HELSON
Hotfix/auto parallel zh doc (#2820) by YuliangLiu0306
[hotfix] add copyright for solver and device mesh (#2803) by YuliangLiu0306
[hotfix] add correct device for fake_param (#2796) by HELSON

Revert] recover "[refactor

[revert] recover "[refactor] restructure configuration files (#2977)" (#3022) by Frank Lee

Format

[format] applied code formatting on changed files in pull request 3025 (#3026) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2997 (#3008) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2933 (#2939) by github-actions[bot]
[format] applied code formatting on changed files in pull request 2922 (#2923) by github-actions[bot]

Pipeline

[pipeline] Add Simplified Alpa DP Partition (#2507) by Ziyue Jiang

Fx

[fx] remove depreciated algorithms. (#2312) (#2313) by Super Daniel

Refactor

[refactor] restructure configuration files (#2977) by Saurav Maheshkar

Kernel

[kernel] cached the op kernel and fixed version check (#2886) by Frank Lee

Misc

[misc] add reference (#2930) by ver217

Autoparallel

[autoparallel] apply repeat block to reduce solving time (#2912) by YuliangLiu0306
[autoparallel] find repeat blocks (#2854) by YuliangLiu0306
[autoparallel] Patch meta information for nodes that will not be handled by SPMD solver (#2823) by Boyuan Yao
[autoparallel] Patch meta information of torch.where (#2822) by Boyuan Yao
[autoparallel] Patch meta information of torch.tanh() and torch.nn.Dropout (#2773) by Boyuan Yao
[autoparallel] Patch tensor related operations meta information (#2789) by Boyuan Yao
[autoparallel] rotor solver refactor (#2813) by Boyuan Yao
[autoparallel] Patch meta information of torch.nn.Embedding (#2760) by Boyuan Yao
[autoparallel] distinguish different parallel strategies (#2699) by YuliangLiu0306

Zero

[zero] trivial zero optimizer refactoring (#2869) by YH
[zero] fix wrong import (#2777) by Boyuan Yao

Cli

[cli] handled version check exceptions (#2848) by Frank Lee

Triton

[triton] added copyright information for flash attention (#2835) by Frank Lee

Nfc

[NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style (#2744) by Michelle
[NFC] polish code format by binmakeswell
[NFC] polish colossala...

Assets 2

15 Feb 08:53

github-actions

v0.2.5

c5be83a

Version v0.2.5 Release Today!

What's Changed

Chatgpt

[chatgpt] optimize generation kwargs (#2717) by ver217

Autoparallel

[autoparallel] add shard option (#2696) by YuliangLiu0306
[autoparallel] fix parameters sharding bug (#2716) by YuliangLiu0306
[autoparallel] refactor runtime pass (#2644) by YuliangLiu0306
[autoparallel] remove deprecated codes (#2664) by YuliangLiu0306
[autoparallel] test compatibility for gemini and auto parallel (#2700) by YuliangLiu0306

Doc

[doc] updated documentation version list (#2715) by Frank Lee
[doc] add open-source contribution invitation (#2714) by binmakeswell
[doc] add Quick Preview (#2706) by binmakeswell
[doc] resize figure (#2705) by binmakeswell
[doc] add ChatGPT (#2703) by binmakeswell

Devops

[devops] add chatgpt ci (#2713) by ver217

Workflow

[workflow] fixed tensor-nvme build caching (#2711) by Frank Lee

App

[app] fix ChatGPT requirements (#2704) by binmakeswell
[app] add chatgpt application (#2698) by ver217

Full Changelog: v0.2.5...v0.2.4

Assets 2

14 Feb 12:07

github-actions

v0.2.4

c3abdd0

Version v0.2.4 Release Today!

What's Changed

Release

[release] update version (#2691) by ver217

Doc

[doc] update auto parallel paper link (#2686) by binmakeswell
[doc] added documentation sidebar translation (#2670) by Frank Lee

Zero1&2

[zero1&2] only append parameters with gradients (#2681) by HELSON

Gemini

[gemini] fix colo_init_context (#2683) by ver217
[gemini] add fake_release_chunk for keep-gathered chunk in the inference mode (#2671) by HELSON

Workflow

[workflow] fixed communtity report ranking (#2680) by Frank Lee
[workflow] added trigger to build doc upon release (#2678) by Frank Lee
[workflow] added doc build test (#2675) by Frank Lee

Autoparallel

[autoparallel] Patch meta information of torch.nn.functional.softmax and torch.nn.Softmax (#2674) by Boyuan Yao

Dooc

[dooc] fixed the sidebar itemm key (#2672) by Frank Lee

Full Changelog: v0.2.4...v0.2.3

Assets 2

13 Feb 01:52

github-actions

v0.2.3

81ea66d

Version v0.2.3 Release Today!

What's Changed

Release

[release] v0.2.3 (#2669) by Frank Lee

Doc

[doc] add CVPR tutorial (#2666) by binmakeswell

Docs

[Docs] layout converting management (#2665) by YuliangLiu0306

Autoparallel

[autoparallel] Patch meta information of torch.nn.LayerNorm (#2647) by Boyuan Yao

Full Changelog: v0.2.3...v0.2.2

Assets 2

10 Feb 03:02

github-actions

v0.2.2

b673e5f

Version v0.2.2 Release Today!

What's Changed

Release

[release] v0.2.2 (#2661) by Frank Lee

Workflow

[workflow] fixed gpu memory check condition (#2659) by Frank Lee
[workflow] fixed the test coverage report (#2614) by Frank Lee
[workflow] fixed test coverage report (#2611) by Frank Lee

Example

[example] Polish README.md (#2658) by Jiatong (Julius) Han

Doc

[doc] fixed compatiblity with docusaurus (#2657) by Frank Lee
[doc] added docusaurus-based version control (#2656) by Frank Lee
[doc] migrate the markdown files (#2652) by Frank Lee
[doc] fix typo of BLOOM (#2643) by binmakeswell
[doc] removed pre-built wheel installation from readme (#2637) by Frank Lee
[doc] updated the sphinx theme (#2635) by Frank Lee
[doc] fixed broken badge (#2623) by Frank Lee

Autoparallel

[autoparallel] refactor handlers which reshape input tensors (#2615) by YuliangLiu0306
[autoparallel] adapt autoparallel tests with latest api (#2626) by YuliangLiu0306
[autoparallel] Patch meta information of torch.matmul (#2584) by Boyuan Yao

Tutorial

[tutorial] added energonai to opt inference requirements (#2625) by Frank Lee
[tutorial] add video link (#2619) by binmakeswell

Autochunk

[autochunk] support diffusion for autochunk (#2621) by oahzxl

Build

[build] fixed the doc build process (#2618) by Frank Lee

Test

[test] fixed the triton version for testing (#2608) by Frank Lee

Full Changelog: v0.2.2...v0.2.1

Assets 2

0 Join discussion

06 Feb 13:44

github-actions

v0.2.1

f566b0c

Version v0.2.1 Release Today!

What's Changed

Workflow

[workflow] fixed broken rellease workflows (#2604) by Frank Lee
[workflow] added cuda extension build test before release (#2598) by Frank Lee
[workflow] hooked pypi release with lark (#2596) by Frank Lee
[workflow] hooked docker release with lark (#2594) by Frank Lee
[workflow] added test-pypi check before release (#2591) by Frank Lee
[workflow] fixed the typo in the example check workflow (#2589) by Frank Lee
[workflow] hook compatibility test failure to lark (#2586) by Frank Lee
[workflow] hook example test alert with lark (#2585) by Frank Lee
[workflow] added notification if scheduled build fails (#2574) by Frank Lee
[workflow] added discussion stats to community report (#2572) by Frank Lee
[workflow] refactored compatibility test workflow for maintenability (#2560) by Frank Lee
[workflow] adjust the GPU memory threshold for scheduled unit test (#2558) by Frank Lee
[workflow] fixed example check workflow (#2554) by Frank Lee
[workflow] fixed typos in the leaderboard workflow (#2567) by Frank Lee
[workflow] added contributor and user-engagement report (#2564) by Frank Lee
[workflow] only report coverage for changed files (#2524) by Frank Lee
[workflow] fixed the precommit CI (#2525) by Frank Lee
[workflow] fixed changed file detection (#2515) by Frank Lee
[workflow] fixed the skip condition of example weekly check workflow (#2481) by Frank Lee
[workflow] automated bdist wheel build (#2459) by Frank Lee
[workflow] automated the compatiblity test (#2453) by Frank Lee
[workflow] fixed the on-merge condition check (#2452) by Frank Lee
[workflow] make test coverage report collapsable (#2436) by Frank Lee
[workflow] report test coverage even if below threshold (#2431) by Frank Lee
[workflow]auto comment with test coverage report (#2419) by Frank Lee
[workflow] auto comment if precommit check fails (#2417) by Frank Lee
[workflow] added translation for non-english comments (#2414) by Frank Lee
[workflow] added precommit check for code consistency (#2401) by Frank Lee
[workflow] refactored the example check workflow (#2411) by Frank Lee
[workflow] added nightly release to pypi (#2403) by Frank Lee
[workflow] added missing file change detection output (#2387) by Frank Lee
[workflow]New version: Create workflow files for examples' auto check (#2298) by ziyuhuang123
[workflow] fixed pypi release workflow error (#2328) by Frank Lee
[workflow] fixed pypi release workflow error (#2327) by Frank Lee
[workflow] added workflow to release to pypi upon version change (#2320) by Frank Lee
[workflow] removed unused assign reviewer workflow (#2318) by Frank Lee
[workflow] rebuild cuda kernels when kernel-related files change (#2317) by Frank Lee

Release

[release] v0.2.1 (#2602) by Frank Lee

Doc

[doc] updated readme for CI/CD (#2600) by Frank Lee
[doc] fixed issue link in pr template (#2577) by Frank Lee
[doc] updated the CHANGE_LOG.md for github release page (#2552) by Frank Lee
[doc] fixed the typo in pr template (#2556) by Frank Lee
[doc] added pull request template (#2550) by Frank Lee
[doc] update example link (#2520) by binmakeswell
[doc] update opt and tutorial links (#2509) by binmakeswell
[doc] added documentation for CI/CD (#2420) by Frank Lee
[doc] updated kernel-related optimisers' docstring (#2385) by Frank Lee
[doc] updated readme regarding pypi installation (#2406) by Frank Lee
[doc] hotfix #2377 by Jiarui Fang
[doc] hotfix #2377 by jiaruifang
[doc] update stable diffusion link (#2322) by binmakeswell
[doc] update diffusion doc (#2296) by binmakeswell
[doc] update news (#2295) by binmakeswell
[doc] update news by binmakeswell

Setup

[setup] fixed inconsistent version meta (#2578) by Frank Lee
[setup] refactored setup.py for dependency graph (#2413) by Frank Lee
[setup] support pre-build and jit-build of cuda kernels (#2374) by Frank Lee
[setup] make cuda extension build optional (#2336) by Frank Lee
[setup] remove torch dependency (#2333) by Frank Lee
[setup] removed the build dependency on colossalai (#2307) by Frank Lee

Tutorial

[tutorial] polish README (#2568) by binmakeswell
[tutorial] update fastfold tutorial (#2565) by oahzxl

Polish

[polish] polish ColoTensor and its submodules (#2537) by HELSON
[polish] polish code for get_static_torch_model (#2405) by HELSON

Kernel

[kernel] fixed repeated loading of kernels (#2549) by Frank Lee

Hotfix

[hotfix] fix zero ddp warmup check (#2545) by ver217
[hotfix] fix autoparallel demo (#2533) by YuliangLiu0306
[hotfix] fix lightning error (#2529) by HELSON
[hotfix] meta tensor default device. (#2510) by Super Daniel
[hotfix] gpt example titans bug #2493 (#2494) by Jiarui Fang
[hotfix] gpt example titans bug #2493 by jiaruifang
[hotfix] add norm clearing for the overflow step (#2416) by HELSON
[hotfix] add DISTPAN argument for benchmark (#2412) by HELSON
[hotfix] fix gpt gemini example (#2404) by HELSON
[hotfix] issue #2388 by Jiarui Fang
[hotfix] issue #2388 by jiaruifang
[hotfix] fix implement error in diffusers by Jiarui Fang
[hotfix] fix implement error in diffusers by 1SAA

Autochunk

[autochunk] add benchmark for transformer and alphafold (#2543) by oahzxl
[autochunk] support multi outputs chunk search (#2538) by oahzxl
[autochunk] support transformer (#2526) by oahzxl
[autochunk] support parsing blocks (#2506) by oahzxl
[autochunk] support autochunk on evoformer (#2497) by oahzxl
[autochunk] support evoformer tracer (#2485) by oahzxl
[autochunk] add autochunk feature by Jiarui Fang

Git

[git] remove invalid submodule (#2540) by binmakeswell

Gemini

[gemini] add profiler in the demo (#2534) by HELSON
[gemini] update the gpt example (#2527) by HELSON
[gemini] update ddp strict mode (#2518) by HELSON
[gemini] add get static torch model (#2356) by HELSON

Example

[example] Add fastfold tutorial (#2528) by [LuGY]...

Assets 2

03 Jan 12:29

github-actions

v0.2.0

26e171a

Version v0.2.0 Release Today!

What's Changed

Version

[version] 0.1.14 -> 0.2.0 (#2286) by Jiarui Fang

Examples

[examples] using args and combining two versions for PaLM (#2284) by ZijianYY
[examples] replace einsum with matmul (#2210) by ZijianYY

Doc

[doc] add feature diffusion v2, bloom, auto-parallel (#2282) by binmakeswell
[doc] updated the stable diffussion on docker usage (#2244) by Frank Lee

Zero

[zero] polish low level zero optimizer (#2275) by HELSON
[zero] fix error for BEiT models (#2169) by HELSON

Example

[example] add benchmark (#2276) by Ziyue Jiang
[example] fix save_load bug for dreambooth (#2280) by BlueRum
[example] GPT polish readme (#2274) by Jiarui Fang
[example] fix gpt example with 0.1.10 (#2265) by HELSON
[example] clear diffuser image (#2262) by Fazzie-Maqianli
[example] diffusion install from docker (#2239) by Jiarui Fang
[example] fix benchmark.sh for gpt example (#2229) by HELSON
[example] make palm + GeminiDPP work (#2227) by Jiarui Fang
[example] Palm adding gemini, still has bugs (#2221) by ZijianYY
[example] update gpt example (#2225) by HELSON
[example] add benchmark.sh for gpt (#2226) by Jiarui Fang
[example] update gpt benchmark (#2219) by HELSON
[example] update GPT example benchmark results (#2212) by Jiarui Fang
[example] update gpt example for larger model scale (#2211) by Jiarui Fang
[example] update gpt readme with performance (#2206) by Jiarui Fang
[example] polish doc (#2201) by ziyuhuang123
[example] Change some training settings for diffusion (#2195) by BlueRum
[example] support Dreamblooth (#2188) by Fazzie-Maqianli
[example] gpt demo more accuracy tflops (#2178) by Jiarui Fang
[example] add palm pytorch version (#2172) by Jiarui Fang
[example] update vit readme (#2155) by Jiarui Fang
[example] add zero1, zero2 example in GPT examples (#2146) by HELSON

Hotfix

[hotfix] fix fp16 optimzier bug (#2273) by YuliangLiu0306
[hotfix] fix error for torch 2.0 (#2243) by xcnick
[hotfix] Fixing the bug related to ipv6 support by Tongping Liu
[hotfix] correcnt cpu_optim runtime compilation (#2197) by Jiarui Fang
[hotfix] add kwargs for colo_addmm (#2171) by Tongping Liu
[hotfix] Jit type hint #2161 (#2164) by アマデウス
[hotfix] fix auto policy of test_sharded_optim_v2 (#2157) by Jiarui Fang
[hotfix] fix aten default bug (#2158) by YuliangLiu0306

Autoparallel

[autoparallel] fix spelling error (#2270) by YuliangLiu0306
[autoparallel] gpt2 autoparallel examples (#2267) by YuliangLiu0306
[autoparallel] patch torch.flatten metainfo for autoparallel (#2247) by Boyuan Yao
[autoparallel] autoparallel initialize (#2238) by YuliangLiu0306
[autoparallel] fix construct meta info. (#2245) by Super Daniel
[autoparallel] record parameter attribute in colotracer (#2217) by YuliangLiu0306
[autoparallel] Attach input, buffer and output tensor to MetaInfo class (#2162) by Boyuan Yao
[autoparallel] new metainfoprop based on metainfo class (#2179) by Boyuan Yao
[autoparallel] update getitem handler (#2207) by YuliangLiu0306
[autoparallel] update_getattr_handler (#2193) by YuliangLiu0306
[autoparallel] add gpt2 performance test code (#2194) by YuliangLiu0306
[autoparallel] integrate_gpt_related_tests (#2134) by YuliangLiu0306
[autoparallel] memory estimation for shape consistency (#2144) by Boyuan Yao
[autoparallel] use metainfo in handler (#2149) by YuliangLiu0306

Gemini

[Gemini] fix the convert_to_torch_module bug (#2269) by Jiarui Fang

Pipeline middleware

[Pipeline Middleware] Reduce comm redundancy by getting accurate output (#2232) by Ziyue Jiang

Builder

[builder] builder for scaled_upper_triang_masked_softmax (#2234) by Jiarui Fang
[builder] polish builder with better base class (#2216) by Jiarui Fang
[builder] raise Error when CUDA_HOME is not set (#2213) by Jiarui Fang
[builder] multihead attn runtime building (#2203) by Jiarui Fang
[builder] unified cpu_optim fused_optim inferface (#2190) by Jiarui Fang
[builder] use runtime builder for fused_optim (#2189) by Jiarui Fang
[builder] runtime adam and fused_optim builder (#2184) by Jiarui Fang
[builder] use builder() for cpu adam and fused optim in setup.py (#2187) by Jiarui Fang

Logger

[logger] hotfix, missing _FORMAT (#2231) by Super Daniel

Diffusion

[diffusion] update readme (#2214) by HELSON

Testing

[testing] add beit model for unit testings (#2196) by HELSON

NFC

[NFC] fix some typos' (#2175) by ziyuhuang123
[NFC] update news link (#2191) by binmakeswell
[NFC] fix a typo 'stable-diffusion-typo-fine-tune' by Arsmart1

Exmaple

[exmaple] diffuser, support quant inference for stable diffusion (#2186) by BlueRum
[exmaple] add vit missing functions (#2154) by Jiarui Fang

Pipeline middleware

[Pipeline Middleware ] Fix deadlock when num_microbatch=num_stage (#2156) by Ziyue Jiang

Full Changelog: v0.2.0...v0.1.13

Assets 2

0 Join discussion

20 Dec 02:30

github-actions

v0.1.13

9b39170

Version v0.1.13 Release Today!

What's Changed

Version

[version] 0.1.13 (#2152) by Jiarui Fang
Revert "[version] version to v0.1.13 (#2139)" (#2153) by Jiarui Fang
[version] version to v0.1.13 (#2139) by Jiarui Fang

Gemini

[Gemini] GeminiDPP convert to PyTorch Module. (#2151) by Jiarui Fang
[Gemini] Update coloinit_ctx to support meta_tensor (#2147) by BlueRum
[Gemini] revert ZeROInitCtx related tracer (#2138) by Jiarui Fang
[Gemini] update API of the chunkmemstatscollector. (#2129) by Jiarui Fang
[Gemini] update the non model data record method in runtime memory tracer (#2128) by Jiarui Fang
[Gemini] test step-tensor mapping using repeated_computed_layers.py (#2127) by Jiarui Fang
[Gemini] update non model data calculation method (#2126) by Jiarui Fang
[Gemini] hotfix the unittest bugs (#2125) by Jiarui Fang
[Gemini] mapping of preop timestep and param (#2124) by Jiarui Fang
[Gemini] chunk init using runtime visited param order (#2115) by Jiarui Fang
[Gemini] chunk init use OrderedParamGenerator (#2110) by Jiarui Fang

Nfc

[NFC] remove useless graph node code (#2150) by Jiarui Fang
[NFC] update chunk manager API (#2119) by Jiarui Fang
[NFC] polish comments for Chunk class (#2116) by Jiarui Fang

Autoparallel

[autoparallel] process size nodes in runtime pass (#2130) by YuliangLiu0306
[autoparallel] implement softmax handler (#2132) by YuliangLiu0306
[autoparallel] gpt2lp runtimee test (#2113) by YuliangLiu0306

Example

Merge pull request #2120 from Fazziekey/example/stablediffusion-v2 by Fazzie-Maqianli

Optimizer

[optimizer] add div_scale for optimizers (#2117) by HELSON

Pp middleware

[PP Middleware] Add bwd and step for PP middleware (#2111) by Ziyue Jiang

Full Changelog: v0.1.13...v0.1.12

Assets 2

09 Dec 17:59

github-actions

v0.1.12

63fbba3

Version v0.1.12 Release Today!

What's Changed

Zero

[zero] add L2 gradient clipping for ZeRO (#2112) by HELSON

Gemini

[gemini] get the param visited order during runtime (#2108) by Jiarui Fang
[Gemini] NFC, polish search_chunk_configuration (#2107) by Jiarui Fang
[Gemini] gemini use the runtime memory tracer (RMT) (#2099) by Jiarui Fang
[Gemini] make RuntimeMemTracer work correctly (#2096) by Jiarui Fang
[Gemini] remove eval in gemini unittests! (#2092) by Jiarui Fang
[Gemini] remove GLOBAL_MODEL_DATA_TRACER (#2091) by Jiarui Fang
[Gemini] remove GLOBAL_CUDA_MEM_INFO (#2090) by Jiarui Fang
[Gemini] use MemStats in Runtime Memory tracer (#2088) by Jiarui Fang
[Gemini] use MemStats to store the tracing data. Seperate it from Collector. (#2084) by Jiarui Fang
[Gemini] remove static tracer (#2083) by Jiarui Fang
[Gemini] ParamOpHook -> ColoParamOpHook (#2080) by Jiarui Fang
[Gemini] polish runtime tracer tests (#2077) by Jiarui Fang
[Gemini] rename hooks related to runtime mem tracer (#2076) by Jiarui Fang
[Gemini] add albert in test models. (#2075) by Jiarui Fang
[Gemini] rename ParamTracerWrapper -> RuntimeMemTracer (#2073) by Jiarui Fang
[Gemini] remove not used MemtracerWrapper (#2072) by Jiarui Fang
[Gemini] fix grad unreleased issue and param recovery issue (#2052) by Zihao

Hotfix

[hotfix] fix a type in ColoInitContext (#2106) by Jiarui Fang
[hotfix] update test for latest version (#2060) by YuliangLiu0306
[hotfix] skip gpt tracing test (#2064) by YuliangLiu0306

Colotensor

[ColoTensor] throw error when ColoInitContext meets meta parameter. (#2105) by Jiarui Fang

Autoparallel

[autoparallel] support linear function bias addition (#2104) by YuliangLiu0306
[autoparallel] support addbmm computation (#2102) by YuliangLiu0306
[autoparallel] add sum handler (#2101) by YuliangLiu0306
[autoparallel] add bias addtion function class (#2098) by YuliangLiu0306
[autoparallel] complete gpt related module search (#2097) by YuliangLiu0306
[autoparallel]add embedding handler (#2089) by YuliangLiu0306
[autoparallel] add tensor constructor handler (#2082) by YuliangLiu0306
[autoparallel] add non_split linear strategy (#2078) by YuliangLiu0306
[autoparallel] Add F.conv metainfo (#2069) by Boyuan Yao
[autoparallel] complete gpt block searching (#2065) by YuliangLiu0306
[autoparallel] add binary elementwise metainfo for auto parallel (#2058) by Boyuan Yao
[autoparallel] fix forward memory calculation (#2062) by Boyuan Yao
[autoparallel] adapt solver with self attention (#2037) by YuliangLiu0306

Version

[version] 0.1.11rc5 -> 0.1.12 (#2103) by Jiarui Fang

Pipeline middleware

[Pipeline Middleware] fix data race in Pipeline Scheduler for DAG (#2087) by Ziyue Jiang
[Pipeline Middleware] Adapt scheduler for Topo (#2066) by Ziyue Jiang

Fx

[fx] An experimental version of ColoTracer.' (#2002) by Super Daniel

Example

[example] update GPT README (#2095) by ZijianYY

Device

[device] update flatten device mesh usage (#2079) by YuliangLiu0306

Test

[test] bert test in non-distributed way (#2074) by Jiarui Fang

Pipeline

[Pipeline] Add Topo Class (#2059) by Ziyue Jiang

Examples

[examples] update autoparallel demo (#2061) by YuliangLiu0306

Full Changelog: v0.1.12...v0.1.11rc5

Assets 2

30 Nov 16:26

github-actions

v0.1.11rc5

d3499c9

Version v0.1.11rc5 Release Today!

What's Changed

Release

[release] update to 0.1.11rc5 (#2053) by Frank Lee

Cli

[cli] updated installation cheheck with more inforamtion (#2050) by Frank Lee

Gemini

[gemini] fix init bugs for modules (#2047) by HELSON
[gemini] add arguments (#2046) by HELSON
[Gemini] free and allocate cuda memory by tensor.storage, add grad hook (#2040) by Zihao
[Gemini] more tests for Gemini (#2038) by Jiarui Fang
[Gemini] more rigorous unit tests for run_fwd_bwd (#2034) by Jiarui Fang
[Gemini] paramWrapper paramTracerHook unitest (#2030) by Zihao
[Gemini] patch for supporting orch.add_ function for ColoTensor (#2003) by Jiarui Fang
[gemini] param_trace_hook (#2020) by Zihao
[Gemini] add unitests to check gemini correctness (#2015) by Jiarui Fang
[Gemini] ParamMemHook (#2008) by Zihao
[Gemini] param_tracer_wrapper and test case (#2009) by Zihao

Setup

[setup] supported conda-installed torch (#2048) by Frank Lee

Test

[test] align model name with the file name. (#2045) by Jiarui Fang

Hotfix

[hotfix] hotfix Gemini for no leaf modules bug (#2043) by Jiarui Fang
[hotfix] add bert test for gemini fwd bwd (#2035) by Jiarui Fang
[hotfix] revert bug PRs (#2016) by Jiarui Fang

Zero

[zero] fix testing parameters (#2042) by HELSON
[zero] fix unit-tests (#2039) by HELSON
[zero] test gradient accumulation (#1964) by HELSON

Testing

[testing] fix testing models (#2036) by HELSON

Rpc

[rpc] split with dag (#2028) by Ziyue Jiang

Autoparallel

[autoparallel] add split handler (#2032) by YuliangLiu0306
[autoparallel] add experimental permute handler (#2029) by YuliangLiu0306
[autoparallel] add runtime pass and numerical test for view handler (#2018) by YuliangLiu0306
[autoparallel] add experimental view handler (#2011) by YuliangLiu0306
[autoparallel] mix gather (#1977) by Genghan Zhang

Fx

[fx]Split partition with DAG information (#2025) by Ziyue Jiang

Github

[GitHub] update issue template (#2023) by binmakeswell

Workflow

[workflow] removed unused pypi release workflow (#2022) by Frank Lee

Full Changelog: v0.1.11rc5...v0.1.11rc4

Assets 2

Releases: hpcaitech/ColossalAI

Version v0.2.6 Release Today!

What's Changed

Release

Doc

Workflow

Booster

Example

Autochunk

Chatgpt

Dtensor

Hotfix

Revert] recover "[refactor

Format

Pipeline

Fx

Refactor

Kernel

Misc

Autoparallel

Zero

Cli

Triton

Nfc

Version v0.2.5 Release Today!

What's Changed

Chatgpt

Autoparallel

Doc

Devops

Workflow

App

Version v0.2.4 Release Today!

What's Changed

Release

Doc

Zero1&2

Gemini

Workflow

Autoparallel

Dooc

Version v0.2.3 Release Today!

What's Changed

Release

Doc

Docs

Autoparallel

Version v0.2.2 Release Today!

What's Changed

Release

Workflow

Example

Doc

Autoparallel

Tutorial

Autochunk

Build

Test

Version v0.2.1 Release Today!

What's Changed

Workflow

Release

Doc

Setup

Tutorial

Polish

Kernel

Hotfix

Autochunk

Git

Gemini

Example

Version v0.2.0 Release Today!

What's Changed

Version

Examples

Doc

Zero

Example

Hotfix