Releases: hpcaitech/ColossalAI
Releases · hpcaitech/ColossalAI
Version v0.2.6 Release Today!
What's Changed
Release
Doc
- [doc] moved doc test command to bottom (#3075) by Frank Lee
- [doc] specified operating system requirement (#3019) by Frank Lee
- [doc] update nvme offload doc (#3014) by ver217
- [doc] add ISC tutorial (#2997) by binmakeswell
- [doc] add deepspeed citation and copyright (#2996) by ver217
- [doc] added reference to related works (#2994) by Frank Lee
- [doc] update news (#2983) by binmakeswell
- [doc] fix chatgpt inference typo (#2964) by binmakeswell
- [doc] add env scope (#2933) by binmakeswell
- [doc] added readme for documentation (#2935) by Frank Lee
- [doc] removed read-the-docs (#2932) by Frank Lee
- [doc] update installation for GPT (#2922) by binmakeswell
- [doc] add os scope, update tutorial install and tips (#2914) by binmakeswell
- [doc] fix GPT tutorial (#2860) by dawei-wang
- [doc] fix typo in opt inference tutorial (#2849) by Zheng Zeng
- [doc] update OPT serving (#2804) by binmakeswell
- [doc] update example and OPT serving link (#2769) by binmakeswell
- [doc] add opt service doc (#2747) by Frank Lee
- [doc] fixed a typo in GPT readme (#2736) by cloudhuang
- [doc] updated documentation version list (#2730) by Frank Lee
Workflow
- [workflow] fixed doc build trigger condition (#3072) by Frank Lee
- [workflow] supported conda package installation in doc test (#3028) by Frank Lee
- [workflow] fixed the post-commit failure when no formatting needed (#3020) by Frank Lee
- [workflow] added auto doc test on PR (#2929) by Frank Lee
- [workflow] moved pre-commit to post-commit (#2895) by Frank Lee
Booster
Example
- [example] fix redundant note (#3065) by binmakeswell
- [example] fixed opt model downloading from huggingface by Tomek
- [example] add LoRA support (#2821) by Haofan Wang
Autochunk
- [autochunk] refactor chunk memory estimation (#2762) by Xuanlei Zhao
Chatgpt
- [chatgpt] change critic input as state (#3042) by wenjunyang
- [chatgpt] fix readme (#3025) by BlueRum
- [chatgpt] Add saving ckpt callback for PPO (#2880) by LuGY
- [chatgpt]fix inference model load (#2988) by BlueRum
- [chatgpt] allow shard init and display warning (#2986) by ver217
- [chatgpt] fix lora gemini conflict in RM training (#2984) by BlueRum
- [chatgpt] making experience support dp (#2971) by ver217
- [chatgpt]fix lora bug (#2974) by BlueRum
- [chatgpt] fix inference demo loading bug (#2969) by BlueRum
- [ChatGPT] fix README (#2966) by Fazzie-Maqianli
- [chatgpt]add inference example (#2944) by BlueRum
- [chatgpt]support opt & gpt for rm training (#2876) by BlueRum
- [chatgpt] Support saving ckpt in examples (#2846) by BlueRum
- [chatgpt] fix rm eval (#2829) by BlueRum
- [chatgpt] add test checkpoint (#2797) by ver217
- [chatgpt] update readme about checkpoint (#2792) by ver217
- [chatgpt] startegy add prepare method (#2766) by ver217
- [chatgpt] disable shard init for colossalai (#2767) by ver217
- [chatgpt] support colossalai strategy to train rm (#2742) by BlueRum
- [chatgpt]fix train_rm bug with lora (#2741) by BlueRum
Dtensor
- [DTensor] refactor CommSpec (#3034) by YuliangLiu0306
- [DTensor] refactor sharding spec (#2987) by YuliangLiu0306
- [DTensor] implementation of dtensor (#2946) by YuliangLiu0306
Hotfix
- [hotfix] skip auto checkpointing tests (#3029) by YuliangLiu0306
- [hotfix] add shard dim to aviod backward communication error (#2954) by YuliangLiu0306
- [hotfix]: Remove math.prod dependency (#2837) by Jiatong (Julius) Han
- [hotfix] fix autoparallel compatibility test issues (#2754) by YuliangLiu0306
- [hotfix] fix chunk size can not be divided (#2867) by HELSON
- Hotfix/auto parallel zh doc (#2820) by YuliangLiu0306
- [hotfix] add copyright for solver and device mesh (#2803) by YuliangLiu0306
- [hotfix] add correct device for fake_param (#2796) by HELSON
Revert] recover "[refactor
Format
- [format] applied code formatting on changed files in pull request 3025 (#3026) by github-actions[bot]
- [format] applied code formatting on changed files in pull request 2997 (#3008) by github-actions[bot]
- [format] applied code formatting on changed files in pull request 2933 (#2939) by github-actions[bot]
- [format] applied code formatting on changed files in pull request 2922 (#2923) by github-actions[bot]
Pipeline
- [pipeline] Add Simplified Alpa DP Partition (#2507) by Ziyue Jiang
Fx
- [fx] remove depreciated algorithms. (#2312) (#2313) by Super Daniel
Refactor
- [refactor] restructure configuration files (#2977) by Saurav Maheshkar
Kernel
Misc
Autoparallel
- [autoparallel] apply repeat block to reduce solving time (#2912) by YuliangLiu0306
- [autoparallel] find repeat blocks (#2854) by YuliangLiu0306
- [autoparallel] Patch meta information for nodes that will not be handled by SPMD solver (#2823) by Boyuan Yao
- [autoparallel] Patch meta information of
torch.where
(#2822) by Boyuan Yao - [autoparallel] Patch meta information of
torch.tanh()
andtorch.nn.Dropout
(#2773) by Boyuan Yao - [autoparallel] Patch tensor related operations meta information (#2789) by Boyuan Yao
- [autoparallel] rotor solver refactor (#2813) by Boyuan Yao
- [autoparallel] Patch meta information of
torch.nn.Embedding
(#2760) by Boyuan Yao - [autoparallel] distinguish different parallel strategies (#2699) by YuliangLiu0306
Zero
- [zero] trivial zero optimizer refactoring (#2869) by YH
- [zero] fix wrong import (#2777) by Boyuan Yao
Cli
Triton
Nfc
- [NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style (#2744) by Michelle
- [NFC] polish code format by binmakeswell
- [NFC] polish colossala...
Version v0.2.5 Release Today!
What's Changed
Chatgpt
Autoparallel
- [autoparallel] add shard option (#2696) by YuliangLiu0306
- [autoparallel] fix parameters sharding bug (#2716) by YuliangLiu0306
- [autoparallel] refactor runtime pass (#2644) by YuliangLiu0306
- [autoparallel] remove deprecated codes (#2664) by YuliangLiu0306
- [autoparallel] test compatibility for gemini and auto parallel (#2700) by YuliangLiu0306
Doc
- [doc] updated documentation version list (#2715) by Frank Lee
- [doc] add open-source contribution invitation (#2714) by binmakeswell
- [doc] add Quick Preview (#2706) by binmakeswell
- [doc] resize figure (#2705) by binmakeswell
- [doc] add ChatGPT (#2703) by binmakeswell
Devops
Workflow
App
- [app] fix ChatGPT requirements (#2704) by binmakeswell
- [app] add chatgpt application (#2698) by ver217
Full Changelog: v0.2.5...v0.2.4
Version v0.2.4 Release Today!
What's Changed
Release
Doc
- [doc] update auto parallel paper link (#2686) by binmakeswell
- [doc] added documentation sidebar translation (#2670) by Frank Lee
Zero1&2
Gemini
- [gemini] fix colo_init_context (#2683) by ver217
- [gemini] add fake_release_chunk for keep-gathered chunk in the inference mode (#2671) by HELSON
Workflow
- [workflow] fixed communtity report ranking (#2680) by Frank Lee
- [workflow] added trigger to build doc upon release (#2678) by Frank Lee
- [workflow] added doc build test (#2675) by Frank Lee
Autoparallel
- [autoparallel] Patch meta information of
torch.nn.functional.softmax
andtorch.nn.Softmax
(#2674) by Boyuan Yao
Dooc
Full Changelog: v0.2.4...v0.2.3
Version v0.2.3 Release Today!
What's Changed
Release
Doc
- [doc] add CVPR tutorial (#2666) by binmakeswell
Docs
- [Docs] layout converting management (#2665) by YuliangLiu0306
Autoparallel
- [autoparallel] Patch meta information of
torch.nn.LayerNorm
(#2647) by Boyuan Yao
Full Changelog: v0.2.3...v0.2.2
Version v0.2.2 Release Today!
What's Changed
Release
Workflow
- [workflow] fixed gpu memory check condition (#2659) by Frank Lee
- [workflow] fixed the test coverage report (#2614) by Frank Lee
- [workflow] fixed test coverage report (#2611) by Frank Lee
Example
- [example] Polish README.md (#2658) by Jiatong (Julius) Han
Doc
- [doc] fixed compatiblity with docusaurus (#2657) by Frank Lee
- [doc] added docusaurus-based version control (#2656) by Frank Lee
- [doc] migrate the markdown files (#2652) by Frank Lee
- [doc] fix typo of BLOOM (#2643) by binmakeswell
- [doc] removed pre-built wheel installation from readme (#2637) by Frank Lee
- [doc] updated the sphinx theme (#2635) by Frank Lee
- [doc] fixed broken badge (#2623) by Frank Lee
Autoparallel
- [autoparallel] refactor handlers which reshape input tensors (#2615) by YuliangLiu0306
- [autoparallel] adapt autoparallel tests with latest api (#2626) by YuliangLiu0306
- [autoparallel] Patch meta information of
torch.matmul
(#2584) by Boyuan Yao
Tutorial
- [tutorial] added energonai to opt inference requirements (#2625) by Frank Lee
- [tutorial] add video link (#2619) by binmakeswell
Autochunk
Build
Test
Full Changelog: v0.2.2...v0.2.1
Version v0.2.1 Release Today!
What's Changed
Workflow
- [workflow] fixed broken rellease workflows (#2604) by Frank Lee
- [workflow] added cuda extension build test before release (#2598) by Frank Lee
- [workflow] hooked pypi release with lark (#2596) by Frank Lee
- [workflow] hooked docker release with lark (#2594) by Frank Lee
- [workflow] added test-pypi check before release (#2591) by Frank Lee
- [workflow] fixed the typo in the example check workflow (#2589) by Frank Lee
- [workflow] hook compatibility test failure to lark (#2586) by Frank Lee
- [workflow] hook example test alert with lark (#2585) by Frank Lee
- [workflow] added notification if scheduled build fails (#2574) by Frank Lee
- [workflow] added discussion stats to community report (#2572) by Frank Lee
- [workflow] refactored compatibility test workflow for maintenability (#2560) by Frank Lee
- [workflow] adjust the GPU memory threshold for scheduled unit test (#2558) by Frank Lee
- [workflow] fixed example check workflow (#2554) by Frank Lee
- [workflow] fixed typos in the leaderboard workflow (#2567) by Frank Lee
- [workflow] added contributor and user-engagement report (#2564) by Frank Lee
- [workflow] only report coverage for changed files (#2524) by Frank Lee
- [workflow] fixed the precommit CI (#2525) by Frank Lee
- [workflow] fixed changed file detection (#2515) by Frank Lee
- [workflow] fixed the skip condition of example weekly check workflow (#2481) by Frank Lee
- [workflow] automated bdist wheel build (#2459) by Frank Lee
- [workflow] automated the compatiblity test (#2453) by Frank Lee
- [workflow] fixed the on-merge condition check (#2452) by Frank Lee
- [workflow] make test coverage report collapsable (#2436) by Frank Lee
- [workflow] report test coverage even if below threshold (#2431) by Frank Lee
- [workflow]auto comment with test coverage report (#2419) by Frank Lee
- [workflow] auto comment if precommit check fails (#2417) by Frank Lee
- [workflow] added translation for non-english comments (#2414) by Frank Lee
- [workflow] added precommit check for code consistency (#2401) by Frank Lee
- [workflow] refactored the example check workflow (#2411) by Frank Lee
- [workflow] added nightly release to pypi (#2403) by Frank Lee
- [workflow] added missing file change detection output (#2387) by Frank Lee
- [workflow]New version: Create workflow files for examples' auto check (#2298) by ziyuhuang123
- [workflow] fixed pypi release workflow error (#2328) by Frank Lee
- [workflow] fixed pypi release workflow error (#2327) by Frank Lee
- [workflow] added workflow to release to pypi upon version change (#2320) by Frank Lee
- [workflow] removed unused assign reviewer workflow (#2318) by Frank Lee
- [workflow] rebuild cuda kernels when kernel-related files change (#2317) by Frank Lee
Release
Doc
- [doc] updated readme for CI/CD (#2600) by Frank Lee
- [doc] fixed issue link in pr template (#2577) by Frank Lee
- [doc] updated the CHANGE_LOG.md for github release page (#2552) by Frank Lee
- [doc] fixed the typo in pr template (#2556) by Frank Lee
- [doc] added pull request template (#2550) by Frank Lee
- [doc] update example link (#2520) by binmakeswell
- [doc] update opt and tutorial links (#2509) by binmakeswell
- [doc] added documentation for CI/CD (#2420) by Frank Lee
- [doc] updated kernel-related optimisers' docstring (#2385) by Frank Lee
- [doc] updated readme regarding pypi installation (#2406) by Frank Lee
- [doc] hotfix #2377 by Jiarui Fang
- [doc] hotfix #2377 by jiaruifang
- [doc] update stable diffusion link (#2322) by binmakeswell
- [doc] update diffusion doc (#2296) by binmakeswell
- [doc] update news (#2295) by binmakeswell
- [doc] update news by binmakeswell
Setup
- [setup] fixed inconsistent version meta (#2578) by Frank Lee
- [setup] refactored setup.py for dependency graph (#2413) by Frank Lee
- [setup] support pre-build and jit-build of cuda kernels (#2374) by Frank Lee
- [setup] make cuda extension build optional (#2336) by Frank Lee
- [setup] remove torch dependency (#2333) by Frank Lee
- [setup] removed the build dependency on colossalai (#2307) by Frank Lee
Tutorial
- [tutorial] polish README (#2568) by binmakeswell
- [tutorial] update fastfold tutorial (#2565) by oahzxl
Polish
- [polish] polish ColoTensor and its submodules (#2537) by HELSON
- [polish] polish code for get_static_torch_model (#2405) by HELSON
Kernel
Hotfix
- [hotfix] fix zero ddp warmup check (#2545) by ver217
- [hotfix] fix autoparallel demo (#2533) by YuliangLiu0306
- [hotfix] fix lightning error (#2529) by HELSON
- [hotfix] meta tensor default device. (#2510) by Super Daniel
- [hotfix] gpt example titans bug #2493 (#2494) by Jiarui Fang
- [hotfix] gpt example titans bug #2493 by jiaruifang
- [hotfix] add norm clearing for the overflow step (#2416) by HELSON
- [hotfix] add DISTPAN argument for benchmark (#2412) by HELSON
- [hotfix] fix gpt gemini example (#2404) by HELSON
- [hotfix] issue #2388 by Jiarui Fang
- [hotfix] issue #2388 by jiaruifang
- [hotfix] fix implement error in diffusers by Jiarui Fang
- [hotfix] fix implement error in diffusers by 1SAA
Autochunk
- [autochunk] add benchmark for transformer and alphafold (#2543) by oahzxl
- [autochunk] support multi outputs chunk search (#2538) by oahzxl
- [autochunk] support transformer (#2526) by oahzxl
- [autochunk] support parsing blocks (#2506) by oahzxl
- [autochunk] support autochunk on evoformer (#2497) by oahzxl
- [autochunk] support evoformer tracer (#2485) by oahzxl
- [autochunk] add autochunk feature by Jiarui Fang
Git
- [git] remove invalid submodule (#2540) by binmakeswell
Gemini
- [gemini] add profiler in the demo (#2534) by HELSON
- [gemini] update the gpt example (#2527) by HELSON
- [gemini] update ddp strict mode (#2518) by HELSON
- [gemini] add get static torch model (#2356) by HELSON
Example
- [example] Add fastfold tutorial (#2528) by [LuGY]...
Version v0.2.0 Release Today!
What's Changed
Version
- [version] 0.1.14 -> 0.2.0 (#2286) by Jiarui Fang
Examples
- [examples] using args and combining two versions for PaLM (#2284) by ZijianYY
- [examples] replace einsum with matmul (#2210) by ZijianYY
Doc
- [doc] add feature diffusion v2, bloom, auto-parallel (#2282) by binmakeswell
- [doc] updated the stable diffussion on docker usage (#2244) by Frank Lee
Zero
- [zero] polish low level zero optimizer (#2275) by HELSON
- [zero] fix error for BEiT models (#2169) by HELSON
Example
- [example] add benchmark (#2276) by Ziyue Jiang
- [example] fix save_load bug for dreambooth (#2280) by BlueRum
- [example] GPT polish readme (#2274) by Jiarui Fang
- [example] fix gpt example with 0.1.10 (#2265) by HELSON
- [example] clear diffuser image (#2262) by Fazzie-Maqianli
- [example] diffusion install from docker (#2239) by Jiarui Fang
- [example] fix benchmark.sh for gpt example (#2229) by HELSON
- [example] make palm + GeminiDPP work (#2227) by Jiarui Fang
- [example] Palm adding gemini, still has bugs (#2221) by ZijianYY
- [example] update gpt example (#2225) by HELSON
- [example] add benchmark.sh for gpt (#2226) by Jiarui Fang
- [example] update gpt benchmark (#2219) by HELSON
- [example] update GPT example benchmark results (#2212) by Jiarui Fang
- [example] update gpt example for larger model scale (#2211) by Jiarui Fang
- [example] update gpt readme with performance (#2206) by Jiarui Fang
- [example] polish doc (#2201) by ziyuhuang123
- [example] Change some training settings for diffusion (#2195) by BlueRum
- [example] support Dreamblooth (#2188) by Fazzie-Maqianli
- [example] gpt demo more accuracy tflops (#2178) by Jiarui Fang
- [example] add palm pytorch version (#2172) by Jiarui Fang
- [example] update vit readme (#2155) by Jiarui Fang
- [example] add zero1, zero2 example in GPT examples (#2146) by HELSON
Hotfix
- [hotfix] fix fp16 optimzier bug (#2273) by YuliangLiu0306
- [hotfix] fix error for torch 2.0 (#2243) by xcnick
- [hotfix] Fixing the bug related to ipv6 support by Tongping Liu
- [hotfix] correcnt cpu_optim runtime compilation (#2197) by Jiarui Fang
- [hotfix] add kwargs for colo_addmm (#2171) by Tongping Liu
- [hotfix] Jit type hint #2161 (#2164) by アマデウス
- [hotfix] fix auto policy of test_sharded_optim_v2 (#2157) by Jiarui Fang
- [hotfix] fix aten default bug (#2158) by YuliangLiu0306
Autoparallel
- [autoparallel] fix spelling error (#2270) by YuliangLiu0306
- [autoparallel] gpt2 autoparallel examples (#2267) by YuliangLiu0306
- [autoparallel] patch torch.flatten metainfo for autoparallel (#2247) by Boyuan Yao
- [autoparallel] autoparallel initialize (#2238) by YuliangLiu0306
- [autoparallel] fix construct meta info. (#2245) by Super Daniel
- [autoparallel] record parameter attribute in colotracer (#2217) by YuliangLiu0306
- [autoparallel] Attach input, buffer and output tensor to MetaInfo class (#2162) by Boyuan Yao
- [autoparallel] new metainfoprop based on metainfo class (#2179) by Boyuan Yao
- [autoparallel] update getitem handler (#2207) by YuliangLiu0306
- [autoparallel] update_getattr_handler (#2193) by YuliangLiu0306
- [autoparallel] add gpt2 performance test code (#2194) by YuliangLiu0306
- [autoparallel] integrate_gpt_related_tests (#2134) by YuliangLiu0306
- [autoparallel] memory estimation for shape consistency (#2144) by Boyuan Yao
- [autoparallel] use metainfo in handler (#2149) by YuliangLiu0306
Gemini
- [Gemini] fix the convert_to_torch_module bug (#2269) by Jiarui Fang
Pipeline middleware
- [Pipeline Middleware] Reduce comm redundancy by getting accurate output (#2232) by Ziyue Jiang
Builder
- [builder] builder for scaled_upper_triang_masked_softmax (#2234) by Jiarui Fang
- [builder] polish builder with better base class (#2216) by Jiarui Fang
- [builder] raise Error when CUDA_HOME is not set (#2213) by Jiarui Fang
- [builder] multihead attn runtime building (#2203) by Jiarui Fang
- [builder] unified cpu_optim fused_optim inferface (#2190) by Jiarui Fang
- [builder] use runtime builder for fused_optim (#2189) by Jiarui Fang
- [builder] runtime adam and fused_optim builder (#2184) by Jiarui Fang
- [builder] use builder() for cpu adam and fused optim in setup.py (#2187) by Jiarui Fang
Logger
- [logger] hotfix, missing _FORMAT (#2231) by Super Daniel
Diffusion
Testing
NFC
- [NFC] fix some typos' (#2175) by ziyuhuang123
- [NFC] update news link (#2191) by binmakeswell
- [NFC] fix a typo 'stable-diffusion-typo-fine-tune' by Arsmart1
Exmaple
- [exmaple] diffuser, support quant inference for stable diffusion (#2186) by BlueRum
- [exmaple] add vit missing functions (#2154) by Jiarui Fang
Pipeline middleware
- [Pipeline Middleware ] Fix deadlock when num_microbatch=num_stage (#2156) by Ziyue Jiang
Full Changelog: v0.2.0...v0.1.13
Version v0.1.13 Release Today!
What's Changed
Version
- [version] 0.1.13 (#2152) by Jiarui Fang
- Revert "[version] version to v0.1.13 (#2139)" (#2153) by Jiarui Fang
- [version] version to v0.1.13 (#2139) by Jiarui Fang
Gemini
- [Gemini] GeminiDPP convert to PyTorch Module. (#2151) by Jiarui Fang
- [Gemini] Update coloinit_ctx to support meta_tensor (#2147) by BlueRum
- [Gemini] revert ZeROInitCtx related tracer (#2138) by Jiarui Fang
- [Gemini] update API of the chunkmemstatscollector. (#2129) by Jiarui Fang
- [Gemini] update the non model data record method in runtime memory tracer (#2128) by Jiarui Fang
- [Gemini] test step-tensor mapping using repeated_computed_layers.py (#2127) by Jiarui Fang
- [Gemini] update non model data calculation method (#2126) by Jiarui Fang
- [Gemini] hotfix the unittest bugs (#2125) by Jiarui Fang
- [Gemini] mapping of preop timestep and param (#2124) by Jiarui Fang
- [Gemini] chunk init using runtime visited param order (#2115) by Jiarui Fang
- [Gemini] chunk init use OrderedParamGenerator (#2110) by Jiarui Fang
Nfc
- [NFC] remove useless graph node code (#2150) by Jiarui Fang
- [NFC] update chunk manager API (#2119) by Jiarui Fang
- [NFC] polish comments for Chunk class (#2116) by Jiarui Fang
Autoparallel
- [autoparallel] process size nodes in runtime pass (#2130) by YuliangLiu0306
- [autoparallel] implement softmax handler (#2132) by YuliangLiu0306
- [autoparallel] gpt2lp runtimee test (#2113) by YuliangLiu0306
Example
- Merge pull request #2120 from Fazziekey/example/stablediffusion-v2 by Fazzie-Maqianli
Optimizer
Pp middleware
- [PP Middleware] Add bwd and step for PP middleware (#2111) by Ziyue Jiang
Full Changelog: v0.1.13...v0.1.12
Version v0.1.12 Release Today!
What's Changed
Zero
Gemini
- [gemini] get the param visited order during runtime (#2108) by Jiarui Fang
- [Gemini] NFC, polish search_chunk_configuration (#2107) by Jiarui Fang
- [Gemini] gemini use the runtime memory tracer (RMT) (#2099) by Jiarui Fang
- [Gemini] make RuntimeMemTracer work correctly (#2096) by Jiarui Fang
- [Gemini] remove eval in gemini unittests! (#2092) by Jiarui Fang
- [Gemini] remove GLOBAL_MODEL_DATA_TRACER (#2091) by Jiarui Fang
- [Gemini] remove GLOBAL_CUDA_MEM_INFO (#2090) by Jiarui Fang
- [Gemini] use MemStats in Runtime Memory tracer (#2088) by Jiarui Fang
- [Gemini] use MemStats to store the tracing data. Seperate it from Collector. (#2084) by Jiarui Fang
- [Gemini] remove static tracer (#2083) by Jiarui Fang
- [Gemini] ParamOpHook -> ColoParamOpHook (#2080) by Jiarui Fang
- [Gemini] polish runtime tracer tests (#2077) by Jiarui Fang
- [Gemini] rename hooks related to runtime mem tracer (#2076) by Jiarui Fang
- [Gemini] add albert in test models. (#2075) by Jiarui Fang
- [Gemini] rename ParamTracerWrapper -> RuntimeMemTracer (#2073) by Jiarui Fang
- [Gemini] remove not used MemtracerWrapper (#2072) by Jiarui Fang
- [Gemini] fix grad unreleased issue and param recovery issue (#2052) by Zihao
Hotfix
- [hotfix] fix a type in ColoInitContext (#2106) by Jiarui Fang
- [hotfix] update test for latest version (#2060) by YuliangLiu0306
- [hotfix] skip gpt tracing test (#2064) by YuliangLiu0306
Colotensor
- [ColoTensor] throw error when ColoInitContext meets meta parameter. (#2105) by Jiarui Fang
Autoparallel
- [autoparallel] support linear function bias addition (#2104) by YuliangLiu0306
- [autoparallel] support addbmm computation (#2102) by YuliangLiu0306
- [autoparallel] add sum handler (#2101) by YuliangLiu0306
- [autoparallel] add bias addtion function class (#2098) by YuliangLiu0306
- [autoparallel] complete gpt related module search (#2097) by YuliangLiu0306
- [autoparallel]add embedding handler (#2089) by YuliangLiu0306
- [autoparallel] add tensor constructor handler (#2082) by YuliangLiu0306
- [autoparallel] add non_split linear strategy (#2078) by YuliangLiu0306
- [autoparallel] Add F.conv metainfo (#2069) by Boyuan Yao
- [autoparallel] complete gpt block searching (#2065) by YuliangLiu0306
- [autoparallel] add binary elementwise metainfo for auto parallel (#2058) by Boyuan Yao
- [autoparallel] fix forward memory calculation (#2062) by Boyuan Yao
- [autoparallel] adapt solver with self attention (#2037) by YuliangLiu0306
Version
- [version] 0.1.11rc5 -> 0.1.12 (#2103) by Jiarui Fang
Pipeline middleware
- [Pipeline Middleware] fix data race in Pipeline Scheduler for DAG (#2087) by Ziyue Jiang
- [Pipeline Middleware] Adapt scheduler for Topo (#2066) by Ziyue Jiang
Fx
- [fx] An experimental version of ColoTracer.' (#2002) by Super Daniel
Example
Device
- [device] update flatten device mesh usage (#2079) by YuliangLiu0306
Test
- [test] bert test in non-distributed way (#2074) by Jiarui Fang
Pipeline
- [Pipeline] Add Topo Class (#2059) by Ziyue Jiang
Examples
- [examples] update autoparallel demo (#2061) by YuliangLiu0306
Full Changelog: v0.1.12...v0.1.11rc5
Version v0.1.11rc5 Release Today!
What's Changed
Release
Cli
Gemini
- [gemini] fix init bugs for modules (#2047) by HELSON
- [gemini] add arguments (#2046) by HELSON
- [Gemini] free and allocate cuda memory by tensor.storage, add grad hook (#2040) by Zihao
- [Gemini] more tests for Gemini (#2038) by Jiarui Fang
- [Gemini] more rigorous unit tests for run_fwd_bwd (#2034) by Jiarui Fang
- [Gemini] paramWrapper paramTracerHook unitest (#2030) by Zihao
- [Gemini] patch for supporting orch.add_ function for ColoTensor (#2003) by Jiarui Fang
- [gemini] param_trace_hook (#2020) by Zihao
- [Gemini] add unitests to check gemini correctness (#2015) by Jiarui Fang
- [Gemini] ParamMemHook (#2008) by Zihao
- [Gemini] param_tracer_wrapper and test case (#2009) by Zihao
Setup
Test
- [test] align model name with the file name. (#2045) by Jiarui Fang
Hotfix
- [hotfix] hotfix Gemini for no leaf modules bug (#2043) by Jiarui Fang
- [hotfix] add bert test for gemini fwd bwd (#2035) by Jiarui Fang
- [hotfix] revert bug PRs (#2016) by Jiarui Fang
Zero
- [zero] fix testing parameters (#2042) by HELSON
- [zero] fix unit-tests (#2039) by HELSON
- [zero] test gradient accumulation (#1964) by HELSON
Testing
Rpc
- [rpc] split with dag (#2028) by Ziyue Jiang
Autoparallel
- [autoparallel] add split handler (#2032) by YuliangLiu0306
- [autoparallel] add experimental permute handler (#2029) by YuliangLiu0306
- [autoparallel] add runtime pass and numerical test for view handler (#2018) by YuliangLiu0306
- [autoparallel] add experimental view handler (#2011) by YuliangLiu0306
- [autoparallel] mix gather (#1977) by Genghan Zhang
Fx
- [fx]Split partition with DAG information (#2025) by Ziyue Jiang
Github
- [GitHub] update issue template (#2023) by binmakeswell
Workflow
Full Changelog: v0.1.11rc5...v0.1.11rc4