Release Hidet v0.2.4 · hidet-org/hidet

What's Changed

[Version] Bump version to v0.2.4.dev by @yaoyaoding in #188
[Dynamo] module tests + operator support by @AndreSlavescu in #148
Refactor compilation workflow to support CPU without CUDA by @LDY1998 in #189
[Stack] Allow the the ulimit stack size less than expected by @yaoyaoding in #195
[Readme] Add platform requirements by @yaoyaoding in #196
[DataType] Add complex64 and complex128 data type by @yaoyaoding in #200
[Example] Add an example of running GPT-2 model by @yaoyaoding in #203
[Fusion] Use inline pass in fusion to allow template call functions with kernel params by @yaoyaoding in #197
[Frontend][Operator] Add missing operators for dinov2 by @yaoyaoding in #206
[Backend] Add openmp support by @yaoyaoding in #208
[Operator] Update batch_matmul to use Hidet Script by @hjjq in #207
[Cache] Add cache management command line interface by @yaoyaoding in #212
[IR] Creation-time constant fold for constant expressions by @yaoyaoding in #209
[Torch][Operator] Allow change torch tensor device when possible by @yaoyaoding in #214
[Torch][Operator] Add op mapping for torch.min/max/minimum/maximum by @yaoyaoding in #216
[Typo] Fix a typo in resnext.py by @eltociear in #210
[Operator] Adding missing operators for llama by @yaoyaoding in #219
[IR] Adding more support for dynamic shape on Task and FlowGraph level by @yaoyaoding in #220
[Torch] Add mapping for torch.ops.aten.add and torch.ops.aten.cos by @yaoyaoding in #223
[Operator][Backend] Add nvcc flags for faster math and update Attention schedule by @hjjq in #221
[CI] Always clear the cache before tests by @yaoyaoding in #224
fix batch_matmul for invalid mma config for sm < 80 by @xinli-git in #227
[Dynamic Shape] Adding more dynamic shape support by @yaoyaoding in #228
[CI] Add importlib_metadata to requirements-dev.txt by @yaoyaoding in #233
[Script] Add list comprehension support in hidet script by @yaoyaoding in #235
[Refactor][Dynamic Shape] Introduce SymbolVar to implement dynamic shape by @yaoyaoding in #236
[Script] Add pointer arthematic by @yaoyaoding in #237
[Operator][Torch] Add causal fmha and torch sdpa mapping by @hjjq in #238
[Fixbug][Pass] Fix a bug in the inline_let_stmt pass by @yaoyaoding in #240
[Options] Add option for controlling parallel build with number of jobs or memory reserved for each job by @xinli-git in #230
[Typo] Fix a typo by @BolinSNLHM in #245
[Typo] Fix minor spelling mistake by @Aalanli in #246
[Fixbug] Fix a bug in StmtRewriter which discard declare scope information by @yaoyaoding in #248
[Refactor] Adding support for compiled model by @yaoyaoding in #247
[Operator] batch_matmul: Remove duplicate smem declaration by @hjjq in #249
[Operator] Adding CPU support for matrix multiplication by @BolinSNLHM in #251
[Hidet Script] Allow bind_tuple argument in mapping.on(...) and grid(...) by @yaoyaoding in #254
[Hidet Script] Add in and not in expression in hidet script by @yaoyaoding in #255
[Codegen] Include header files as needed by @yaoyaoding in #256
[Operator] Add new operator "normalize" that makes a group of layers (layer norm, group norm and instance norm) faster using hidet script by @xinli-git in #257
[Testing][Models] Add gpt2 module in testing models by @yaoyaoding in #252
[Fixbug] Fix test warnings and the incompatibility of two recent PRs by @yaoyaoding in #258
[Operator] Add sm75 support for attention by @hjjq in #259
[Operator] batch_matmul: Remove unroll and reduce tuning space by @hjjq in #260
[Fixbug] Fix a bug when fused operator has no input by @yaoyaoding in #263
[Graph] Translate softmax and reduce to hidet script by @Aalanli in #242
[Fixbug] batch_matmul: move cc checking inside schedule by @hjjq in #264
[Refactor] Refactor building system and adding compiled products by @yaoyaoding in #261
[Fixbug] Reduce the default unroll factor to 4 by @yaoyaoding in #266
[Torch] Add some torch frontend mappings for roberta-base by @hjjq in #267
[Refactor] Remove schedules submodule under hidet.graph.ops by @yaoyaoding in #269
[Device] Add support for mixed cpu and cuda kernels in the same flow graph by @yaoyaoding in #270
[Dynamic Shape] Adding dynamic shape support for reduce by @Aalanli in #268
[Complex Type] Add more support for complex data type by @yaoyaoding in #271
[Tools] Model translator by @Aalanli in #273
[Model] Llama model implementation in hidet by @Aalanli in #243
[Operator] Add support for cross attention by @hjjq in #275
[Operator] Add dynamic shape support and tests for Operators. by @Aalanli in #274
[Fusion] Enhance the prologue epilogue fusion by @yaoyaoding in #277
[Drivers] Suppress OSError by @hjjq in #278
[Dynamic Shape] More correctness guards by @Aalanli in #276
[Operator] Make Convolution gemms fusible by resolving to batch_matmul by @hjjq in #279
[External Tasks] Move task build into method call for external kernel support by @xinli-git in #282
[Distributed] add nccl primitives by @soodoshll in #280
[Operators] Conv2d fp16 implicit gemm kernel by @Aalanli in #283

New Contributors

@eltociear made their first contribution in #210
@BolinSNLHM made their first contribution in #245
@Aalanli made their first contribution in #246

Full Changelog: v0.2.3...v0.2.4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hidet v0.2.4

What's Changed

New Contributors

Contributors