Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking Suite for Loss Functions in T5 Training #36

Merged
merged 31 commits into from
Jan 9, 2025

Conversation

s1k0ra
Copy link
Contributor

@s1k0ra s1k0ra commented Jan 8, 2025

This suite benchmarks different loss functions in neural network training across three scenarios:
1. Standalone loss computation
2. Model forward pass
3. Complete training step (including gradient updates)

Supported loss functions:
• Standard Cross Entropy (CE)
• CE with Number Token Loss (NTL) using MSE
• CE with NTL using Wasserstein distance
• CE with NTL using Absolute Difference

The suite includes utility functions for generating synthetic data, timing benchmarks, and logging results. It supports experimentation with different configurations and outputs CSV reports of benchmark statistics.

Copy link
Collaborator

@jannisborn jannisborn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@zausin33 zausin33 merged commit dc4eabe into main Jan 9, 2025
2 checks passed
@zausin33 zausin33 deleted the loss-benchmarking branch January 9, 2025 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants