Agentic Evaluation Metric, Custom Evaluation LLMs, and Async for Synthetic Data Generation

penguine-ip released this 30 Jul 17:27

· 938 commits to main since this release

eb343ac

In DeepEval v0.21.74, we have:

Agnetic evaluation metric to evaluate tool calling correctness for LLM agents: https://docs.confident-ai.com/docs/metrics-tool-correctness
Pydantic Schemas to enforce JSON outputs for custom, smaller LLMs: https://docs.confident-ai.com/docs/guides-using-custom-llms
Asynchronous support for synthetic data generation: https://docs.confident-ai.com/docs/evaluation-datasets-synthetic-data
Tracing integration for LLamaIndex and LangChain: https://docs.confident-ai.com/docs/confident-ai-tracing

Assets 2

Provide feedback