Releases: opea-project/GenAIStudio
Generative AI Studio v1.2 Release Notes
OPEA Release Notes v1.2
We are excited to announce the release of OPEA version 1.2, which includes significant contributions from the open-source community. This release addresses over 320 pull requests.
More information about how to get started with OPEA v1.2 can be found at Getting Started page. All project source code is maintained in the repository. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.
What's New in OPEA v1.2
This release focuses on code refactoring for GenAIComps, the epic efforts aimed at reducing redundancy, addressing technical debt, and enhancing overall maintainability and code quality. As a result, OPEA users can expect a more robust and reliable OPEA with clearer guidance and improved documentation.
OPEA v1.2 also introduces more scenarios with general availability, including:
- LlamaIndex and LangChain Integration: Enabling OPEA as a backend. LlamaIndex integration currently supports ChatQnA only.
- Model Context Protocol(MCP) Support: Experimental support for MCP at Retriever.
- Cloud Service Providers(CSP) Support: Supported automated Terraform deployment using Intel® Optimized Cloud Modules for Terraform, available for major cloud platforms, including
Amazon Web Services (AWS)
,Google Cloud Platform (GCP)
, andMicrosoft Azure
. - Enhanced Security: Istio Mutual TLS (mTLS) and OIDC (Open ID Connect) based Authentication with APISIX.
- Enhancements for GenAI Evaluation: Specialized evaluation benchmarks tailored for Chinese language models, focusing on their performance and accuracy within Chinese dataset.
- Helm Charts Deployment: Add supports for the examples Text2Image, SearchQnA and their microservices.
Highlights
Code Factoring for GenAIComps
This is an epic task in v1.2. We refactored the entire GenAIComps codebase. This comprehensive effort focused on reducing redundancy, addressing accumulated technical debt, and enhancing the overall maintainability and code quality. The refactoring not only streamlined the architecture but also laid a stronger foundation for future scalability and development.
At the architecture level, OPEA introduces OpeaComponentRegistry
and OpeaComponentLoader
. The OpeaComponentRegistry manages the lifecycle of component classes, including their registration and deregistration, while the OpeaComponentLoader instantiates components based on the classes in the registry and execute as needed. Unlike previous implementations, this approach ensures that the lifecycle of a component class is transparent to the user, and components are instantiated only when actively used. This design enhances efficiency, clarity, and flexibility in the system.
At the component level, each OPEA component is structured into two layers: the service wrapper
and the service provider
(named as integrations in the code). The service wrapper, which is optional, acts as a protocol hub and manages service access, while the service provider delivers the actual functionality. This architecture allows components to be seamlessly integrated or removed without requiring code changes, enabling a modular and adaptable system. All the existing components have ported to the new architecture.
Additionally, we reduced code redundancy, merged overlapping modules, and implemented adjustments to align with the new architectural changes.
Note
We suggest users and contributors to review the documentation to understand the impacts of the code refactoring.
Supporting Cloud Service Providers
OPEA offers automated Terraform deployment using Intel® Optimized Cloud Modules for Terraform, available for major cloud platforms, including AWS
, GCP
, and Azure
. To explore this option, check out the Terraform deployment guide.
Additionally, OPEA supports manual deployment on virtual servers across AWS
, GCP
, IBM Cloud
, Azure
, and Oracle Cloud Infrastructure (OCI)
. For detailed instructions, refer to the manual deployment guide.
Enhanced GenAI Components
- vLLM support for embeddings and rerankings: Integrate vLLM as a serving framework to enhance the performance and scalability of embedding and reranking models.
- Agent Microservice:
- SQL agent strategy: Take user question, hints (optional) and history (when available), and think step by step to solve the problem by interacting with a SQL database. OPEA currently has two types of SQL agents:
sql_agent_llama
for using with open-source LLMs andsql_agent
: for using with OpenAI models. - Enabled user-customized tool subsets: Added support for user-defined subsets of tools for the ChatCompletion API and Assistant APIs.
- Enabled persistence: Introduced Redis to persist Agent configurations and historical messages for Agent recovery and multi-turn conversations.
- SQL agent strategy: Take user question, hints (optional) and history (when available), and think step by step to solve the problem by interacting with a SQL database. OPEA currently has two types of SQL agents:
- Long-context Summarization: Supported multiple modes:
auto
,stuff
,truncate
,map_reduce
, andrefine
. - Standalone Microservice Deployment: Enabled the deployment of OPEA components as independent services, allowing for greater flexibility, scalability, and modularity in various application scenarios.
- PDF Inputs Support: Support PDF inputs for dataprep, embeddings, LVMs, and retrievers.
New GenAI Components
- Bedrock: OPEA LLM now supports Amazon Bedrock as the backend of the text generation microservice. Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.
- OpenSearch Vector Database: OPEA vectorstores now supports AWS OpenSearch. OpenSearch is an open-source, enterprise-grade search and observability suite that brings order to unstructured data at scale.
- Elasticsearch Vector Database: OPEA vectorestores now supports Elasticsearch vector database, Elasticsearch's open source vector database offering an efficient way to create, store, and search vector embeddings.
- Guardrail Hallucination Detection: Added the capability of detecting Hallucination which spans a wide range of issues that can impact reliability, trustworthiness, and utility of AI-generated content.
Enhanced GenAI Examples
- ChatQnA: Enabled embedding and reranking on vLLM, and Jaeger UI and OpenTelemetry tracing for TGI serving on HPU.
- AgentQnA: Added SQL worker agent and introduced a Svelte-based GUI for ChatCompletion API for non-streaming interactions.
- MultimodalQnA: Added support for PDF ingestion, and image/audio queries.
- EdgeCraftRAG: Supported image/url data retrieval and display, display of LLM-used context sources in UI, pipeline remove operation in RESTful API and UI, RAG pipeline performance benchmark and display in UI. (#GenAIExamples/1324)
- DocSum: Added URL summary option to Gradio-based UI.
- DocIndexRetriever: Add the pipeline without Reranking.
Enhanced GenAIStudio
In this release, GenAI Studio enables Keycloak for multi-user management, supporting sandbox environment for multi-workflow execution and enables Grafana based visualization dashboards with built-in performance metric on Prometheus for model evaluation and functional nodes performance.
Newly Supported Models
- bge-base-zh-v1.5
- Falcon2-40B/11B
- Falcon3
Newly Supported Hardware
- Intel® Gaudi® 3 AI Accelerator
- AMD® GPU using AMD® ROCm™ for AgentQnA, [AudioQnA](https://github.com/opea-project/GenA...
Generative AI Studio v1.1 Release Notes
OPEA Release Notes v1.1
We are pleased to announce the release of OPEA version 1.1, which includes significant contributions from the open-source community. This release addresses over 470 pull requests.
More information about how to get started with OPEA v1.1 can be found at Getting Started page. All project source code is maintained in the repository. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.
What's New in OPEA v1.1
This release introduces more scenarios with general availability, including:
- Newly supported Generative AI capabilities: Image-to-Video, Text-to-Image, Text-to-SQL and Avatar Animation.
- Generative AI Studio that offers a no-code alternative to create enterprise Generative AI applications.
- Expands the portfolio of supported hardware to include Intel® Arc™ GPUs and AMD® GPUs.
- Enhanced monitoring support, providing real-time insights into runtime status and system resource utilization for CPU and Intel® Gaudi® AI Accelerator, as well as Horizontal Pod Autoscaling (HPA).
- Helm Chart support for 7 new GenAIExamples and their microservices.
- Benchmark tools for long-context language models (LCLMs) such as LongBench and HELMET.
Highlights
New GenAI Examples
- AvatarChatbot: a chatbot that combines a virtual "avatar" that can run on either Intel Gaudi 2 AI Accelerator or Intel Xeon Scalable Processors.
- DBQnA: for seamless translation of natural language queries into SQL and deliver real-time database results.
- EdgeCraftRAG: a customizable and tunable RAG example for edge solutions on Intel® Arc™ GPUs.
- GraphRAG: a Graph RAG-based approach to summarization.
- Text2Image: an application that generates images based on text prompts.
- WorkflowExecAgent: a workflow executor example to handle data/AI workflow operations via LangChain agents to execute custom-defined workflow-based tools.
Enhanced GenAI Examples
- Multi-media support: DocSum, MultimodalQnA
- Multi-language support: AudioQnA, DocSum
New GenAI Components
- Text-to-Image: add Stable Diffusion microservice
- Image-to-Video: add Stable Video Diffusion microservice
- Text-to-SQL: add Text-to-SQL microservice
- Text-to-Speech: add GPT-SoVITS microservice
- Avatar Animation: add Animation microservice
- RAG: add GraphRAG with llama-index microservice
Enhanced GenAI Components
- Asynchronous support for microservices (28672956, 9df4b3c0, f3746dc8)
- Add vLLM backends for summarization, FAQ generation, code generation, and Agents
- Multimedia support (29ef6426, baafa402)
GenAIStudio
GenAI Studio, a new project of OPEA, streamlines the creation of enterprise Generative AI applications by providing an alternative UI-based processes to create end-to-end solutions. It supports GenAI application definition, evaluation, performance benchmarking, and deployment. The GenAI Studio empowers developers to effortlessly build, test, optimize their LLM solutions, and create a deployment package. Its intuitive no-code/low-code interface accelerates innovation, enabling rapid development and deployment of cutting-edge AI applications with unparalleled efficiency and precision.
Enhanced Observability
Observability offers real-time insights into component performance and system resource utilization. We enhanced this capability by monitoring key system metrics, including CPU, host memory, storage, network, and accelerators (such as Intel Gaudi), as well as tracking OPEA application scaling.
Helm Charts Support
OPEA examples and microservices support Helm Charts as the packaging format on Kubernetes (k8s). The newly supported examples include AgentQnA, AudioQnA, FaqGen, VisualQnA. The newly supported microservices include chathistory, mongodb, prompt, and Milvus for data-prep and retriever. Helm Charts have now option to get Prometheus metrics from the applications.
Long-context Benchmark Support
We added the following two benchmark kits to response to the community's requirements of long-context language models.
- HELMET: a comprehensive benchmark for long-context language models covering seven diverse categories of tasks. The datasets are application-centric and are designed to evaluate models at different lengths and levels of complexity.
- LongBench: a benchmark tool for bilingual, multitask, and comprehensive assessment of long context understanding capabilities of large language models.
Newly Supported Models
- llama-3.2 (1B/3B/11B/90B)
- glm-4-9b-chat
- Qwen2/2.5 (7B/32B/72B)
Newly Supported Hardware
- Intel® Arc™ GPU: vLLM powered by OpenVINO can perform optimal model serving on Intel® Arc™ GPU.
- AMD® GPU: deploy GenAI examples on AMD® GPUs using AMD® ROCm™: CodeTrans, CodeGen, FaqGen, DocSum, ChatQnA.
Notable Changes
GenAIExamples
- Functionalities
- New GenAI Examples
- [AvatarChatbot] Initiate "AvatarChatbot" (audio) example (cfffb4c, 960805a)
- [DBQnA] Adding DBQnA example in GenAIExamples (c0643b7, 6b9a27d)
- [EdgeCraftRag] Add EdgeCraftRag as a GenAIExample (c9088eb, 7949045, 096a37a)
- [GraphRAG] Add GraphRAG example a65640b
- [Text2Image]: Add example for text2image 085d859
- [WorkflowExecAgent] Add Workflow Executor Example bf5c391
- Enhanced GenAI Examples
- [AudioQnA] Add multi-language AudioQnA on Xeon 658867f
- [AgentQnA] Update AgentQnA example for v1.1 release 5eb3d28
- [ChatQnA] Enable vLLM Profiling for ChatQnA ([00d9bb6](https://github.com/opea-project...
- New GenAI Examples