The repository you have cloned provides a skeleton project in the project directory. The skeleton project contains an already-set-up CAP application with the fully configured package.json and supporting files for the following exercises.
In this exercise you will learn:
- How to explore the package.json and it's contents.
- How to define the database schema for the HANA database.
- How to build and deploy the schema to your HDI container.
The package.json
file includes all Node.js project-specific configurations like project name, version, dependencies, and run scripts but also CDS-specific configurations like the HANA runtime information.
👉 Open the package.json
file.
👉 Examine the dependencies. Notice that there is a dependency for Langchain
and the CAP-LLM-Plugin
.
LangChain is the leading library for generative AI development and compatible with SAP’s Generative AI Hub Python SDK and the CAP-LLM-Plugin. Langchain is available in Python as well as in JavaScript.
👉 Examine the listed scripts.
build
: Runs thecds build
command to build the CAP project.build_sqlite
: Builds the CAP application for usage on a SQLite database.start
: Starts the CAP application.watch
: Deploys your changes specific to your service to localhost using the hybrid profile establishing a connection to your real HDI container instance.sqlite
: Deploys your changes specific to your service to localhost using the hybrid profile establishing a connection to a real SQLite database.
You can add as many scripts as you want here making it easier to run commands for your project.
Within a CAP application, you can define a database schema that can be built into HANA database artifacts. The artifacts can be deployed to a bound HDI container, which will cause the creation of the database tables, relationships, views, and any other HANA database artifacts.
For this project, the schema has exactly one entity; DocumentChunk
.
The DocumentChunk
entity contains the text chunks and embeddings for the provided context information. In the next exercise 07 - Define the Embedding Service, you will define a service for the chunking of a PDF document, the CAP_Documentation_V8.pdf, and retrieving vector embeddings for the individual chunks. The chunks and vector embeddings will be saved in the DocumentChunk
entity.
👉 Open the schema.cds
file under the db
directory.
👉 In the file define a namespace namespace sap.codejam
:
namespace sap.codejam;
The namespace allows for better identification and provides uniqueness to the entities within that namespace. It also will cause the database table to be named SAP_CODEJAM_<Entity Name>
in example SAP_CODEJAM_DOCUMENTCHUNK
.
👉 Right below, add the following line of code:
using { managed } from '@sap/cds/common';
The entity should be managed, meaning it will get an autogenerated UUID and time stamps for creation and mutation.
👉 Lastly, add the definition for the DocumentChunk
entity. The entity is using the managed
feature from the cds.common
package.
entity DocumentChunk: managed {
text_chunk : LargeString;
metadata_column : LargeString;
embedding : Vector(1536);
}
The entity defines three fields:
text_chunk
: Stores the individual-created text chunks.metadata_column
: Stores the path to the information document. The document is a PDF that includes business contextual information.embedding
: Stores the encoded vector embeddings created by an embedding model of your choice.
👉 Save the file.
Because you created a binding to our HDI container in exercise 02 you have all configurations in place to deploy HANA database artifacts to the instance. To do so you need to build the artifacts first, for that you can use the hana
script from the package.json
or manually enter the cds deploy
command.
👉 Open a terminal or use an existing one.
👉 Make sure that you are still connected to the Cloud Foundry instance by checking the connection details:
cf target
If the reply from the CLI tells you to log in again simply enter cf login
. This time you don't have to specify the API endpoint because it is stored from the previous login.
cf login
👉 If you want to manually build and deploy the artifacts, call the cds deploy --to hana:<hdi-instance>
command (Use the HDI container name from Exercise 04).
In case you forgot your HDI container name, you can simply call
cf services
to get a list of all available service instances including your HDI container.
cds deploy --to hana:<your-hdi-container-name> --auto-undeploy
The --auto-undeploy
argument causes the database to adjust to the new runtime definition of your database artifacts.
You will see a big terminal output listing the different steps of the building and deployment process.
Great! The database is initialized, and the table with all necessary fields is created.
There are multiple ways of viewing your database artifacts on SAP HANA Cloud. One would be to use the SAP HANA Database Explorer and, of course, the CLI if you don't want to use the UI. If you are interested in using the ** SAP HANA Database Explorer **, you will find a tutorial in the Further Reading section. You should use the CLI today because it is quicker. #TheFutureIsTerminal
You have to install the hana-cli first.
👉 Open a new terminal or use an existing one.
👉 Run the install command:
npm install -g hana-cli
👉 Enter the hana-cli help
command to get a list of all available commands:
hana-cli help
👉 To get a list of all available tables within your HDI container you can execute the following command:
hana-cli tables
From the response, you can extract the schema name and the table name. You will use the information to fetch all information about that table.
👉 Enter the following command to list the table information:
hana-cli inspectTable <Your-schema-name> SAP_CODEJAM_DOCUMENTCHUNK
You can see all created fields as defined in the schema.cds. Notice that it also has all the fields from the managed
feature from the cds.common
package.
At this point, you have learned how to define a database schema using CDS, how to build and deploy database artifacts, and how you can use the hana-cli to inspect your database tables.