Merge branch 'develop' v0.4.0

aws-samples · Nov 28, 2022 · aad0587 · aad0587
2 parents 044f80b + a5257a8
commit aad0587
Show file tree

Hide file tree

Showing 25 changed files with 1,096 additions and 403 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.4.0] - 2022-11-27
+### Added
+- Supports ingestion of post-call output transcripts from Transcribe Real-time Call Analytics. 
+- Supports integration with Live Call Analytics and Agent Assist (LCA) v0.6.0 or later. See [LCA Integration](./README.md#live-call-analytics-and-agent-assist-companion-solution)
+
+
 ## [0.3.4] - 2022-11-9
 ### Added
 - Additional processing for Genesys CTR telephony files. See [Integration with Telephony CTR Files](./README.md#integration-with-telephony-ctr-files)
@@ -90,7 +96,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Added
 - Initial release
 
-[Unreleased]: https://github.com/aws-samples/amazon-transcribe-post-call-analytics/compare/v0.3.4...develop
+[Unreleased]: https://github.com/aws-samples/amazon-transcribe-post-call-analytics/compare/v0.4.0...develop
+[0.4.0]: https://github.com/aws-samples/amazon-transcribe-post-call-analytics/releases/tag/v0.3.4
 [0.3.4]: https://github.com/aws-samples/amazon-transcribe-post-call-analytics/releases/tag/v0.3.4
 [0.3.3]: https://github.com/aws-samples/amazon-transcribe-post-call-analytics/releases/tag/v0.3.3
 [0.3.2]: https://github.com/aws-samples/amazon-transcribe-post-call-analytics/releases/tag/v0.3.2

diff --git a/README.md b/README.md
@@ -6,17 +6,26 @@
 
 Your contact center connects your business to your community, enabling customers to order products, callers to request support, clients to make appointments, and much more. Each conversation with a caller is an opportunity to learn more about that caller’s needs, and how well those needs were addressed during the call. You can uncover insights from these conversations that help you manage script compliance and find new opportunities to satisfy your customers, perhaps by expanding your services to address reported gaps, improving the quality of reported problem areas, or by elevating the customer experience delivered by your contact center agents.
 
-This sample solution, Post Call Analytics (PCA), does most of the heavy lifting associated with providing an end-to-end solution that can process call recordings from your existing contact center. PCA provides actionable insights to spot emerging trends, identify agent coaching opportunities, and assess the general sentiment of calls. 
+This sample solution, Post Call Analytics (PCA), does most of the heavy lifting associated with providing an end-to-end solution that can process call recordings from your existing contact center. PCA provides actionable insights to spot emerging trends, identify agent coaching opportunities, and assess the general sentiment of calls.
+
+If you already have Amazon Transcribe transcripts generated from the Streaming Call Analytics feature then those output transcripts can de delivered to PCA so that they can be aggregated and analyzed in exactly the same way as any audio files that you process from telephony system.  You can use PCA with audio files, with transcript files, or with both. 
+
+*(New!) The latest version (v0.6.0) of our companion Live Call Analytics and Agent Assist (LCA) supports Amazon Transcribe Real-time Call Analytics and provides easy integration with PCA. See [Live Call Analytics and Agent Assist: Companion Solution](#live-call-analytics-and-agent-assist-companion-solution) section below.*
 
 PCA currently supports the following features:
 
-* **Transcription**
+* **Source Input Data**
+    * Audio files can be delivered to audio ingestion location in Amazon S3, which is defined in AWS Systems Manager Parameter store in the bucket defined in `InputBucketName` and folder `InputBucketRawAudio`. 
+    * Transcript files from Transcribe's Streaming Analytics APIs can be delivered to the transcript ingest location in Amazon S3, which is defined in AWS Systems Manager Parameter store in the bucket defined in `InputBucketName` and folder `InputBucketOrigTranscripts`
+
+
+* **Transcription** *(audio files only)*
     * Batch turn-by-turn transcription with support for [Amazon Transcribe custom vocabulary](https://docs.aws.amazon.com/transcribe/latest/dg/custom-vocabulary.html) for accuracy of domain-specific terminology 
     * [Personally identifiable information (PII) redaction](https://docs.aws.amazon.com/transcribe/latest/dg/call-analytics-pii-redaction.html) from transcripts and audio files, and [vocabulary filtering](https://docs.aws.amazon.com/transcribe/latest/dg/create-filter.html) for masking custom words and phrases
     * Multiple languages and automatic language detection
     * Standard audio file formats
     * Caller and agent speaker labels using [channel identification](https://docs.aws.amazon.com/transcribe/latest/dg/channel-id.html) or [speaker diarization](https://docs.aws.amazon.com/transcribe/latest/dg/diarization.html)
-* **Analytics**
+* **Analytics** *(audio files only)*
     * Caller and agent sentiment details and trends
     * Talk and non-talk time for both caller and agent 
     * Configurable Transcribe Call Analytics categories based on the presence or absence of keywords or phrases, sentiment, and non-talk time
@@ -44,13 +53,15 @@ Call recording audio files are uploaded to the S3 bucket and folder, identified
 
 As each recording file is added to the input bucket, an S3 event notification triggers a Lambda function that initiates a workflow in Step Functions to process the file. The workflow orchestrates the steps to start an Amazon Transcribe batch job and process the results by doing entity detection and additional preparation of the call analytics results. Processed results are stored as JSON files in another S3 bucket and folder, identified in the main stack outputs as ``OutputBucket`` and ``OutputBucketPrefix``**.**
 
+If you deliver transcript files rather than audio files then the majority of the above is bypassed, and the transcripts just undergo the same post-Transcribe results procesing, giving you a single store of call analytics data for your call data from multiple sources.
+
 As the Step Functions workflow creates each JSON results file in the output bucket, an S3 event notification triggers a Lambda function, which loads selected call metadata into a DynamoDB table.
 
 The PCA UI web app queries the DynamoDB table to retrieve the list of processed calls to display on the home page. The call detail page reads additional detailed transcription and analytics from the JSON results file for the selected call.
 
 Amazon S3 lifecycle policies delete recordings and JSON files from both input and output buckets after a configurable retention period, defined by the deployment parameter `RetentionDays`. S3 event notifications and Lambda functions keep the DynamoDB table synchronized as files are both created and deleted.
 
-When the `EnableTranscriptKendraSearch` parameter** **is `true`, the Step Functions workflow also adds time markers and metadata attributes to the transcription, which are loaded into an Amazon Kendra index. The transcription search web application is used to search call transcriptions. For more information on how this works, see [Make your audio and video files searchable using Amazon Transcribe and Amazon Kendra](http://www.amazon.com/mediasearch).
+When the `EnableTranscriptKendraSearch` parameter is `true`, the Step Functions workflow also adds time markers and metadata attributes to the transcription, which are loaded into an Amazon Kendra index. The transcription search web application is used to search call transcriptions. For more information on how this works, see [Make your audio and video files searchable using Amazon Transcribe and Amazon Kendra](http://www.amazon.com/mediasearch).
 
 ## Integration with Telephony CTR Files
 
@@ -200,6 +211,10 @@ As before, your new password must have a length of at least 8 characters, and co
 
 You’re now logged in to the transcript search Finder application. The sample audio files are indexed already, and ready for search.
 
+## Live Call Analytics and Agent Assist: Companion solution
+
+Our companion solution, Live Call Analytics and Agent Assist (LCA), offers real-time transcription and analytics capabilities by using the Amazon Transcribe real-time APIs. Unlike PCA, which transcribes and analyzes recorded audio after the call has ended, LCA transcribes and analyzes your calls as they are happening and provides real-time updates to supervisors and agents.  The new Amazon Transcribe Real-time Call Analytics service provides post-call analytics output from your streaming sessions just a few minutes after the call has ended. LCA (v 0.6.0 or later) can now send this post-call analytics data to PCA to provide analytics visualizations for completed calls without needing to transcribe the audio a second time. Configure LCA (v0.6.0 or later) to integrate with PCA (v0.4.0 or later) and use the two solutions together to get the best of both worlds.. See [Live call analytics and agent assist for your contact center with Amazon language AI services](https://www.amazon.com/live-call-analytics) for more information.
+
 ## Learn more
 
 Check out the AWS blog post: [Post call analytics for your contact center with Amazon language AI services](https://www.amazon.com/post-call-analytics)

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-0.3.4
+0.4.0
diff --git a/output_json_structure.md b/output_json_structure.md
@@ -253,13 +253,15 @@ Present when the source of the conversation is Amazon Transcribe.  A mixture of
     "TranscribeJobInfo": {
       "TranscriptionJobName": "string",
       "TranscribeApiType": "string",
+      "StreamingSession": "string",
       "CompletionTime": "string",
       "VocabularyName": "string",
       "VocabularyFilter": "string",
       "MediaFormat": "string",
       "MediaSampleRateHertz": "integer",
       "MediaFileUri": "string",
       "MediaOriginalUri": "string",
+      "RedactedTranscript": "boolean",
       "ChannelIdentification": "boolean",
       "AverageWordConfidence": "float",
       "CombinedAnalyticsGraph": "string"
@@ -270,18 +272,20 @@ Present when the source of the conversation is Amazon Transcribe.  A mixture of
 
 | Field                  | Type   | Description                                                  |
 | ---------------------- | ------ | ------------------------------------------------------------ |
-| TranscriptionJobName   | string | The name of the transcription job                            |
+| TranscriptionJobName   | string | The name of the transcription job (audio file input) or the name of the transcription file (transcription file input) |
 | TranscribeApiType      | string | The Transcribe API used, must be one of:  `standard`, `analytics` |
+| StreamingSession       | string | ID for any associated Transcribe Streaming session           |
 | CompletionTime         | string | A timestamp that shows when the job was completed            |
 | VocabularyName         | string | The name of the vocabulary used in the transcription job     |
 | VocabularyFilter       | string | The name and mask method of the vocabulary filter used in the transcription job |
 | MediaFormat            | string | The format of the input media file, as determined by Amazon Transribe |
 | MediaSampleRateHertz   | Int    | The sample rate, in Hertz, of the audio track in the input audio |
 | MediaFileUri           | string | The S3 object location of the media file to use during playback, as we may playback an audio-redacted version or a version that has a format unplayable in all browsers with the HTML5 audio control |
 | MediaOriginalUri       | string | The S3 object location of the original input audio file      |
+| RedactedTranscript     | bool   | Indicates that the transcript has been redacted              |
 | ChannelIdentifcation   | bool   | Indicates whether the transcription job used channel- (true) or speaker-separation (false) |
 | AverageWordConfidence  | float  | Percentage value between 0.00 and 1.00 indicating overall word confidence score for this job |
-| CombinedAnalyticsGraph | string | S3 URL for the pre-generated combined Call Analytics chart |
+| CombinedAnalyticsGraph | string | S3 URL for the pre-generated combined Call Analytics chart   |
 
 ### SpeechSegments
 

diff --git a/pca-main-nokendra.template b/pca-main-nokendra.template
@@ -1,6 +1,6 @@
 AWSTemplateFormatVersion: "2010-09-09"
 
-Description: Amazon Transcribe Post Call Analytics - PCA (v0.3.4) (uksb-1sn29lk73)
+Description: Amazon Transcribe Post Call Analytics - PCA (v0.4.0) (uksb-1sn29lk73)
 
 Parameters:
 
@@ -73,8 +73,8 @@ Parameters:
 
   InputBucketAudioPlayback:
     Type: String
-    Default: mp3
-    Description: Folder that holds the audio files to playback in the browser when original audio cannot be used
+    Default: playbackAudio
+    Description: Folder that holds the audio files to playback in the browser.
 
   InputBucketFailedTranscriptions:
     Type: String
@@ -93,6 +93,13 @@ Parameters:
     Default: originalAudio
     Description: Prefix/Folder that holds the audio files to be ingested into the system
 
+  InputBucketOrigTranscripts:
+    Type: String
+    Default: originalTranscripts
+    Description: >
+      Folder that holds Transcripts from other applications (e.g. Live Call Analytics) that are to be
+      processed as if PCA had processed that audio
+
   MaxSpeakers:
     Type: String
     Default: "2"
@@ -353,6 +360,7 @@ Metadata:
               Parameters:
                   - InputBucketFailedTranscriptions
                   - InputBucketRawAudio
+                  - InputBucketOrigTranscripts
                   - InputBucketAudioPlayback
                   - OutputBucketParsedResults
                   - OutputBucketTranscribeResults
@@ -503,6 +511,7 @@ Resources:
           - !Ref InputBucket
           - !Ref InputBucketName        
         InputBucketRawAudio: !Ref InputBucketRawAudio
+        InputBucketOrigTranscripts: !Ref InputBucketOrigTranscripts
         MaxSpeakers: !Ref MaxSpeakers
         MinSentimentNegative: !Ref MinSentimentNegative
         MinSentimentPositive: !Ref MinSentimentPositive
@@ -597,6 +606,14 @@ Outputs:
     Description: S3 Bucket prefix/folder for uploading input audio files
     Value: !Ref InputBucketRawAudio
 
+  InputBucketTranscriptPrefix:
+    Description: S3 Bucket prefix/folder for uploading input transcripts
+    Value: !Ref InputBucketOrigTranscripts
+
+  InputBucketPlaybackAudioPrefix:
+    Description: S3 Bucket prefix/folder for audio used for playboack from browser
+    Value: !Ref InputBucketAudioPlayback
+
   OutputBucket:
     Description: S3 Bucket where Transcribe output files are delivered
     Value:

diff --git a/pca-main.template b/pca-main.template
@@ -1,6 +1,6 @@
 AWSTemplateFormatVersion: "2010-09-09"
 
-Description: Amazon Transcribe Post Call Analytics - PCA (v0.3.4) (uksb-1sn29lk73)
+Description: Amazon Transcribe Post Call Analytics - PCA (v0.4.0) (uksb-1sn29lk73)
 
 Parameters:
 
@@ -37,7 +37,7 @@ Parameters:
 
   ComprehendLanguages:
     Type: String
-    Default: en | es | fr | de | it | pt | ar | hi | ja | ko | zh | zh-TW
+    Default: de | en | es | it | pt | fr | ja | ko | hi | ar | zh | zh-TW
     Description: Languages supported by Comprehend's standard calls, separated by " | "
 
   ContentRedactionLanguages:
@@ -73,8 +73,8 @@ Parameters:
 
   InputBucketAudioPlayback:
     Type: String
-    Default: mp3
-    Description: Folder that holds the audio files to playback in the browser when original audio cannot be used
+    Default: playbackAudio
+    Description: Folder that holds the audio files to playback in the browser
 
   InputBucketFailedTranscriptions:
     Type: String
@@ -93,6 +93,13 @@ Parameters:
     Default: originalAudio
     Description: Prefix/Folder that holds the audio files to be ingested into the system
 
+  InputBucketOrigTranscripts:
+    Type: String
+    Default: originalTranscripts
+    Description: >
+      Folder that holds Transcripts from other applications (e.g. Live Call Analytics) that are to be
+      processed as if PCA had processed that audio
+
   MaxSpeakers:
     Type: String
     Default: "2"
@@ -355,6 +362,7 @@ Metadata:
               Parameters:
                   - InputBucketFailedTranscriptions
                   - InputBucketRawAudio
+                  - InputBucketOrigTranscripts
                   - InputBucketAudioPlayback
                   - OutputBucketParsedResults
                   - OutputBucketTranscribeResults
@@ -635,6 +643,7 @@ Resources:
           - !Ref InputBucket
           - !Ref InputBucketName        
         InputBucketRawAudio: !Ref InputBucketRawAudio
+        InputBucketOrigTranscripts: !Ref InputBucketOrigTranscripts
         MaxSpeakers: !Ref MaxSpeakers
         MinSentimentNegative: !Ref MinSentimentNegative
         MinSentimentPositive: !Ref MinSentimentPositive
@@ -743,6 +752,14 @@ Outputs:
     Description: S3 Bucket prefix/folder for uploading input audio files
     Value: !Ref InputBucketRawAudio
 
+  InputBucketTranscriptPrefix:
+    Description: S3 Bucket prefix/folder for uploading input transcripts
+    Value: !Ref InputBucketOrigTranscripts
+
+  InputBucketPlaybackAudioPrefix:
+    Description: S3 Bucket prefix/folder for audio used for playboack from browser
+    Value: !Ref InputBucketAudioPlayback
+
   OutputBucket:
     Description: S3 Bucket where Transcribe output files are delivered
     Value:

diff --git a/pca-server/cfn/lib/pca-definition.json b/pca-server/cfn/lib/pca-definition.json
@@ -1,7 +1,19 @@
 {
   "Comment": "Post-Call Analytics Workflow with Transcribe and Comprehend",
-  "StartAt": "TranscribeAudio",
+  "StartAt": "CheckFileType?",
   "States": {
+    "CheckFileType?": {
+      "Type": "Choice",
+      "Comment": "Picks the correct pathway for audio and json files",
+      "Choices": [
+        {
+          "Variable": "$.inputType",
+          "StringEquals": "audio",
+          "Next": "TranscribeAudio"
+        }
+      ],
+      "Default": "ProcessTranscriptHeader"
+    },
     "TranscribeAudio": {
       "Comment": "Sends the file in S3 for Transcription",
       "Type": "Task",
@@ -52,7 +64,7 @@
         {
           "Variable": "$.transcribeStatus",
           "StringEquals": "COMPLETED",
-          "Next": "ProcessTranscription"
+          "Next": "ProcessJobHeader"
         },
         {
           "Variable": "$.transcribeStatus",
@@ -62,6 +74,26 @@
       ],
       "Default": "TranscriptionFailed"
     },
+    "ProcessJobHeader": {
+      "Comment": "Creates header information based upon the Transcribe job",
+      "Type": "Task",
+      "Resource": "${SFExtractJobHeaderArn}",
+      "Retry": [{
+          "IntervalSeconds": 5,
+          "ErrorEquals": ["Lambda.Unknown"]
+      }],
+      "Next": "ProcessTranscription"
+    },
+    "ProcessTranscriptHeader": {
+      "Comment": "Creates header information based upon what's available in a transcript file",
+      "Type": "Task",
+      "Resource": "${SFExtractTranscriptHeaderArn}",
+      "Retry": [{
+          "IntervalSeconds": 5,
+          "ErrorEquals": ["Lambda.Unknown"]
+      }],
+      "Next": "ProcessTranscription"
+    },
     "ProcessTranscription": {
       "Comment": "Takes the output from Transcribe and creates the initial results processing",
       "Type": "Task",