Skip to content

Commit 2f346b9

Browse files
WMS ID 9001: Fix typos & markdown errors in OCI Speech customizations.md (#418)
Fix typos in OCI Speech customizations.md
1 parent f92eb25 commit 2f346b9

File tree

1 file changed

+105
-90
lines changed

1 file changed

+105
-90
lines changed

oci-artificial-intelligence/ai-speech/customizations/customizations.md

Lines changed: 105 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
# Lab 3: Create and manage customizations using OCI Console
1+
# Lab 3: Create, enable and manage customizations using OCI Console
22

33
## Introduction
4-
When the Live Transcribe service as is, it may not provide perfect transcriptions for domain-specific words, acronyms and proper nouns.
4+
When the Live Transcribe service is used as is, it may not provide perfect transcriptions for domain-specific words, acronyms and proper nouns.
55
Speech Customizations can be enabled when using the Live Transcribe service to improve the transcription accuracy in such cases.
66
In this session, we will help users get familiar with customizations and how to create and manage them using the OCI Console.
77

8-
***Estimated Lab Time***: 5 minutes
8+
***Estimated Lab Time***: 30 minutes
99

1010
### Objectives
1111

@@ -20,12 +20,12 @@ In this lab, you will:
2020

2121
## Task 1: Navigate to Speech Overview Page
2222

23-
Log into OCI Cloud Console. Using the Burger Menu on the top left corner, navigate to Analytics and AI menu and click it, and then select Language item under AI services.
23+
Log into OCI Cloud Console. Using the Burger Menu in the top left corner, navigate to Analytics and AI menu and click it, and then select Speech item under AI services.
2424
![Navigate speech service menu](./images/navigate-to-ai-speech-menu.png " ")
2525

2626
This will navigate you to the Speech overview page.
27-
On the left you can toggle between overview and transcription jobs listing page.
28-
Under documentation, you can find helpful links relevant to OCI speech service
27+
On the left, you can access various features of the OCI Speech service i.e. Transcription Jobs, Live Transcribe, Customizations, and Text to Speech.
28+
Under documentation, you can find helpful links relevant to OCI speech service.
2929
![Speech service overview page](./images/overview-page.png " ")
3030

3131

@@ -93,6 +93,7 @@ We see that the live transcribe produces the right output only once. In order to
9393

9494
Let's create an ObjectStorageDataset from the console. For EBITDA, let's create a JSON file like so
9595
```
96+
<copy>
9697
{
9798
"datasetType": "ENTITY_LIST",
9899
"entityList": [
@@ -117,6 +118,7 @@ We see that the live transcribe produces the right output only once. In order to
117118
}
118119
]
119120
}
121+
</copy>
120122
```
121123
122124
2. Upload this JSON file to an Object Storage bucket
@@ -132,6 +134,7 @@ We see that the live transcribe produces the right output only once. In order to
132134
133135
5. If you want to provide an audio file as the pronunciation, first upload the audio file to object storage. You can provide multiple audio files for the same entity. The JSON file can look like:
134136
```
137+
<copy>
135138
{
136139
"datasetType": "ENTITY_LIST",
137140
"entityList": [
@@ -167,13 +170,15 @@ We see that the live transcribe produces the right output only once. In order to
167170
}
168171
]
169172
}
173+
</copy>
170174
```
171175
172176
173177
## Task 4: Create and enable a customization with custom pronunciations and reference examples
174178
Sometimes, even when enabling a customization with custom pronunciations, Live Transcribe may still provide some mis-transcription. For example, let's say you have an organization that's abbreviated as "DSCRD" and is pronounced as "discord". Even if you create and enable a customization for these entities, the transcription might still not have the right entity every time.
175179
For example, let's say you created a customization for DSCRD using the below dataset:
176180
```
181+
<copy>
177182
{
178183
"datasetType": "ENTITY_LIST",
179184
"entityList": [
@@ -192,6 +197,7 @@ For example, let's say you created a customization for DSCRD using the below dat
192197
}
193198
]
194199
}
200+
</copy>
195201
```
196202
When enabling this customization with the Live Transcribe service, there might still be instances where the transcription does not have DSCRD. For example, in the screenshot below the transcript says "What is the return on equity for discard".
197203
![RT without DSCRD customization](./images/DSCRD-without-ref-examples.png " ")
@@ -202,13 +208,14 @@ Here is where Reference Examples come in.
202208
203209
You can provide simple examples of sentences where you expect to see entities of a given type. For example, since DSCRD has an entity type of "organizations" here are some sample reference examples for the "organizations" entity type:
204210
205-
* organization called \<organizations\>
206-
* return on equity for \<organizations>
207-
* work for \<organizations\>
208-
* statement for \<organizations\>
211+
* organization called `<organizations>`
212+
* return on equity for `<organizations>`
213+
* work for `<organizations>`
214+
* statement for `<organizations>`
209215
210216
You can add these examples to the JSON file from before and upload the json file to object storage.
211217
```
218+
<copy>
212219
{
213220
"datasetType": "ENTITY_LIST",
214221
"entityList": [
@@ -233,12 +240,13 @@ Here is where Reference Examples come in.
233240
"statement for <organizations>"
234241
]
235242
}
243+
</copy>
236244
```
237245

238246
2. Use this dataset when creating the customization. The Customizations service will provide two customizations, one for the reference examples (we call this the Main Customization) and one for the entity list that you provide (we call this the Slot Customization). The Slot Customization i.e. the customization created for the entity list will have "--<entity-type>" in the display name.
239247
![RT without DSCRD customization](./images/DSCRD-main-and-slot-customizations.png " ")
240248

241-
If you open the details of the Main Customization, which in this case has the display name "dsrcd-customization", you will see that it has an entities section that shows which Slot Customization it refers to, for the "organizations" entity type (it is also possible to override this Slot Customization with another one when calling Live Transcribe).
249+
If you open the details of the Main Customization, which in this case has the display name "dsrcd-customization", you will see that it has an entities section that shows which Slot Customization it refers to, for the "organizations" entity type.
242250
![DSCRD main customization details](./images/DSCRD-main-customization-details.png " ")
243251

244252
If you open the details of the Slot Customization, which in this case has the display name "dsrcd-customization--organizations", you will see that it does not have an entities section.
@@ -247,8 +255,10 @@ Here is where Reference Examples come in.
247255
3. It is important to enable the Main Customization when making the Live Transcribe call. Think of the main customization as encapsulating both the reference examples and the entity lists. Enabling the Slot Customization means that Live Transcribe will use just the entity list and not the reference examples. The Live Transcribe output with the main customization enabled looks like
248256
![RT with DSCRD main customization enabled](./images/DSCRD-with-reference-examples.png " ")
249257

258+
We see that "DSCRD" appears as expected for the first 5 sentences. The goal of reference examples is to add more context for the custom entities that you have defined.
259+
Note that the transcript may not have the custom entity for utterances that are NOT included in any of the reference examples. For example, in the above screenshot, the last line "The discord app is amazing" could very well have been "The DSCRD app is amazing". If you want to increase the chances of the model transcribing it as "The DSCRD app is amazing", you can add the reference example "`<organizations>` app" into the dataset.
250260

251-
## Task 5: Create and enable a customization with multiple entity lists, soundsLike and reference examples
261+
## Task 5: Create and enable a customization with multiple entity lists, custom pronunciations and reference examples
252262
One of the best use-cases for Live Transcribe is in the healthcare domain for doctor-patient conversations. Let's say you are a hospital using OCI Live Transcribe and you have the following requirements.
253263

254264
- You have 2 patients with the names - Daniel and Sorabh, and you want Live Transcribe to accurately transcribe both these names. As a bonus requirement, let's say that if the doctor utters "Dan" or "Danny", the Live Transcribe should still transcribe that as Daniel
@@ -259,91 +269,96 @@ When used as is, the Live Transcribe produces an output like so:
259269
![RT without hospital customization](./images/RT-without-hospital-customization.png " ")
260270

261271
Let's use Speech Customizations to make this transcription better.
262-
1. Creating and enabling the customization
272+
1. Creating a customization using the below dataset. This dataset has two entity lists - one for the "names" entity type and one for the "medical" entity type.
273+
Reference examples have been added for both the entity types. Note that you can have multiple entity types in a single reference example.
274+
For example, you can have a reference example like "hi `<names>`, have you been taking `<medical>`".
275+
263276
```
277+
<copy>
264278
{
265-
"datasetType": "ENTITY_LIST",
266-
"entityList": [
267-
{
268-
"entityType": "names",
269-
"entities": [
270-
{
271-
"entityValue": "Daniel",
272-
"pronunciations": [
273-
{
274-
"soundsLike": "Danny"
275-
},
276-
{
277-
"soundsLike": "Dan"
278-
}
279-
]
280-
},
281-
{
282-
"entityValue": "Sorabh",
283-
"pronunciations": [
284-
{
285-
"soundsLike": "Saurabh"
286-
},
287-
{
288-
"soundsLike": "so rub"
289-
},
290-
{
291-
"soundsLike": "so raab"
292-
}
293-
]
294-
}
295-
]
296-
},
297-
{
298-
"entityType": "medical",
299-
"entities": [
300-
{
301-
"entityValue": "PeriCare",
302-
"pronunciations": [
303-
{
304-
"soundsLike": "Perry care"
305-
},
306-
{
307-
"soundsLike": "Paris care"
308-
}
309-
]
310-
},
279+
"datasetType": "ENTITY_LIST",
280+
"entityList": [
311281
{
312-
"entityValue": "procapil",
313-
"pronunciations": [
314-
{
315-
"soundsLike": "pro sepil"
316-
},
317-
{
318-
"soundsLike": "pro capill"
319-
}
320-
]
282+
"entityType": "names",
283+
"entities": [
284+
{
285+
"entityValue": "Daniel",
286+
"pronunciations": [
287+
{
288+
"soundsLike": "Danny"
289+
},
290+
{
291+
"soundsLike": "Dan"
292+
}
293+
]
294+
},
295+
{
296+
"entityValue": "Sorabh",
297+
"pronunciations": [
298+
{
299+
"soundsLike": "Saurabh"
300+
},
301+
{
302+
"soundsLike": "so rub"
303+
},
304+
{
305+
"soundsLike": "so raab"
306+
}
307+
]
308+
}
309+
]
321310
},
322311
{
323-
"entityValue": "EpiCeram",
324-
"pronunciations": [
325-
{
326-
"soundsLike": "epic serum"
327-
},
328-
{
329-
"soundsLike": "a PC rum"
330-
}
331-
]
312+
"entityType": "medical",
313+
"entities": [
314+
{
315+
"entityValue": "PeriCare",
316+
"pronunciations": [
317+
{
318+
"soundsLike": "Perry care"
319+
},
320+
{
321+
"soundsLike": "Paris care"
322+
}
323+
]
324+
},
325+
{
326+
"entityValue": "procapil",
327+
"pronunciations": [
328+
{
329+
"soundsLike": "pro sepil"
330+
},
331+
{
332+
"soundsLike": "pro capill"
333+
}
334+
]
335+
},
336+
{
337+
"entityValue": "EpiCeram",
338+
"pronunciations": [
339+
{
340+
"soundsLike": "epic serum"
341+
},
342+
{
343+
"soundsLike": "a PC rum"
344+
}
345+
]
346+
}
347+
]
332348
}
333-
]
334-
}
335-
],
336-
"referenceExamples": [
337-
"hi <names>",
338-
"hello <names>",
339-
"good morning <names>",
340-
"good afternoon <names>",
341-
"tell me <names>",
342-
"prescribe <medical>",
343-
"take <medical>",
344-
"apply <medical>"
345-
]
349+
],
350+
"referenceExamples": [
351+
"hi <names>",
352+
"hello <names>",
353+
"good morning <names>",
354+
"good afternoon <names>",
355+
"tell me <names>",
356+
"prescribe <medical>",
357+
"take <medical>",
358+
"apply <medical>"
359+
]
346360
}
361+
</copy>
347362
```
348363
2. This dataset would create 3 customizations - one main customization and two slot customizations.
349364
![hospital customizations](./images/hostpital-main-and-slot-customizations.png " ")

0 commit comments

Comments
 (0)