|
4 | 4 | "cell_type": "markdown", |
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | | - "# Extract Custom Fields from Your Pretranscribed File" |
| 7 | + "# Extract Custom Fields from Your Pre-transcribed File" |
8 | 8 | ] |
9 | 9 | }, |
10 | 10 | { |
11 | 11 | "cell_type": "markdown", |
12 | 12 | "metadata": {}, |
13 | 13 | "source": [ |
14 | | - "This notebook demonstrates how to use analyzers to extract custom fields from your transcription input files." |
| 14 | + "This notebook demonstrates how to use analyzers to extract custom fields from your pre-transcribed input files." |
15 | 15 | ] |
16 | 16 | }, |
17 | 17 | { |
18 | 18 | "cell_type": "markdown", |
19 | 19 | "metadata": {}, |
20 | 20 | "source": [ |
21 | 21 | "## Prerequisites\n", |
22 | | - "1. Ensure Azure AI service is configured following [steps](../README.md#configure-azure-ai-service-resource)\n", |
| 22 | + "1. Ensure your Azure AI service is configured by following the [configuration steps](../README.md#configure-azure-ai-service-resource).\n", |
23 | 23 | "2. Install the required packages to run the sample." |
24 | 24 | ] |
25 | 25 | }, |
|
45 | 45 | "source": [ |
46 | 46 | "Below is a collection of analyzer templates designed to extract fields from various input file types.\n", |
47 | 47 | "\n", |
48 | | - "These templates are highly customizable, allowing you to modify them to suit your specific needs. For additional verified templates from Microsoft, please visit [here](../analyzer_templates/README.md)." |
| 48 | + "These templates are highly customizable, allowing you to adapt them to your specific requirements. For additional verified templates provided by Microsoft, please visit [here](../analyzer_templates/)." |
49 | 49 | ] |
50 | 50 | }, |
51 | 51 | { |
|
65 | 65 | "cell_type": "markdown", |
66 | 66 | "metadata": {}, |
67 | 67 | "source": [ |
68 | | - "Specify the analyzer template you want to use and provide a name for the analyzer to be created based on the template." |
| 68 | + "Specify the analyzer template to use and assign a unique name for the analyzer that will be created from the template." |
69 | 69 | ] |
70 | 70 | }, |
71 | 71 | { |
|
88 | 88 | "source": [ |
89 | 89 | "## Create Azure AI Content Understanding Client\n", |
90 | 90 | "\n", |
91 | | - "> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class containing functions to interact with the Content Understanding API. Before the official release of the Content Understanding SDK, it can be regarded as a lightweight SDK. Fill the constant **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, **AZURE_AI_API_KEY** with the information from your Azure AI Service.\n", |
| 91 | + "> The [AzureContentUnderstandingClient](../python/content_understanding_client.py) is a utility class providing functions to interact with the Content Understanding API. Before the official release of the Content Understanding SDK, this class can be considered a lightweight SDK.\n", |
| 92 | + "\n", |
| 93 | + "> Fill in the constants **AZURE_AI_ENDPOINT**, **AZURE_AI_API_VERSION**, and **AZURE_AI_API_KEY** with your Azure AI Service credentials.\n", |
92 | 94 | "\n", |
93 | 95 | "> ⚠️ Important:\n", |
94 | | - "You must update the code below to match your Azure authentication method.\n", |
| 96 | + "Make sure to update the code below to match your chosen Azure authentication method.\n", |
95 | 97 | "Look for the `# IMPORTANT` comments and modify those sections accordingly.\n", |
96 | | - "If you skip this step, the sample may not run correctly.\n", |
| 98 | + "Skipping this step may prevent the sample from running correctly.\n", |
97 | 99 | "\n", |
98 | | - "> ⚠️ Note: Using a subscription key works, but using a token provider with Azure Active Directory (AAD) is much safer and is highly recommended for production environments." |
| 100 | + "> ⚠️ Note: While subscription key authentication works, it is strongly recommended to use a token provider with Azure Active Directory (AAD) for improved security in production environments." |
99 | 101 | ] |
100 | 102 | }, |
101 | 103 | { |
|
115 | 117 | "load_dotenv(find_dotenv())\n", |
116 | 118 | "logging.basicConfig(level=logging.INFO)\n", |
117 | 119 | "\n", |
118 | | - "# For authentication, you can use either token-based auth or subscription key, and only one of them is required\n", |
| 120 | + "# For authentication, you may use either token-based auth or a subscription key; only one is required.\n", |
119 | 121 | "AZURE_AI_ENDPOINT = os.getenv(\"AZURE_AI_ENDPOINT\")\n", |
120 | | - "# IMPORTANT: Replace with your actual subscription key or set up in \".env\" file if not using token auth\n", |
| 122 | + "# IMPORTANT: Replace with your actual subscription key or configure it in the \".env\" file if not using token authentication.\n", |
121 | 123 | "AZURE_AI_API_KEY = os.getenv(\"AZURE_AI_API_KEY\")\n", |
122 | 124 | "AZURE_AI_API_VERSION = os.getenv(\"AZURE_AI_API_VERSION\", \"2025-05-01-preview\")\n", |
123 | 125 | "\n", |
124 | | - "# Add the parent directory to the path to use shared modules\n", |
| 126 | + "# Add the parent directory to the system path to access shared modules\n", |
125 | 127 | "parent_dir = Path(Path.cwd()).parent\n", |
126 | 128 | "sys.path.append(str(parent_dir))\n", |
127 | 129 | "from python.content_understanding_client import AzureContentUnderstandingClient\n", |
|
134 | 136 | " api_version=AZURE_AI_API_VERSION,\n", |
135 | 137 | " # IMPORTANT: Comment out token_provider if using subscription key\n", |
136 | 138 | " token_provider=token_provider,\n", |
137 | | - " # IMPORTANT: Uncomment this if using subscription key\n", |
| 139 | + " # IMPORTANT: Uncomment the following line if using subscription key\n", |
138 | 140 | " # subscription_key=AZURE_AI_API_KEY,\n", |
139 | | - " # x_ms_useragent=\"azure-ai-content-understanding-python/field_extraction\", # This header is used for sample usage telemetry, please comment out this line if you want to opt out.\n", |
| 141 | + " # x_ms_useragent=\"azure-ai-content-understanding-python/field_extraction\", # This header is used for sample usage telemetry. Please comment out if you want to opt out.\n", |
140 | 142 | ")" |
141 | 143 | ] |
142 | 144 | }, |
|
170 | 172 | "cell_type": "markdown", |
171 | 173 | "metadata": {}, |
172 | 174 | "source": [ |
173 | | - "After the analyzer is successfully created, we can use it to analyze our input files." |
| 175 | + "Once the analyzer is successfully created, you can use it to analyze your input files." |
174 | 176 | ] |
175 | 177 | }, |
176 | 178 | { |
|
181 | 183 | "source": [ |
182 | 184 | "from python.extension.transcripts_processor import TranscriptsProcessor\n", |
183 | 185 | "\n", |
184 | | - "test_file_path=analyzer_sample_file_path\n", |
| 186 | + "test_file_path = analyzer_sample_file_path\n", |
185 | 187 | "\n", |
186 | 188 | "transcripts_processor = TranscriptsProcessor()\n", |
187 | 189 | "webvtt_output, webvtt_output_file_path = transcripts_processor.convert_file(test_file_path)\n", |
188 | 190 | "\n", |
189 | 191 | "if \"WEBVTT\" not in webvtt_output:\n", |
190 | 192 | " print(\"Error: The output is not in WebVTT format.\")\n", |
191 | | - "else: \n", |
| 193 | + "else:\n", |
192 | 194 | " response = client.begin_analyze(CUSTOM_ANALYZER_ID, file_location=webvtt_output_file_path)\n", |
193 | 195 | " print(\"Response:\", response)\n", |
194 | 196 | " result_json = client.poll_result(response)\n", |
|
201 | 203 | "metadata": {}, |
202 | 204 | "source": [ |
203 | 205 | "## Clean Up\n", |
204 | | - "Optionally, delete the sample analyzer from your resource. In typical usage scenarios, you would analyze multiple files using the same analyzer." |
| 206 | + "Optionally, delete the sample analyzer from your Azure resource. In typical usage scenarios, you would analyze multiple files using the same analyzer." |
205 | 207 | ] |
206 | 208 | }, |
207 | 209 | { |
|
0 commit comments