Skip to content

Commit 4be100e

Browse files
AbdullahMubeenAnwarprabod
authored andcommitted
tasks-docs-integration (#14428)
* Update navigation.yml Adding "- Tasks" page to SparkNLP Docs * adding task files
1 parent 317abf9 commit 4be100e

16 files changed

+2486
-0
lines changed

docs/_data/navigation.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,8 @@ sparknlp:
4444
url: /docs/en/pipelines
4545
- title: General Concepts
4646
url: /docs/en/concepts
47+
- title: Tasks
48+
url: /docs/en/tasks/landing_page
4749
- title: Annotators
4850
url: /docs/en/annotators
4951
- title: Transformers
Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
---
2+
layout: docs
3+
header: true
4+
seotitle:
5+
title: Automatic Speech Recognition
6+
permalink: docs/en/tasks/automatic_speech_recognition
7+
key: docs-tasks-automatic-speech-recognition
8+
modify_date: "2024-09-26"
9+
show_nav: true
10+
sidebar:
11+
nav: sparknlp
12+
---
13+
14+
**Automatic Speech Recognition (ASR)** is the technology that enables computers to recognize and process human speech into text. ASR plays a vital role in numerous applications, from voice-activated assistants to transcription services, making it an essential part of modern natural language processing (NLP) solutions. Spark NLP provides powerful tools for implementing ASR systems effectively.
15+
16+
In this context, ASR involves converting spoken language into text by analyzing audio signals. Common use cases include:
17+
18+
- **Voice Assistants:** Enabling devices like smartphones and smart speakers to understand and respond to user commands.
19+
- **Transcription Services:** Automatically converting audio recordings from meetings, interviews, or lectures into written text.
20+
- **Accessibility:** Helping individuals with disabilities interact with technology through voice commands.
21+
22+
By leveraging ASR, organizations can enhance user experience, improve accessibility, and streamline workflows that involve audio data.
23+
24+
<div style="text-align: center;">
25+
<iframe width="560" height="315" src="https://www.youtube.com/embed/hscJK-4kA_A?si=xYgz6ejc-2XvXcoR" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
26+
</div>
27+
28+
## Picking a Model
29+
30+
When selecting a model for Automatic Speech Recognition, it’s essential to evaluate several factors to ensure optimal performance for your specific use case. Begin by analyzing the **nature of your audio data**, considering the accent, language, and quality of the recordings. Determine if your task requires **real-time transcription** or if batch processing is sufficient, as some models excel in specific scenarios.
31+
32+
Next, assess the **model complexity**; simpler models may suffice for straightforward tasks, while more sophisticated models are better suited for nuanced speech recognition. Consider the **availability of diverse audio data** for training, as larger datasets can significantly enhance model performance. Define key **performance metrics** (e.g., word error rate, accuracy) to guide your choice, and ensure the model's interpretability meets your requirements. Finally, account for **resource constraints**, as advanced models typically demand more memory and processing power.
33+
34+
To explore and select from a variety of models, visit [Spark NLP Models](https://sparknlp.org/models), where you can find models tailored for different ASR tasks and languages.
35+
36+
#### Recommended Models for Automatic Speech Recognition Tasks
37+
- **General Speech Recognition:** Use models like [`asr_wav2vec2_large_xlsr_53_english_by_jonatasgrosman`](https://sparknlp.org/2022/09/24/asr_wav2vec2_large_xlsr_53_english_by_jonatasgrosman_en.html){:target="_blank"} for general-purpose transcription.
38+
- **Multilingual Support:** For applications requiring support for multiple languages, consider using models like [`asr_wav2vec2_large_xlsr_53_portuguese_by_jonatasgrosman`](https://sparknlp.org/2021/12/15/wav2vec2.html){:target="_blank"} from the [`Wav2Vec2ForCTC`](https://sparknlp.org/docs/en/transformers#wav2vec2forctc){:target="_blank"} transformer.
39+
40+
By thoughtfully considering these factors and using the right models, you can enhance your ASR applications significantly.
41+
42+
## How to use
43+
44+
<div class="tabs-box" markdown="1">
45+
{% include programmingLanguageSelectScalaPython.html %}
46+
```python
47+
import sparknlp
48+
from sparknlp.base import *
49+
from sparknlp.annotator import *
50+
from pyspark.ml import Pipeline
51+
52+
# Step 1: Assemble the raw audio content into a suitable format
53+
audioAssembler = AudioAssembler() \
54+
.setInputCol("audio_content") \
55+
.setOutputCol("audio_assembler")
56+
57+
# Step 2: Load a pre-trained Wav2Vec2 model for automatic speech recognition (ASR)
58+
speechToText = Wav2Vec2ForCTC \
59+
.pretrained() \
60+
.setInputCols(["audio_assembler"]) \
61+
.setOutputCol("text")
62+
63+
# Step 3: Define the pipeline with audio assembler and speech-to-text model
64+
pipeline = Pipeline().setStages([audioAssembler, speechToText])
65+
66+
# Step 4: Create a DataFrame containing the raw audio content (as floats)
67+
processedAudioFloats = spark.createDataFrame([[rawFloats]]).toDF("audio_content")
68+
69+
# Step 5: Fit the pipeline and transform the audio data
70+
result = pipeline.fit(processedAudioFloats).transform(processedAudioFloats)
71+
72+
# Step 6: Display the transcribed text from the audio
73+
result.select("text.result").show(truncate = False)
74+
75+
+------------------------------------------------------------------------------------------+
76+
|result |
77+
+------------------------------------------------------------------------------------------+
78+
|[MISTER QUILTER IS THE APOSTLE OF THE MIDLE CLASES AND WE ARE GLAD TO WELCOME HIS GOSPEL ]|
79+
+------------------------------------------------------------------------------------------+
80+
```
81+
```scala
82+
import spark.implicits._
83+
import com.johnsnowlabs.nlp.base._
84+
import com.johnsnowlabs.nlp.annotators._
85+
import com.johnsnowlabs.nlp.annotators.audio.Wav2Vec2ForCTC
86+
import org.apache.spark.ml.Pipeline
87+
88+
// Step 1: Assemble the raw audio content into a suitable format
89+
val audioAssembler: AudioAssembler = new AudioAssembler()
90+
.setInputCol("audio_content")
91+
.setOutputCol("audio_assembler")
92+
93+
// Step 2: Load a pre-trained Wav2Vec2 model for automatic speech recognition (ASR)
94+
val speechToText: Wav2Vec2ForCTC = Wav2Vec2ForCTC
95+
.pretrained()
96+
.setInputCols("audio_assembler")
97+
.setOutputCol("text")
98+
99+
// Step 3: Define the pipeline with audio assembler and speech-to-text model
100+
val pipeline: Pipeline = new Pipeline().setStages(Array(audioAssembler, speechToText))
101+
102+
// Step 4: Load raw audio floats from a CSV file
103+
val bufferedSource =
104+
scala.io.Source.fromFile("src/test/resources/audio/csv/audio_floats.csv")
105+
106+
// Step 5: Extract raw audio floats from CSV and convert to an array of floats
107+
val rawFloats = bufferedSource
108+
.getLines()
109+
.map(_.split(",").head.trim.toFloat)
110+
.toArray
111+
bufferedSource.close
112+
113+
// Step 6: Create a DataFrame with raw audio content (as floats)
114+
val processedAudioFloats = Seq(rawFloats).toDF("audio_content")
115+
116+
// Step 7: Fit the pipeline and transform the audio data
117+
val result = pipeline.fit(processedAudioFloats).transform(processedAudioFloats)
118+
119+
// Step 8: Display the transcribed text from the audio
120+
result.select("text.result").show(truncate = false)
121+
122+
+------------------------------------------------------------------------------------------+
123+
|result |
124+
+------------------------------------------------------------------------------------------+
125+
|[MISTER QUILTER IS THE APOSTLE OF THE MIDLE CLASES AND WE ARE GLAD TO WELCOME HIS GOSPEL ]|
126+
+------------------------------------------------------------------------------------------+
127+
```
128+
</div>
129+
130+
## Try Real-Time Demos!
131+
132+
If you want to see the outputs of ASR models in real time, visit our interactive demos:
133+
134+
- **[Wav2Vec2ForCTC](https://huggingface.co/spaces/abdullahmubeen10/sparknlp-Wav2Vec2ForCTC){:target="_blank"}** – Try this powerful model for real-time speech-to-text from raw audio.
135+
- **[WhisperForCTC](https://huggingface.co/spaces/abdullahmubeen10/sparknlp-WhisperForCTC){:target="_blank"}** – Test speech recognition in multiple languages and noisy environments.
136+
- **[HubertForCTC](https://huggingface.co/spaces/abdullahmubeen10/sparknlp-HubertForCTC){:target="_blank"}** – Experience quick and accurate voice command recognition.
137+
138+
## Useful Resources
139+
140+
Want to dive deeper into Automatic Speech Recognition with Spark NLP? Here are somText Preprocessinge curated resources to help you get started and explore further:
141+
142+
**Articles and Guides**
143+
- *[Converting Speech to Text with Spark NLP and Python](https://www.johnsnowlabs.com/converting-speech-to-text-with-spark-nlp-and-python/){:target="_blank"}*
144+
- *[Simplify Your Speech Recognition Workflow with SparkNLP](https://medium.com/spark-nlp/simplify-your-speech-recognition-workflow-with-sparknlp-e381606e4e82){:target="_blank"}*
145+
- *[Vision Transformers and Automatic Speech Recognition in Spark NLP](https://www.nlpsummit.org/vision-transformers-and-automatic-speech-recognition-in-spark-nlp/){:target="_blank"}*
146+
147+
**Notebooks**
148+
- *[Automatic Speech Recognition in Spark NLP](https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Public/17.Speech_Recognition.ipynb){:target="_blank"}*
149+
- *[Speech Recognition Transformers in Spark NLP](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/annotation/audio){:target="_blank"}*
Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
---
2+
layout: docs
3+
header: true
4+
seotitle:
5+
title: Dependency Parsing
6+
permalink: docs/en/tasks/dependency_parsing
7+
key: docs-tasks-dependency-parsing
8+
modify_date: "2024-09-28"
9+
show_nav: true
10+
sidebar:
11+
nav: sparknlp
12+
---
13+
14+
**Dependency Parsing** is a syntactic analysis task that focuses on the grammatical structure of sentences. It identifies the dependencies between words, showcasing how they relate in terms of grammar. Spark NLP provides advanced dependency parsing models that can accurately analyze sentence structures, enabling various applications in natural language processing.
15+
16+
Dependency parsing models process input sentences and generate a structured representation of word relationships. Common use cases include:
17+
18+
- **Grammatical Analysis:** Understanding the grammatical structure of sentences for better comprehension.
19+
- **Information Extraction:** Identifying key relationships and entities in sentences for tasks like knowledge graph construction.
20+
21+
By using Spark NLP dependency parsing models, you can build efficient systems to analyze and understand sentence structures accurately.
22+
23+
## Picking a Model
24+
25+
When selecting a dependency parsing model, consider factors such as the **language of the text** and the **complexity of sentence structures**. Some models may be optimized for specific languages or types of text. Evaluate whether you need **detailed syntactic parsing** or a more **general analysis** based on your application.
26+
27+
Explore the available dependency parsing models at [Spark NLP Models](https://sparknlp.org/models) to find the one that best fits your requirements.
28+
29+
#### Recommended Models for Dependency Parsing Tasks
30+
31+
- **General Dependency Parsing:** Consider models such as [`dependency_conllu_en_3_0`](https://sparknlp.org/2022/06/29/dependency_conllu_en_3_0.html){:target="_blank"} for analyzing English sentences. You can also explore language-specific models tailored for non-English languages.
32+
33+
Choosing the appropriate model ensures you produce accurate syntactic structures that suit your specific language and use case.
34+
35+
## How to use
36+
37+
<div class="tabs-box" markdown="1">
38+
{% include programmingLanguageSelectScalaPython.html %}
39+
```python
40+
from sparknlp.annotator import *
41+
from sparknlp.base import *
42+
from pyspark.ml import Pipeline
43+
44+
# Document Assembler: Converts raw text into a document format suitable for processing
45+
documentAssembler = DocumentAssembler() \
46+
.setInputCol("text") \
47+
.setOutputCol("document")
48+
49+
# Sentence Detector: Splits text into individual sentences
50+
sentenceDetector = SentenceDetector() \
51+
.setInputCols(["document"]) \
52+
.setOutputCol("sentence")
53+
54+
# Tokenizer: Breaks sentences into tokens (words)
55+
tokenizer = Tokenizer() \
56+
.setInputCols(["sentence"]) \
57+
.setOutputCol("token")
58+
59+
# Part-of-Speech Tagger: Tags each token with its respective POS (pretrained model)
60+
posTagger = PerceptronModel.pretrained() \
61+
.setInputCols(["token", "sentence"]) \
62+
.setOutputCol("pos")
63+
64+
# Dependency Parser: Analyzes the grammatical structure of a sentence
65+
dependencyParser = DependencyParserModel.pretrained() \
66+
.setInputCols(["sentence", "pos", "token"]) \
67+
.setOutputCol("dependency")
68+
69+
# Typed Dependency Parser: Assigns typed labels to the dependencies
70+
typedDependencyParser = TypedDependencyParserModel.pretrained() \
71+
.setInputCols(["token", "pos", "dependency"]) \
72+
.setOutputCol("labdep")
73+
74+
# Create a pipeline that includes all the stages
75+
pipeline = Pipeline(stages=[
76+
documentAssembler,
77+
sentenceDetector,
78+
tokenizer,
79+
posTagger,
80+
dependencyParser,
81+
typedDependencyParser
82+
])
83+
84+
# Sample input data (a DataFrame with one text example)
85+
data = {"text": ["Dependencies represent relationships between words in a sentence."]}
86+
df = spark.createDataFrame(data)
87+
88+
# Run the pipeline on the input data
89+
result = pipeline.fit(df).transform(df)
90+
91+
# Show the dependency parsing results
92+
result.select("dependency.result").show(truncate=False)
93+
94+
+---------------------------------------------------------------------------------+
95+
|result |
96+
+---------------------------------------------------------------------------------+
97+
|[ROOT, Dependencies, represents, words, relationships, Sentence, Sentence, words]|
98+
+---------------------------------------------------------------------------------+
99+
```
100+
```scala
101+
import com.johnsnowlabs.nlp.DocumentAssembler
102+
import com.johnsnowlabs.nlp.annotator._
103+
import org.apache.spark.ml.Pipeline
104+
import spark.implicits._
105+
106+
// Document Assembler: Converts raw text into a document format for NLP processing
107+
val documentAssembler = new DocumentAssembler()
108+
.setInputCol("text")
109+
.setOutputCol("document")
110+
111+
// Sentence Detector: Splits the input text into individual sentences
112+
val sentenceDetector = new SentenceDetector()
113+
.setInputCols(Array("document"))
114+
.setOutputCol("sentence")
115+
116+
// Tokenizer: Breaks sentences into individual tokens (words)
117+
val tokenizer = new Tokenizer()
118+
.setInputCols(Array("sentence"))
119+
.setOutputCol("token")
120+
121+
// Part-of-Speech Tagger: Tags each token with its respective part of speech (pretrained model)
122+
val posTagger = PerceptronModel.pretrained()
123+
.setInputCols(Array("token", "sentence"))
124+
.setOutputCol("pos")
125+
126+
// Dependency Parser: Analyzes the grammatical structure of the sentence
127+
val dependencyParser = DependencyParserModel.pretrained()
128+
.setInputCols(Array("sentence", "pos", "token"))
129+
.setOutputCol("dependency")
130+
131+
// Typed Dependency Parser: Assigns typed labels to the dependencies
132+
val typedDependencyParser = TypedDependencyParserModel.pretrained()
133+
.setInputCols(Array("token", "pos", "dependency"))
134+
.setOutputCol("labdep")
135+
136+
// Create a pipeline that includes all stages
137+
val pipeline = new Pipeline().setStages(Array(
138+
documentAssembler,
139+
sentenceDetector,
140+
tokenizer,
141+
posTagger,
142+
dependencyParser,
143+
typedDependencyParser
144+
))
145+
146+
// Sample input data (a DataFrame with one text example)
147+
val df = Seq("Dependencies represent relationships between words in a Sentence").toDF("text")
148+
149+
// Run the pipeline on the input data
150+
val result = pipeline.fit(df).transform(df)
151+
152+
// Show the dependency parsing results
153+
result.select("dependency.result").show(truncate = false)
154+
155+
+---------------------------------------------------------------------------------+
156+
|result |
157+
+---------------------------------------------------------------------------------+
158+
|[ROOT, Dependencies, represents, words, relationships, Sentence, Sentence, words]|
159+
+---------------------------------------------------------------------------------+
160+
```
161+
</div>
162+
163+
## Try Real-Time Demos!
164+
165+
If you want to see the outputs of dependency parsing models in real time, visit our interactive demos:
166+
167+
- **[Grammar Analysis & Dependency Parsing](https://huggingface.co/spaces/abdullahmubeen10/sparknlp-grammar-analysis-and-dependency-parsing){:target="_blank"}** – An interactive demo to visualize dependencies in sentences.
168+
169+
## Useful Resources
170+
171+
Want to dive deeper into dependency parsing with Spark NLP? Here are some curated resources to help you get started and explore further:
172+
173+
**Articles and Guides**
174+
- *[Mastering Dependency Parsing with Spark NLP and Python](https://www.johnsnowlabs.com/supercharge-your-nlp-skills-mastering-dependency-parsing-with-spark-nlp-and-python/){:target="_blank"}*
175+
176+
**Notebooks**
177+
- *[Extract Part of speech tags and perform dependency parsing on a text](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/GRAMMAR_EN.ipynb#scrollTo=syePZ-1gYyj3){:target="_blank"}*
178+
- *[Typed Dependency Parsing with NLU.](https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/nlu/colab/component_examples/dependency_parsing/NLU_typed_dependency_parsing_example.ipynb){:target="_blank"}*

0 commit comments

Comments
 (0)