Skip to content

Commit 956e2cd

Browse files
Merge pull request #12613 from JohnSnowLabs/release/410-release-candidate
Release/410 release candidate
2 parents a85972f + 6498da0 commit 956e2cd

File tree

1,361 files changed

+61175
-37702
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,361 files changed

+61175
-37702
lines changed

CHANGELOG

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,16 @@
1+
========
2+
4.1.0
3+
========
4+
----------------
5+
New Features & Enhancements
6+
----------------
7+
* **NEW:** Introducing **ViTForImageClassification** annotator in Spark NLP 🚀. `ViTForImageClassification` can load Vision Transformer `ViT` Models with an image classification head on top (a linear layer on top of the final hidden state of the [CLS] token) e.g. for ImageNet. This annotator is compatible with all the models trained/fine-tuned by using `ViTForImageClassification` for **PyTorch** or `TFViTForImageClassification` for **TensorFlow** models in HuggingFace 🤗
8+
* Provide support for AWS Graviton processors and ARM64 processors with architecture greater than ARMv8
9+
* Introducing **TFNerDLGraphBuilder** annotator. `TFNerDLGraphBuilder` can be used to automatically detect the parameters of a needed NerDL graph and generate the graph within a pipeline when the default NER graphs are not suitable for your training datasets.
10+
* Allow passing confidence scores from all XXXForTokenClassification annotators to NerConverter. From this release it is possible to access the confidence scores coming from the following annotators via NerConverter: AlbertForTokenClassification, BertForTokenClassification, DeBertaForTokenClassification, DistilBertForTokenClassification, LongformerForTokenClassification, RoBertaForTokenClassification, XlmRoBertaForTokenClassification, XlnetForTokenClassification, and DeBertaForTokenClassification
11+
* Introducing PushToHub Python class to easily push public models/pipelines to Models Hub
12+
* Introducing fullAnnotateImage to existing LightPipeline to support ImageAssembler and ViTForImageClassification annotators in a Spark NLP pipeline.
13+
114
========
215
4.0.2
316
========

README.md

Lines changed: 88 additions & 51 deletions
Large diffs are not rendered by default.

build.sbt

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@ import Dependencies._
22
import Resolvers.m2Resolvers
33
import sbtassembly.MergeStrategy
44

5-
name := getPackageName(is_m1, is_gpu)
5+
name := getPackageName(is_m1, is_gpu, is_aarch64)
66

77
organization := "com.johnsnowlabs.nlp"
88

9-
version := "4.0.2"
9+
version := "4.1.0"
1010

1111
(ThisBuild / scalaVersion) := scalaVer
1212

@@ -148,6 +148,8 @@ val tensorflowDependencies: Seq[sbt.ModuleID] =
148148
Seq(tensorflowGPU)
149149
else if (is_m1.equals("true"))
150150
Seq(tensorflowM1)
151+
else if (is_aarch64.equals("true"))
152+
Seq(tensorflowLinuxAarch64)
151153
else
152154
Seq(tensorflowCPU)
153155

conda/meta.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
package:
22
name: "spark-nlp"
3-
version: 4.0.2
3+
version: 4.1.0
44

55
app:
66
entry: spark-nlp
77
summary: Natural Language Understanding Library for Apache Spark.
88

99
source:
10-
fn: spark-nlp-4.0.2.tar.gz
11-
url: https://files.pythonhosted.org/packages/b2/b7/f07cdf02cf91a8ad5925f2d9b4af926ba51c9cb4b8230ecd0ec0663baba9/spark-nlp-4.0.2.tar.gz
12-
sha256: 5a017576c18e1963ef5ae09d179334de2ea4b195cfc3f2644f360cbc2338509e
10+
fn: spark-nlp-4.1.0.tar.gz
11+
url: https://files.pythonhosted.org/packages/4f/3d/b00ee9426bea493e4c754f59ecd219fe8f04e664991b1be7b570c3714cdb/spark-nlp-4.1.0.tar.gz
12+
sha256: 5afc3a7e9e8381fd2f8a5d6a47bcf2fdd4c6e247b82cfab411b9aee0e9a6f69d
1313
build:
1414
noarch: generic
1515
number: 0

docs/_layouts/landing.html

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -201,22 +201,22 @@ <h3 class="grey h3_title">{{ _section.title }}</h3>
201201
<div class="highlight-box">
202202
{% highlight bash %}
203203
# Install Spark NLP from PyPI
204-
$ pip install spark-nlp==4.0.2 pyspark==3.2.1
204+
$ pip install spark-nlp==4.1.0 pyspark==3.2.1
205205

206206
# Install Spark NLP from Anaconda/Conda
207207
$ conda install -c johnsnowlabs spark-nlp
208208

209209
# Load Spark NLP with Spark Shell
210-
$ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.0.2
210+
$ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.1.0
211211

212212
# Load Spark NLP with PySpark
213-
$ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.0.2
213+
$ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.1.0
214214

215215
# Load Spark NLP with Spark Submit
216-
$ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.0.2
216+
$ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:4.1.0
217217

218218
# Load Spark NLP as an external Fat JAR
219-
$ spark-shell --jars spark-nlp-assembly-4.0.2.jar
219+
$ spark-shell --jars spark-nlp-assembly-4.1.0.jar
220220
{% endhighlight %}
221221
</div>
222222
</div>
@@ -322,6 +322,7 @@ <h4 class="blue h4_title">NLP Features</h4>
322322
<li>Neural <strong>Machine Translation</strong> (MarianMT)</li>
323323
<li><strong>Text-To-Text</strong> Transfer Transformer <strong>(Google T5)</strong></li>
324324
<li><strong>Generative Pre-trained</strong> Transformer 2 <strong>(OpenAI GPT-2)</strong></li>
325+
<li>Vision Transformer (ViT) <strong>Image Classification</strong></li>
325326
<li>Unsupervised <strong>keywords extraction</strong></li>
326327
<li>Language <strong>Detection</strong> & <strong>Identification</strong> (up to 375 languages)</li>
327328
<li>Multi-class Text <strong>Classification</strong> (DL model)</li>
@@ -340,13 +341,11 @@ <h4 class="blue h4_title">NLP Features</h4>
340341
<li>Easy <strong>TensorFlow</strong> integration</li>
341342
<li><strong>GPU</strong> Support</li>
342343
<li>Full integration with <strong>Spark ML</strong> functions</li>
343-
<li><strong>4400+</strong> pre-trained <strong>models </strong> in <strong>200+ languages! </strong>
344-
<li><strong>1800+</strong> pre-trained <strong>pipelines </strong> in <strong>200+ languages! </strong>
344+
<li><strong>6150+</strong> pre-trained <strong>models </strong> in <strong>200+ languages! </strong>
345+
<li><strong>1840+</strong> pre-trained <strong>pipelines </strong> in <strong>200+ languages! </strong>
345346
</ul>
346347
</div>
347348
{% highlight python %}
348-
from sparknlp.base import *
349-
from sparknlp.annotator import *
350349
from sparknlp.pretrained import PretrainedPipeline
351350
import sparknlp
352351

@@ -357,7 +356,8 @@ <h4 class="blue h4_title">NLP Features</h4>
357356
pipeline = PretrainedPipeline('explain_document_dl', lang='en')
358357

359358
# Annotate your testing dataset
360-
result = pipeline.annotate("The Mona Lisa is a 16th century oil painting created by Leonardo. It's held at the Louvre in Paris.")
359+
text = "The Mona Lisa is a 16th century oil painting created by Leonardo. It's held at the Louvre in Paris."
360+
result = pipeline.annotate(text)
361361

362362
# What's in the pipeline
363363
list(result.keys())

docs/api/com/index.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
<head>
44
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
6-
<title>Spark NLP 4.0.2 ScalaDoc - com</title>
7-
<meta name="description" content="Spark NLP 4.0.2 ScalaDoc - com" />
8-
<meta name="keywords" content="Spark NLP 4.0.2 ScalaDoc com" />
6+
<title>Spark NLP 4.1.0 ScalaDoc - com</title>
7+
<meta name="description" content="Spark NLP 4.1.0 ScalaDoc - com" />
8+
<meta name="keywords" content="Spark NLP 4.1.0 ScalaDoc com" />
99
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
1010

1111

@@ -28,7 +28,7 @@
2828
</head>
2929
<body>
3030
<div id="search">
31-
<span id="doc-title">Spark NLP 4.0.2 ScalaDoc<span id="doc-version"></span></span>
31+
<span id="doc-title">Spark NLP 4.1.0 ScalaDoc<span id="doc-version"></span></span>
3232
<span class="close-results"><span class="left">&lt;</span> Back</span>
3333
<div id="textfilter">
3434
<span class="input">

docs/api/com/johnsnowlabs/client/CredentialParams.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
<head>
44
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
6-
<title>Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.CredentialParams</title>
7-
<meta name="description" content="Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.CredentialParams" />
8-
<meta name="keywords" content="Spark NLP 4.0.2 ScalaDoc com.johnsnowlabs.client.CredentialParams" />
6+
<title>Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.CredentialParams</title>
7+
<meta name="description" content="Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.CredentialParams" />
8+
<meta name="keywords" content="Spark NLP 4.1.0 ScalaDoc com.johnsnowlabs.client.CredentialParams" />
99
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
1010

1111

@@ -28,7 +28,7 @@
2828
</head>
2929
<body>
3030
<div id="search">
31-
<span id="doc-title">Spark NLP 4.0.2 ScalaDoc<span id="doc-version"></span></span>
31+
<span id="doc-title">Spark NLP 4.1.0 ScalaDoc<span id="doc-version"></span></span>
3232
<span class="close-results"><span class="left">&lt;</span> Back</span>
3333
<div id="textfilter">
3434
<span class="input">

docs/api/com/johnsnowlabs/client/aws/AWSAnonymousCredentials.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
<head>
44
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
6-
<title>Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials</title>
7-
<meta name="description" content="Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
8-
<meta name="keywords" content="Spark NLP 4.0.2 ScalaDoc com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
6+
<title>Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials</title>
7+
<meta name="description" content="Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
8+
<meta name="keywords" content="Spark NLP 4.1.0 ScalaDoc com.johnsnowlabs.client.aws.AWSAnonymousCredentials" />
99
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
1010

1111

@@ -28,7 +28,7 @@
2828
</head>
2929
<body>
3030
<div id="search">
31-
<span id="doc-title">Spark NLP 4.0.2 ScalaDoc<span id="doc-version"></span></span>
31+
<span id="doc-title">Spark NLP 4.1.0 ScalaDoc<span id="doc-version"></span></span>
3232
<span class="close-results"><span class="left">&lt;</span> Back</span>
3333
<div id="textfilter">
3434
<span class="input">

docs/api/com/johnsnowlabs/client/aws/AWSBasicCredentials.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
<head>
44
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
6-
<title>Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials</title>
7-
<meta name="description" content="Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials" />
8-
<meta name="keywords" content="Spark NLP 4.0.2 ScalaDoc com.johnsnowlabs.client.aws.AWSBasicCredentials" />
6+
<title>Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials</title>
7+
<meta name="description" content="Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.aws.AWSBasicCredentials" />
8+
<meta name="keywords" content="Spark NLP 4.1.0 ScalaDoc com.johnsnowlabs.client.aws.AWSBasicCredentials" />
99
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
1010

1111

@@ -28,7 +28,7 @@
2828
</head>
2929
<body>
3030
<div id="search">
31-
<span id="doc-title">Spark NLP 4.0.2 ScalaDoc<span id="doc-version"></span></span>
31+
<span id="doc-title">Spark NLP 4.1.0 ScalaDoc<span id="doc-version"></span></span>
3232
<span class="close-results"><span class="left">&lt;</span> Back</span>
3333
<div id="textfilter">
3434
<span class="input">

docs/api/com/johnsnowlabs/client/aws/AWSCredentialsProvider.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
<head>
44
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
6-
<title>Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.aws.AWSCredentialsProvider</title>
7-
<meta name="description" content="Spark NLP 4.0.2 ScalaDoc - com.johnsnowlabs.client.aws.AWSCredentialsProvider" />
8-
<meta name="keywords" content="Spark NLP 4.0.2 ScalaDoc com.johnsnowlabs.client.aws.AWSCredentialsProvider" />
6+
<title>Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.aws.AWSCredentialsProvider</title>
7+
<meta name="description" content="Spark NLP 4.1.0 ScalaDoc - com.johnsnowlabs.client.aws.AWSCredentialsProvider" />
8+
<meta name="keywords" content="Spark NLP 4.1.0 ScalaDoc com.johnsnowlabs.client.aws.AWSCredentialsProvider" />
99
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
1010

1111

@@ -28,7 +28,7 @@
2828
</head>
2929
<body>
3030
<div id="search">
31-
<span id="doc-title">Spark NLP 4.0.2 ScalaDoc<span id="doc-version"></span></span>
31+
<span id="doc-title">Spark NLP 4.1.0 ScalaDoc<span id="doc-version"></span></span>
3232
<span class="close-results"><span class="left">&lt;</span> Back</span>
3333
<div id="textfilter">
3434
<span class="input">

0 commit comments

Comments
 (0)