Skip to content

Commit 180a3de

Browse files
maziyarpanahijsl-modelsahmedlone127DevinTDHa
authored
Models hub (#14470)
Co-authored-by: ahmedlone127 <[email protected]> Co-authored-by: Maziyar Panahi <[email protected]> * 2024-11-01-distilbart_xsum_12_6_en (#14447) * Add model 2024-11-01-distilbart_xsum_12_6_en * Add model 2024-11-03-gpt2_en * Add model 2024-11-08-hubert_ukrainian_uk * Add model 2024-11-08-hubert_ukrainian_pipeline_uk * Add model 2024-11-08-unitku_hubert_japanese_asr_ja * Add model 2024-11-08-unitku_hubert_japanese_asr_pipeline_ja * Add model 2024-11-08-hubert_large_japanese_asr_ja * Add model 2024-11-08-hubert_large_japanese_asr_pipeline_ja --------- Co-authored-by: ahmedlone127 <[email protected]> * 2024-11-10-rubert_address_elements_ru (#14452) * Add model 2024-11-11-sent_bowdpr_wiki_en * Add model 2024-11-11-cc_uffs_ppc_ft_test_multiqa_pipeline_en * Add model 2024-11-11-unified_skill_ner_echo_en * Add model 2024-11-11-mountain_ner_model_en * Add model 2024-11-11-mountain_ner_model_pipeline_en * Add model 2024-11-11-msu_wiki_ner_ru * Add model 2024-11-11-bert_xomlac_ner_pipeline_zh * Add model 2024-11-11-bert_base_cased_finetuned_ner_pipeline_en * Add model 2024-11-11-bert_base_cased_finetuned_ner_en * Add model 2024-11-11-ner_tokenclassification_persian_pipeline_en * Add model 2024-11-11-persian_text_ner_bert_v1_fa * Add model 2024-11-11-sent_flang_spanbert_pipeline_en * Add model 2024-11-11-sent_gww_pipeline_en * Add model 2024-11-11-software_ner_prod_en * Add model 2024-11-11-quote_model_bertm_v1_pipeline_en * Add model 2024-11-11-classify_bluesky_1000_v2_pipeline_en * Add model 2024-11-11-msu_wiki_ner_pipeline_ru * Add model 2024-11-11-hardware_ner_prod_en * Add model 2024-11-11-auto_adver_pipeline_en * Add model 2024-11-11-bert_finetuned_ner_viktoryes_pipeline_en * Add model 2024-11-11-bert_finetuned_ner_viktoryes_en * Add model 2024-11-11-quote_model_bertm_v1_en * Add model 2024-11-11-software_ner_prod_pipeline_en * Add model 2024-11-11-sent_tiny_mlm_glue_qnli_en * Add model 2024-11-11-sent_cocodr_large_pipeline_en * Add model 2024-11-11-ner_tokenclassification_persian_en * Add model 2024-11-11-hardware_ner_prod_pipeline_en * Add model 2024-11-11-embedded_e5_base_50_pipeline_en * Add model 2024-11-11-bert_finetuned_tmvar_corpus_pipeline_en * Add model 2024-11-11-e5_base_pipeline_en * Add model 2024-11-11-e5_large_en * Add model 2024-11-11-rupunct_small_ru * Add model 2024-11-11-spanish_medical_ner_pipeline_es * Add model 2024-11-11-nepal_bhasa_biored_model_pipeline_en * Add model 2024-11-11-unified_skill_ner_echo_pipeline_en * Add model 2024-11-11-e5_large_pipeline_en * Add model 2024-11-11-e5_small_en * Add model 2024-11-11-cleaned_e5_base_unsupervised_pipeline_en * Add model 2024-11-11-keybert_bulgarian_pipeline_bg * Add model 2024-11-11-bert_xomlac_ner_zh * Add model 2024-11-11-bert_finetuned_tmvar_corpus_en * Add model 2024-11-11-cleaned_e5_large_unsupervised_en * Add model 2024-11-11-sent_tiny_mlm_snli_en * Add model 2024-11-11-embedded_e5_base_50_en * Add model 2024-11-11-cleaned_e5_base_unsupervised_en * Add model 2024-11-11-results_pipeline_en * Add model 2024-11-11-xlm_cebinary_vmo2_large_3_en * Add model 2024-11-11-xlm_cebinary_vmo2_large_3_pipeline_en * Add model 2024-11-11-southern_sotho_mpnet_base_normal_en * Add model 2024-11-11-persian_text_ner_bert_v1_pipeline_fa * Add model 2024-11-11-results_en * Add model 2024-11-11-autotrain_nzog3_ca819_pipeline_en * Add model 2024-11-11-sentence_similarity_finetuned_mpnet_adrta_pipeline_en * Add model 2024-11-11-sentence_similarity_finetuned_mpnet_adrta_en * Add model 2024-11-11-sentencetransformer_mpnet_base_on_chemical_dataset_en * Add model 2024-11-11-keybert_bulgarian_bg * Add model 2024-11-11-southern_sotho_mpnet_base10_en * Add model 2024-11-11-sentencetransformer_mpnet_base_on_chemical_dataset_pipeline_en * Add model 2024-11-11-e5_base_en * Add model 2024-11-11-southern_sotho_mpnet_base_normal_pipeline_en * Add model 2024-11-11-finetuned_sentence_similarity_en * Add model 2024-11-11-nepal_bhasa_biored_model_en * Add model 2024-11-11-whisper_tiny_amharic_en * Add model 2024-11-11-cleaned_e5_large_unsupervised_pipeline_en * Add model 2024-11-11-sent_bert_base_english_french_arabic_cased_pipeline_en * Add model 2024-11-11-fund_embedder_en * Add model 2024-11-11-whisper_tiny_v2_2_romanian_pipeline_en * Add model 2024-11-11-southern_sotho_mpnet_base20_pipeline_en * Add model 2024-11-11-auto_adver_en * Add model 2024-11-11-whisper_small_arabic_augmentation_en * Add model 2024-11-11-linshoufanfork_whisper_small_nan_twi_pinyin_pipeline_en * Add model 2024-11-11-whisper_small_arabic_augmentation_pipeline_en * Add model 2024-11-11-whisper_tiny_amharic_pipeline_en * Add model 2024-11-11-whisper_tiny_arabic_pipeline_ar * Add model 2024-11-11-linshoufanfork_whisper_small_nan_twi_pinyin_en * Add model 2024-11-11-checkpoints_almino_pipeline_en * Add model 2024-11-11-whisper_tiny_v2_2_romanian_en * Add model 2024-11-11-autotrain_nzog3_ca819_en * Add model 2024-11-11-whisper_omg_hi * Add model 2024-11-11-whisper_omg_pipeline_hi * Add model 2024-11-11-checkpoints_almino_en * Add model 2024-11-11-whisper_small_western_frisian_dutch_transfer_from_english_fy * Add model 2024-11-11-whisper_tiny_nob_en * Add model 2024-11-11-whisper_tiny_nob_pipeline_en * Add model 2024-11-11-whisper_small_western_frisian_dutch_transfer_from_english_pipeline_fy * Add model 2024-11-11-whisper_tiny_arabic_ar * Add model 2024-11-11-e5_small_pipeline_en * Add model 2024-11-11-whisper_small_english_crossdelenna_en * Add model 2024-11-11-finetuned_sentence_similarity_pipeline_en * Add model 2024-11-11-whisper_small_malay_pipeline_my * Add model 2024-11-11-whisper_small_malay_my * Add model 2024-11-11-rupunct_small_pipeline_ru * Add model 2024-11-11-southern_sotho_mpnet_base20_en * Add model 2024-11-11-whisper_small_english_crossdelenna_pipeline_en * Add model 2024-11-11-whisper_small_russian_f_ru * Add model 2024-11-11-whisper_small_yt_en * Add model 2024-11-11-whisper_small_russian_f_pipeline_ru * Add model 2024-11-11-whisper_small_yt_pipeline_en * Add model 2024-11-11-whisper_base_common_voice_arabic11_0_en * Add model 2024-11-11-southern_sotho_mpnet_base10_pipeline_en * Add model 2024-11-11-spanish_medical_ner_es * Add model 2024-11-11-whisper_base_common_voice_arabic11_0_pipeline_en * Add model 2024-11-11-whisper_base_hungarian_v1_hu * Add model 2024-11-11-whisper_base_hungarian_v1_pipeline_hu * Add model 2024-11-11-whisper_finetuned_atcosim_en * Add model 2024-11-11-whisper_finetuned_atcosim_pipeline_en * Add model 2024-11-11-whisper_medium_latvian_ver2_lv * Add model 2024-11-11-whisper_medium_latvian_ver2_pipeline_lv * Add model 2024-11-11-whisper_small_french_uncased_fr * Add model 2024-11-11-whisper_small_french_uncased_pipeline_fr * Add model 2024-11-11-whisper_tiny_chinese_antares28_en * Add model 2024-11-11-whisper_tiny_chinese_antares28_pipeline_en * Add model 2024-11-11-malaysian_whisper_tiny_ms * Add model 2024-11-11-malaysian_whisper_tiny_pipeline_ms * Add model 2024-11-11-whisper_medium_luluw_en * Add model 2024-11-11-whisper_small_dutch_en * Add model 2024-11-11-whisper_small_greek_modern_finetune_el * Add model 2024-11-11-whisper_small_dutch_pipeline_en * Add model 2024-11-11-whisper_small_greek_modern_finetune_pipeline_el * Add model 2024-11-11-deberta_v3_large_lemon_spell_5k_en * Add model 2024-11-11-deberta_v3_large_lemon_spell_5k_pipeline_en * Add model 2024-11-11-bert_finetuned_squad_dokyoungkim_en * Add model 2024-11-11-bert_finetuned_squad_dokyoungkim_pipeline_en * Add model 2024-11-11-bert_large_uncased_whole_word_masking_finetuned_squad_dev_i_en * Add model 2024-11-11-bert_large_uncased_whole_word_masking_finetuned_squad_dev_i_pipeline_en * Add model 2024-11-11-banglabert_qa_en * Add model 2024-11-11-mi_chatbotv3_en * Add model 2024-11-11-mi_chatbotv3_pipeline_en * Add model 2024-11-11-bert_sliding_window_epoch_3_en * Add model 2024-11-11-hebert_finetuned_precedents_he * Add model 2024-11-11-bert_sliding_window_epoch_3_pipeline_en * Add model 2024-11-11-bert_base_uncased_finetuned_triviaqa_en * Add model 2024-11-11-mbert_finetuned_mlqa_dev_spanish_chinese_hindi_en * Add model 2024-11-11-bert_base_uncased_figurative_language_en * Add model 2024-11-11-bert_base_uncased_finetuned_triviaqa_pipeline_en * Add model 2024-11-11-bert_finetuned_squad_accelerate_3_en * Add model 2024-11-11-banglabert_qa_pipeline_en * Add model 2024-11-11-bert_base_uncased_figurative_language_pipeline_en * Add model 2024-11-11-mbert_finetuned_mlqa_dev_spanish_chinese_hindi_pipeline_en * Add model 2024-11-11-hebert_finetuned_precedents_pipeline_he * Add model 2024-11-11-bert_finetuned_squad_accelerate_3_pipeline_en * Add model 2024-11-11-beto_sentiment_analysis_finetuned_en * Add model 2024-11-11-beto_sentiment_analysis_finetuned_pipeline_en * Add model 2024-11-11-personalinfoclassifier_en * Add model 2024-11-11-fine_tuned_metaphor_detection_en * Add model 2024-11-11-personalinfoclassifier_pipeline_en * Add model 2024-11-11-hs_arabic_translate_syn_4class_for_tool_en * Add model 2024-11-11-fine_tuned_metaphor_detection_pipeline_en * Add model 2024-11-11-clinical_trial_termination_en * Add model 2024-11-11-factuality_model_pipeline_en * Add model 2024-11-11-factuality_model_en * Add model 2024-11-11-bert_classifier_spanish_news_classification_headlines_pipeline_es * Add model 2024-11-11-kaggle_detect_generated_text_pipeline_en * Add model 2024-11-11-bert_base_uncased_sba_clf_pipeline_en * Add model 2024-11-11-e5_small_lora_ai_generated_detector_en * Add model 2024-11-11-bert_340m_ft_first_1000_pref_en * Add model 2024-11-11-kaggle_detect_generated_text_en * Add model 2024-11-11-bert_news_class_en * Add model 2024-11-11-politeness_model_pipeline_en * Add model 2024-11-11-politeness_model_en * Add model 2024-11-11-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_pubmedqa_pipeline_en * Add model 2024-11-11-scenario_nepal_bhasa_pipeline_en * Add model 2024-11-11-bio_clinicalbert_medical_en * Add model 2024-11-11-bert_classifier_spanish_news_classification_headlines_es * Add model 2024-11-11-bert_base_cased_mnli_en * Add model 2024-11-11-bert_large_finetuned_phishing_junginkim_en * Add model 2024-11-11-popbert_pipeline_de * Add model 2024-11-11-aspect_based_sentiment_analyzer_using_bert_en * Add model 2024-11-11-bert_base_cased_mnli_pipeline_en * Add model 2024-11-11-workprocess_24_10_01_en * Add model 2024-11-11-bert_model_news_aggregator_pipeline_en * Add model 2024-11-11-bert_base_uncased_emotion_prikshit7766_en * Add model 2024-11-11-clinical_trial_termination_pipeline_en * Add model 2024-11-11-nasa_smd_ibm_v0_1_uat_labeler_en * Add model 2024-11-11-hs_arabic_translate_syn_4class_for_tool_pipeline_en * Add model 2024-11-11-flash_italian_ns_classifier_fpt_en * Add model 2024-11-11-bert_large_finetuned_phishing_junginkim_pipeline_en * Add model 2024-11-11-e5_small_lora_ai_generated_detector_pipeline_en * Add model 2024-11-11-biomednlp_pubmedbert_base_uncased_abstract_fulltext_finetuned_pubmedqa_en * Add model 2024-11-11-climateattention_ctw_pipeline_en * Add model 2024-11-11-climateattention_ctw_en * Add model 2024-11-11-bio_clinicalbert_medical_pipeline_en * Add model 2024-11-11-bert_340m_ft_first_1000_pref_pipeline_en * Add model 2024-11-11-sst2_benign_bert_uncased_pipeline_en * Add model 2024-11-11-roberta_base_finetuned_ner_cadec_pipeline_en * Add model 2024-11-11-roberta_combined_generated_v1_1_epoch_7_en * Add model 2024-11-11-roberta_base_ainu_sayula_popoluca_en * Add model 2024-11-11-roberta_large_lemon_spell_5k_pipeline_en * Add model 2024-11-11-roberta_test_training_pipeline_en * Add model 2024-11-11-roberta_test_training_en * Add model 2024-11-11-securebert_finetuned_ner_pipeline_en * Add model 2024-11-11-bert_base_uncased_sba_clf_en * Add model 2024-11-11-sst2_benign_bert_uncased_en * Add model 2024-11-11-biomed_roberta_all_deep_en * Add model 2024-11-11-bert_model_news_aggregator_en * Add model 2024-11-11-indonesian_roberta_base_nerp_tagger_pipeline_en * Add model 2024-11-11-indonesian_roberta_base_nerp_tagger_en * Add model 2024-11-11-flash_italian_ns_classifier_fpt_pipeline_en * Add model 2024-11-11-popbert_de * Add model 2024-11-11-roberta_base_ainu_sayula_popoluca_pipeline_en * Add model 2024-11-11-roberta_base_finetuned_ner_cadec_en * Add model 2024-11-11-nasa_smd_ibm_v0_1_uat_labeler_pipeline_en * Add model 2024-11-11-scenario_nepal_bhasa_en * Add model 2024-11-11-affilgood_ner_en * Add model 2024-11-11-bge_large_zhtw_v1_5_en * Add model 2024-11-11-bge_small_english_v1_5_ft_orc_0930_dates_en * Add model 2024-11-11-bge_base_legal_matryoshka_v1_pipeline_en * Add model 2024-11-11-bsc_bio_ehr_spanish_distemist_es * Add model 2024-11-11-finetuned_baai_bge_base_english_pipeline_en * Add model 2024-11-11-bge_micro_smiles_pipeline_en * Add model 2024-11-11-bge_micro_smiles_en * Add model 2024-11-11-securebert_finetuned_ner_en * Add model 2024-11-11-bsc_bio_ehr_spanish_distemist_pipeline_es * Add model 2024-11-11-bge_tuned_en * Add model 2024-11-11-bge_base_english_v1_5_course_recommender_v2_en * Add model 2024-11-11-bge_base_legal_matryoshka_v1_en * Add model 2024-11-11-roberta_combined_generated_v1_1_epoch_8_en * Add model 2024-11-11-bge_small_english_v1_5_ft_orc_0930_dates_pipeline_en * Add model 2024-11-11-roberta_base_bne_capitel_ner_bsc_lt_pipeline_es * Add model 2024-11-11-fine_tuned_bge_large_en * Add model 2024-11-11-bge_99gpt_v1_en * Add model 2024-11-11-affilgood_ner_pipeline_en * Add model 2024-11-11-roberta_large_finetuned_abbr_filtered_plod_en * Add model 2024-11-11-roberta_base_bne_capitel_ner_plantl_gob_es_pipeline_es * Add model 2024-11-11-bge_tuned_pipeline_en * Add model 2024-11-11-roberta_base_absa_ate_sentiment_en * Add model 2024-11-11-bsc_bio_ehr_spanish_medprocner_pipeline_es * Add model 2024-11-11-lettuce_sayula_popoluca_dutch_mono_en * Add model 2024-11-11-ruroberta_large_ner_pipeline_en * Add model 2024-11-11-bge_base_english_v1_5_course_recommender_v2_pipeline_en * Add model 2024-11-11-roberta_combined_generated_epoch_7_pipeline_en * Add model 2024-11-11-roberta_combined_generated_epoch_7_en * Add model 2024-11-11-bge_small_english_v1_5_rirag_obliqa_en * Add model 2024-11-11-bge_99gpt_v1_pipeline_en * Add model 2024-11-11-bert_base_uncased_emotion_prikshit7766_pipeline_en * Add model 2024-11-11-roberta_large_finetuned_ner_finetuned_ner_en * Add model 2024-11-11-lettuce_sayula_popoluca_dutch_mono_pipeline_en * Add model 2024-11-11-roberta_large_finetuned_ner_finetuned_ner_pipeline_en * Add model 2024-11-11-bge_base_english_v1_5_finetuned_osllmai_v1_pipeline_en * Add model 2024-11-11-bert_finetuned_semantic_augmentation_ner_en * Add model 2024-11-11-bge_large_zhtw_v1_5_pipeline_en * Add model 2024-11-11-roberta_combined_generated_v1_1_epoch_8_pipeline_en * Add model 2024-11-11-ruroberta_large_ner_en * Add model 2024-11-11-roberta_spanish_clinical_trials_neg_spec_ner_en * Add model 2024-11-11-bert_news_class_pipeline_en * Add model 2024-11-11-roberta_base_absa_ate_sentiment_pipeline_en * Add model 2024-11-11-finetuned_bge_base_english_pipeline_en * Add model 2024-11-11-roberta_combined_generated_v1_1_epoch_7_pipeline_en * Add model 2024-11-11-fine_tuned_bge_large_pipeline_en * Add model 2024-11-11-workprocess_24_10_01_pipeline_en --------- Co-authored-by: ahmedlone127 <[email protected]> * Add model 2024-11-13-roberta_embeddings_legal_roberta_base_en (#14456) Co-authored-by: gadde5300 <[email protected]> * Add model 2024-11-20-bert_embeddings_sec_bert_base_en (#14460) Co-authored-by: gadde5300 <[email protected]> * 2024-11-26-mini_cpm_2b_8bit_xx (#14466) * Add model 2024-11-26-mini_cpm_2b_8bit_xx * Add model 2024-11-26-mini_cpm_2b_8bit_xx * Add model 2024-11-27-nllb_distilled_600M_8int_xx * Add model 2024-11-27-nomic_embed_v1_en * Add model 2024-11-29-nomic_embed_v1_en * Delete docs/_posts/ahmedlone127/2024-11-29-nomic_embed_v1_en.md * Update 2024-11-27-nllb_distilled_600M_8int_xx.md * Update 2024-11-27-nllb_distilled_600M_8int_xx.md * Add model 2024-11-29-phi_3_mini_128k_instruct_en * Update 2024-11-29-phi_3_mini_128k_instruct_en.md * Add model 2024-11-29-qwen_7.5b_chat_en --------- Co-authored-by: ahmedlone127 <[email protected]> --------- Co-authored-by: jsl-models <[email protected]> Co-authored-by: ahmedlone127 <[email protected]> Co-authored-by: DevinTDHa <[email protected]> Co-authored-by: Devin Ha <[email protected]>
1 parent 6653546 commit 180a3de

6 files changed

+535
-0
lines changed
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
layout: model
3+
title: mini_cpm_2b_8bit model from
4+
author: John Snow Labs
5+
name: mini_cpm_2b_8bit
6+
date: 2024-11-26
7+
tags: [en, open_source, pipeline, openvino, xx]
8+
task: Text Generation
9+
language: xx
10+
edition: Spark NLP 5.5.1
11+
spark_version: 3.0
12+
supported: true
13+
engine: openvino
14+
annotator: CPMTransformer
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained CPMTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mini_cpm_2b_8bit` is a multilingual model originally trained by openbmb.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/mini_cpm_2b_8bit_xx_5.5.1_3.0_1732658809236.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/mini_cpm_2b_8bit_xx_5.5.1_3.0_1732658809236.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol("text") \
40+
.setOutputCol("document")
41+
42+
seq2seq = CPMTransformer.pretrained("mini_cpm_2b_8bit","xx") \
43+
.setInputCols(["documents"]) \
44+
.setOutputCol("generation")
45+
46+
pipeline = Pipeline().setStages([documentAssembler, seq2seq])
47+
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
48+
pipelineModel = pipeline.fit(data)
49+
pipelineDF = pipelineModel.transform(data)
50+
51+
```
52+
```scala
53+
54+
val documentAssembler = new DocumentAssembler()
55+
.setInputCol("text")
56+
.setOutputCol("document")
57+
58+
val seq2seq = CPMTransformer.pretrained("mini_cpm_2b_8bit","xx")
59+
.setInputCols(Array("documents"))
60+
.setOutputCol("generation")
61+
62+
val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq))
63+
val data = Seq("I love spark-nlp").toDF("text")
64+
val pipelineModel = pipeline.fit(data)
65+
val pipelineDF = pipelineModel.transform(data)
66+
67+
```
68+
</div>
69+
70+
{:.model-param}
71+
## Model Information
72+
73+
{:.table-model}
74+
|---|---|
75+
|Model Name:|mini_cpm_2b_8bit|
76+
|Compatibility:|Spark NLP 5.5.1+|
77+
|License:|Open Source|
78+
|Edition:|Official|
79+
|Input Labels:|[documents]|
80+
|Output Labels:|[generation]|
81+
|Language:|xx|
82+
|Size:|3.0 GB|
83+
84+
## References
85+
86+
https://huggingface.co/openbmb/MiniCPM-2B-dpo-bf16
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
layout: model
3+
title: nllb_distilled_600M_8int model from Facebook
4+
author: John Snow Labs
5+
name: nllb_distilled_600M_8int
6+
date: 2024-11-27
7+
tags: [en, open_source, pipeline, openvino, xx]
8+
task: Text Generation
9+
language: xx
10+
edition: Spark NLP 5.5.1
11+
spark_version: 3.0
12+
supported: true
13+
engine: openvino
14+
annotator: NLLBTransformer
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained NLLBTransformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`nllb_distilled_600M_8int` is a Multilingual model originally trained by facebook.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nllb_distilled_600M_8int_xx_5.5.1_3.0_1732741416718.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nllb_distilled_600M_8int_xx_5.5.1_3.0_1732741416718.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol("text") \
40+
.setOutputCol("document")
41+
42+
seq2seq = NLLBTransformer.pretrained("mini_cpm_2b_8bit","xx") \
43+
.setInputCols(["documents"]) \
44+
.setOutputCol("generation")
45+
46+
pipeline = Pipeline().setStages([documentAssembler, seq2seq])
47+
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
48+
pipelineModel = pipeline.fit(data)
49+
pipelineDF = pipelineModel.transform(data)
50+
51+
```
52+
```scala
53+
54+
val documentAssembler = new DocumentAssembler()
55+
.setInputCol("text")
56+
.setOutputCol("document")
57+
58+
val seq2seq = NLLBTransformer.pretrained("mini_cpm_2b_8bit","xx")
59+
.setInputCols(Array("documents"))
60+
.setOutputCol("generation")
61+
62+
val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq))
63+
val data = Seq("I love spark-nlp").toDF("text")
64+
val pipelineModel = pipeline.fit(data)
65+
val pipelineDF = pipelineModel.transform(data)
66+
67+
```
68+
</div>
69+
70+
{:.model-param}
71+
## Model Information
72+
73+
{:.table-model}
74+
|---|---|
75+
|Model Name:|nllb_distilled_600M_8int|
76+
|Compatibility:|Spark NLP 5.5.1+|
77+
|License:|Open Source|
78+
|Edition:|Official|
79+
|Input Labels:|[documents]|
80+
|Output Labels:|[generation]|
81+
|Language:|xx|
82+
|Size:|842.9 MB|
83+
84+
## References
85+
86+
https://huggingface.co/facebook/nllb-200-distilled-600M
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
layout: model
3+
title: nomic_embed_v1 model from nomic-ai
4+
author: John Snow Labs
5+
name: nomic_embed_v1
6+
date: 2024-11-27
7+
tags: [en, open_source, openvino]
8+
task: Embeddings
9+
language: en
10+
edition: Spark NLP 5.5.1
11+
spark_version: 3.0
12+
supported: true
13+
engine: openvino
14+
annotator: NomicEmbeddings
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained NomicEmbeddings, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`mini_cpm_2b_8bit` is a multilingual model originally trained by openbmb.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/nomic_embed_v1_en_5.5.1_3.0_1732743647389.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/nomic_embed_v1_en_5.5.1_3.0_1732743647389.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol("text") \
40+
.setOutputCol("document")
41+
42+
embeddings = NomicEmbeddings.pretrained("nomic_embed_v1","en") \
43+
.setInputCols(["document"]) \
44+
.setOutputCol("embeddings")
45+
46+
pipeline = Pipeline().setStages([documentAssembler, embeddings])
47+
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
48+
pipelineModel = pipeline.fit(data)
49+
pipelineDF = pipelineModel.transform(data)
50+
51+
```
52+
```scala
53+
54+
val documentAssembler = new DocumentAssembler()
55+
.setInputCol("text")
56+
.setOutputCol("document")
57+
58+
val embeddings = NomicEmbeddings.pretrained("nomic_embed_v1","en")
59+
.setInputCols(Array("document"))
60+
.setOutputCol("embeddings")
61+
62+
val pipeline = new Pipeline().setStages(Array(documentAssembler, embeddings))
63+
val data = Seq("I love spark-nlp").toDF("text")
64+
val pipelineModel = pipeline.fit(data)
65+
val pipelineDF = pipelineModel.transform(data)
66+
67+
```
68+
</div>
69+
70+
{:.model-param}
71+
## Model Information
72+
73+
{:.table-model}
74+
|---|---|
75+
|Model Name:|nomic_embed_v1|
76+
|Compatibility:|Spark NLP 5.5.1+|
77+
|License:|Open Source|
78+
|Edition:|Official|
79+
|Input Labels:|[documents]|
80+
|Output Labels:|[generation]|
81+
|Language:|en|
82+
|Size:|255.0 MB|
83+
84+
## References
85+
86+
https://huggingface.co/nomic-ai/nomic-embed-text-v1
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
---
2+
layout: model
3+
title: phi_3_mini_128k_instruct model from microsoft
4+
author: John Snow Labs
5+
name: phi_3_mini_128k_instruct
6+
date: 2024-11-29
7+
tags: [en, open_source, openvino]
8+
task: Text Generation
9+
language: en
10+
edition: Spark NLP 5.5.1
11+
spark_version: 3.0
12+
supported: true
13+
engine: openvino
14+
annotator: Phi3Transformer
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Pretrained Phi3Transformer, adapted from Hugging Face and curated to provide scalability and production-readiness using Spark NLP.`phi_3_mini_128k_instruct` is a english model originally trained by openbmb.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/phi_3_mini_128k_instruct_en_5.5.1_3.0_1732897700551.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/phi_3_mini_128k_instruct_en_5.5.1_3.0_1732897700551.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
38+
documentAssembler = DocumentAssembler() \
39+
.setInputCol("text") \
40+
.setOutputCol("document")
41+
42+
seq2seq = Phi3Transformer.pretrained("phi_3_mini_128k_instruct","en") \
43+
.setInputCols(["document"]) \
44+
.setOutputCol("generation")
45+
46+
pipeline = Pipeline().setStages([documentAssembler, seq2seq])
47+
data = spark.createDataFrame([["I love spark-nlp"]]).toDF("text")
48+
pipelineModel = pipeline.fit(data)
49+
pipelineDF = pipelineModel.transform(data)
50+
51+
```
52+
```scala
53+
54+
val documentAssembler = new DocumentAssembler()
55+
.setInputCol("text")
56+
.setOutputCol("document")
57+
58+
val seq2seq = Phi3Transformer.pretrained("phi_3_mini_128k_instruct","en")
59+
.setInputCols(Array("document"))
60+
.setOutputCol("generation")
61+
62+
val pipeline = new Pipeline().setStages(Array(documentAssembler, seq2seq))
63+
val data = Seq("I love spark-nlp").toDF("text")
64+
val pipelineModel = pipeline.fit(data)
65+
val pipelineDF = pipelineModel.transform(data)
66+
67+
```
68+
</div>
69+
70+
{:.model-param}
71+
## Model Information
72+
73+
{:.table-model}
74+
|---|---|
75+
|Model Name:|phi_3_mini_128k_instruct|
76+
|Compatibility:|Spark NLP 5.5.1+|
77+
|License:|Open Source|
78+
|Edition:|Official|
79+
|Input Labels:|[documents]|
80+
|Output Labels:|[generation]|
81+
|Language:|en|
82+
|Size:|3.5 GB|
83+
84+
## References
85+
86+
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct

0 commit comments

Comments
 (0)