Skip to content

Commit b827818

Browse files
Models Hub Update
1 parent b193c44 commit b827818

File tree

2 files changed

+263
-0
lines changed

2 files changed

+263
-0
lines changed
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
---
2+
layout: model
3+
title: Qwen2.5-VL-7B-Instruct (Q16 GGUF Quantized)
4+
author: John Snow Labs
5+
name: qwen2.5_vl_7b_instruct_q16_gguf
6+
date: 2025-09-15
7+
tags: [qwen2_5_vl, image_to_text, multimodal, conversational, instruct, q16, 7b, en, open_source, llamacpp]
8+
task: Image Captioning
9+
language: en
10+
edition: Spark NLP 6.1.1
11+
spark_version: 3.0
12+
supported: true
13+
engine: llamacpp
14+
annotator: AutoGGUFVisionModel
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
**Qwen2.5-VL-7B-Instruct (Q16 GGUF Quantized)** is a 7-billion-parameter multimodal instruction-tuned model supporting **text, image, and video understanding**. Compared to Qwen2-VL, it introduces major enhancements in **fine-grained visual analysis (objects, text, charts, layouts), structured outputs (tables, invoices, forms), visual localization (bounding boxes, points with JSON), and long-video comprehension (over 1 hour with temporal reasoning)**.
23+
24+
It also adds **agentic capabilities**, enabling tool use such as computer and phone control. This version is provided in **GGUF Q16 format** for efficient inference in SparkNLP pipelines and lightweight runtimes, balancing speed and accuracy.
25+
26+
Originally from [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct).
27+
28+
{:.btn-box}
29+
<button class="button button-orange" disabled>Live Demo</button>
30+
<button class="button button-orange" disabled>Open in Colab</button>
31+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/qwen2.5_vl_7b_instruct_q16_gguf_en_6.1.1_3.0_1757966755156.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
32+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/qwen2.5_vl_7b_instruct_q16_gguf_en_6.1.1_3.0_1757966755156.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
33+
34+
## How to use
35+
36+
37+
38+
<div class="tabs-box" markdown="1">
39+
{% include programmingLanguageSelectScalaPythonNLU.html %}
40+
```python
41+
from sparknlp.base import DocumentAssembler, ImageAssembler
42+
from sparknlp.annotator import AutoGGUFVisionModel
43+
from pyspark.sql.functions import lit
44+
from pyspark.ml import Pipeline
45+
46+
images_path = "path/to/images/folder"
47+
prompt = "Caption this image."
48+
49+
data = ImageAssembler.loadImagesAsBytes(spark, images_path)
50+
data = data.withColumn("caption", lit(prompt))
51+
52+
document_assembler = (
53+
DocumentAssembler()
54+
.setInputCol("caption")
55+
.setOutputCol("caption_document")
56+
)
57+
58+
image_assembler = (
59+
ImageAssembler()
60+
.setInputCol("image")
61+
.setOutputCol("image_assembler")
62+
)
63+
64+
qwen_chat_template = """<|im_start|>user
65+
prompt<|im_end|>
66+
<|im_start|>assistant
67+
"""
68+
69+
autoGGUFVisionModel = (
70+
AutoGGUFVisionModel.pretrained("qwen2.5_vl_7b_instruct_q16_gguf")
71+
.setInputCols(["caption_document", "image_assembler"])
72+
.setOutputCol("completions")
73+
.setChatTemplate(qwen_chat_template)
74+
.setBatchSize(4)
75+
.setNGpuLayers(32)
76+
.setNCtx(4096)
77+
.setMinKeep(0)
78+
.setMinP(0.05)
79+
.setNPredict(64)
80+
.setNProbs(0)
81+
.setPenalizeNl(False)
82+
.setRepeatLastN(256)
83+
.setRepeatPenalty(1.1)
84+
.setStopStrings(["</s>", "<|im_end|>", "User:"])
85+
.setTemperature(0.2)
86+
.setTfsZ(1)
87+
.setTypicalP(1)
88+
.setTopK(40)
89+
.setTopP(0.95)
90+
)
91+
92+
pipeline = Pipeline().setStages([
93+
document_assembler,
94+
image_assembler,
95+
autoGGUFVisionModel
96+
])
97+
98+
model = pipeline.fit(data)
99+
result = model.transform(data)
100+
101+
result.selectExpr(
102+
"reverse(split(image.origin, '/'))[0] as image_name",
103+
"completions.result"
104+
).show(truncate=False)
105+
```
106+
```scala
107+
import com.johnsnowlabs.nlp.base._
108+
import com.johnsnowlabs.nlp.annotators._
109+
import org.apache.spark.sql.functions.lit
110+
import org.apache.spark.ml.Pipeline
111+
112+
val images_path = "path/to/images/folder"
113+
val prompt = "Caption this image."
114+
115+
var data = ImageAssembler.loadImagesAsBytes(spark, images_path)
116+
data = data.withColumn("caption", lit(prompt))
117+
118+
val document_assembler = new DocumentAssembler()
119+
.setInputCol("caption")
120+
.setOutputCol("caption_document")
121+
122+
val image_assembler = new ImageAssembler()
123+
.setInputCol("image")
124+
.setOutputCol("image_assembler")
125+
126+
val qwen_chat_template = """<|im_start|>user
127+
prompt<|im_end|>
128+
<|im_start|>assistant
129+
"""
130+
131+
val autoGGUFVisionModel = AutoGGUFVisionModel.pretrained("qwen2.5_vl_7b_instruct_q16_gguf")
132+
.setInputCols(Array("caption_document", "image_assembler"))
133+
.setOutputCol("completions")
134+
.setChatTemplate(qwen_chat_template)
135+
.setBatchSize(4)
136+
.setNGpuLayers(32)
137+
.setNCtx(4096)
138+
.setMinKeep(0)
139+
.setMinP(0.05)
140+
.setNPredict(64)
141+
.setNProbs(0)
142+
.setPenalizeNl(false)
143+
.setRepeatLastN(256)
144+
.setRepeatPenalty(1.1)
145+
.setStopStrings(Array("</s>", "<|im_end|>", "User:"))
146+
.setTemperature(0.2)
147+
.setTfsZ(1)
148+
.setTypicalP(1)
149+
.setTopK(40)
150+
.setTopP(0.95)
151+
152+
val pipeline = new Pipeline().setStages(Array(
153+
document_assembler,
154+
image_assembler,
155+
autoGGUFVisionModel
156+
))
157+
158+
val model = pipeline.fit(data)
159+
val result = model.transform(data)
160+
161+
result.selectExpr(
162+
"reverse(split(image.origin, '/'))[0] as image_name",
163+
"completions.result"
164+
).show(false)
165+
```
166+
</div>
167+
168+
## Results
169+
170+
```bash
171+
172+
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------+
173+
|image_name |result |
174+
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------+
175+
|prescription_02.png|["Medical prescription for systemic lupus erythematosus and scleroderma overlap with interstitial lung disease, dated 02/07/2021."]|
176+
|prescription_01.png|["Prescription for malaria treatment, dated 30-Aug-2023, from SMS Hospital."] |
177+
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------+
178+
179+
```
180+
181+
{:.model-param}
182+
## Model Information
183+
184+
{:.table-model}
185+
|---|---|
186+
|Model Name:|qwen2.5_vl_7b_instruct_q16_gguf|
187+
|Compatibility:|Spark NLP 6.1.1+|
188+
|License:|Open Source|
189+
|Edition:|Official|
190+
|Input Labels:|[caption_document, image_assembler]|
191+
|Output Labels:|[completions]|
192+
|Language:|en|
193+
|Size:|13.3 GB|
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
layout: model
3+
title: BGE Reranker V2 M3 Q4_K_M GGUF
4+
author: John Snow Labs
5+
name: bge_reranker_v2_m3_Q4_K_M
6+
date: 2025-09-01
7+
tags: [llamacpp, gguf, reranker, bge, en, open_source]
8+
task: Reranking
9+
language: en
10+
edition: Spark NLP 6.1.2
11+
spark_version: 3.0
12+
supported: true
13+
engine: llamacpp
14+
annotator: AutoGGUFReranker
15+
article_header:
16+
type: cover
17+
use_language_switcher: "Python-Scala-Java"
18+
---
19+
20+
## Description
21+
22+
Lightweight reranker model, possesses strong multilingual capabilities, easy to deploy, with fast inference.
23+
24+
{:.btn-box}
25+
<button class="button button-orange" disabled>Live Demo</button>
26+
<button class="button button-orange" disabled>Open in Colab</button>
27+
[Download](https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/models/bge_reranker_v2_m3_Q4_K_M_en_6.1.2_3.0_1756718229635.zip){:.button.button-orange.button-orange-trans.arr.button-icon}
28+
[Copy S3 URI](s3://auxdata.johnsnowlabs.com/public/models/bge_reranker_v2_m3_Q4_K_M_en_6.1.2_3.0_1756718229635.zip){:.button.button-orange.button-orange-trans.button-icon.button-copy-s3}
29+
30+
## How to use
31+
32+
33+
34+
<div class="tabs-box" markdown="1">
35+
{% include programmingLanguageSelectScalaPythonNLU.html %}
36+
```python
37+
import sparknlp
38+
from sparknlp.base import *
39+
from sparknlp.annotator import *
40+
from pyspark.ml import Pipeline
41+
document = DocumentAssembler() \n .setInputCol("text") \n .setOutputCol("document")
42+
reranker = AutoGGUFReranker.pretrained("bge_reranker_v2_m3_Q4_K_M") \n .setInputCols(["document"]) \n .setOutputCol("reranked_documents") \n .setBatchSize(4) \n .setQuery("A man is eating pasta.")
43+
pipeline = Pipeline().setStages([document, reranker])
44+
data = spark.createDataFrame([
45+
["A man is eating food."],
46+
["A man is eating a piece of bread."],
47+
["The girl is carrying a baby."],
48+
["A man is riding a horse."]
49+
]).toDF("text")
50+
result = pipeline.fit(data).transform(data)
51+
result.select("reranked_documents").show(truncate = False)
52+
# Each document will have a relevance_score in metadata showing how relevant it is to the query
53+
54+
```
55+
56+
</div>
57+
58+
{:.model-param}
59+
## Model Information
60+
61+
{:.table-model}
62+
|---|---|
63+
|Model Name:|bge_reranker_v2_m3_Q4_K_M|
64+
|Compatibility:|Spark NLP 6.1.2+|
65+
|License:|Open Source|
66+
|Edition:|Official|
67+
|Input Labels:|[document]|
68+
|Output Labels:|[reranked_documents]|
69+
|Language:|en|
70+
|Size:|416.0 MB|

0 commit comments

Comments
 (0)