Skip to content

Commit ed77ac6

Browse files
authored
Add GeminiLanguageModel (#17)
* Initial implementation of Gemini support * Improve ergnomics of thinking and server tools * Rename GeminiServerTool to ServerTool * Update README * sse as url query parameter, not stream body parameter * Add test coverage for GeminiLanguageModel * Add custom user info key to control omission of additionalProperties * Fix encoding of server tools * Serialize Gemini tests to avoid API rate limiting * Disable thinking for withGenerationOptions test * Fix various coding issues * Bump maximum token count for withGenerationOptions test * Update README
1 parent 8dc8946 commit ed77ac6

File tree

4 files changed

+827
-6
lines changed

4 files changed

+827
-6
lines changed

README.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ print(response.content)
4747
- [x] [llama.cpp](https://github.com/ggml-org/llama.cpp) (GGUF models)
4848
- [x] Ollama [HTTP API](https://github.com/ollama/ollama/blob/main/docs/api.md)
4949
- [x] Anthropic [Messages API](https://docs.claude.com/en/api/messages)
50+
- [x] Google [Gemini API](https://ai.google.dev/api/generate-content)
5051
- [x] OpenAI [Chat Completions API](https://platform.openai.com/docs/api-reference/chat)
5152
- [x] OpenAI [Responses API](https://platform.openai.com/docs/api-reference/responses)
5253

@@ -228,6 +229,68 @@ let response = try await session.respond {
228229
}
229230
```
230231

232+
### Google Gemini
233+
234+
Uses the [Gemini API](https://ai.google.dev/api/generate-content) with Gemini models:
235+
236+
```swift
237+
let model = GeminiLanguageModel(
238+
apiKey: ProcessInfo.processInfo.environment["GEMINI_API_KEY"]!,
239+
model: "gemini-2.5-flash"
240+
)
241+
242+
let session = LanguageModelSession(model: model, tools: [WeatherTool()])
243+
let response = try await session.respond {
244+
Prompt("What's the weather like in Tokyo?")
245+
}
246+
```
247+
248+
Gemini models use an internal ["thinking process"](https://ai.google.dev/gemini-api/docs/thinking)
249+
that improves reasoning and multi-step planning.
250+
You can configure how much Gemini should "think" using the `thinking` parameter:
251+
252+
```swift
253+
// Enable thinking
254+
var model = GeminiLanguageModel(
255+
apiKey: apiKey,
256+
model: "gemini-2.5-flash",
257+
thinking: true /* or `.dynamic` */,
258+
)
259+
260+
// Set an explicit number of tokens for its thinking budget
261+
model.thinking = .budget(1024)
262+
263+
// Revert to default configuration without thinking
264+
model.thinking = false /* or `.disabled` */
265+
```
266+
267+
Gemini supports [server-side tools](https://ai.google.dev/gemini-api/docs/google-search)
268+
that execute transparently on Google's infrastructure:
269+
270+
```swift
271+
let model = GeminiLanguageModel(
272+
apiKey: apiKey,
273+
model: "gemini-2.5-flash",
274+
serverTools: [
275+
.googleMaps(latitude: 35.6580, longitude: 139.7016) // Optional location
276+
]
277+
)
278+
```
279+
280+
**Available server tools**:
281+
282+
- `.googleSearch`
283+
Grounds responses with real-time web information
284+
- `.googleMaps`
285+
Provides location-aware responses
286+
- `.codeExecution`
287+
Generates and runs Python code to solve problems
288+
- `.urlContext`
289+
Fetches and analyzes content from URLs mentioned in prompts
290+
291+
> [!TIP]
292+
> Gemini server tools are not available as client tools (`Tool`) for other models.
293+
231294
### Ollama
232295

233296
Run models locally via Ollama's [HTTP API](https://github.com/ollama/ollama/blob/main/docs/api.md):

Sources/AnyLanguageModel/GenerationSchema.swift

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,6 @@ import struct Foundation.Decimal
99
/// Generation schemas guide the output of a ``SystemLanguageModel`` to deterministically
1010
/// ensure the output is in the desired format.
1111
public struct GenerationSchema: Sendable, Codable, CustomDebugStringConvertible {
12-
13-
// MARK: - Structure
14-
1512
indirect enum Node: Sendable, Codable {
1613
case object(ObjectNode)
1714
case array(ArrayNode)
@@ -44,7 +41,12 @@ public struct GenerationSchema: Sendable, Codable, CustomDebugStringConvertible
4441
try propsContainer.encode(node, forKey: DynamicCodingKey(stringValue: name)!)
4542
}
4643
try container.encode(Array(obj.required), forKey: .required)
47-
try container.encode(false, forKey: .additionalProperties)
44+
45+
// Check userInfo to see if additionalProperties should be omitted
46+
let shouldOmit = encoder.userInfo[GenerationSchema.omitAdditionalPropertiesKey] as? Bool ?? false
47+
if !shouldOmit {
48+
try container.encode(false, forKey: .additionalProperties)
49+
}
4850

4951
case .array(let arr):
5052
try container.encode("array", forKey: .type)
@@ -201,8 +203,6 @@ public struct GenerationSchema: Sendable, Codable, CustomDebugStringConvertible
201203
}
202204
}
203205

204-
// MARK: - Properties
205-
206206
let root: Node
207207
private var defs: [String: Node]
208208

@@ -774,3 +774,21 @@ extension GenerationSchema {
774774
}
775775
}
776776
}
777+
778+
// MARK: - CodingUserInfoKey
779+
780+
extension GenerationSchema {
781+
/// A key used in the encoder's `userInfo` dictionary to control whether
782+
/// the `additionalProperties` field should be omitted from the encoded output.
783+
///
784+
/// Set this to `true` to omit `additionalProperties` from object schemas.
785+
/// Defaults to `false` (includes `additionalProperties`) if not specified.
786+
///
787+
/// Example:
788+
/// ```swift
789+
/// let encoder = JSONEncoder()
790+
/// encoder.userInfo[GenerationSchema.omitAdditionalPropertiesKey] = true
791+
/// let data = try encoder.encode(schema)
792+
/// ```
793+
static let omitAdditionalPropertiesKey = CodingUserInfoKey(rawValue: "GenerationSchema.omitAdditionalProperties")!
794+
}

0 commit comments

Comments
 (0)