Commit 8b035ca
authored
feat: add support for pyspark equivalent
### TL;DR
Add outer explode support that preserves null/empty arrays, plus position-aware variants. No breaking changes; default explode behavior unchanged.
### What's included
- Explode "outer" semantics:
- DataFrame APIs: `explode_outer(col)`, `posexplode_outer(col)`
- New `keep_null_and_empty: bool` on `Explode` and `ExplodeWithIndex` logical plans
- Physical exec honors `keep_null_and_empty`:
- Preserves rows with empty/null arrays
- Emits null for value; and for position in pos-explode variants
- Position-aware explode:
- `explode_with_index(col, index_name="index", value_name=None, keep_null_and_empty=False)`
- PySpark alias: `posexplode(col)` and `posexplode_outer(col)`
- Serde + Protobuf:
- New `ExplodeWithIndex` message
- `Explode` and `ExplodeWithIndex` carry `keep_null_and_empty`
- Optional `value_name` handled correctly (None when omitted)
- Tests:
- Outer behavior for explode/posexplode
- Optional naming coverage (only value_name; only index_name; both; expression input)
- Plan serde round-trip for `ExplodeWithIndex` (default/custom names, outer)
### Behavior details
- Regular explode: filters null/empty arrays (unchanged)
- Outer explode: preserves all rows, emits nulls for missing elements
- Position column:
- Non-outer: 0-based indices per original row
- Outer: null position for empty/null arrays
### Compatibility
- PySpark semantics for `posexplode`/`posexplode_outer` (0-based indices here; position nulls for outer)
- Backward compatible defaults
- Protobuf/serde extended; existing fields untouched
### Examples (with results)
```
df = session.create_dataframe({
"id": [1, 2, 3],
"tags": [["red", "blue"], [], None],
})
```
- Regular explode (filters empty/null)
```
df.explode("tags").to_polars()
```
| id | tags |
| --- | --- |
| 1 | red |
| 1 | blue |
- Outer explode (preserves rows, yields nulls)
```
df.explode_outer("tags").to_polars()
```
| id | tags |
| --- | --- |
| 1 | red |
| 1 | blue |
| 2 | null |
| 3 | null |
- posexplode (position + value; filters empty/null)
```
df.posexplode("tags").to_polars()
```
| id | pos | col |
| --- | --- | --- |
| 1 | 0 | red |
| 1 | 1 | blue |
- posexplode_outer (position + value; preserves rows; nulls for empty/null)
```
df.posexplode_outer("tags").to_polars()
```
| id | pos | col |
| --- | --- | --- |
| 1 | 0 | red |
| 1 | 1 | blue |
| 2 | null | null |
| 3 | null | null |
- explode_with_index (default names; filters empty/null)
```
df2 = session.create_dataframe({
"id": [1, 2],
"tags": [["x", "y"], ["z"]],
})
df2.explode_with_index("tags").to_polars()
```
| id | index | tags |
| --- | --- | --- |
| 1 | 0 | x |
| 1 | 1 | y |
| 2 | 0 | z |
- explode_with_index (custom names; outer behavior)
```
df3 = session.create_dataframe({
"id": [1, 2, 3],
"letters": [["a", "b"], [], None],
})
df3.explode_with_index("letters", index_name="pos", value_name="val", keep_null_and_empty=True).to_polars()
```
| id | pos | val |
| --- | --- | --- |
| 1 | 0 | a |
| 1 | 1 | b |
| 2 | null | null |
| 3 | null | null |
### Testing and quality
- Lints clean
- Full suite passing (1198 passed, 23 skipped)
- Added serde and API tests covering optional parameters and outer variantsexplode_outer, posexplode/posexplode_outer (via explode_with_index) (#249)1 parent 7c7454a commit 8b035ca
File tree
13 files changed
+864
-57
lines changed- protos/logical_plan/v1
- src/fenic
- _backends/local
- physical_plan
- transpiler
- _gen/protos/logical_plan/v1
- api/dataframe
- core
- _logical_plan/plans
- _serde/proto
- plans
- tests
- _backends/local/dataframe
- _logical_plan/serde
13 files changed
+864
-57
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| 45 | + | |
45 | 46 | | |
46 | 47 | | |
47 | 48 | | |
| |||
143 | 144 | | |
144 | 145 | | |
145 | 146 | | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
146 | 156 | | |
147 | 157 | | |
148 | 158 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
40 | 43 | | |
41 | 44 | | |
42 | 45 | | |
| |||
65 | 68 | | |
66 | 69 | | |
67 | 70 | | |
| 71 | + | |
68 | 72 | | |
69 | 73 | | |
70 | 74 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
174 | 174 | | |
175 | 175 | | |
176 | 176 | | |
| 177 | + | |
177 | 178 | | |
178 | 179 | | |
179 | 180 | | |
180 | 181 | | |
181 | 182 | | |
182 | 183 | | |
| 184 | + | |
183 | 185 | | |
184 | 186 | | |
185 | 187 | | |
186 | 188 | | |
187 | 189 | | |
188 | 190 | | |
189 | 191 | | |
190 | | - | |
191 | | - | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
192 | 196 | | |
193 | 197 | | |
194 | 198 | | |
| |||
197 | 201 | | |
198 | 202 | | |
199 | 203 | | |
| 204 | + | |
200 | 205 | | |
201 | 206 | | |
202 | 207 | | |
| |||
207 | 212 | | |
208 | 213 | | |
209 | 214 | | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
210 | 318 | | |
211 | 319 | | |
212 | 320 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
| 45 | + | |
44 | 46 | | |
45 | 47 | | |
46 | 48 | | |
| |||
335 | 337 | | |
336 | 338 | | |
337 | 339 | | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
338 | 364 | | |
339 | 365 | | |
340 | 366 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
| 34 | + | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| |||
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
| 71 | + | |
70 | 72 | | |
71 | 73 | | |
72 | 74 | | |
| |||
76 | 78 | | |
77 | 79 | | |
78 | 80 | | |
79 | | - | |
| 81 | + | |
80 | 82 | | |
81 | 83 | | |
82 | 84 | | |
| |||
217 | 219 | | |
218 | 220 | | |
219 | 221 | | |
220 | | - | |
| 222 | + | |
221 | 223 | | |
222 | 224 | | |
| 225 | + | |
223 | 226 | | |
224 | 227 | | |
225 | | - | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
226 | 244 | | |
227 | 245 | | |
228 | 246 | | |
| |||
0 commit comments