Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/concepts/parallel-task.rst
Original file line number Diff line number Diff line change
Expand Up @@ -266,3 +266,10 @@ to set what should be determined to be collected at DAG construction time:

# Then in the driver building pass in the configuration:
.with_config(_config)

Parallelizable Subclassing
==========================

When annotating a function with `Parallelizable`, it is not possible to specify in the annotation what the type returned by the function will actually be, and these are not identified by a linter or other tools as static type checking. Especially for functions that can be used with or without Hamilton, this can be a problem.

To solve this problem, it is possible to create subclasses of the `Parallelizable` classes. The ["Parallelizable Subclass" example](https://github.com/dagworks-inc/hamilton/blob/main/examples/parallelism/parallelizable_subclass) showcases how to do that.
11 changes: 11 additions & 0 deletions examples/parallelism/parallelizable_subclass/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Parallelizable Subclass

## Overview

When annotating a function with `Parallelizable`, it is not possible to specify in the annotation what the type returned by the function will actually be, and these are not identified by a linter or other tools as static type checking. Especially for functions that can be used with or without Hamilton, this can be a problem.

To solve this problem, it is possible to create subclasses of the `Parallelizable` classes, as demonstrated in this example.

## Running

The `notebook.ipynb` exemplifies how to use a `Parallelizable` subclass.
15 changes: 15 additions & 0 deletions examples/parallelism/parallelizable_subclass/functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
from parallelizable_list import ParallelizableList

from hamilton.htypes import Collect


def hello_list() -> ParallelizableList[str]:
return ["h", "e", "l", "l", "o", " ", "l", "i", "s", "t"]


def uppercase(hello_list: str) -> str:
return hello_list.upper()


def hello_uppercase(uppercase: Collect[str]) -> str:
return "".join(uppercase)
299 changes: 299 additions & 0 deletions examples/parallelism/parallelizable_subclass/notebook.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,299 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"#Install Hamilton if not avaiable\n",
"\n",
"try:\n",
" import hamilton\n",
"except ModuleNotFoundError:\n",
" %pip install sf-hamilton"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parallelism: Paralellizable Subclass [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dagworks-inc/hamilton/blob/main/examples/parallelism/parallelizable_subclass/notebook.ipynb) [![GitHub badge](https://img.shields.io/badge/github-view_source-2b3137?logo=github)](https://github.com/dagworks-inc/hamilton/blob/main/examples/parallelism/parallelizable_subclass/notebook.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When annotating a function with `Parallelizable`, it is not possible to specify in the annotation what the type returned by the function will actually be, and these are not identified by a linter or other tools as static type checking. Especially for functions that can be used with or without Hamilton, this can be a problem.\n",
"\n",
"To solve this problem, it is possible to create subclasses of the `Parallelizable` classes, as demonstrated in this example."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We start by importing Hamilton and the created example functions:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from hamilton import driver\n",
"\n",
"import functions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Creating a driver and displaing all the module functions, we can see the `hello_list` function, that returns a `ParallelizableList`. This is a example `Parallelizable` subclass created for annotate functions that returns `list`. Is important to note that all `Parallelizable` subclasses must return a `Iterable` subclass, as for example list.\n",
"\n",
"The `ParallelizableList` implementation can be found in the [\"parallelizable_list.py\" file](https://github.com/dagworks-inc/hamilton/blob/main/examples/parallelism/parallelizable_subclass/parallelizable_list.py)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 12.2.1 (20241206.2353)\n",
" -->\n",
"<!-- Pages: 1 -->\n",
"<svg width=\"422pt\" height=\"300pt\"\n",
" viewBox=\"0.00 0.00 422.05 299.80\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 295.8)\">\n",
"<polygon fill=\"white\" stroke=\"none\" points=\"-4,4 -4,-295.8 418.05,-295.8 418.05,4 -4,4\"/>\n",
"<g id=\"clust1\" class=\"cluster\">\n",
"<title>cluster__legend</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"20.5,-80.8 20.5,-283.8 110.35,-283.8 110.35,-80.8 20.5,-80.8\"/>\n",
"<text text-anchor=\"middle\" x=\"65.43\" y=\"-266.5\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Legend</text>\n",
"</g>\n",
"<!-- hello_uppercase -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>hello_uppercase</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M398.05,-67.6C398.05,-67.6 293.95,-67.6 293.95,-67.6 287.95,-67.6 281.95,-61.6 281.95,-55.6 281.95,-55.6 281.95,-16 281.95,-16 281.95,-10 287.95,-4 293.95,-4 293.95,-4 398.05,-4 398.05,-4 404.05,-4 410.05,-10 410.05,-16 410.05,-16 410.05,-55.6 410.05,-55.6 410.05,-61.6 404.05,-67.6 398.05,-67.6\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M402.05,-71.6C402.05,-71.6 289.95,-71.6 289.95,-71.6 283.95,-71.6 277.95,-65.6 277.95,-59.6 277.95,-59.6 277.95,-12 277.95,-12 277.95,-6 283.95,0 289.95,0 289.95,0 402.05,0 402.05,0 408.05,0 414.05,-6 414.05,-12 414.05,-12 414.05,-59.6 414.05,-59.6 414.05,-65.6 408.05,-71.6 402.05,-71.6\"/>\n",
"<text text-anchor=\"start\" x=\"292.75\" y=\"-44.5\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">hello_uppercase</text>\n",
"<text text-anchor=\"start\" x=\"338.5\" y=\"-16.5\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- uppercase -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>uppercase</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M236.95,-67.6C236.95,-67.6 171.85,-67.6 171.85,-67.6 165.85,-67.6 159.85,-61.6 159.85,-55.6 159.85,-55.6 159.85,-16 159.85,-16 159.85,-10 165.85,-4 171.85,-4 171.85,-4 236.95,-4 236.95,-4 242.95,-4 248.95,-10 248.95,-16 248.95,-16 248.95,-55.6 248.95,-55.6 248.95,-61.6 242.95,-67.6 236.95,-67.6\"/>\n",
"<text text-anchor=\"start\" x=\"170.65\" y=\"-44.5\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">uppercase</text>\n",
"<text text-anchor=\"start\" x=\"196.9\" y=\"-16.5\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- uppercase&#45;&gt;hello_uppercase -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>uppercase&#45;&gt;hello_uppercase</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M259.6,-35.8C261.81,-35.8 264.05,-35.8 266.29,-35.8\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"259.52,-35.8 249.52,-31.3 254.19,-35.8 249.86,-35.8 249.86,-35.8 249.86,-35.8 254.19,-35.8 249.52,-40.3 259.52,-35.8\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"266.08,-39.3 276.08,-35.8 266.08,-32.3 266.08,-39.3\"/>\n",
"</g>\n",
"<!-- hello_list -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>hello_list</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M114.85,-67.6C114.85,-67.6 16,-67.6 16,-67.6 10,-67.6 4,-61.6 4,-55.6 4,-55.6 4,-16 4,-16 4,-10 10,-4 16,-4 16,-4 114.85,-4 114.85,-4 120.85,-4 126.85,-10 126.85,-16 126.85,-16 126.85,-55.6 126.85,-55.6 126.85,-61.6 120.85,-67.6 114.85,-67.6\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M118.85,-71.6C118.85,-71.6 12,-71.6 12,-71.6 6,-71.6 0,-65.6 0,-59.6 0,-59.6 0,-12 0,-12 0,-6 6,0 12,0 12,0 118.85,0 118.85,0 124.85,0 130.85,-6 130.85,-12 130.85,-12 130.85,-59.6 130.85,-59.6 130.85,-65.6 124.85,-71.6 118.85,-71.6\"/>\n",
"<text text-anchor=\"start\" x=\"36.18\" y=\"-44.5\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">hello_list</text>\n",
"<text text-anchor=\"start\" x=\"14.8\" y=\"-16.5\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">ParallelizableList</text>\n",
"</g>\n",
"<!-- hello_list&#45;&gt;uppercase -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>hello_list&#45;&gt;uppercase</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M131.3,-35.8C137.27,-35.8 143.27,-35.8 149.12,-35.8\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"149.13,-35.8 159.13,-40.3 154.47,-35.8 158.8,-35.8 158.8,-35.8 158.8,-35.8 154.47,-35.8 159.13,-31.3 149.13,-35.8\"/>\n",
"</g>\n",
"<!-- function -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>function</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M87.85,-126.47C87.85,-126.47 43,-126.47 43,-126.47 37,-126.47 31,-120.47 31,-114.47 31,-114.47 31,-101.12 31,-101.12 31,-95.12 37,-89.12 43,-89.12 43,-89.12 87.85,-89.12 87.85,-89.12 93.85,-89.12 99.85,-95.12 99.85,-101.12 99.85,-101.12 99.85,-114.47 99.85,-114.47 99.85,-120.47 93.85,-126.47 87.85,-126.47\"/>\n",
"<text text-anchor=\"middle\" x=\"65.43\" y=\"-102.38\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">function</text>\n",
"</g>\n",
"<!-- expand -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>expand</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M86.35,-185.48C86.35,-185.48 44.5,-185.48 44.5,-185.48 38.5,-185.48 32.5,-179.48 32.5,-173.48 32.5,-173.48 32.5,-160.12 32.5,-160.12 32.5,-154.12 38.5,-148.12 44.5,-148.12 44.5,-148.12 86.35,-148.12 86.35,-148.12 92.35,-148.12 98.35,-154.12 98.35,-160.12 98.35,-160.12 98.35,-173.48 98.35,-173.48 98.35,-179.48 92.35,-185.48 86.35,-185.48\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M90.35,-189.48C90.35,-189.48 40.5,-189.48 40.5,-189.48 34.5,-189.48 28.5,-183.48 28.5,-177.48 28.5,-177.48 28.5,-156.12 28.5,-156.12 28.5,-150.12 34.5,-144.12 40.5,-144.12 40.5,-144.12 90.35,-144.12 90.35,-144.12 96.35,-144.12 102.35,-150.12 102.35,-156.12 102.35,-156.12 102.35,-177.48 102.35,-177.48 102.35,-183.48 96.35,-189.48 90.35,-189.48\"/>\n",
"<text text-anchor=\"middle\" x=\"65.43\" y=\"-161.38\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">expand</text>\n",
"</g>\n",
"<!-- collect -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>collect</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M83.35,-248.48C83.35,-248.48 47.5,-248.48 47.5,-248.48 41.5,-248.48 35.5,-242.48 35.5,-236.48 35.5,-236.48 35.5,-223.12 35.5,-223.12 35.5,-217.12 41.5,-211.12 47.5,-211.12 47.5,-211.12 83.35,-211.12 83.35,-211.12 89.35,-211.12 95.35,-217.12 95.35,-223.12 95.35,-223.12 95.35,-236.48 95.35,-236.48 95.35,-242.48 89.35,-248.48 83.35,-248.48\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M87.35,-252.48C87.35,-252.48 43.5,-252.48 43.5,-252.48 37.5,-252.48 31.5,-246.48 31.5,-240.48 31.5,-240.48 31.5,-219.12 31.5,-219.12 31.5,-213.12 37.5,-207.12 43.5,-207.12 43.5,-207.12 87.35,-207.12 87.35,-207.12 93.35,-207.12 99.35,-213.12 99.35,-219.12 99.35,-219.12 99.35,-240.48 99.35,-240.48 99.35,-246.48 93.35,-252.48 87.35,-252.48\"/>\n",
"<text text-anchor=\"middle\" x=\"65.43\" y=\"-224.38\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">collect</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.graphs.Digraph at 0x2c62fe9fc50>"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dr = (\n",
" driver.Builder()\n",
" .with_modules(functions)\n",
" .enable_dynamic_execution(allow_experimental_mode=True)\n",
" .build()\n",
" )\n",
"\n",
"dr.display_all_functions()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this simple example, the created flow generates a list with \"hello list\" letters, converts each letter to uppercase in parallel, and then joins the letters together:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'hello_uppercase': 'HELLO LIST'}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dr.execute([\"hello_uppercase\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Focusing attention on the function that was annotated with ParallelizableList, running it manually we can see that it actually returns a list:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['h', 'e', 'l', 'l', 'o', ' ', 'l', 'i', 's', 't']"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"functions.hello_list()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Checking the annotation, we can see the return annotation as \"ParallelizableList[str]\":"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'return': parallelizable_list.ParallelizableList[str]}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"functions.hello_list.__annotations__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And, the key point of using subtypes of `Parallelizable`, it is considered a list instance:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"issubclass(functions.hello_list.__annotations__[\"return\"], list)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This means that when using a linter or static type checking, it will correctly identify the return type as a list instance."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from typing import Generic, List

from hamilton.htypes import Parallelizable, ParallelizableElement


class ParallelizableList(
List[ParallelizableElement], Parallelizable, Generic[ParallelizableElement]
):
"""
Marks the output of a function node as parallelizable and also as a list.

It has the same usage as "Parallelizable", but for returns that are specifically
lists, for correct functioning of linters and other tools.
"""

pass
4 changes: 2 additions & 2 deletions examples/validate_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@


def _create_github_badge(path: pathlib.Path) -> str:
github_url = f"https://github.com/dagworks-inc/hamilton/blob/main/{path}"
github_url = f"https://github.com/dagworks-inc/hamilton/blob/main/{path.as_posix()}"
github_badge = f"[![GitHub badge](https://img.shields.io/badge/github-view_source-2b3137?logo=github)]({github_url})"
return github_badge


def _create_colab_badge(path: pathlib.Path) -> str:
colab_url = f"https://colab.research.google.com/github/dagworks-inc/hamilton/blob/main/{path}"
colab_url = f"https://colab.research.google.com/github/dagworks-inc/hamilton/blob/main/{path.as_posix()}"
colab_badge = (
f"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)]({colab_url})"
)
Expand Down
Loading