Skip to content
Draft
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions example_workflows/curl_external/workflow_template.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
metadata:
name: example-curl-external
namespace: argo-workflows
spec:
serviceAccountName: argo-workflow
entrypoint: curl
arguments:
parameters:
- name: url
value: https://ipinfo.io
templates:
- name: curl
inputs:
parameters:
- name: url
value: "{{workflow.parameters.url}}"
container:
name: main
image: alpine/curl
command:
- curl
args:
- "-s"
- "{{inputs.parameters.url}}"
ttlStrategy:
secondsAfterCompletion: 300
podGC:
strategy: OnPodCompletion
deleteDelayDuration: 300
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
library(tidyverse)
setwd("/tmp/test/")
head(mpg)
tmp <-
ggplot(mpg, aes(x = hwy, y = cty)) +
geom_point() +
geom_smooth()
tmp_boxplot <-
ggplot(mpg, aes(x = class, y = hwy)) +
geom_boxplot() +
theme_classic()
ggsave("/tmp/routput/test.png", plot = tmp, device = "png")
ggsave("/tmp/routput/test_boxplot.png", plot = tmp_boxplot, device = "png")
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
metadata:
name: input-and-output-artifact-s3
namespace: argo-workflows
spec:
templates:
- name: s3-input-and-output-example
inputs:
artifacts:
- name: rscript
path: /tmp/test/generate_plots_no_txt.R
s3:
key: generate_plots_no_txt.R
outputs:
artifacts:
- name: routput
path: /tmp/routput
s3:
key: /plot_gen.tgz
Comment on lines +15 to +18
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be a sensible way to generalise this kind of thing?
It would be good to not put the responsibility on the users to makes/find writable directories.

Could we provide templates that ensure the job runs in a tmp dir and has loops like

outputs:
  artifacts:
{{% for output in outputs %}}
    - name: {{ output.name }}
      path: /tmp/{{job_id}}/{{ output.filename }}
      s3:
        key : {{ output.filename }}
{% endfor %}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you wanting these to also be user-facing examples? Is this thinking ahead to what we've discussed about potentially locking down templates? I didn't realise these were really intended to exemplify that.

This was only intended to exemplify getting an artifact in and putting one out, not generalise to many output artifacts. This script just tars an entire directory and puts it in a single file in the default S3 bucket, so there is only one output and it doesn't need a loop.

container:
image: rocker/tidyverse:latest
command:
- sh
- -c
args:
- mkdir /tmp/routput; Rscript /tmp/test/generate_plots_no_txt.R
entrypoint: s3-input-and-output-example
serviceAccountName: argo-workflow
17 changes: 17 additions & 0 deletions example_workflows/python_population_analysis/generate_csv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env python3
from csv import DictWriter

from faker import Faker

Faker.seed(36903)
fake = Faker(["en_GB", "fr_FR", "de_DE"])

population = [fake.profile() for _ in range(5000)]

field_names = population[0].keys()
with open("./population.csv", "w", newline="") as csvfile:
writer = DictWriter(csvfile, field_names)

writer.writeheader()
for profile in population:
writer.writerow(profile)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pandas[performance, plot]
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
faker
59 changes: 59 additions & 0 deletions example_workflows/python_population_analysis/script.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env python3
import re
from decimal import Decimal
from enum import Enum, unique

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


@unique
class BloodGroup(Enum):
a_negative = "A-"
a_positive = "A+"
ab_negative = "AB-"
ab_positive = "AB+"
b_negative = "B-"
b_positive = "B+"
o_negative = "O-"
o_positive = "O+"


re_location = re.compile(r"\(Decimal\('(.*)'\), Decimal\('(.*)'\)\)")


def convert_location(location: str) -> tuple[Decimal, Decimal]:
match = re_location.match(location)
return (Decimal(match.group(1)), Decimal(match.group(2)))


df = pd.read_csv(
"./population.csv",
converters={
"blood_group": BloodGroup,
"current_location": convert_location,
"website": eval,
},
dtype={
"job": str,
"company": str,
"ssn": str,
"residence": str,
"username": str,
"address": str,
"mail": str,
},
parse_dates=["birthdate"],
header=0,
)

ax = df["blood_group"].value_counts().plot.pie()
ax.get_figure().savefig("./blood_group.png")

plt.clf()

now = np.datetime64("now")
df["age"] = df["birthdate"].apply(lambda x: int((now - x).days / 365.25))
ax = df["age"].plot.hist()
ax.get_figure().savefig("./age.png")
Loading