Skip to content

Conversation

@schmelczer
Copy link

Motivation

I identified six new potential best practices through conducting two case studies on a production ML pipeline while designing GreatAI. I think these are generalisable enough to be useful for other practitioners.

I created this PR following the advice of @jstvssr.

Changes

The added best practices are:

  • Parallelise Feature Extraction
  • Allow Experimentation with the Inference Function
  • Implement Standard Schemas for Common Prediction Tasks
  • Cache Production Predictions
  • Allow Robustly Composing Inference Functions
  • Keep the Model and its Documentation Together

I also added some smaller fixes:

  • Updated the Gemfile to include missing dependencies
  • Updated the practices' unique_id-s because two used to have the same (24)
  • Reformatted a template expression inside a JS section which stopped Disqus from loading
  • Resolved the build warnings
  • Reformatted the README
  • Removed the extra space above the title of /practices
  • Automated the sitemap generation so that it includes the new practices

Thank you for taking a look at my PR!

@kvdblom kvdblom requested a review from xserban October 24, 2022 08:17
@xserban
Copy link
Member

xserban commented Dec 9, 2022

Hi,

Thanks for the PR. Here is some initial feedback regarding the practices:

  1. Parallelise Feature Extraction -> I think this can be merged with the test feature extraction code into a broader practice regarding feature engineering. I reckon you are targeting the production environment here, where scaling feature extraction is of utmost importance (some inspiration can be found here: http://proceedings.mlr.press/v67/li17a/li17a.pdf). If you agree, I think we should debate whether the practice belongs to deployment or training. Currently it's possitioned in training, but I would be careful to suggest engineers to focus on feature extraction parallelisation while still playing around with feature. Let me know your thoughts.

  2. Allow Experimentation with the Inference Function and Allow Robustly Composing Inference Functions -> I will treat these two practices together, as they both refer to engineering the inference function. Probably it will be better to merge them into one practice and call them the inference API rather than function, since most likely it will serve as an interface to a larger service for inference.

Moreover, judging by the nature of the practices, I would say they fit better to the Coding/Deployment group of practices, rather than the training practices. The way we engineer the inference APIs is similar to implementing continuous integration or shadow deployment. I would call the practice something along the lines of "Design a flexible inferene API than can be used for experimentation.". Flexibility implies both composition and fast experimentation.

  1. Implement Standard Schemas for Common Prediction Tasks -> this is a good practice. Restructuring it will depend on the way we approach point nr. 2 from above. I would also drop "common" and emphasise more the scenarios to which this practice applies (most likely tabular data ).

  2. Cache Production Predictions -> I think the description should talk more about the scenarios where this practice applies. In particular, regarding repetitive predictions that do not involve personalisation or any individual attributes from the users. If you agree, I may take a stance on editing the practice.

  3. Keep the Model and its Documentation Together -> we already have a practice for versioning all model related artefacts, so I suggest to merge the content of this practice with this practice. Otherwise, a better motivation for this practice is needed.

My suggestion is to try to first debate the large structural changes (e.g., if you agree some practices should be merge) and afterwards modify them one by one.
Let me know your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants