diff --git a/Gemfile b/Gemfile index 8e842b1..ce14151 100644 --- a/Gemfile +++ b/Gemfile @@ -7,4 +7,7 @@ group :jekyll_plugins do gem 'jekyll-scholar' gem 'jemoji' gem 'unicode_utils' + gem 'jekyll-sitemap' end + +gem "webrick" diff --git a/README.md b/README.md index ff0303b..4ca9ac3 100644 --- a/README.md +++ b/README.md @@ -1,15 +1,16 @@ # se-ml.github.io -Software Engineering for Machine Learning Homepage +> Software Engineering for Machine Learning Homepage +### Prerequisites -### Prereq Install [jekyll](https://jekyllrb.com/docs/). -### Development: +### Development -#### Install dependencies: -``` +#### Install dependencies + +```sh $ git clone https://github.com/SE-ML/se-ml.github.io $ cd se-ml.github.io $ git fetch --all @@ -17,18 +18,21 @@ $ git checkout -b source origin/source $ bundle install ``` -#### Run dev server: -``` +#### Run development server + +```sh $ bundle exec jekyll serve ``` +> The website will be available at [localhost:4000](http://localhost:4000) -### Deploy to production: -``` +### Deploy to production + +```sh $ ./bin/deploy -u se-ml ``` +#### Kudos -#### Kudos: - this webpage uses the [al-folio](https://alshedivat.github.io/al-folio/) theme. - some icons are downloaded from [Freepik](https://www.freepik.com/). diff --git a/_best_practices/01-parallel_feature_extraction.md b/_best_practices/01-parallel_feature_extraction.md new file mode 100644 index 0000000..002f7c4 --- /dev/null +++ b/_best_practices/01-parallel_feature_extraction.md @@ -0,0 +1,31 @@ +--- +layout: practice +author: András Schmelczer +name: Parallelise Feature Extraction +title: Parallelise Feature Extraction +category: Data +unique_id: data_parallel +index: 9 +difficulty: na +comments: True +references: [RAY, PPP, GREATAI] +description: +image: +photocredit: + +intent: Iterate quicker during feature engineering and be more predictable in production. +motivation: Before processing any data, features have to be extracted. Sometimes, this step can be computationally expensive, so speeding it up, for example, by parallelising the workload, can result in less time wasted during experimentation and a more predictable production deployment. +applicability: Parallelisation should be considered in machine learning applications where feature extraction is resource-intensive and efficiently parallelisable. +related: [exp_parallel] # +dependencies: # +survey_question: + +labels: effectiveness + +--- + +Depending on the maturity of the data engineering processes, in some cases, the researchers and engineers might end-up processing data solely using simple scripts. Correctness over performance might be reasonably prioritised in such a setup, leading to too much waiting around while the scripts run. This can slow down development and, if the same functions are deployed, even production inference. + +Using a database with good support for online analytical processing (OLAP) use-cases (such as Apache Cassandra) and building a proper, distributed processing cluster (using, for example, Apache Spark) are reasonable first steps for mitigating the issue. Of course, when the dataset is much smaller, single-machine processing can also be appropriate. But even in that case, keeping parallelisation in mind (achievable with, for instance, Joblib) during experimentation is vital for rapid prototyping and increasing developer experience. + +In some deployments where we have available unused resources, it can also make sense to parallelise the feature extraction of individual production inputs (given that it is efficiently doable). This can result in more predictable response times, which depend less on the input length. diff --git a/_best_practices/02-archive_old_feature.md b/_best_practices/02-archive_old_feature.md index c39acc9..486d941 100644 --- a/_best_practices/02-archive_old_feature.md +++ b/_best_practices/02-archive_old_feature.md @@ -5,7 +5,7 @@ name: Actively Remove or Archive Features That are Not Used title: Actively Remove or Archive Features That are Not Used category: Training unique_id: exp_archive -index: 13 +index: 14 difficulty: "medium" comments: True references: [Rs4ML, MLTD] diff --git a/_best_practices/02-auto_feat.md b/_best_practices/02-auto_feat.md index 6770dcc..a065530 100644 --- a/_best_practices/02-auto_feat.md +++ b/_best_practices/02-auto_feat.md @@ -4,7 +4,7 @@ author: Koen van der Blom, Alex Serban, Joost Visser name: Automate Feature Generation and Selection category: Training # unique_id: exp_auto_feat -index: 17 +index: 18 difficulty: "advanced" references: # comments: True diff --git a/_best_practices/02-auto_hyperparams.md b/_best_practices/02-auto_hyperparams.md index a6cc15b..10c1cbe 100644 --- a/_best_practices/02-auto_hyperparams.md +++ b/_best_practices/02-auto_hyperparams.md @@ -5,7 +5,7 @@ name: Automate Hyper-Parameter Optimisation title: Automate Hyper-Parameter Optimisation category: Training unique_id: exp_hyperparam -index: 18 +index: 19 difficulty: "medium" comments: True description: diff --git a/_best_practices/02-auto_nas.md b/_best_practices/02-auto_nas.md index 57b688c..109affd 100644 --- a/_best_practices/02-auto_nas.md +++ b/_best_practices/02-auto_nas.md @@ -4,7 +4,7 @@ author: Koen van der Blom, Alex Serban, Joost Visser name: Automate Configuration of Algorithms or Model Structure category: Training unique_id: exp_auto_nas -index: 19 +index: 20 difficulty: "advanced" references: # comments: True diff --git a/_best_practices/02-data_version.md b/_best_practices/02-data_version.md index 389ffd6..40d8fea 100644 --- a/_best_practices/02-data_version.md +++ b/_best_practices/02-data_version.md @@ -5,7 +5,7 @@ name: Use Versioning for Data, Model, Configurations and Training Scripts title: Use Versioning for Data, Model, Configurations and Training Scripts category: Training unique_id: exp_versioning -index: 22 +index: 23 difficulty: "basic" comments: True references: [VML, MLPROD, MMLP, BPDL, MDLOPS, PMLPP] diff --git a/_best_practices/02-doc_features.md b/_best_practices/02-doc_features.md index b2e01c6..9132844 100644 --- a/_best_practices/02-doc_features.md +++ b/_best_practices/02-doc_features.md @@ -5,7 +5,7 @@ name: Assign an Owner to Each Feature and Document its Rationale title: Assign an Owner to Each Feature and Document its Rationale category: Training unique_id: exp_owner -index: 12 +index: 13 difficulty: "medium" comments: True references: [Rs4ML] diff --git a/_best_practices/02-efficient-models.md b/_best_practices/02-efficient-models.md index c660828..669c74d 100644 --- a/_best_practices/02-efficient-models.md +++ b/_best_practices/02-efficient-models.md @@ -3,7 +3,7 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Use The Most Efficient Models category: Training -index: 24 +index: 25 unique_id: efficient_compression difficulty: na references: [DISTSV, ] # diff --git a/_best_practices/02-experimentation.md b/_best_practices/02-experimentation.md new file mode 100644 index 0000000..f2e4c23 --- /dev/null +++ b/_best_practices/02-experimentation.md @@ -0,0 +1,31 @@ +--- +layout: practice +author: András Schmelczer +name: Allow Experimentation with the Inference Function +title: Allow Experimentation with the Inference Function +category: Training +unique_id: training_experimentation +index: 26 +difficulty: na +references: [GRADIO, STREAMLIT, GREATAI] +comments: True +description: +image: +photocredit: + +intent: Gather feedback on the ML service before its production-ready deployment and allow efficiently iterating on it. +motivation: Involving more colleagues, specifically ones with different (non-technical) perspectives, early in the development can help catch issues quicker. +applicability: Allowing experimentation with the inference function early on should be a part of any mature ML lifecycle. +related: [interpretable, concerns] +dependencies: +survey_question: + +labels: [agility, effectiveness] + +--- + +Next to proper documentation, models can also be evaluated qualitatively by interacting with them. Streamlining this process, especially for making it accessible to all stakeholders, can help in getting a shared understanding of the actual results and tightening the feedback loop. + +These early deployments can be simply created with solutions such as Gradio and Streamlit. + +Additionally, the gathered insights should be easily fed back into the development. Thus, supporting rapid iterations should be part of implementing this best practice. A key component of good developer experience (DX) is _progressive evaluation_, through which development can become a highly iterative, experimental process. This is well-understood by popular data science tools, such as Jupyter Notebooks. Progressive evaluation can be implemented by making the inference functions code locally runnable and providing auto-reloading for their projects. diff --git a/_best_practices/02-interpretable.md b/_best_practices/02-interpretable.md index e4a5b5e..876b24f 100644 --- a/_best_practices/02-interpretable.md +++ b/_best_practices/02-interpretable.md @@ -3,7 +3,7 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Employ Interpretable Models When Possible category: Training -index: 14 +index: 15 unique_id: interpretable difficulty: "advanced" references: [IMLG,TTAID,AIHLEG] # diff --git a/_best_practices/02-measure_mdl_quality.md b/_best_practices/02-measure_mdl_quality.md index 30b539a..383cb7a 100644 --- a/_best_practices/02-measure_mdl_quality.md +++ b/_best_practices/02-measure_mdl_quality.md @@ -5,7 +5,7 @@ name: Continuously Measure Model Quality and Performance title: Continuously Measure Model Quality and Performance category: Training unique_id: exp_quality -index: 20 +index: 21 difficulty: "basic" references: [Rs4ML, TDBML] comments: True diff --git a/_best_practices/02-parallel_training.md b/_best_practices/02-parallel_training.md index 66d29ed..6f8a9f9 100644 --- a/_best_practices/02-parallel_training.md +++ b/_best_practices/02-parallel_training.md @@ -5,7 +5,7 @@ name: Enable Parallel Training Experiments title: Enable Parallel Training Experiments category: Training unique_id: exp_parallel -index: 16 +index: 17 difficulty: "basic" comments: True references: [CD4ML, MLPROD] diff --git a/_best_practices/02-peer_review_mdl.md b/_best_practices/02-peer_review_mdl.md index d47f85c..d316652 100644 --- a/_best_practices/02-peer_review_mdl.md +++ b/_best_practices/02-peer_review_mdl.md @@ -5,7 +5,7 @@ name: Peer Review Training Scripts title: Peer Review Training Scripts category: Training unique_id: exp_peer -index: 15 +index: 16 difficulty: "medium" comments: True references: [MLTS] diff --git a/_best_practices/02-share_exp_status.md b/_best_practices/02-share_exp_status.md index ba71bbc..e208bbc 100644 --- a/_best_practices/02-share_exp_status.md +++ b/_best_practices/02-share_exp_status.md @@ -5,7 +5,7 @@ name: Share Status and Outcomes of Experiments Within the Team title: Share Status and Outcomes of Experiments Within the Team category: Training unique_id: exp_status -index: 23 +index: 24 difficulty: "basic" references: [BPDL, PMLPP] comments: True diff --git a/_best_practices/02-subgroup_bias.md b/_best_practices/02-subgroup_bias.md index 095c83b..1f7fb94 100644 --- a/_best_practices/02-subgroup_bias.md +++ b/_best_practices/02-subgroup_bias.md @@ -3,10 +3,10 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Assess and Manage Subgroup Bias category: Training -index: 21 +index: 22 unique_id: subgroup_bias difficulty: "advanced" -references: [PFG, MCCIM]# +references: [PFG, MCCIM] comments: True description: image: # diff --git a/_best_practices/02-test_feature_extractor.md b/_best_practices/02-test_feature_extractor.md index 351d04a..f9d52f4 100644 --- a/_best_practices/02-test_feature_extractor.md +++ b/_best_practices/02-test_feature_extractor.md @@ -5,7 +5,7 @@ name: Test all Feature Extraction Code title: Test all Feature Extraction Code category: Training unique_id: exp_tstfeature -index: 11 +index: 12 difficulty: "medium" references: [CD4ML] comments: True diff --git a/_best_practices/02-train_metric.md b/_best_practices/02-train_metric.md index 4c7f945..16f6a50 100644 --- a/_best_practices/02-train_metric.md +++ b/_best_practices/02-train_metric.md @@ -5,7 +5,7 @@ name: Capture the Training Objective in a Metric that is Easy to Measure and Und title: Capture the Training Objective in a Metric that is Easy to Measure and Understand category: Training unique_id: exp_trainingmetric -index: 10 +index: 11 difficulty: "basic" references: [Rs4ML, OPML, DSTEAM, MLTEAM] comments: True diff --git a/_best_practices/02-train_objective.md b/_best_practices/02-train_objective.md index 51e20e2..8e070ef 100644 --- a/_best_practices/02-train_objective.md +++ b/_best_practices/02-train_objective.md @@ -5,7 +5,7 @@ name: Share a Clearly Defined Training Objective within the Team title: Share a Clearly Defined Training Objective within the Team category: Training unique_id: exp_trainingobjective -index: 9 +index: 10 difficulty: "basic" references: [Rs4ML, MMLP, MLTEAM] comments: True diff --git a/_best_practices/03-cont-int.md b/_best_practices/03-cont-int.md index e646c19..4021e6b 100644 --- a/_best_practices/03-cont-int.md +++ b/_best_practices/03-cont-int.md @@ -5,7 +5,7 @@ name: Use Continuous Integration title: Use Continuous Integration category: Coding unique_id: coding_build -index: 25 +index: 28 difficulty: "advanced" references: [CD4ML] comments: True diff --git a/_best_practices/03-regr_test.md b/_best_practices/03-regr_test.md index 26854dd..c82418e 100644 --- a/_best_practices/03-regr_test.md +++ b/_best_practices/03-regr_test.md @@ -5,7 +5,7 @@ name: Run Automated Regression Tests title: Run Automated Regression Tests category: Coding unique_id: coding_regr -index: 24 +index: 27 difficulty: "medium" references: [MLPROD] comments: True diff --git a/_best_practices/03-security.md b/_best_practices/03-security.md index 92b1cd0..0bf3c6e 100644 --- a/_best_practices/03-security.md +++ b/_best_practices/03-security.md @@ -3,10 +3,10 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Assure Application Security category: Coding -index: 27 +index: 30 unique_id: security difficulty: "advanced" -references: [TTAID]# +references: [TTAID] comments: True description: image: # diff --git a/_best_practices/03-shemas.md b/_best_practices/03-shemas.md new file mode 100644 index 0000000..4eefd47 --- /dev/null +++ b/_best_practices/03-shemas.md @@ -0,0 +1,31 @@ +--- +layout: practice +author: András Schmelczer +name: Implement Standard Schemas for Common Prediction Tasks +title: Implement Standard Schemas for Common Prediction Tasks +category: Coding +unique_id: coding_schemas +index: 31 +difficulty: na +references: [PYDANTIC, GREATAI] +comments: True +description: +image: +photocredit: + +intent: Increase interoperability within the organisation and facilitate code reuse. +motivation: Developing tools, dashboards, and APIs for ML models can be made more efficient by allowing to reuse them for different models easily. However, this is only possible if these models share a similar interface. +applicability: Machine learning teams should share a repository of standard schemas within every organisation. +related: [data_reusable, team_communication, deployment_log] +dependencies: +survey_question: + +labels: [agility, quality] + +--- + +Standard prediction response schemas allow interchanging models without friction. They also enable agile teams to reuse their existing software support (such as dashboards and tools) across many different models implementing similar tasks. For example, multiclass classification tasks always result in a prediction, probability, and, optionally, an explanation. A multi-label classification task's result is often similar to a list of multiclass classification task results, etc. + +Next to helping code reuse, quality can also be improved through this. For instance, having a required `explanation` field could coerce colleagues into considering more explainable approaches. + +Additionally, the refined, mostly stable schemas can prevent the unnecessary overhead of deciding the appropriate interface for each new ML service. It can also help avoid mistakes coming from slight differences between inference function outputs, such as mistaking `log_probability` for `probability` or `odds`. diff --git a/_best_practices/03-use_static_analysis.md b/_best_practices/03-use_static_analysis.md index 5ecd86d..bb18e06 100644 --- a/_best_practices/03-use_static_analysis.md +++ b/_best_practices/03-use_static_analysis.md @@ -5,7 +5,7 @@ name: Use Static Analysis to Check Code Quality title: Use Static Analysis to Check Code Quality category: Coding unique_id: coding_static -index: 26 +index: 29 difficulty: "medium" references: [BMS] comments: True diff --git a/_best_practices/04-audit_trails.md b/_best_practices/04-audit_trails.md index 1a7064b..34451d6 100644 --- a/_best_practices/04-audit_trails.md +++ b/_best_practices/04-audit_trails.md @@ -3,7 +3,7 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Provide Audit Trails category: Deployment -index: 34 +index: 38 unique_id: audit_trails difficulty: "advanced" references: [CAIAG, AIHLEG, TTAID] # diff --git a/_best_practices/04-auto_model_packaging.md b/_best_practices/04-auto_model_packaging.md index ed36a08..671b9e2 100644 --- a/_best_practices/04-auto_model_packaging.md +++ b/_best_practices/04-auto_model_packaging.md @@ -5,7 +5,7 @@ name: Automate Model Deployment title: Automate Model Deployment category: Deployment unique_id: deployment_automate -index: 28 +index: 32 difficulty: "medium" references: [VML, MLArch, MLLG, OPML] comments: True diff --git a/_best_practices/04-cache.md b/_best_practices/04-cache.md new file mode 100644 index 0000000..70d6fef --- /dev/null +++ b/_best_practices/04-cache.md @@ -0,0 +1,31 @@ +--- +layout: practice +author: András Schmelczer +name: Cache Production Predictions +title: Cache Production Predictions +category: Deployment +unique_id: deployment_cache +index: 39 +difficulty: na +references: [SAASSA, GREEN, CMMD, GREATAI] +comments: True +description: +image: +photocredit: + +intent: Improve performance, allow more flexibility for the clients, and reduce the deployment's carbon footprint. +motivation: Avoiding the expensive — especially in the case of Deep Learning models — recomputation of results can lead to lower latency, lower costs, and an overall more socially conscious deployment. +applicability: Caching should be implemented in production-level ML applications where repetitive input values may occur. +related: [deployment_log] +dependencies: +survey_question: + +labels: [quality] + +--- + +Sustainability is an increasingly crucial concern of ethical AI, and avoiding wasting computing resources is a part of it. To this end, caching the results of expensive operations has to be considered in any ML deployment. + +Caching is a well-known technique for improving the latency of repeated responses. By using it, we can avoid recomputing results we have already calculated. However, extra care must be taken to avoid exposing private data to third parties; therefore, access control must be thoroughly considered. + +If the ML service's clients can rely on virtually instant responses to repeated queries, that can open up opportunities for different, new access patterns. This freedom can also result in a developer friendlier API and better developer experience. diff --git a/_best_practices/04-composing.md b/_best_practices/04-composing.md new file mode 100644 index 0000000..cdfcd6d --- /dev/null +++ b/_best_practices/04-composing.md @@ -0,0 +1,31 @@ +--- +layout: practice +author: András Schmelczer +name: Allow Robustly Composing Inference Functions +title: Allow Robustly Composing Inference Functions +category: Deployment +unique_id: deployment_composing +index: 40 +difficulty: na +references: [MSCS, VE2E, GREATAI] +comments: True +description: +image: +photocredit: + +intent: Create more focused services and promote reuse by letting the inference functions call each other. +motivation: Just as with conventional software, composability and modularity can lead to simpler and more flexible architectures. However, extra care needs to be taken to avoid non-monotonic error propagation. +applicability: Enabling robust inference function composition should be implemented in any production-level ML application. +related: [exp_versioning, governance_model_card] +dependencies: +survey_question: + +labels: [quality, agility] + +--- + +Letting inference functions call each other can lead to more flexibility and, in many cases, better performance because each core functionality has to be implemented only once. Therefore, more resources can be committed to improving their quality, and the knowledge doesn't need to be scattered around a multitude of models. + +However, extra attention must be given to the proper documentation of reusable models and their versions. The intertwined nature of many ML services may lead to non-monotonic error propagation, meaning that improvements in one part of the system might decrease the overall system quality. So, allowing pinning down model versions across the entire call tree and/or robust testing should be in place in order to mitigate this. + +Additionally, even though most inference functions are CPU-bound (or GPU-bound), they can end up being IO-bound if they rely on the results of other remote services. This gives an opportunity to a significant performance (throughput) improvement, which can be achieved by implementing inference functions in an asynchronous manner. diff --git a/_best_practices/04-dist_skew.md b/_best_practices/04-dist_skew.md index 33a3ef9..997b291 100644 --- a/_best_practices/04-dist_skew.md +++ b/_best_practices/04-dist_skew.md @@ -5,7 +5,7 @@ name: Perform Checks to Detect Skew between Models title: Perform Checks to Detect Skew between Models category: Deployment unique_id: deployment_distskew -index: 31 +index: 35 difficulty: "medium" references: [Rs4ML, CD4ML, TDBML, TFX] comments: True diff --git a/_best_practices/04-log_production.md b/_best_practices/04-log_production.md index 111eff1..7997d1b 100644 --- a/_best_practices/04-log_production.md +++ b/_best_practices/04-log_production.md @@ -5,7 +5,7 @@ name: Log Production Predictions with the Model's Version and Input Data title: Log Production Predictions with the Model's Version and Input Data category: Deployment unique_id: deployment_log -index: 33 +index: 37 difficulty: "medium" references: [MMLP, MLGov, MDLOPS] comments: True diff --git a/_best_practices/04-monitor_models_prod.md b/_best_practices/04-monitor_models_prod.md index b650c60..4bdd87b 100644 --- a/_best_practices/04-monitor_models_prod.md +++ b/_best_practices/04-monitor_models_prod.md @@ -5,7 +5,7 @@ name: Continuously Monitor the Behaviour of Deployed Models title: Continuously Monitor the Behaviour of Deployed Models category: Deployment unique_id: deployment_monitor -index: 30 +index: 34 difficulty: "medium" references: [CD4ML, MLPROD, MLLG, TFX, TDBML] comments: True diff --git a/_best_practices/04-rollback_models_prod.md b/_best_practices/04-rollback_models_prod.md index ab90394..871b124 100644 --- a/_best_practices/04-rollback_models_prod.md +++ b/_best_practices/04-rollback_models_prod.md @@ -5,7 +5,7 @@ name: Enable Automatic Roll Backs for Production Models title: Enable Automatic Roll Backs for Production Models category: Deployment unique_id: deployment_rollback -index: 32 +index: 36 difficulty: "medium" references: [MLLG, CD4ML] comments: True diff --git a/_best_practices/04-shadow_models_prod.md b/_best_practices/04-shadow_models_prod.md index 79b7c4d..6dfe642 100644 --- a/_best_practices/04-shadow_models_prod.md +++ b/_best_practices/04-shadow_models_prod.md @@ -5,7 +5,7 @@ name: Enable Shadow Deployment title: Enable Shadow Deployment category: Deployment unique_id: deployment_shadow -index: 29 +index: 33 difficulty: "medium" references: [VML, MLLG, TFX] comments: True diff --git a/_best_practices/05-collaborative_platform.md b/_best_practices/05-collaborative_platform.md index 9710953..76843be 100644 --- a/_best_practices/05-collaborative_platform.md +++ b/_best_practices/05-collaborative_platform.md @@ -5,7 +5,7 @@ name: Use A Collaborative Development Platform title: Use A Collaborative Development Platform category: Team unique_id: team_collab -index: 35 +index: 41 difficulty: "basic" comments: True description: diff --git a/_best_practices/05-communication_collab.md b/_best_practices/05-communication_collab.md index f1db6a8..f34ff34 100644 --- a/_best_practices/05-communication_collab.md +++ b/_best_practices/05-communication_collab.md @@ -5,7 +5,7 @@ name: Communicate, Align, and Collaborate With Others title: Communicate, Align, and Collaborate With Others category: Team unique_id: team_communication -index: 37 +index: 43 difficulty: "basic" comments: True description: @@ -13,7 +13,7 @@ image: # photocredit: # intent: Ensure alignment with other (development) teams, management, and external stakeholders. # -motivation: The system that your team develops is meant to integrate with other systems within the context of a wider organization. this requires communication, alignment, and collaboration with others outside the team. +motivation: The system that your team develops is meant to integrate with other systems within the context of a wider organization. This requires communication, alignment, and collaboration with others outside the team. applicability: # related: [team_collab] dependencies: # diff --git a/_best_practices/05-tradeoff.md b/_best_practices/05-tradeoff.md index 69e8a45..fc9c89f 100644 --- a/_best_practices/05-tradeoff.md +++ b/_best_practices/05-tradeoff.md @@ -3,7 +3,7 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Decide Trade-Offs through Defined Team Process category: Team -index: 38 +index: 44 unique_id: tradeoff difficulty: "medium" references: [SEDS, GDMMA, MOIEA] # diff --git a/_best_practices/05-use_backlog.md b/_best_practices/05-use_backlog.md index 31476da..baacfa6 100644 --- a/_best_practices/05-use_backlog.md +++ b/_best_practices/05-use_backlog.md @@ -5,7 +5,7 @@ name: Work Against a Shared Backlog title: Work Against a Shared Backlog category: Team unique_id: team_backlog -index: 36 +index: 42 difficulty: "medium" comments: True description: diff --git a/_best_practices/06-alert.md b/_best_practices/06-alert.md index d801ebe..81690f3 100644 --- a/_best_practices/06-alert.md +++ b/_best_practices/06-alert.md @@ -3,7 +3,7 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Inform Users on Machine Learning Usage category: Governance -index: 42 +index: 48 unique_id: alert difficulty: "advanced" references: [AIHLEG,MCARD] # @@ -15,7 +15,7 @@ photocredit: # intent: Make users aware that machine learning is used by the application, what it is used for, and what its limitations are. This allows users to understand better how to use, or not use the application. # motivation: Machine learning systems should not represent themselves as humans to users. Humans have the right to know that they are interacting with a machine learning system. # applicability: User communication should be applied to any machine learning application. # -related: [concerns] +related: [concerns, governance_model_card] dependencies: # survey_question: Q100 # survey_item: Our application informs users that it makes use of machine learning and describes its intended use and limitations. diff --git a/_best_practices/06-audits.md b/_best_practices/06-audits.md index 4edaa91..50a6afd 100644 --- a/_best_practices/06-audits.md +++ b/_best_practices/06-audits.md @@ -3,7 +3,7 @@ layout: practice author: name: Have Your Application Audited category: Governance -index: 45 +index: 51 unique_id: audits difficulty: "advanced" references: [TTAID, AIHLEG] # diff --git a/_best_practices/06-code_conduct.md b/_best_practices/06-code_conduct.md index eb0bd42..aa59a43 100644 --- a/_best_practices/06-code_conduct.md +++ b/_best_practices/06-code_conduct.md @@ -3,7 +3,7 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Establish Responsible AI Values category: Governance -index: 39 +index: 45 difficulty: "advanced" references: [TTAID, AIHLEG] # comments: True diff --git a/_best_practices/06-concerns.md b/_best_practices/06-concerns.md index 60b4c8b..ea7cb61 100644 --- a/_best_practices/06-concerns.md +++ b/_best_practices/06-concerns.md @@ -3,7 +3,7 @@ layout: practice author: Alex Serban, Koen van der Blom, Joost Visser name: Provide Safe Channels to Raise Concerns category: Governance -index: 44 +index: 50 unique_id: concerns difficulty: "advanced" references: [TTAID,AIHLEG] # diff --git a/_best_practices/06-explainable.md b/_best_practices/06-explainable.md index e2a4745..1adff5b 100644 --- a/_best_practices/06-explainable.md +++ b/_best_practices/06-explainable.md @@ -3,7 +3,7 @@ layout: practice author: name: Explain Results and Decisions to Users category: Governance -index: 43 +index: 49 unique_id: explainable difficulty: "advanced" references: [AIHLEG] # diff --git a/_best_practices/06-model_card.md b/_best_practices/06-model_card.md new file mode 100644 index 0000000..2a67d9e --- /dev/null +++ b/_best_practices/06-model_card.md @@ -0,0 +1,31 @@ +--- +layout: practice +author: András Schmelczer +name: Keep the Model and its Documentation Together +title: Keep the Model and its Documentation Together +category: Governance +unique_id: governance_model_card +index: 52 +difficulty: na +references: [MCARD, MCMR2, MCT, GREATAI] +comments: True +description: +image: +photocredit: + +intent: Models applied without a nuanced understanding of their capabilities and limits can easily lead to misuse. +motivation: Providing prospective users and/or the public with a clear description of the models' strengths, biases, and shortcomings must be an integral part of responsible open-sourcing. This way, both misguided applications and public distrust can be averted. +applicability: Clear documentation must be distributed together with the models in all cases where models are made accessible to third parties. +related: [data_lbl, team_communication, alert] +dependencies: +survey_question: + +labels: [transparency] + +--- + +ML models are not perfect, and their imperfections can often be unintuitive to notice and interpret. Therefore, it is expected of their creators to ensure that the model's users have a precise understanding of its boundaries. This includes measured performance, the means of measuring that — for instance, evaluation metrics and data used — applicability, and known edge cases. + +Documentation can come in many shapes, but Model Cards, pioneered by Google, are a safe choice for starting out. They allow a large degree of freedom while still providing a clear structure. + +Model cards and, in general, documentation of machine learning models should be aimed at both technical and non-technical audiences. It's not only for the engineers looking to apply it but also for the end-users whose data it might be applied to, and legal or other governing professionals who oversee the safety, legality, and ethics of the model's usage. diff --git a/_best_practices/06-responsible_ml_ai.md b/_best_practices/06-responsible_ml_ai.md index 0233176..6f2b6d2 100644 --- a/_best_practices/06-responsible_ml_ai.md +++ b/_best_practices/06-responsible_ml_ai.md @@ -5,7 +5,7 @@ name: Enforce Fairness and Privacy title: Enforce Fairness and Privacy category: Governance unique_id: gov_responsible -index: 41 +index: 47 difficulty: "medium" references: [MLFAIR, MLRES] comments: True diff --git a/_best_practices/06-risk.md b/_best_practices/06-risk.md index ae37896..148e4a2 100644 --- a/_best_practices/06-risk.md +++ b/_best_practices/06-risk.md @@ -3,7 +3,7 @@ layout: practice author: Joost Visser, Alex Serban, Koen van der Blom name: Perform Risk Assessments category: Governance -index: 40 +index: 46 unique_id: risk difficulty: "advanced" references: [TTAID,AIHLEG,CAIAG] # diff --git a/_config.yml b/_config.yml index 33277bb..f550f93 100644 --- a/_config.yml +++ b/_config.yml @@ -2,12 +2,13 @@ # Site settings # ----------------------------------------------------------------------------- name: SE-ML +title: SE-ML email: j.m.w.visser@liacs.leidenuniv.nl description: > # this means to ignore newlines until "url:" Webpage for the Software Engineering for Machine Learning footer_text: # -url: # the base hostname & protocol for your site +url: "https://se-ml.github.io" # the base hostname & protocol for your site baseurl: # the subpath of your site, e.g. /blog/ last_updated: # leave blank if you don't want to display last updated diff --git a/_layouts/post.html b/_layouts/post.html index c203111..01a50b6 100644 --- a/_layouts/post.html +++ b/_layouts/post.html @@ -27,11 +27,7 @@

{{ page.title }}