proposal: add SpeciesistBiasMetric for evaluating species-based discrimination in NLP

## Proposal: Species as a Protected Attribute Category

### Summary

Add species as a protected attribute category in AIF360, with metrics for evaluating speciesist bias in NLP models. This extends AIF360's fairness framework to cover a documented, measurable form of bias that has been established across 5+ peer-reviewed publications.

### Academic Foundation

Multiple peer-reviewed papers have documented speciesist bias as a measurable phenomenon in AI systems:

1. **Hagendorff, Bossert, Tse & Singer (2023).** "Speciesist bias in AI: how AI applications perpetuate the exploitation of animals." *AI and Ethics*. [DOI: 10.1007/s43681-023-00380-w](https://doi.org/10.1007/s43681-023-00380-w)
   - Found GPT-3 associates farmed animals with violence
   - The Delphi moral AI judged killing a factory-farmed pig as "okay" but killing a dog as "wrong"
   - **Explicitly calls for fairness frameworks to "widen their scope and include mitigation measures for speciesist biases"**

2. **Takeshita, Rzepka & Araki (2022).** "Speciesist language and nonhuman animal bias in English masked language models." *Information Processing & Management*.
   - Demonstrated BERT, DistilBERT, RoBERTa, and ALBERT all associate harmful words with nonhuman animals

3. **Hagendorff et al. (2025).** "SpeciesismBench." arXiv:2508.11534.
   - 1,003-item benchmark showing LLMs "frequently normalized harm toward farmed animals while refusing to do so for non-farmed animals"

4. **AI-for-Animals (2025).** "AHA Benchmark." arXiv:2503.04804.
   - 4,350-item dataset finding species-dependent risks of harm in LLM outputs

5. **Takeshita et al. (2024).** Found speciesism among NLP researchers themselves and in annotated NLP data. *AI and Ethics*.

### What This Would Look Like

- A new `SpeciesistBiasMetric` class that evaluates differential treatment of animal species in model outputs
- Species as a recognized protected attribute alongside race, gender, age, etc.
- Integration with existing AIF360 metrics infrastructure
- Test datasets based on SpeciesismBench and AHA Benchmark

### Why AIF360

Hagendorff et al. (2023) specifically identified fairness toolkits as the right venue for speciesist bias metrics. AIF360 is the most widely-used open-source AI fairness toolkit and would set the standard for other frameworks.

### Willingness to Contribute

We ([Open Paws](https://openpaws.ai)) are willing to implement this module. We've already built speciesist bias evaluation tools and have experience with the relevant benchmarks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: add SpeciesistBiasMetric for evaluating species-based discrimination in NLP #559

Proposal: Species as a Protected Attribute Category

Summary

Academic Foundation

What This Would Look Like

Why AIF360

Willingness to Contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

proposal: add SpeciesistBiasMetric for evaluating species-based discrimination in NLP #559

Description

Proposal: Species as a Protected Attribute Category

Summary

Academic Foundation

What This Would Look Like

Why AIF360

Willingness to Contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions