Skip to content

Conversation

ZagrosLLMModel
Copy link

What does this PR do?
This PR introduces two new model architectures, Zagros and ZagrosNext, to the Transformers library. The implementation includes:

Model architecture files (modeling_zagros.py and modeling_zagros_next.py) in src/transformers/models/zagros/ and src/transformers/models/zagros_next/.
Configuration files (configuration_zagros.py and configuration_zagros_next.py) for both models.
Comprehensive tests for both models in tests/models/zagros/ and tests/models/zagros_next/.
Updated documentation in docs/source/en/ with usage examples and details for Zagros and ZagrosNext.

Both models follow the standard structure and conventions of existing Transformers models, ensuring compatibility with the library's pipelines and utilities.
Motivation and Context

The Zagros and ZagrosNext models are designed to [briefly describe the purpose, e.g., "enhance performance on specific NLP tasks with novel architectural improvements"].
These models leverage standard Transformer conventions, making them easy to integrate into existing workflows.
The implementation is fully compatible with the Transformers library and supports all standard functionalities (e.g., training, inference, and pipeline integration).

Dependencies

No additional dependencies are required beyond the standard Transformers setup.
Tested with Python 3.9+ and PyTorch 2.0+.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?
@ArthurZucker (for text models)@Rocketknight1 (for pipelines and library compatibility)@stevhliu (for documentation)
Thank you for reviewing!

Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, zagros, zagros_next

@Rocketknight1
Copy link
Member

hi @ZagrosLLMModel, we generally don't accept PRs for new architectures without significant pre-trained checkpoints! Are there any pre-trained Zagros models available?

@ZagrosLLMModel
Copy link
Author

hi @ZagrosLLMModel, we generally don't accept PRs for new architectures without significant pre-trained checkpoints! Are there any pre-trained Zagros models available?

Hi, yes sir we have:
https://huggingface.co/darsadilab/zagros-1.0-quick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants