GitXplorerGitXplorer
e

spacy-transformers

public
1356 stars
167 forks
0 issues

Commits

List of commits on branch master.
Unverified
da1f682653285368189cd01cb982eb67e3310256

Increment version

hhonnibal committed 3 days ago
Verified
6286d87ac363335e407ececea9ae4afdff9e6228

Update transformers pin, fix CI (#420)

hhonnibal committed 3 days ago
Unverified
2422eaea6bc05ac4bdfc96a85a8deb1c7f264b78

Add CI files

hhonnibal committed 4 days ago
Verified
40ee09d9b2b2b18f77fc3715329ce080691b6af9

Bump to v1.3.6 (#419)

ssvlandeg committed 7 months ago
Verified
3635dc7c7e6ac9a23d86f18fe9c2e5c8baf05f1e

Set transformers upper bound to < 4.42.0 (#418)

ddanieldk committed 8 months ago
Verified
ed40a989a48fe028f778159baa463595275882e5

CI: fix macOS and Windows builds (#416)

ddanieldk committed 9 months ago

README

The README file for this repository.

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

This package provides spaCy components and architectures to use transformer models via Hugging Face's transformers in spaCy. The result is convenient access to state-of-the-art transformer architectures, such as BERT, GPT-2, XLNet, etc.

This release requires spaCy v3. For the previous version of this library, see the v0.6.x branch.

tests PyPi GitHub Code style: black

Features

  • Use pretrained transformer models like BERT, RoBERTa and XLNet to power your spaCy pipeline.
  • Easy multi-task learning: backprop to one transformer model from several pipeline components.
  • Train using spaCy v3's powerful and extensible config system.
  • Automatic alignment of transformer output to spaCy's tokenization.
  • Easily customize what transformer data is saved in the Doc object.
  • Easily customize how long documents are processed.
  • Out-of-the-box serialization and model packaging.

🚀 Installation

Installing the package from pip will automatically install all dependencies, including PyTorch and spaCy. Make sure you install this package before you install the models. Also note that this package requires Python 3.6+, PyTorch v1.5+ and spaCy v3.0+.

pip install 'spacy[transformers]'

For GPU installation, find your CUDA version using nvcc --version and add the version in brackets, e.g. spacy[transformers,cuda92] for CUDA9.2 or spacy[transformers,cuda100] for CUDA10.0.

If you are having trouble installing PyTorch, follow the instructions on the official website for your specific operating system and requirements.

📖 Documentation

⚠️ Important note: This package has been extensively refactored to take advantage of spaCy v3.0. Previous versions that were built for spaCy v2.x worked considerably differently. Please see previous tagged versions of this README for documentation on prior versions.

Applying pretrained text and token classification models

Note that the transformer component from spacy-transformers does not support task-specific heads like token or text classification. A task-specific transformer model can be used as a source of features to train spaCy components like ner or textcat, but the transformer component does not provide access to task-specific heads for training or inference.

Alternatively, if you only want use to the predictions from an existing Hugging Face text or token classification model, you can use the wrappers from spacy-huggingface-pipelines to incorporate task-specific transformer models into your spaCy pipelines.

Bug reports and other issues

Please use spaCy's issue tracker to report a bug, or open a new thread on the discussion board for any other issue.