hmBench: A Benchmark for Historical Language Models on NER Datasets

This repository presents a benchmark for Historical Language Models with main focus on NER Datasets such as HIPE-2022.

Models

The following Historical Language Models are currently used in benchmarks:

Model	Hugging Face Model Hub Org
hmBERT	Historical Multilingual Language Models for Named Entity Recognition
hmTEAMS	Historical Multilingual TEAMS Models
hmByT5	Historical Multilingual and Monolingual ByT5 Models

Datasets

We benchmark pretrained language models on various datasets from HIPE-2020, HIPE-2022 and Europeana. The following table shows an overview of used datasets:

Language	Datasets
English	AjMC - TopRes19th
German	AjMC - NewsEye - HIPE-2020
French	AjMC - ICDAR-Europeana - LeTemps - NewsEye - HIPE-2020
Finnish	NewsEye
Swedish	NewsEye
Dutch	ICDAR-Europeana

Results

The hmLeaderboard space on the Hugging Face Model Hub shows all results and can be accessed here.

Best Models

A collection of best performing models can be found here (grouped by the used backbone LM):

Fine-Tuning

We use Flair for fine-tuning NER models on HIPE-2022 datasets from HIPE-2022 Shared Task. Additionally, the ICDAR-Europeana is used for benchmarks on Dutch and French.

We use a tagged version of Flair to ensure a kind of reproducibility. The following commands need to be run to install all necessary dependencies:

$ pip3 install -r requirements.txt

In order to use the hmTEAMS models you need to authorize with your account on Hugging Face Model Hub. This can be done via cli:

# Use access token from https://huggingface.co/settings/tokens
$ huggingface-cli login

We use a config-driven hyper-parameter search. The script flair-fine-tuner.py can be used to fine-tune NER models from our Model Zoo.

Additionally, we provide a script that uses Hugging Face AutoTrain Advanced (Space Runner) to fine-tune models. The following snippet shows an example:

$ pip3 install git+https://github.com/huggingface/autotrain-advanced.git
$  export HF_TOKEN="" # Get token from: https://huggingface.co/settings/tokens
$ autotrain spacerunner --project-name "flair-hmbench-hmbyt5-ajmc-de" \
  --script-path $(pwd) \
  --username stefan-it \
  --token $HF_TOKEN \
  --backend spaces-t4s \
  --env "CONFIG=configs/ajmc/de/hmbyt5.json;HF_TOKEN=$HF_TOKEN;HUB_ORG_NAME=stefan-it"

The concrete implementation can be found in script.py.

Notice: the AutoTrain implementation is currently under development!

All configurations for fine-tuning are located in the ./configs folder with the following naming convention: ./configs/<dataset-name>/<language>/<model-name>.json.

Changelog

17.10.2023: Over 1.200 models from hyper-parameter search are now available on the Model Hub.
05.10.2023: Initial version of this repository.

Acknowledgements

We thank Luisa März, Katharina Schmid and Erion Çano for their fruitful discussions about Historical Language Models.

Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC). Many Thanks for providing access to the TPUs ❤️

hmBench

Commits

configs: add hmBERT Tiny for mhmner dataset

configs: add hmTEAMS for mhmner dataset

configs: add hmBERT 64k for mhmner dataset

readme: ajust number of fine-tuned models & link

configs: update hmBERT for mhmner dataset

configs: add hmBERT for mhmner dataset

README