GLiNER: Generalist and Lightweight Model for Named Entity Recognition#

GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.

Demo: 🤗 Hugging Face

🌟 Available Models on Hugging Face#

🇬🇧 For English#

  • GLiNER Base: urchade/gliner_base (CC BY NC 4.0)

  • GLiNER Small: urchade/gliner_small (CC BY NC 4.0)

  • GLiNER Small v2: urchade/gliner_small-v2 (Apache 2.0)

  • GLiNER Small v2.1: urchade/gliner_small-v2.1 (Apache 2.0)

  • GLiNER Medium: urchade/gliner_medium (CC BY NC 4.0)

  • GLiNER Medium v2: urchade/gliner_medium-v2 (Apache 2.0)

  • GLiNER Medium v2.1: urchade/gliner_medium-v2.1 (Apache 2.0)

  • GLiNER Large: urchade/gliner_large (CC BY NC 4.0)

  • GLiNER Large v2: urchade/gliner_large-v2 (Apache 2.0)

🌍 For Other Languages#

  • Korean: 🇰🇷 taeminlee/gliner_ko

  • Italian: 🇮🇹 DeepMount00/universal_ner_ita

  • Multilingual: 🌐 urchade/gliner_multi (CC BY NC 4.0) and urchade/gliner_multi-v2.1 (Apache 2.0)

🔬 Domain Specific Models#

  • Biomedical: 🧬 urchade/gliner_large_bio-v0.1 (Apache 2.0)

!pip install gliner
Requirement already satisfied: gliner in /opt/homebrew/lib/python3.10/site-packages (0.1.7)
Requirement already satisfied: huggingface-hub>=0.21.4 in /opt/homebrew/lib/python3.10/site-packages (from gliner) (0.22.2)
Requirement already satisfied: torch>=2.0.0 in /opt/homebrew/lib/python3.10/site-packages (from gliner) (2.2.2)
Requirement already satisfied: seqeval in /opt/homebrew/lib/python3.10/site-packages (from gliner) (1.2.2)
Requirement already satisfied: flair==0.13.1 in /opt/homebrew/lib/python3.10/site-packages (from gliner) (0.13.1)
Requirement already satisfied: tqdm in /opt/homebrew/lib/python3.10/site-packages (from gliner) (4.64.1)
Requirement already satisfied: transformers>=4.38.2 in /opt/homebrew/lib/python3.10/site-packages (from gliner) (4.39.3)
Requirement already satisfied: bpemb>=0.3.2 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (0.3.5)
Requirement already satisfied: scikit-learn>=1.0.2 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (1.4.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (2.8.2)
Requirement already satisfied: deprecated>=1.2.13 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (1.2.14)
Requirement already satisfied: gdown>=4.4.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (5.1.0)
Requirement already satisfied: pptree>=3.1 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (3.1)
Requirement already satisfied: segtok>=1.5.11 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (1.5.11)
Requirement already satisfied: urllib3<2.0.0,>=1.0.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (1.26.18)
Requirement already satisfied: langdetect>=1.0.9 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (1.0.9)
Requirement already satisfied: semver<4.0.0,>=3.0.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (3.0.2)
Requirement already satisfied: regex>=2022.1.18 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (2023.12.25)
Requirement already satisfied: tabulate>=0.8.10 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (0.9.0)
Requirement already satisfied: conllu>=4.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (4.5.3)
Requirement already satisfied: transformer-smaller-training-vocab>=0.2.3 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (0.4.0)
Requirement already satisfied: ftfy>=6.1.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (6.2.0)
Requirement already satisfied: wikipedia-api>=0.5.7 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (0.6.0)
Requirement already satisfied: mpld3>=0.3 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (0.5.10)
Requirement already satisfied: lxml>=4.8.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (5.2.1)
Requirement already satisfied: sqlitedict>=2.0.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (2.1.0)
Requirement already satisfied: pytorch-revgrad>=0.2.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (0.2.0)
Requirement already satisfied: janome>=0.4.2 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (0.5.0)
Requirement already satisfied: more-itertools>=8.13.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (10.2.0)
Requirement already satisfied: matplotlib>=2.2.3 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (3.6.3)
Requirement already satisfied: boto3>=1.20.27 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (1.34.83)
Requirement already satisfied: gensim>=4.2.0 in /opt/homebrew/lib/python3.10/site-packages (from flair==0.13.1->gliner) (4.3.2)
Requirement already satisfied: filelock in /opt/homebrew/lib/python3.10/site-packages (from huggingface-hub>=0.21.4->gliner) (3.13.1)
Requirement already satisfied: packaging>=20.9 in /opt/homebrew/lib/python3.10/site-packages (from huggingface-hub>=0.21.4->gliner) (23.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/homebrew/lib/python3.10/site-packages (from huggingface-hub>=0.21.4->gliner) (4.9.0)
Requirement already satisfied: requests in /opt/homebrew/lib/python3.10/site-packages (from huggingface-hub>=0.21.4->gliner) (2.31.0)
Requirement already satisfied: pyyaml>=5.1 in /opt/homebrew/lib/python3.10/site-packages (from huggingface-hub>=0.21.4->gliner) (6.0.1)
Requirement already satisfied: fsspec>=2023.5.0 in /opt/homebrew/lib/python3.10/site-packages (from huggingface-hub>=0.21.4->gliner) (2023.12.2)
Requirement already satisfied: sympy in /opt/homebrew/lib/python3.10/site-packages (from torch>=2.0.0->gliner) (1.12)
Requirement already satisfied: jinja2 in /opt/homebrew/lib/python3.10/site-packages (from torch>=2.0.0->gliner) (3.1.3)
Requirement already satisfied: networkx in /opt/homebrew/lib/python3.10/site-packages (from torch>=2.0.0->gliner) (3.2.1)
Requirement already satisfied: safetensors>=0.4.1 in /opt/homebrew/lib/python3.10/site-packages (from transformers>=4.38.2->gliner) (0.4.2)
Requirement already satisfied: numpy>=1.17 in /opt/homebrew/lib/python3.10/site-packages (from transformers>=4.38.2->gliner) (1.24.1)
Requirement already satisfied: tokenizers<0.19,>=0.14 in /opt/homebrew/lib/python3.10/site-packages (from transformers>=4.38.2->gliner) (0.15.2)
Requirement already satisfied: botocore<1.35.0,>=1.34.83 in /opt/homebrew/lib/python3.10/site-packages (from boto3>=1.20.27->flair==0.13.1->gliner) (1.34.83)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/homebrew/lib/python3.10/site-packages (from boto3>=1.20.27->flair==0.13.1->gliner) (1.0.1)
Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /opt/homebrew/lib/python3.10/site-packages (from boto3>=1.20.27->flair==0.13.1->gliner) (0.10.1)
Requirement already satisfied: sentencepiece in /opt/homebrew/lib/python3.10/site-packages (from bpemb>=0.3.2->flair==0.13.1->gliner) (0.2.0)
Requirement already satisfied: wrapt<2,>=1.10 in /opt/homebrew/lib/python3.10/site-packages (from deprecated>=1.2.13->flair==0.13.1->gliner) (1.16.0)
Requirement already satisfied: wcwidth<0.3.0,>=0.2.12 in /opt/homebrew/lib/python3.10/site-packages (from ftfy>=6.1.0->flair==0.13.1->gliner) (0.2.13)
Requirement already satisfied: beautifulsoup4 in /opt/homebrew/lib/python3.10/site-packages (from gdown>=4.4.0->flair==0.13.1->gliner) (4.12.3)
Requirement already satisfied: smart-open>=1.8.1 in /opt/homebrew/lib/python3.10/site-packages (from gensim>=4.2.0->flair==0.13.1->gliner) (7.0.4)
Requirement already satisfied: scipy>=1.7.0 in /opt/homebrew/lib/python3.10/site-packages (from gensim>=4.2.0->flair==0.13.1->gliner) (1.12.0)
Requirement already satisfied: six in /opt/homebrew/lib/python3.10/site-packages (from langdetect>=1.0.9->flair==0.13.1->gliner) (1.16.0)
Requirement already satisfied: cycler>=0.10 in /opt/homebrew/lib/python3.10/site-packages (from matplotlib>=2.2.3->flair==0.13.1->gliner) (0.11.0)
Requirement already satisfied: contourpy>=1.0.1 in /opt/homebrew/lib/python3.10/site-packages (from matplotlib>=2.2.3->flair==0.13.1->gliner) (1.0.7)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/homebrew/lib/python3.10/site-packages (from matplotlib>=2.2.3->flair==0.13.1->gliner) (1.4.4)
Requirement already satisfied: pillow>=6.2.0 in /opt/homebrew/lib/python3.10/site-packages (from matplotlib>=2.2.3->flair==0.13.1->gliner) (10.2.0)
Requirement already satisfied: pyparsing>=2.2.1 in /opt/homebrew/lib/python3.10/site-packages (from matplotlib>=2.2.3->flair==0.13.1->gliner) (3.0.9)
Requirement already satisfied: fonttools>=4.22.0 in /opt/homebrew/lib/python3.10/site-packages (from matplotlib>=2.2.3->flair==0.13.1->gliner) (4.38.0)
Requirement already satisfied: joblib>=1.2.0 in /opt/homebrew/lib/python3.10/site-packages (from scikit-learn>=1.0.2->flair==0.13.1->gliner) (1.3.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/homebrew/lib/python3.10/site-packages (from scikit-learn>=1.0.2->flair==0.13.1->gliner) (3.2.0)
Requirement already satisfied: protobuf in /opt/homebrew/lib/python3.10/site-packages (from transformers>=4.38.2->gliner) (5.26.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/homebrew/lib/python3.10/site-packages (from jinja2->torch>=2.0.0->gliner) (2.1.4)
Requirement already satisfied: idna<4,>=2.5 in /opt/homebrew/lib/python3.10/site-packages (from requests->huggingface-hub>=0.21.4->gliner) (3.6)
Requirement already satisfied: certifi>=2017.4.17 in /opt/homebrew/lib/python3.10/site-packages (from requests->huggingface-hub>=0.21.4->gliner) (2023.11.17)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/homebrew/lib/python3.10/site-packages (from requests->huggingface-hub>=0.21.4->gliner) (3.3.2)
Requirement already satisfied: mpmath>=0.19 in /opt/homebrew/lib/python3.10/site-packages (from sympy->torch>=2.0.0->gliner) (1.3.0)
Requirement already satisfied: accelerate>=0.21.0 in /opt/homebrew/lib/python3.10/site-packages (from transformers>=4.38.2->gliner) (0.29.2)
Requirement already satisfied: soupsieve>1.2 in /opt/homebrew/lib/python3.10/site-packages (from beautifulsoup4->gdown>=4.4.0->flair==0.13.1->gliner) (2.5)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /opt/homebrew/lib/python3.10/site-packages (from requests->huggingface-hub>=0.21.4->gliner) (1.7.1)
Requirement already satisfied: psutil in /opt/homebrew/lib/python3.10/site-packages (from accelerate>=0.21.0->transformers>=4.38.2->gliner) (5.9.8)

[notice] A new release of pip is available: 23.0.1 -> 24.0
[notice] To update, run: python3.10 -m pip install --upgrade pip

Basic Use Case#

from gliner import GLiNER

# Initialize GLiNER with the base model
model = GLiNER.from_pretrained("urchade/gliner_medium-v2.1")
/Users/victorgallego/miniforge3/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
/Users/victorgallego/miniforge3/lib/python3.10/site-packages/transformers/convert_slow_tokenizer.py:550: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.
  warnings.warn(
# Sample text for entity prediction
text = """
Cristiano Ronaldo dos Santos Aveiro (Portuguese pronunciation: [kɾiʃˈtjɐnu ʁɔˈnaldu]; born 5 February 1985) is a Portuguese professional footballer who plays as a forward for and captains both Saudi Pro League club Al Nassr and the Portugal national team. Widely regarded as one of the greatest players of all time, Ronaldo has won five Ballon d'Or awards,[note 3] a record three UEFA Men's Player of the Year Awards, and four European Golden Shoes, the most by a European player. He has won 33 trophies in his career, including seven league titles, five UEFA Champions Leagues, the UEFA European Championship and the UEFA Nations League. Ronaldo holds the records for most appearances (183), goals (140) and assists (42) in the Champions League, goals in the European Championship (14), international goals (128) and international appearances (205). He is one of the few players to have made over 1,200 professional career appearances, the most by an outfield player, and has scored over 850 official senior career goals for club and country, making him the top goalscorer of all time.
"""
# Labels for entity prediction
labels = ["Person", "Award", "Date", "Competitions", "Teams"] # use capital case for better performance
# Perform entity prediction
entities = model.predict_entities(text, labels, threshold=0.5)
# Display predicted entities and their labels
for entity in entities:
    print(entity["text"], "=>", entity["label"])
Cristiano Ronaldo dos Santos Aveiro => Person
5 February 1985 => Date
Portugal national team => Teams
Ballon d'Or => Award
UEFA Men's Player of the Year Awards => Award
European Golden Shoes => Award
UEFA Champions Leagues => Competitions
UEFA European Championship => Competitions
UEFA Nations League => Competitions
European Championship => Competitions
import pandas as pd

df = pd.DataFrame(entities)

df
start end text label score
0 1 36 Cristiano Ronaldo dos Santos Aveiro Person 0.864556
1 92 107 5 February 1985 Date 0.985105
2 233 255 Portugal national team Teams 0.540601
3 338 349 Ballon d'Or Award 0.604587
4 381 417 UEFA Men's Player of the Year Awards Award 0.817369
5 428 449 European Golden Shoes Award 0.809395
6 556 578 UEFA Champions Leagues Competitions 0.836124
7 584 610 UEFA European Championship Competitions 0.869951
8 619 638 UEFA Nations League Competitions 0.924063
9 761 782 European Championship Competitions 0.731239
import numpy as np
from random import randint

df = pd.DataFrame(entities)

unique_labels = df['label'].unique()
colors = {label: f'background-color: rgba({randint(0,255)},{randint(0,255)},{randint(0,255)},{np.round(np.random.uniform(0.1,0.4), 2)})' 
          for label in unique_labels}

def color_rows(row):
    color = colors.get(row['label'], '')
    return [f'{color}' for _ in row]

df.style.apply(color_rows, axis=1)
  start end text label score
0 1 36 Cristiano Ronaldo dos Santos Aveiro Person 0.864556
1 92 107 5 February 1985 Date 0.985105
2 233 255 Portugal national team Teams 0.540601
3 338 349 Ballon d'Or Award 0.604587
4 381 417 UEFA Men's Player of the Year Awards Award 0.817369
5 428 449 European Golden Shoes Award 0.809395
6 556 578 UEFA Champions Leagues Competitions 0.836124
7 584 610 UEFA European Championship Competitions 0.869951
8 619 638 UEFA Nations League Competitions 0.924063
9 761 782 European Championship Competitions 0.731239

What happens if we reduce the threshold?

Example: Job Offer Analysis#

text = """
* Data Scientist, Data Analyst, or Data Engineer with 1+ years of experience.
* Experience with technologies such as Docker, Kubernetes, or Kubeflow
* Machine Learning experience preferred
* Experience with programming languages such as Python, C++, or SQL preferred
* Experience with technologies such as Databricks, Qlik, TensorFlow, PyTorch, Python, Dash, Pandas, or NumPy preferred
* BA or BS degree
* Active Secret OR Active Top Secret or Active TS/SCI clearance
"""
labels = ["programing language", "software tool", "degree", "job title"]
entities = model.predict_entities(text, labels, threshold=0.5)

df = pd.DataFrame(entities)

unique_labels = df['label'].unique()
colors = {label: f'background-color: rgba({randint(0,255)},{randint(0,255)},{randint(0,255)},{np.round(np.random.uniform(0.1,0.4), 2)})' 
          for label in unique_labels}

def color_rows(row):
    color = colors.get(row['label'], '')
    return [f'{color}' for _ in row]

df.style.apply(color_rows, axis=1)
  start end text label score
0 3 17 Data Scientist job title 0.877562
1 19 31 Data Analyst job title 0.829869
2 36 49 Data Engineer job title 0.811564
3 141 149 Kubeflow software tool 0.668544
4 238 244 Python programing language 0.988815
5 246 249 C++ programing language 0.920541
6 254 257 SQL programing language 0.783584
7 307 317 Databricks software tool 0.850532
8 319 323 Qlik software tool 0.674751
9 325 335 TensorFlow software tool 0.929136
10 337 344 PyTorch programing language 0.620552
11 346 352 Python programing language 0.985091
12 354 358 Dash software tool 0.748522
13 389 391 BA degree 0.914484
14 395 397 BS degree 0.966724

Example: Literature Research#

text = """Libretto by Marius Petipa, based on the 1822 novella ``Trilby, ou Le Lutin d'Argail`` by Charles Nodier, first presented by the Ballet of the Moscow Imperial Bolshoi Theatre on January 25/February 6 (Julian/Gregorian calendar dates), 1870, in Moscow with Polina Karpakova as Trilby and Ludiia Geiten as Miranda and restaged by Petipa for the Imperial Ballet at the Imperial Bolshoi Kamenny Theatre on January 17–29, 1871 in St. Petersburg with Adèle Grantzow as Trilby and Lev Ivanov as Count Leopold."""

labels = ["person", "book", "location", "date", "actor", "character"]
entities = model.predict_entities(text, labels, threshold=0.5)

df = pd.DataFrame(entities)

unique_labels = df['label'].unique()
colors = {label: f'background-color: rgba({randint(0,255)},{randint(0,255)},{randint(0,255)},{np.round(np.random.uniform(0.1,0.4), 2)})' 
          for label in unique_labels}

def color_rows(row):
    color = colors.get(row['label'], '')
    return [f'{color}' for _ in row]

df.style.apply(color_rows, axis=1)
  start end text label score
0 55 61 Trilby character 0.974547
1 142 148 Moscow location 0.911368
2 177 198 January 25/February 6 date 0.739128
3 234 238 1870 date 0.565082
4 243 249 Moscow location 0.924236
5 255 271 Polina Karpakova actor 0.926855
6 275 281 Trilby character 0.986574
7 286 299 Ludiia Geiten actor 0.897300
8 303 310 Miranda character 0.798976
9 401 420 January 17–29, 1871 date 0.906363
10 424 438 St. Petersburg location 0.915059
11 444 458 Adèle Grantzow actor 0.933230
12 462 468 Trilby character 0.990543
13 473 483 Lev Ivanov actor 0.934771
14 487 500 Count Leopold character 0.854538