Intent Classification in Banking 🏦

Intent Classification in Banking 🏦#

BANKING77 dataset provides a very fine-grained set of intents in a banking domain. It comprises 13,083 customer service queries labeled with 77 intents. It focuses on fine-grained single-domain intent detection.

from datasets import load_dataset
from transformers import AutoTokenizer, DataCollatorWithPadding
import pandas as pd

raw_dataset_train = load_dataset("banking77", split="train")
raw_dataset_val = load_dataset("banking77", split="test")
raw_dataset_train_df = raw_dataset_train.to_pandas()
pd.set_option('display.max_colwidth', None)

raw_dataset_train_df.head()
text label
0 I am still waiting on my card? 11
1 What can I do if my card still hasn't arrived after 2 weeks? 11
2 I have been waiting over a week. Is the card still coming? 11
3 Can I track my card while it is in the process of delivery? 11
4 How do I know if I will get my card, or if it is lost? 11
raw_dataset_train_df.sample(10)
text label
6883 Is it possible for me to change my PIN number? 21
5836 I'm not sure why my card didn't work 25
8601 I don't think my top up worked 59
2545 Can you explain why my payment was charged a fee? 15
8697 How long does a transfer from a UK account take? I just made one and it doesn't seem to be working, wondering if everything is okay 5
5573 Why am I getting declines when trying to make a purchase online? 27
576 What is the $1 transaction on my account? 34
6832 It looks like my card payment was sent back. 53
7111 Why am I unable to transfer money when I was able to before? 7
439 What if there is an error on the exchange rate? 17

It seems we have a lot of classes…

raw_dataset_train_df.label.value_counts().plot(kind='bar', figsize=(15, 10))
<AxesSubplot: >
../_images/4d3b5559e1899b05bcdf09473eaec06c2384fe1e38460d507040739c469b9c61.png

Here is the list of classes with their definition:

label

intent (category)

0

activate_my_card

1

age_limit

2

apple_pay_or_google_pay

3

atm_support

4

automatic_top_up

5

balance_not_updated_after_bank_transfer

6

balance_not_updated_after_cheque_or_cash_deposit

7

beneficiary_not_allowed

8

cancel_transfer

9

card_about_to_expire

10

card_acceptance

11

card_arrival

12

card_delivery_estimate

13

card_linking

14

card_not_working

15

card_payment_fee_charged

16

card_payment_not_recognised

17

card_payment_wrong_exchange_rate

18

card_swallowed

19

cash_withdrawal_charge

20

cash_withdrawal_not_recognised

21

change_pin

22

compromised_card

23

contactless_not_working

24

country_support

25

declined_card_payment

26

declined_cash_withdrawal

27

declined_transfer

28

direct_debit_payment_not_recognised

29

disposable_card_limits

30

edit_personal_details

31

exchange_charge

32

exchange_rate

33

exchange_via_app

34

extra_charge_on_statement

35

failed_transfer

36

fiat_currency_support

37

get_disposable_virtual_card

38

get_physical_card

39

getting_spare_card

40

getting_virtual_card

41

lost_or_stolen_card

42

lost_or_stolen_phone

43

order_physical_card

44

passcode_forgotten

45

pending_card_payment

46

pending_cash_withdrawal

47

pending_top_up

48

pending_transfer

49

pin_blocked

50

receiving_money

51

Refund_not_showing_up

52

request_refund

53

reverted_card_payment?

54

supported_cards_and_currencies

55

terminate_account

56

top_up_by_bank_transfer_charge

57

top_up_by_card_charge

58

top_up_by_cash_or_cheque

59

top_up_failed

60

top_up_limits

61

top_up_reverted

62

topping_up_by_card

63

transaction_charged_twice

64

transfer_fee_charged

65

transfer_into_account

66

transfer_not_received_by_recipient

67

transfer_timing

68

unable_to_verify_identity

69

verify_my_identity

70

verify_source_of_funds

71

verify_top_up

72

virtual_card_not_working

73

visa_or_mastercard

74

why_verify_identity

75

wrong_amount_of_cash_received

76

wrong_exchange_rate_for_cash_withdrawal

And this is a summary of the dataset

Dataset statistics

Train

Test

Number of examples

10 003

3 080

Average character length

59.5

54.2

Number of intents

77

77

model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)


def tokenize_function(example):
    return tokenizer(example["text"], truncation=True)


tokenized_datasets_train = raw_dataset_train.map(tokenize_function, batched=True)
tokenized_datasets_val = raw_dataset_val.map(tokenize_function, batched=True)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
from transformers import TrainingArguments

training_args = TrainingArguments("intent-banking", per_device_train_batch_size=16)
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=77) # watch out for the number of labels!
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
from transformers import Trainer

trainer = Trainer(
    model,
    training_args,
    train_dataset=tokenized_datasets_train,
    eval_dataset=tokenized_datasets_val,
    data_collator=data_collator,
    tokenizer=tokenizer,
)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
trainer.train()
Checkpoint destination directory test-trainer/checkpoint-500 already exists and is non-empty. Saving will proceed but saved results may be invalid.
{'loss': 1.4679, 'grad_norm': 9.389148712158203, 'learning_rate': 3.668796592119276e-05, 'epoch': 0.8}
{'loss': 0.4518, 'grad_norm': 7.393991947174072, 'learning_rate': 2.3375931842385517e-05, 'epoch': 1.6}
{'loss': 0.2117, 'grad_norm': 5.330203056335449, 'learning_rate': 1.0063897763578276e-05, 'epoch': 2.4}
{'train_runtime': 582.5736, 'train_samples_per_second': 51.511, 'train_steps_per_second': 3.224, 'train_loss': 0.5940346174473706, 'epoch': 3.0}
TrainOutput(global_step=1878, training_loss=0.5940346174473706, metrics={'train_runtime': 582.5736, 'train_samples_per_second': 51.511, 'train_steps_per_second': 3.224, 'train_loss': 0.5940346174473706, 'epoch': 3.0})
predictions = trainer.predict(tokenized_datasets_val)
print(predictions.predictions.shape, predictions.label_ids.shape)
(3080, 77) (3080,)
predictions.predictions.argmax(axis=1)
array([41, 11, 11, ..., 24, 24, 24])
raw_dataset_val_df = raw_dataset_val.to_pandas()
raw_dataset_val_df['prediction'] = predictions.predictions.argmax(axis=1)
from sklearn.metrics import classification_report

print(classification_report(raw_dataset_val_df['label'], raw_dataset_val_df['prediction']))
              precision    recall  f1-score   support

           0       1.00      0.97      0.99        40
           1       0.98      1.00      0.99        40
           2       1.00      1.00      1.00        40
           3       1.00      0.97      0.99        40
           4       0.95      0.93      0.94        40
           5       0.84      0.80      0.82        40
           6       1.00      0.93      0.96        40
           7       0.97      0.93      0.95        40
           8       1.00      0.95      0.97        40
           9       0.95      1.00      0.98        40
          10       0.93      0.93      0.93        40
          11       0.88      0.88      0.88        40
          12       0.90      0.88      0.89        40
          13       0.97      0.95      0.96        40
          14       0.87      0.97      0.92        40
          15       0.84      0.93      0.88        40
          16       0.90      0.95      0.93        40
          17       0.88      0.95      0.92        40
          18       1.00      0.95      0.97        40
          19       0.95      0.95      0.95        40
          20       0.83      0.97      0.90        40
          21       0.89      1.00      0.94        40
          22       0.94      0.78      0.85        40
          23       1.00      0.88      0.93        40
          24       0.93      0.95      0.94        40
          25       0.85      0.97      0.91        40
          26       0.78      1.00      0.88        40
          27       1.00      0.75      0.86        40
          28       0.90      0.88      0.89        40
          29       0.97      0.90      0.94        40
          30       1.00      1.00      1.00        40
          31       0.97      0.93      0.95        40
          32       0.93      0.97      0.95        40
          33       0.88      0.95      0.92        40
          34       1.00      0.95      0.97        40
          35       0.87      0.97      0.92        40
          36       0.95      0.90      0.92        40
          37       0.94      0.85      0.89        40
          38       0.97      0.97      0.97        40
          39       0.93      0.97      0.95        40
          40       0.87      0.97      0.92        40
          41       0.86      0.95      0.90        40
          42       1.00      0.97      0.99        40
          43       0.90      0.95      0.93        40
          44       1.00      1.00      1.00        40
          45       0.95      0.95      0.95        40
          46       1.00      0.97      0.99        40
          47       0.91      0.97      0.94        40
          48       0.89      0.80      0.84        40
          49       0.97      0.85      0.91        40
          50       0.95      0.93      0.94        40
          51       1.00      1.00      1.00        40
          52       1.00      0.97      0.99        40
          53       0.93      0.93      0.93        40
          54       0.89      0.97      0.93        40
          55       0.98      1.00      0.99        40
          56       0.92      0.90      0.91        40
          57       0.93      0.95      0.94        40
          58       0.95      0.95      0.95        40
          59       0.92      0.90      0.91        40
          60       1.00      0.97      0.99        40
          61       0.92      0.85      0.88        40
          62       0.86      0.80      0.83        40
          63       0.93      1.00      0.96        40
          64       0.95      0.93      0.94        40
          65       0.95      0.88      0.91        40
          66       0.92      0.90      0.91        40
          67       0.78      0.95      0.85        40
          68       0.95      0.93      0.94        40
          69       0.81      0.95      0.87        40
          70       1.00      1.00      1.00        40
          71       1.00      1.00      1.00        40
          72       1.00      0.93      0.96        40
          73       1.00      0.93      0.96        40
          74       0.91      0.80      0.85        40
          75       1.00      0.90      0.95        40
          76       0.95      0.88      0.91        40

    accuracy                           0.93      3080
   macro avg       0.94      0.93      0.93      3080
weighted avg       0.94      0.93      0.93      3080

We achieve around 93% accuracy and 93% F1-score… Not so bad for a classification problem with 77 classes!!

Comparing to scikit-learn#

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

model = Pipeline([
    ("vectorizer", TfidfVectorizer(stop_words="english")),
    ("classifier", LogisticRegression())
])

model.fit(raw_dataset_train['text'], raw_dataset_train['label'])

y_pred = model.predict(raw_dataset_val['text'])

print(classification_report(raw_dataset_val['label'], y_pred))
              precision    recall  f1-score   support

           0       1.00      0.90      0.95        40
           1       0.93      0.97      0.95        40
           2       0.97      0.97      0.97        40
           3       0.90      0.93      0.91        40
           4       0.97      0.90      0.94        40
           5       0.64      0.75      0.69        40
           6       0.85      0.88      0.86        40
           7       0.89      0.82      0.86        40
           8       0.89      0.97      0.93        40
           9       1.00      0.97      0.99        40
          10       0.84      0.53      0.65        40
          11       0.75      0.90      0.82        40
          12       0.75      0.82      0.79        40
          13       0.86      0.93      0.89        40
          14       0.63      0.82      0.72        40
          15       0.78      0.90      0.84        40
          16       0.64      0.72      0.68        40
          17       0.90      0.90      0.90        40
          18       1.00      0.88      0.93        40
          19       0.90      0.88      0.89        40
          20       0.66      0.78      0.71        40
          21       0.97      0.93      0.95        40
          22       0.82      0.80      0.81        40
          23       1.00      0.82      0.90        40
          24       0.90      0.90      0.90        40
          25       0.76      0.80      0.78        40
          26       0.82      0.80      0.81        40
          27       0.94      0.75      0.83        40
          28       0.89      0.78      0.83        40
          29       0.81      0.72      0.76        40
          30       0.93      0.95      0.94        40
          31       0.90      0.88      0.89        40
          32       0.89      1.00      0.94        40
          33       0.78      0.90      0.84        40
          34       0.79      0.85      0.82        40
          35       0.72      0.85      0.78        40
          36       0.97      0.70      0.81        40
          37       0.60      0.75      0.67        40
          38       0.87      1.00      0.93        40
          39       0.77      0.68      0.72        40
          40       0.72      0.97      0.83        40
          41       0.89      0.80      0.84        40
          42       0.97      0.93      0.95        40
          43       0.65      0.80      0.72        40
          44       1.00      1.00      1.00        40
          45       0.85      0.82      0.84        40
          46       0.94      0.78      0.85        40
          47       0.62      0.90      0.73        40
          48       0.85      0.57      0.69        40
          49       1.00      0.85      0.92        40
          50       0.89      0.80      0.84        40
          51       0.82      0.93      0.87        40
          52       0.74      0.78      0.76        40
          53       0.71      0.90      0.79        40
          54       0.71      0.90      0.79        40
          55       0.93      0.95      0.94        40
          56       0.96      0.62      0.76        40
          57       0.92      0.88      0.90        40
          58       0.88      0.72      0.79        40
          59       0.69      0.78      0.73        40
          60       0.94      0.80      0.86        40
          61       0.81      0.75      0.78        40
          62       0.89      0.60      0.72        40
          63       0.88      0.93      0.90        40
          64       0.74      0.93      0.82        40
          65       0.78      0.88      0.82        40
          66       0.72      0.78      0.75        40
          67       0.77      0.75      0.76        40
          68       0.90      0.68      0.77        40
          69       0.64      0.62      0.63        40
          70       0.83      1.00      0.91        40
          71       0.95      1.00      0.98        40
          72       0.92      0.28      0.42        40
          73       1.00      0.93      0.96        40
          74       0.67      0.72      0.70        40
          75       0.88      0.90      0.89        40
          76       0.94      0.75      0.83        40

    accuracy                           0.83      3080
   macro avg       0.84      0.83      0.83      3080
weighted avg       0.84      0.83      0.83      3080