Skip to content
Update SDG classifier authored by Ioanna Mandilara's avatar Ioanna Mandilara
......@@ -11,9 +11,6 @@ _class_ SDGDetector.SDG_classifier(pretrained_model_path,pretrained_model_name,t
|------------|-------------|------------|
| `pretrained_model_name` | The name of the pre-trained model | {'RoBERTa','XLNet'} |
| `pretrained_model_path` | The path of the fine-tuned model | str |
| `top_keywords` | The number of the top keywords of texts | int |
| `diversity` | The diversity (\[0,1\]) of the top produced keywords. A high diversity score will create very diverse keyphrases/keywords compared with low diversity, which will produce very similar keywords with the respective ones using the cosine similarity method: | float |
| `n_gram_range` | The different lengths of keyphrases | tuple of shape (range_x,range_y) |
| `sentence_model_name` | The name of the pre-trained sentence model | {'all-mpnet-base-v2','distilbert-base-nli-mean-tokens','all-MiniLM-L6-v2'} |
### Methods
......@@ -36,6 +33,9 @@ The above formula has two branches, because the OSDG community dataset, which ha
| Parameters | Description | Type |
|------------|-------------|------|
| `text` | Input texts | `list` |
| `top_keywords` | The number of the top keywords of texts | `int` |
| `diversity` | The diversity (\[0,1\]) of the top produced keywords. A high diversity score will create very diverse keyphrases/keywords compared with low diversity, which will produce very similar keywords with the respective ones using the cosine similarity method | `float` |
| `n_gram_range` | The different lengths of keyphrases | tuple of shape (range_x,range_y) |
| `return_association` | If set to True then returns the association of each text with the SDGs | `Boolean` |
Returns:
......@@ -68,9 +68,9 @@ text = ['Europe has always been the home of industry. For centuries, it has been
from SDGDetector import SDGDetector
combo = SDGDetector.SDG_classifier(pretrained_model_name='XLNet',pretrained_model_path=<the path of downloaded model>,
sentence_model_name='all-mpnet-base-v2',top_keywords=10,diversity=0.3,n_gram_range=(1,2))
sentence_model_name='all-mpnet-base-v2')
sdg,sdg_name,association = combo.predict(text,return_association=True)
sdg,sdg_name,association = combo.predict(text,top_keywords=10,diversity=0.3,n_gram_range=(1,2),return_association=True)
for i in range(len(text)):
print('The most relevant SDG of the provided text {}: {}'.format(i+1,sdg_name[i]))
print('======== The the association of the text with the SDGs: ========')
......
......