| `pretrained_model_name` | The name of the pre-trained model | {'RoBERTa','XLNet'} |
| `pretrained_model_path` | The path of the fine-tuned model | str |
| `top_keywords` | The number of the top keywords of texts | int |
| `diversity` | The diversity (\[0,1\]) of the top produced keywords. A high diversity score will create very diverse keyphrases/keywords compared with low diversity, which will produce very similar keywords with the respective ones using the cosine similarity method: | float |
| `n_gram_range` | The different lengths of keyphrases | tuple of shape (range_x,range_y) |
| `sentence_model_name` | The name of the pre-trained sentence model | {'all-mpnet-base-v2','distilbert-base-nli-mean-tokens','all-MiniLM-L6-v2'} |
### Methods
...
...
@@ -36,6 +33,9 @@ The above formula has two branches, because the OSDG community dataset, which ha
| Parameters | Description | Type |
|------------|-------------|------|
| `text` | Input texts | `list` |
| `top_keywords` | The number of the top keywords of texts | `int` |
| `diversity` | The diversity (\[0,1\]) of the top produced keywords. A high diversity score will create very diverse keyphrases/keywords compared with low diversity, which will produce very similar keywords with the respective ones using the cosine similarity method | `float` |
| `n_gram_range` | The different lengths of keyphrases | tuple of shape (range_x,range_y) |
| `return_association` | If set to True then returns the association of each text with the SDGs | `Boolean` |
Returns:
...
...
@@ -68,9 +68,9 @@ text = ['Europe has always been the home of industry. For centuries, it has been