Open
Description
Recently I was doing a text summarization related research and trained a simple model. I want to use ROUGE to check the validity of the model, and I got the following results.
1 ROUGE-1 Average_R: 0.41775
1 ROUGE-1 Average_P: 0.39336
1 ROUGE-1 Average_F: 0.39289
1 ROUGE-2 Average_R: 0.18253
1 ROUGE-2 Average_P: 0.17314
1 ROUGE-2 Average_F: 0.17203
1 ROUGE-3 Average_R: 0.10546
1 ROUGE-3 Average_P: 0.10178
1 ROUGE-3 Average_F: 0.10011
1 ROUGE-4 Average_R: 0.07039
1 ROUGE-4 Average_P: 0.06904
1 ROUGE-4 Average_F: 0.06724
...
And it shows that ROUGE_F score is smaller than ROUGE_P and ROUGE_R? Does anyone know why?
Is this result normal?
Metadata
Metadata
Assignees
Labels
No labels