Skip to content

CrediBench Leaderboard

Benchmark results for domain credibility prediction on the CrediBench dataset.

Evaluation Date: May 2025
Scale: 0–1 (higher is better)

Benchmarks


Binary Classification

Domain credibility prediction

Accuracy and F1 Scores on test set. Higher is better.

Predict whether a domain is credible (1) or not (0). The main evaluation metrics is Accuracy.

Rank Method Acc. F1 Team Year
1. MLP (Graph + Text) (Ours) 83.6 ± 0.03 83.2 CrediNet team 2025
2. GAT w/ Text Embeddings (Ours) 76.1 ± 0.36 75.2 CrediNet team 2025
3. GAT w/ Random Initialization (Ours) 68.9 ± 0.18 69.7 CrediNet team 2025
3. Text Embedding-based (Ours) 63.2 ± 0.02 60.5 CrediNet team 2025
4. LightGBM 56.0 70.7 Kadkhoda et al. 2025
5. LLM-URL + Web Search 54.35 63.03 Yang et al. (enhanced) 2025
6. Constant 53.0 69.1 N/A N/A
7. LLM-URL 52.84 63.9 Yang et al. 2025
8. SEO-based GNN 50.0 66.7 Carragher et al. 2025

Regression

Domain credibility scoring

Mean and Max. Absolute Error on test set. Higher is better.

Predict a continuous credibility score in [0, 1]. The main evaluation metric is Mean Absolute Error (MAE).

Rank Method MAE Max(AE) Team Year
1. MLP (Graph + Text) (Ours) 0.112 ± 0.001 0.544 CrediNet team 2025
2. GAT w/ Text Embeddings (Ours) 0.114 ± 0.001 0.477 CrediNet team 2025
3. GCN w/ Text Embeddings (Ours) 0.114 ± 0.002 0.849 CrediNet team 2025
4. LightGBM 0.145 0.630 Kadkhoda et al. 2025
5. LLM-URL + Web Search 0.158 0.719 Yang et al. (enhanced) 2025
6. LLM-URL 0.162 0.765 Yang et al. 2025
7. Mean 0.167 0.546 N/A N/A
8. SEO-based GNN 0.428 0.956 Carragher et al. 2025

Submissions: Run your method on CrediBench and open an issue with your results, code, and contact information.