Efektivitas Pelatihan Awal Berbasis Domain Spesifik Legal-BERT Untuk Natural Language Processing Hukum: Replikasi Dan Perluasan Studi Casehold
DOI:
https://doi.org/10.47065/jieee.v5i1.2610Keywords:
Legal NLP; Domain-Specific Pretraining; ; Legal-BERT; Transformer; CaseHOLD.Abstract
Abstract?The emergence of domain-specific language models has demonstrated significant potential across various specialized fields. However, their effectiveness in legal natural language processing (NLP) remains underexplored, particularly given the unique challenges posed by legal text complexity and specialized terminology. Legal NLP has practical applications such as automated legal precedent search and court decision analysis that can accelerate legal research from weeks to hours. This study evaluates the CaseHOLD dataset to provide comprehensive empirical validation of domain-specific pretraining benefits for legal NLP tasks with focus on data efficiency and context complexity analysis. We conducted systematic experiments using the CaseHOLD dataset containing 53,000 legal multiple-choice questions. We compared four models: BiLSTM, BERT-base, Legal-BERT, and RoBERTa across varying data volumes (1%, 10%, 50%, 100%) and context complexity levels. Paired t-tests with 10-fold cross-validation and Bonferroni correction ensure robust methodology that guarantees finding reliability. Legal-BERT achieved the highest macro-F1 score of 69.5% (95% CI: [68.0, 71.0]), demonstrating a statistically significant improvement of 7.2 percentage points over BERT-base (62.3%, p < 0.001, Cohen's d= 1.23). RoBERTa showed competitive performance at 68.9%, nearly matching Legal-BERT. The most substantial improvements occurred under limited data conditions with 16.6% improvement at 1% training data. Context complexity analysis revealed an inverted-U pattern with optimal performance on 41-60 word texts. The introduced Domain Specificity Score (DS-score) showed strong positive correlation (r = 0.73, p < 0.001) with pretraining effectiveness, explaining 53.3% of performance improvement variance. These findings provide empirical evidence that domain-specific pretraining offers significant advantages for legal NLP tasks, particularly under data-constrained conditions and moderate-high context complexity. The key distinction of this research is the development of a predictive DS-score framework enabling benefit estimation before implementation, unlike previous studies that only evaluated post-hoc performance. The results have practical implications for developing legal NLP systems in resource-limited environments and provide optimal implementation guidance for Legal-BERT.
Downloads
References
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 4171–4186. doi: 10.18653/v1/N19-1423.
A. Rogers, O. Kovaleva, and A. Rumshisky, “A Primer on Neural Network Models for Natural Language Processing,” J. Artif. Intell. Res., vol. 57, pp. 615–732, 2020, doi: 10.1613/jair.1.11030.
I. Chalkidis, M. Fergadiotis, P. Malakasiotis, N. Aletras, and I. Androutsopoulos, “LEGAL-BERT: The Muppets straight out of Law School,” in Findings of the Association for Computational Linguistics: EMNLP 2020, Online, 2020, pp. 2898–2904. doi: 10.18653/v1/2020.findings-emnlp.261.
L. Zheng, N. Guha, B. R. Anderson, P. Henderson, and D. E. Ho, “When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset of 53,000+ Legal Holdings,” in Proc. 18th International Conference on Artificial Intelligence and Law, São Paulo, Brazil, 2021, pp. 159–168. doi: 10.1145/3462757.3466088.
I. Chalkidis et al., “LexGLUE: A Benchmark Dataset for Legal Language Understanding in English,” in Proc. 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 4310–4330. doi: 10.18653/v1/2022.acl-long.297.
L. Manor and J. J. Li, “Plain English Summarization of Contracts,” in Proc. Natural Legal Language Processing Workshop at EMNLP 2019, Hong Kong, China, 2019, pp. 1–11. doi: 10.18653/v1/D19-5001.
H. Chen, T. Cohn, and T. Baldwin, “Legal Judgment Prediction with Multi-Stage Case Representation Learning,” in Proc. 30th ACM International Conference on Information and Knowledge Management, Gold Coast, Australia, 2021, pp. 298–307. doi: 10.1145/3459637.3482324.
H. Westermann, J. Savelka, K. Benyekhlef, and K. D. Ashley, “Using Summarization to Discover Argument Facets in Online Ideological Dialog,” in Proc. 2022 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA, 2022, pp. 1412–1422. doi: 10.18653/v1/2022.naacl-main.104.
J. Lee et al., “BioBERT: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020, doi: 10.1093/bioinformatics/btz682.
I. Beltagy, K. Lo, and A. Cohan, “SciBERT: A Pretrained Language Model for Scientific Text,” in Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 2019, pp. 3615–3620. doi: 10.18653/v1/D19-1371.
E. Alsentzer et al., “Publicly Available Clinical BERT Embeddings,” in Proc. 2nd Clinical Natural Language Processing Workshop, Minneapolis, MN, USA, 2019, pp. 72–78. doi: 10.18653/v1/W19-1909.
Y. Li, T. Wehbe, F. Ahmad, H. Wang, and Y. Luo, “Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences,” 2022. doi: 10.48550/arXiv.2201.11838.
P. Colombo et al., “SaulLM-7B: A pioneering Large Language Model for Law,” 2024. doi: 10.48550/arXiv.2403.03883.
T. Wolf et al., “Transformers: State-of-the-Art Natural Language Processing,” in Proc. 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 2020, pp. 38–45. doi: 10.18653/v1/2020.emnlp-demos.6.
S. Ruder, M. E. Peters, S. Swayamdipta, and T. Wolf, “Transfer Learning in Natural Language Processing,” in Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 15–18. doi: 10.18653/v1/N19-5004.
Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” 2019. doi: 10.48550/arXiv.1907.11692.
N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” in Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 2019, pp. 3982–3992. doi: 10.18653/v1/D19-1410.
R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, “Green AI,” Commun. ACM, vol. 63, no. 12, pp. 54–63, 2020, doi: 10.1145/3381831.
Y. Bengio, A. Courville, and P. Vincent, “Representation Learning: A Review and New Perspectives,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1798–1828, 2013, doi: 10.1109/TPAMI.2013.50.
A. Vaswani et al., “Attention is All You Need,” in Proc. 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 5998–6008.
J. Niklaus, V. Matoshi, M. Stürmer, I. Chalkidis, and D. E. Ho, “MultiLegalPile: A 689GB Multilingual Legal Corpus,” in Proc. Data and Machine Learning Research Workshop at ICLR 2023, Kigali, Rwanda, 2023, pp. 1–15.
N. Guha et al., “LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models,” in Proc. 37th Conference on Neural Information Processing Systems, New Orleans, LA, USA, 2023, pp. 1–15.
M. Kenton and L. K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” 2019. doi: 10.48550/arXiv.1810.04805.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Efektivitas Pelatihan Awal Berbasis Domain Spesifik Legal-BERT Untuk Natural Language Processing Hukum: Replikasi Dan Perluasan Studi Casehold
ARTICLE HISTORY
Issue
Section
Copyright (c) 2025 Hasani Zakiri, Alva Hendi Muhammad, Asro Nasiri

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).