Derivative Words Scraping of Every Quranic Root Word from the Quran Corpus Web using Python to Support the Quranpedia Project


Authors

  • Idzhari Syaeful Ma'mun - Telkom University, Bandung, Indonesia
  • Eko Darwiyanto Telkom University, Bandung, Indonesia
  • Moch. Arif Bijaksana Telkom University, Bandung, Indonesia

DOI:

https://doi.org/10.30865/klik.v4i3.1547

Keywords:

Scraping; Quranpedia; Quran; Derivative Words

Abstract

The Qur'an, as a guide to life for Muslims, has given birth to various disciplines such as tafsir science, fiqh science, hadith science, nahwu science, and balaghah science. However, the limited number of websites on learning and understanding the Qur'an is a problem that can hinder Muslims from exploring the contents of the Qur'an. To overcome this problem, the Quranpedia project was initiated. Quranpedia is a web-based application designed to resemble Wikipedia in providing in-depth explanations of derivative words in the Qur'an. Using the "Scraping" technique, Quranpedia collects data from various sources to provide a comprehensive explanation of nouns in the Qur'an and Hadith. One of the main challenges in this project was to find the common root of nouns in the Qur'an and hadith. To overcome this challenge, a method was used to transform words from sentences to their root words. Thus, Quranpedia can have the ability to look up the root word of a noun. This allows users to have a better understanding of derivative words in the Qur'?n and how they are used in different contexts. The objective of this research is to create a derivative word scraping program that scrapes all derivative words in the Quran from the Corpus Quran web accurately. The problem discussed in this research covers both how one can scrape derivative words of each root word in the Quran from the Corpus Quran web and whether the data scraped from the web is complete and accurate. The method to ensure that these problems are solved includes using the Python programming language to create the program and then testing the program itself. The interim results achieved is whether the data is complete or not

Downloads

Download data is not yet available.

References

N. Himawan, G. Wasis Wicaksono, dan I. Nuryasin, “Ekstraksi Fi’il dan Isim Pada Kaidah Nahwu Shorof Berbasis Android,” REPOSITOR, vol. 2, no. 5, hlm. 619–626, 2020.

A. N. Qowim, “Metode Pendidikan Islam Perspektif Al-Qur’an,” IQ (Ilmu Al-qur’an): Jurnal Pendidikan Islam, vol. 3, no. 01, hlm. 35–58, Jul 2020, doi: 10.37542/iq.v3i01.53.

V. Krotov, L. Johnson, dan L. Silva, “Legality and Ethics of Web Scraping,” Communications of the Association for Information Systems, vol. 47, hlm. 539–563, 2020, doi: 10.17705/1CAIS.04724.

M. Khder, “Web Scraping or Web Crawling: State of Art, Techniques, Approaches and Application,” International Journal of Advances in Soft Computing and its Applications, vol. 13, no. 3, hlm. 145–168, Des 2021, doi: 10.15849/IJASCA.211128.11.

G. W. Noblit, D. Beach, B. Bueno, L. Fickel, W. Pillow, dan M. Thapan, The Oxford Encyclopedia of Qualitative Research Methods in Education. 2020.

Subhan Hi Ali Dodego, “Pentingnya Penguasaan Bahasa Arab Dalam Pembelajaran Pendidikan Agama Islam,” PESHUM?: Jurnal Pendidikan, Sosial dan Humaniora, vol. 1, no. 2, hlm. 55–70, Feb 2022, doi: 10.56799/peshum.v1i2.48.

N. H. A. Shukri, M. K. M. Nasir, dan K. Abdul Razak, “Educational Strategies on Memorizing the Quran: A Review of Literature,” International Journal of Academic Research in Progressive Education and Development, vol. 9, no. 2, Jul 2020, doi: 10.6007/IJARPED/v9-i2/7649.

M. Ikhwan, “Legitimasi Islam: Sebuah Pembacaan Teoritis Tentang Wahyu Alquran,” MUTAWATIR, vol. 10, no. 1, hlm. 144–169, Jun 2020, doi: 10.15642/mutawatir.2020.10.1.144-169.

“The Quranic Arabic Corpus - Word by Word Grammar, Syntax and Morphology of the Holy Quran.” Diakses: 9 Desember 2023. [Daring]. Tersedia pada: https://corpus.quran.com/

O. Oueslati, E. Cambria, M. Ben HajHmida, dan H. Ounelli, “A review of sentiment analysis research in Arabic language,” Future Generation Computer Systems, vol. 112, hlm. 408–430, Nov 2020, doi: 10.1016/j.future.2020.05.034.

“Arabic language | History & Alphabet | Britannica.” Diakses: 9 Desember 2023. [Daring]. Tersedia pada: https://www.britannica.com/topic/Arabic-language

N. Mufidah, I. Izha, R. Pendidikan, B. Arab, U. M. Malik, dan I. Malang, “PENGAJARAN KOSA KATA UNTUK MAHASISWA KELAS INTENSIF BAHASA ARAB (Vocabulary Teaching For Arabic Intensive Class),” 2020.

E. Suhemi, “Mashdar dalam Surat Al-Kahfi: Suatu Kajian Morfologis,” Jurnal Ilmiah Al-Mu’ashirah, vol. 17, hlm. 186, Okt 2020, doi: 10.22373/jim.v17i2.9180.

Kamalia, “PRONOMINA (ISIM DHAMIR) ATAU KATA GANTI DALAM BAHASA ARAB (TINJAUAN GENDER),” 2019. doi: 10.37064/ai.v7i2.7812.

M. T. Ben Othman, M. A. Al-Hagery, dan Y. M. El Hashemi, “Arabic Text Processing Model: Verbs Roots and Conjugation Automation,” IEEE Access, vol. 8, hlm. 103913–103923, 2020, doi: 10.1109/ACCESS.2020.2999259.

“What is Python? Executive Summary | Python.org.” Diakses: 22 Oktober 2023. [Daring]. Tersedia pada: https://www.python.org/doc/essays/blurb/

“Introduction to Python.” Diakses: 9 Desember 2023. [Daring]. Tersedia pada: https://www.w3schools.com/python/python_intro.asp

M. D. Squire dkk., “Cyclomatic Complexity and Basis Path Testing Study,” 2020.

S. Huntsman, “Path homology as a stronger analogue of cyclomatic complexity,” Mar 2020.

“Why good metrics values do not equal good quality.” Diakses: 8 Desember 2023. [Daring]. Tersedia pada: https://www.codecentric.de/wissens-hub/blog/why-good-metrics-values-do-not-equal-good-quality


Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Derivative Words Scraping of Every Quranic Root Word from the Quran Corpus Web using Python to Support the Quranpedia Project

Dimensions Badge

ARTICLE HISTORY


Published: 2023-12-25
Abstract View: 241 times
PDF Download: 151 times