Impementasi Web Scraping Pada OJS Dengan Metode CSS Selector
DOI:
https://doi.org/10.30865/resolusi.v3i2.423Keywords:
OJS; Web Scraping; CSS Selector; BeautifulSoup; PythonAbstract
Reference sources in research and scientific article writing are journals. For reputable international journals, a lot of software has been developed to retrieve meta data for research mapping. To obtain meta data, for the purposes of visualizing research results from national journals, especially Sinta's journal, it still has to be done manually. Researchers must copy and paste one by one the articles they want. In order to facilitate the retrieval of article meta data contained in a sinta journal with the OJS system, the development of an OJS journal scraping web application was carried out. The chosen web scraping method is the CSS Selector. The application is developed with the Python programming language and additional BeautifulSoup, flask and pandas libraries. From the results of testing the OJS scraping application was able to retrieve meta data in the form of title data, abstract, keywords, author, links. Weaknesses found were not being able to retrieve data in journals with the OJS system that had made changes to the non-OJS standard web interface.
Downloads
References
J. Hillen, “Web scraping for food price research,” Br. Food J., vol. 121, no. 12, 2019, doi: 10.1108/BFJ-02-2019-0081.
F. Djiwadikusumah, G. H. Irawan, and R. Haekal Al-Fadilah, “Web scraping situs e-commerce menggunakan teknik parsing dom,” J. Siliwangi, vol. 7, no. 2, 2021.
D. F. Setiawan, T. Tristiyanto, and A. Hijriani, “APLIKASI WEB SCRAPING DESKRIPSI PRODUK,” J. Teknoinfo, vol. 14, no. 1, 2020, doi: 10.33365/jti.v14i1.498.
D. D. A. Yani, H. S. Pratiwi, and H. Muhardi, “Implementasi Web Scraping untuk Pengambilan Data pada Situs Marketplace,” J. Sist. dan Teknol. Inf., vol. 7, no. 4, 2019, doi: 10.26418/justin.v7i4.30930.
M. Djufri, “PENERAPAN TEKNIK WEB SCRAPING UNTUK PENGGALIAN POTENSI PAJAK (STUDI KASUS PADA ONLINE MARKET PLACE TOKOPEDIA, SHOPEE DAN BUKALAPAK),” J. BPPK Badan Pendidik. dan Pelatih. Keuang., vol. 13, no. 2, 2020, doi: 10.48108/jurnalbppk.v13i2.636.
R. Baskara and F. Rahma, “Implementasi Web Scraping Pada Media Sosial Instagram,” Automata, vol. 3, pp. 1–3, 2022.
R. Crystal Pereira and T. Vanitha, “Web Scraping of Social Networks,” Int. J. Innov. Res. Comput. Commun. Eng. (An ISO, vol. 3297, no. 7, 2015.
I. Dongo, Y. Cadinale, A. Aguilera, F. Martínez, Y. Quintero, and S. Barrios, “Web Scraping versus Twitter API: A Comparison for a Credibility Analysis,” 2020, doi: 10.1145/3428757.3429104.
M. I. Habibie, T. Widiaputra, and Y. Yulianingsani, “WEB SCRAPING OF DISEASE INFORMATION FROM SOCIAL MEDIA TWITTER,” J. Teknoinfo, vol. 16, no. 2, 2022, doi: 10.33365/jti.v16i2.1871.
S. Satriajati, S. B. Panuntun, and S. Pramana, “IMPLEMENTASI WEB SCRAPING DALAM PENGUMPULAN BERITA KRIMINAL PADA MASA PANDEMI COVID-19,” Semin. Nas. Off. Stat., vol. 2020, no. 1, 2021, doi: 10.34123/semnasoffstat.v2020i1.578.
sinta, “journals,” kemendikbud, 2023. https://sinta.kemdikbud.go.id/journals.
S. E. Chasins, M. Mueller, and R. Bodik, “Rousillon: Scraping distributed hierarchical web data,” 2018, doi: 10.1145/3242587.3242661.
I. Onyenwe, S. Ogbonna, E. Onyedimma, O. Ikechukwu-Onyenwe, and C. Nwafor, “Developing Smart Web-Search using Regex,” Int. J. Nat. Lang. Comput., vol. 11, no. 3, 2022, doi: 10.5121/ijnlc.2022.11303.
Z. Wu, B. Ericson, and C. Brooks, “Regex Parsons: Using Horizontal Parsons Problems to Scaffold Learning Regex,” 2021, doi: 10.1145/3488042.3489968.
A. Rahmatulloh and R. Gunawan, “Web Scraping with HTML DOM Method for Data Collection of Scientific Articles from Google Scholar,” Indones. J. Inf. Syst., vol. 2, no. 2, 2020, doi: 10.24002/ijis.v2i2.3029.
V. Mitra, H. Sujaini, and A. B. P. Negara, “Rancang Bangun Aplikasi Web Scraping untuk Korpus Paralel Indonesia - Inggris dengan Metode HTML DOM,” J. Sist. dan Teknol. Inf., vol. 5, no. 1, 2017.
P. Gao, H. Han, J. Guo, and M. Saeki, “Stable web scraping: An approach based on neighbour zone and path similarity of page elements,” Int. J. Web Eng. Technol., vol. 13, no. 4, 2018, doi: 10.1504/IJWET.2018.097561.
R. Yaqoob, Sanaa, M. Haris, Samadyar, and M. A. Shah, “The Price Scraping Bot Threat on E-commerce Store Using Custom XPATH Technique,” 2021, doi: 10.23919/ICAC50006.2021.9594223.
M. S. Rohman, H. A. Santoso, G. W. Saraswati, N. Anisa, and S. Winarsih, “Pemanfaatan Topic-Focused Crawler untuk Pembangunan Corpus Berita Bencana menggunakan Teknik Scrapy CSS Selector,” Semin. Nas. APTIKOM, 2019.
E. Uzun, “A regular expression generator based on CSS selectors for efficient extraction from HTML pages,” Turkish J. Electr. Eng. Comput. Sci., vol. 28, no. 6, 2020, doi: 10.3906/ELK-2004-67.
J. Attardi, “CSS Selectors,” in Modern CSS, 2020.
R. Gunawan, A. Rahmatulloh, I. Darmawan, and F. Firdaus, “Comparison of Web Scraping Techniques?: Regular Expression, HTML DOM and Xpath,” 2019, doi: 10.2991/icoiese-18.2019.50.
I. Darmawan, M. Maulana, R. Gunawan, and N. Widiyasono, “Evaluating Web Scraping Performance Using XPath, CSS Selector, Regular Expression, and HTML DOM With Multiprocessing Technical Applications,” Int. J. Informatics Vis., vol. 6, no. 4, 2022, doi: 10.30630/joiv.6.4.1525.
Bila bermanfaat silahkan share artikel ini
Berikan Komentar Anda terhadap artikel Impementasi Web Scraping Pada OJS Dengan Metode CSS Selector
ARTICLE HISTORY
Issue
Section
Copyright (c) 2022 Agus Purnomo

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).














