IntelliPaper
Abstract
The aim of this article is to develop a method to find the importance of web pages without using web browser data or invading the privacy of users. Rather, it works on the structure of a website. To achieve this goal, we propose a novel method that can take webpage content as input and produce a score for each page automatically. Initially, we extract content from a web page in real-time. Subsequently, we consider two important factors based on the website structure: (1) “What is the minimum number of clicks needed to access web pages in a website?” and (2) “How a web page is linked with other web pages in a website?” We use a learning method to train our model by using the “web page views” results generated by “Google Analytics” and “Similar Web”. Experiments and Case studies on the world’s most popular websites show that our method can produce very effective results in real-time.
Explore Digital Article Text
Article file ID not found.
Conflict of Interest
The authors declare no conflict of interest.
Ethical Approval
Not applicable
Data Availability
The datasets used in this study are openly available at [repository link] and the source code is available on GitHub at [GitHub link].
Funding
This work did not receive any external funding.