Compute ‘Internal Page Rank’
⚠️ THIS IS A WORK IN PROGRESS
It is very much an adaptation of Paul Shapiro awesome Script but Instead of using ScreamingFrog export file, we will use the data from a Rcrawler crawl.
Lets crawl with the link data enabled
1
Rcrawler(Website = "https://www.rforseo.com", NetworkData = TRUE)
Copied!
When it's done, The links will be stored in the NetwEdges variable.
1
View(NetwEdges)
Copied!
We only want to first 2 column:
1
library(dplyr)
2
3
links <- NetwEdges[,1:2] %>%
4
#grabing the first two columns
5
distinct()
6
7
# loading igraph package
8
library(igraph)
9
10
# Loading website internal links inside a graph object
11
g <- graph.data.frame(links)
12
13
# this is the main function, don't ask how it works
14
pr <- page.rank(g, algo = "prpack", vids = V(g), directed = TRUE, damping = 0.85)
15
16
# grabing result inside a dedicated data frame
17
values <- data.frame(pr$vector)
18
values$names <- rownames(values)
19
20
# delating row names
21
row.names(values) <- NULL
22
23
# reordering column
24
values <- values[c(2,1)]
25
# renaming columns
26
names(values)[1] <- "url"
27
names(values)[2] <- "pr"
28
View(values)
Copied!
Internal Page Rank calculation
Let make it more readable, we’re going to put the number on a ten basis, just like when the PageRank was a thing.
1
#replacing id with url
2
values$url <- NetwIndex
3
# out of 10
4
values$pr <- round(values$pr / max(values$pr) * 10)
5
#display
6
View(values)
Copied!
On 15 webpages website, it’s not very impressive but I encourage you to try on a bigger website.
Copy link