Compute ‘Internal Page Rank’

⚠️ THIS IS A WORK IN PROGRESS

It is very much an adaptation of Paul Shapiro awesome Script but Instead of using ScreamingFrog export file, we will use the data from a Rcrawler crawl.

Lets crawl with the link data enabled

Rcrawler(Website = "https://www.rforseo.com", NetworkData = TRUE)

When it's done, The links will be stored in the NetwEdges variable.

View(NetwEdges)

We only want to first 2 column:

library(dplyr)
links <- NetwEdges[,1:2] %>%
#grabing the first two columns
distinct()
# loading igraph package
library(igraph)
# Loading website internal links inside a graph object
g <- graph.data.frame(links)
# this is the main function, don't ask how it works
pr <- page.rank(g, algo = "prpack", vids = V(g), directed = TRUE, damping = 0.85)
# grabing result inside a dedicated data frame
values <- data.frame(pr$vector)
values$names <- rownames(values)
# delating row names
row.names(values) <- NULL
# reordering column
values <- values[c(2,1)]
# renaming columns
names(values)[1] <- "url"
names(values)[2] <- "pr"
View(values)
Internal Page Rank calculation

Let make it more readable, we’re going to put the number on a ten basis, just like when the PageRank was a thing.

#replacing id with url
values$url <- NetwIndex
# out of 10
values$pr <- round(values$pr / max(values$pr) * 10)
#display
View(values)

On 15 webpages website, it’s not very impressive but I encourage you to try on a bigger website.