🤖
R for SEO
  • Using R for SEO, What to expect?
  • Getting started
  • What is R? What is SEO?
  • About this Book
  • Crawl and extract data
    • What's crawling and why is it useful?
    • Download and check XML sitemaps using R'
    • Crawling with rvest
    • Website Crawling and SEO extraction with Rcrawler
    • Perform automatic browser tests with RSelenium
  • Grabbing data from APIs
    • Grab Google Suggest Search Queries using R'
    • Grab Google Analytics Data x
    • Grab keywords search volume from DataForSeo API using R'
    • Grab Google Rankings from VALUE SERP API using R'
    • Classify SEO Keywords using GPT-3 & R'
    • Grab Google Search Console Data x
    • Grab 'ahrefs' API data x
    • Grab Google Custom search API Data x
    • Send requests to the Google Indexing API using googleAuthR
    • other APIs x
  • Export and read Data
    • Send and read SEO data to Excel/CSV
    • Send your data by email using gmail API
    • Send and read SEO data to Google Sheet x
  • data wrangling & analysis
    • Join Crawl data with Google Analytics Data
    • Count words, n-grams, shingles x
    • Hunt down keyword cannibalization
    • Duplicate content analysis x
    • Compute ‘Internal Page Rank’
    • SEO traffic Forecast x
    • URLs categorization
    • Track SEO active pages percentage over time x
  • Data Viz
    • Why Data visualisation is important? x
    • Use Esquisse to create plots quickly
  • Explore data with rPivotTable
  • Resources
    • Launch an R script using github actions
    • Types / Class & packages x
    • SEO & R People x
    • Execute R code online
    • useful SEO XPath's & CSS selectors X
Powered by GitBook
On this page

Was this helpful?

  1. data wrangling & analysis

Compute ‘Internal Page Rank’

⚠️ THIS IS A WORK IN PROGRESS

PreviousDuplicate content analysis xNextSEO traffic Forecast x

Last updated 3 years ago

Was this helpful?

It is very much an adaptation of awesome but Instead of using ScreamingFrog export file, we will use the data from a crawl.

Lets crawl with the link data enabled

Rcrawler(Website = "https://www.rforseo.com",  NetworkData = TRUE)

When it's done, The links will be stored in the NetwEdges variable.

View(NetwEdges)

We only want to first 2 column:

library(dplyr)

links <- NetwEdges[,1:2] %>%
   #grabing the first two columns
   distinct() 

# loading igraph package
 library(igraph)

# Loading website internal links inside a graph object
 g <- graph.data.frame(links)
 
# this is the main function, don't ask how it works
 pr <- page.rank(g, algo = "prpack", vids = V(g), directed = TRUE, damping = 0.85)
 
# grabing result inside a dedicated data frame
 values <- data.frame(pr$vector)
 values$names <- rownames(values)
 
# delating row names
 row.names(values) <- NULL
 
# reordering column
 values <- values[c(2,1)]
# renaming columns
 names(values)[1] <- "url"
 names(values)[2] <- "pr"
 View(values)

Let make it more readable, we’re going to put the number on a ten basis, just like when the PageRank was a thing.

#replacing id with url
values$url <- NetwIndex
# out of 10
 values$pr <- round(values$pr / max(values$pr) * 10)
#display
 View(values)

On 15 webpages website, it’s not very impressive but I encourage you to try on a bigger website.

Internal Page Rank calculation
Paul Shapiro
Script
Rcrawler