Join Crawl data with Google Analytics Data

The SEO data to be analyzed often comes from different sources that why it's better to know how to connect or merge them. Let's imagine we have crawled your website, it might be quite nice to check which one of these pages got some SEO traffic.

To do that we'll need to merge or join the two "datasets"

1. Crawl data

Using rcrawler, we've collected our pages (see How to use rcrawler article)

library(Rcrawler)
Rcrawler(Website = "https://www.rforseo.com/")

We now have a dataset (dataframe) of urls associated to their crawl depht called INDEX

View(INDEX)
second column is the url

2. Google analytics data

Using googleAnalyticsR package we grab Google Analytics SEO Landing page (see How so use googleAnalyticsR article)

3. Fuuuuu...sion!

First, you need to define what's the common ground. We have on the crawler data side the Url column and on the GA side the landingPagePath

So we need to make a conversion. We'll remove the hostname from the Url using the path function urltools package.

and now we can merge

That's it really. Lets display the data

Last updated

Was this helpful?