Grab Google Suggest Search Queries using R'
To make things easier, I've created two dedicated functions:
  • getGSQueries this one grabs the queries
  • suggestGSQueries this one merges each request's results
Just copy and paste those 2 functions inside your RStudio Console
getGSQueries <- function (search_query, code_lang) {
packages <- c("XML", "httr")
if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
install.packages(setdiff(packages, rownames(installed.packages())))
}
library(httr)
library(XML)
query <- URLencode(search_query)
url <-
paste0(
"http://suggestqueries.google.com/complete/search?output=toolbar&hl=",
code_lang,
"&q=",
query
)
# message(url)
# use GET method
req <- GET(url)
# extract xml
# message(req$status_code)
xml <- content(req)
# parse xml
doc <- xmlParse(xml)
# extract attributes from
# <CompleteSuggestion><suggestion data="XXXXXX"/></CompleteSuggestion>
list <-
xpathSApply(doc, "//CompleteSuggestion/suggestion", xmlGetAttr, 'data')
#print results
#print(list)
return(list)
}
suggestGSQueries <- function (search_query, code_lang, level) {
if(length(search_query) == 1){
all_suggestion <- getGSQueries(search_query, code_lang)
message("level 1")
if (level > 1) {
for (l in letters) {
message("level 2 ", l)
Sys.sleep(runif(1, 0, 2))
local_suggestion <-
getGSQueries(paste0(search_query," ", l), code_lang)
all_suggestion <- c(all_suggestion, local_suggestion)
}
if (level > 2) {
for (l1 in letters) {
for (l2 in letters) {
Sys.sleep(1+runif(1, 0, 9))
message("level 3 ", l1, l2)
local_suggestion <-
getGSQueries(paste0(search_query," ", l1, l2), code_lang)
all_suggestion <- c(all_suggestion, local_suggestion)
}
}
}
}
all_suggestion <- unique(all_suggestion)
} else {
message(1," ",search_query[1])
all_suggestion <- getGSQueries(as.character(search_query[1]), code_lang)
for (word in 2:length(search_query)){
Sys.sleep(1+runif(1, 0, 9))
message(word," ",search_query[word])
all_suggestion <- c(all_suggestion,getGSQueries(as.character(search_query[word]), code_lang))
}
all_suggestion
}
}
This is how you can use it:
kwd <- suggestGSQueries('covid', 'en', 2)
View(as.data.frame(unlist(kwd)))
The first parameter is the seed keyword, the second one is the language (or host language), and the last one is the level of details (1,2 or 3).
1 will just grab the first suggestion list, 2 will grab suggestions if you add another letter ('covid a', 'covid b', 'covid c', ...), 3, which I don't recommend, will add two letters (covid aa, covid ab, ..) it's also possible to pass a vector instead of a string. In this example, we ask for Google suggestions for each of the results in the previous step.
it will drastically increase the keyword list and... it might a little bit of time too :)
deeper_kwd <- suggestGSQueries(kwd, 'en', 1)
View(as.data.frame(unlist(deeper_kwd)))
Use these functions with caution because they can send a lot of queries to Google and you might get your IP banned.
Copy link