To make things easier, I've created two dedicated functions:
getGSQueries
this one grabs the queries
suggestGSQueries
this one merges each request's results
Just copy and paste those 2 functions inside your RStudio Console
Copy getGSQueries <- function (search_query, code_lang) {
packages <- c ( "XML" , "httr" )
if ( length ( setdiff (packages, rownames ( installed.packages ()))) > 0 ) {
install.packages ( setdiff (packages, rownames ( installed.packages ())))
}
library (httr)
library (XML)
query <- URLencode (search_query)
url <-
paste0 (
"http://suggestqueries.google.com/complete/search?output=toolbar&hl=" ,
code_lang,
"&q=" ,
query
)
# message(url)
# use GET method
req <- GET( url )
# extract xml
# message(req$status_code)
xml <- content( req )
# parse xml
doc <- xmlParse( xml )
# extract attributes from
# <CompleteSuggestion><suggestion data="XXXXXX"/></CompleteSuggestion>
list <-
xpathSApply( doc, "//CompleteSuggestion/suggestion" , xmlGetAttr, 'data' )
#print results
#print(list)
return (list)
}
​
Copy suggestGSQueries <- function (search_query, code_lang, level) {
if ( length (search_query) == 1 ){
all_suggestion <- getGSQueries( search_query, code_lang )
message ( "level 1" )
if (level > 1 ) {
for (l in letters ) {
message ( "level 2 " , l)
Sys.sleep ( runif ( 1 , 0 , 2 ))
local_suggestion <-
getGSQueries(paste0 (search_query, " " , l ) , code_lang)
all_suggestion <- c (all_suggestion, local_suggestion)
}
if (level > 2 ) {
for (l1 in letters ) {
for (l2 in letters ) {
Sys.sleep ( 1 + runif ( 1 , 0 , 9 ))
message ( "level 3 " , l1, l2)
local_suggestion <-
getGSQueries(paste0 (search_query, " " , l1, l2 ) , code_lang)
all_suggestion <- c (all_suggestion, local_suggestion)
}
}
}
}
all_suggestion <- unique (all_suggestion)
} else {
message ( 1 , " " ,search_query[ 1 ])
all_suggestion <- getGSQueries(as.character (search_query[ 1 ] ) , code_lang)
for (word in 2 : length (search_query)){
Sys.sleep ( 1 + runif ( 1 , 0 , 9 ))
message (word, " " ,search_query[word])
all_suggestion <- c (all_suggestion, getGSQueries(as.character (search_query[word] ) , code_lang))
}
all_suggestion
}
}
This is how you can use it:
Copy kwd <- suggestGSQueries( 'covid' , 'en' , 2 )
​
View ( as.data.frame ( unlist (kwd)))
The first parameter is the seed keyword, the second one is the language (or host language ), and the last one is the level of details (1,2 or 3).
1 will just grab the first suggestion list, 2 will grab suggestions if you add another letter ('covid a', 'covid b', 'covid c', ...), 3 , which I don't recommend, will add two letters (covid aa, covid ab, ..)
it's also possible to pass a vector instead of a string. In this example, we ask for Google suggestions for each of the results in the previous step.
it will drastically increase the keyword list and... it might a little bit of time too :)
Copy deeper_kwd <- suggestGSQueries( kwd, 'en' , 1 )
View ( as.data.frame ( unlist (deeper_kwd)))
Use these functions with caution because they can send a lot of queries to Google and you might get your IP banned.