Grab Google Suggest Search Queries using R'
To make things easier, I've created two dedicated functions:
  • getGSQueries this one grabs the queries
  • suggestGSQueries this one merges each request's results
Just copy and paste those 2 functions inside your RStudio Console
1
getGSQueries <- function (search_query, code_lang) {
2
packages <- c("XML", "httr")
3
if (length(setdiff(packages, rownames(installed.packages()))) > 0) {
4
install.packages(setdiff(packages, rownames(installed.packages())))
5
}
6
library(httr)
7
library(XML)
8
9
10
query <- URLencode(search_query)
11
url <-
12
paste0(
13
"http://suggestqueries.google.com/complete/search?output=toolbar&hl=",
14
code_lang,
15
"&q=",
16
query
17
)
18
19
20
# message(url)
21
# use GET method
22
req <- GET(url)
23
# extract xml
24
25
# message(req$status_code)
26
27
xml <- content(req)
28
# parse xml
29
doc <- xmlParse(xml)
30
31
# extract attributes from
32
# <CompleteSuggestion><suggestion data="XXXXXX"/></CompleteSuggestion>
33
list <-
34
xpathSApply(doc, "//CompleteSuggestion/suggestion", xmlGetAttr, 'data')
35
36
#print results
37
#print(list)
38
return(list)
39
}
40
Copied!
1
suggestGSQueries <- function (search_query, code_lang, level) {
2
if(length(search_query) == 1){
3
all_suggestion <- getGSQueries(search_query, code_lang)
4
message("level 1")
5
6
if (level > 1) {
7
for (l in letters) {
8
message("level 2 ", l)
9
Sys.sleep(runif(1, 0, 2))
10
local_suggestion <-
11
getGSQueries(paste0(search_query," ", l), code_lang)
12
all_suggestion <- c(all_suggestion, local_suggestion)
13
14
}
15
16
if (level > 2) {
17
for (l1 in letters) {
18
for (l2 in letters) {
19
Sys.sleep(1+runif(1, 0, 9))
20
message("level 3 ", l1, l2)
21
local_suggestion <-
22
getGSQueries(paste0(search_query," ", l1, l2), code_lang)
23
all_suggestion <- c(all_suggestion, local_suggestion)
24
25
}
26
27
}
28
}
29
}
30
31
all_suggestion <- unique(all_suggestion)
32
} else {
33
message(1," ",search_query[1])
34
all_suggestion <- getGSQueries(search_query[1], code_lang)
35
for (word in 2:length(search_query)){
36
Sys.sleep(1+runif(1, 0, 9))
37
message(word," ",search_query[word])
38
all_suggestion <- c(all_suggestion,getGSQueries(search_query[word], code_lang))
39
}
40
41
all_suggestion
42
}
43
}
Copied!
This is how you can use it:
1
kwd <- suggestGSQueries('covid', 'en', 2)
2
3
View(as.data.frame(unlist(kwd)))
Copied!
The first parameter is the seed keyword, the second one is the language (or host language), and the last one is the level of details (1,2 or 3).
1 will just grab the first suggestion list, 2 will grab suggestions if you add another letter ('covid a', 'covid b', 'covid c', ...), 3, which I don't recommend, will add two letters (covid aa, covid ab, ..) it's also possible to pass a vector instead of a string. In this example, we ask for Google suggestions for each of the results in the previous step.
it will drastically increase the keyword list and... it might a little bit of time too :)
1
deeper_kwd <- suggestGSQueries(kwd, 'en', 1)
2
3
View(as.data.frame(unlist(deeper_kwd)))
Copied!
Use these functions with caution because they can send a lot of queries to Google and you might get your IP banned.
Copy link