unable to get splashr to render content
Created by: david-jankoski
Hey Bob,
Thank you for making this (and all your other) pkg - i'm a big fan and use it regularly to scrape stuff. So much easier and faster than the selenium route. This might be the wrong channel to ask this so please feel free to ignore/close this issue.
Here's my problem:
Recently Spencer Graves asked this question on the R-help mailing list - how to scrape this site
https://www.battleforthenet.com/scoreboard/
At first i thought ahh it would be a breeze with splashr so i tried
library("splashr")
url <- "https://www.battleforthenet.com/scoreboard"
page <- splashr::render_html(url = url, wait = 10)
res <-
page %>%
rvest::html_nodes("#senate") %>%
rvest::html_nodes(".politicians")
# returns
> {xml_nodeset (0)}
but this doesn't seem to work. Trying to see what's going on
res <-
page %>%
rvest::html_nodes("#senate") %>%
xml2::html_structure()
# returns
> [[1]]
<div#senate .politicians>
{text}
<h2>
{text}
<em>
{text}
{text}
{text}
<team-legend>
<p>
{text}
{text}
<politician-card [v-for, :politician, v-if]>
I'm not very knowledgable of web-dev things so i apologize if this might be something obvious. I'm trying to understand what's going on and why does this not work with splashr. My best guess would be that there is some kind of secret js mumbo jumbo going on which manages to keep away the real content from splashr...
I would be thankful if you could just point me in some direction on how to do this.
Thanks again for all your work!
david