Scraping time... Selenium too slow!
Level 5 - SHARP
Do you have a friend who likes betting and would like to sharpen his/her action? Help them on their path by sharing the BowTiedBettor Substack. Win and help win!
Welcome Degen Gambler!
Today we will extend our current web scraping knowledge by discussing new ways of gathering online information. In our first scraping post “Build your first odds scraper” we consulted Selenium to solve our problems, an open source umbrella project for a range of tools and libraries aimed at supporting browser automation. However, there are two major disadvantages with Selenium:
It is slow.
It is unintuitive. You are basically extracting data from arbitrary HTML sections containing elements with more or less random names. This makes error checking/updating code unnecessarily difficult. Example below.
today_object = driver.find_element(By.CLASS_NAME, "_79bb0") games = today_object.find_elements(By.CLASS_NAME, "f9aec._0c119.bd9c6") for game in games: team_names = game.find_elements(By.CLASS_NAME, "_6548b") ...
What if instead we could find a way to get around this Selenium stuff, and do something more similar to what our web browsers are doing when we are visiting a website? After all, the data we see is derived from somewhere, right? Rarely do things just magically appear.