Scraping Google Reviews in Python with Livescraper

Scraping Google Reviews in Python with Livescraper

Introduction: Scraping Google Reviews Using Livescraper As you may know, scraping Google reviews can be a challenging task due to […]

Introduction: Scraping Google Reviews Using Livescraper

As you may know, scraping Google reviews can be a challenging task due to the dynamic content that is loaded through JavaScript. The official Google Places API only allows developers to fetch 5 reviews per business listing, which is often insufficient. That’s why developers turn to scraping methods that allow them to extract all reviews from Google. While there are various scraping tools available, one of the most efficient and easy-to-use solutions is Livescraper, a powerful tool that simplifies scraping Google reviews, among other data types, without requiring the setup and maintenance of complex scraping infrastructure. In this blog, we’ll walk you through how to use Livescraper to scrape Google reviews effectively.

Install Livescraper and Other Necessary Packages

To get started, you’ll need to install Livescraper. Additionally, you may need some supporting packages like Parsel to parse the HTML. Below is the command to install Livescraper.
pip install livescraper
pip install parsel  # to extract data from HTML using XPath or CSS selectors

Start the Browser

Livescraper uses a headless browser to render dynamic pages, just like Selenium does. However, the setup and execution are simpler. To get started, you’ll first need to initialize the browser.
from livescraper import Browser

# Initialize Livescraper browser
browser = Browser(driver_path='./chromedriver')  # Provide the path to your ChromeDriver
browser.start()  # Start the browser

Download All Reviews Page

Once the browser is started, you’re ready to open Google Maps pages and scrape the reviews. To do this, use the following code to navigate to any Google Maps listing URL.
# Define the URL of the Google Maps place
url = 'https://www.google.com/maps/place/Central+Park+Zoo/@40.7712318,-73.9674707,15z/data=!3m1!5s0x89c259a1e735d943:0xb63f84c661f84258'

# Open the page
browser.get(url)

Parse Reviews

Once the page is loaded, you can start scraping the review data. Livescraper makes it easy to parse the HTML content and extract review information.
from parsel import Selector

# Get the page content
page_content = browser.page_source
selector = Selector(page_content)

# Parse the reviews
reviews = []

for review in selector.xpath('//div[@class="section-review"]'):
    reviews.append({
        'author': review.xpath('.//span[@class="section-review-title"]/text()').get(),
        'rating': review.xpath('.//span[@aria-label="stars"]/@aria-label').get().replace('stars', '').strip(),
        'review_text': review.xpath('.//span[@class="section-review-text"]/text()').get(),
    })

# Print the results
for review in reviews:
    print(review)

Stop the Browser

It’s essential to stop the browser once your scraping task is complete. Use the following code to close the browser after scraping:
# Stop the browser
browser.quit()

Multiprocessing and Other Recommendations

To scale your scraping efforts, consider using multiprocessing. However, it’s important to note that each browser instance will consume one CPU. Ensure you have enough resources for handling multiple processes. Another recommendation is to use proxies if you’re scraping at a large scale. This helps you avoid being blocked by Google due to frequent requests from the same IP address.

The Easiest Way of Scraping Google Reviews with Livescraper

Although scraping Google reviews using browser emulation provides great flexibility, it can be costly in terms of resources, especially for large-scale scraping operations. Additionally, maintaining a scraper that can handle frequent changes to the Google website can be time-consuming. If you want an even easier solution, Livescraper offers an SDK and API that makes it incredibly easy to access Google reviews without the hassle of browser setup or worrying about proxies.

Scrape Reviews in Python Using Livescraper SDK

Livescraper’s SDK provides a straightforward method to fetch reviews directly from Google Maps without the need for handling dynamic content manually. Here’s how you can use the SDK to scrape reviews. Install the SDK:
pip install livescraper-sdk
Get Your API Key: Visit the Livescraper platform and retrieve your API key from your profile page. Use the SDK to Scrape Reviews:
from livescraper_sdk import ApiClient

# Initialize the API client with your API key
api_client = ApiClient(api_key='YOUR_API_KEY')

# Define the Google Maps URL or place ID
place_url = 'https://www.google.com/maps/place/Do+or+Dive+Bar/@40.6867831,-73.9570104,17z/'

# Fetch reviews using the API
reviews = api_client.get_reviews(
    place_url=place_url,
    language='en',
    limit=100  # Set a limit on the number of reviews
)

# Print reviews
for review in reviews['reviews_data']:
    print(f"Author: {review['author_name']}")
    print(f"Rating: {review['review_rating']}")
    print(f"Review: {review['review_text']}")
    print(f"Link: {review['review_link']}")
    print("-" * 80)
API Response:

{ 
    "reviews_data": [ 
        "query": "real estate agents in Los Angeles, CA",
        "business_name": "Prevu",
        "google_id": "0x89c25a18440df38d:0x41db57ca0d7213a0",
        "place_id": "ChIJjfMNRBhawokRoBNyDcpX20E",
        "place_cid": 4745483157685540000,
        "google_place_url": "https://www.google.com/maps?cid=4745483157685539744",
        "review_url": "https://search.google.com/local/reviews?placeid=ChIJjfMNRBhawokRoBNyDcpX20E&q=real+estate+agents+in+Los+Angeles,+CA&authuser=0&hl=en&gl=US",
        "reviews_per_score": "{1: 2, 2: 1, 3: 2, 4: 1, 5: 623}",
        "total_reviews": 629,
        "average_rating": 5,
        "review_id": "ChdDSUhNMG9nS0VJQ0FnSUN2anFIUW1BRRAB",
        "author_link": "https://www.google.com/maps/contrib/100735152414342745869/reviews?hl=en",
        "author_title": "Donna Marie",
        "author_id": "100735152414342745869",
        "author_image": "https://lh3.googleusercontent.com/a/ACg8ocJEQZazUKq5OxvV3RO-EL04yW3EQuSqdQwkEdnjy7jz0VL15A=s120-c-rp-mo-br100",
        "review_text": "Very glad I chose Prevu as my real estate agency when looking to purchase a co-op in NYC. Sarah, my agent was incredible helping me find the right place and assisting me                 with the process of the purchase. And on top of this great service I also got a nice rebate check  back. Highly recommend",
        "review_img_url": null,
        "review_img_urls": null,
        "owner_answer": null,
        "owner_answer_timestamp": null,
        "owner_answer_timestamp_datetime_utc": null,
        "review_link":                    "https://www.google.com/maps/reviews/data=!4m8!14m7!1m6!2m5!1sChdDSUhNMG9nS0VJQ0FnSUN2anFIUW1BRRAB!2m1!1s0x0:0x41db57ca0d7213a0!3m1!1s2@1:CIHM0ogKEICAgICvjqHQmAE%7CCgwI5rzjugYQsJX5lgE%7C?        hl=en",
        "review_rating": 5,
        "review_timestamp": 1733877350316558,
        "review_datetime_utc": "11-12-2024 06:05:50",
        "review_likes": null,
        "reviews_id": 4745483157685540000
        }, 
        ... 
    ]
}

Video Tutorial

Check out our video tutorial to get a detailed, step-by-step guide on how to set up and use Livescraper for scraping Google reviews.

FAQ

How do I scrape all Google reviews? With Livescraper, you can easily scrape all Google reviews by using the SDK or by controlling a browser window. For larger-scale scraping, the SDK provides a straightforward method to access the data without complex browser setups. Is there an API for Google reviews? Yes, Livescraper provides an API that allows you to fetch Google reviews directly without worrying about browser rendering or dealing with JavaScript. You can access this API with an API key. How to scrape reviews using Livescraper? Using Livescraper, you can either scrape reviews by controlling a headless browser or by using the Livescraper SDK to access the data directly. The SDK is the easiest option if you want to avoid handling browsers and proxies yourself.
Scroll to Top