What GMB Categories are the Competition Using?

Picking the right GMB categories can be a little daunting, especially when there are thousands to choose from.  There is often no obvious choice, so the best option is sometimes to examine what the competition is doing.

You can view the primary category of a business within Google Maps, but secondary categories are not displayed anywhere.  Going through Maps listing by listing is a tedious way to discover categories in any case.

Note to the reader: You will need to enable billing in Cloud Console in order to use the Google Places API.

ADDENDUM: In April 2022, Google is deprecating the GMB API. Code in this article may not work properly after that date.

Full code for this post is available on Github.


Subscribe to my Newsletter


Using Python to Automate GMB Category Discovery

With a little bit of Python, we can automatically build a list of which categories our competition are using.

To do this, we’ll need to use the Google Places API.  Although the API can be expensive when used at scale, the good news is that (at least as of this writing) there is a $200 credit which should be more than enough for our purposes.

If you haven’t already, follow the instructions on this page to get an API key.

Gathering the categories will be a two step process.  First, we’ll get a list of places from the Google Places API, and then do some web scraping to gather the full list of categories for all of these places.

Finding the Competition

We first need to find the GMB listings that represent the local competition.  To do that, we’ll perform the API equivalent of a Google Maps search.  This will involve two calls to the API: one to surface local results, and another to get the CID for each of the resulting locations.

The CID is a unique identifier that we’ll need later on when gathering categories. There are a few different ways to get CID numbers, but using the API is probably the most efficient.

Let’s take a look at the first step of the process.

import requests
import sys
import os


def get_places(keyword, lat, lng):
    place_results = requests.get(
        "https://maps.googleapis.com/maps/api/place/nearbysearch/json",
        params={
            "key": os.environ["GOOGLE_MAPS_API_KEY"],
            "keyword": keyword,
            "location": "{},{}".format(lat, lng),
            "radius": 5000,
        },
    ).json()

    return place_results["results"]


def get_place_details(place_id):
    place_details = requests.get(
        "https://maps.googleapis.com/maps/api/place/details/json",
        params={
            "key": os.environ["GOOGLE_MAPS_API_KEY"],
            "place_id": place_id,
            "fields": "url",
        },
    ).json()

    return place_details["result"]


def find_categories(keyword, lat, lng):
    for place in get_places(keyword, lat, lng):
        details = get_place_details(place["place_id"])


if __name__ == "__main__":
    find_categories(sys.argv[1], sys.argv[2], sys.argv[3])

From this combination of API calls we’ll end up with everything we need to gather the categories.

Here’s an example of the JSON that’s returned from the first API call.

{'business_status': 'OPERATIONAL',
 'geometry': {'location': {'lat': 35.853308, 'lng': -78.71187599999999},
              'viewport': {'northeast': {'lat': 35.85466172989272,
                                         'lng': -78.71042762010727},
                           'southwest': {'lat': 35.85196207010728,
                                         'lng': -78.71312727989272}}},
 'icon': 'https://maps.gstatic.com/mapfiles/place_api/icons/v1/png_71/generic_business-71.png',
 'name': 'Hemmings & Stevens PLLC',
 'opening_hours': {'open_now': True},
 'photos': [{'height': 1151,
             'html_attributions': ['<a '
                                   'href="https://maps.google.com/maps/contrib/114397351403243610237">A '
                                   'Google User</a>'],
             'photo_reference': 'ATtYBwLRuwkayhhwmCCOnWRVQQmoqhrrUi7TZHcH3WDhUc_C6Q8av_YQQki1GGI_8Sv72pLU1JCnpyQt1HOEQRIqEpb4s4i6rLOvOj3UIVkF_1e-F_NOXvBHarlzGfPiBgw5vmrgvoplVUL8y8wl3QqpZGwkOFifiK0ceS3oPuJkm2PeDLcc',
             'width': 2048}],
 'place_id': 'ChIJfVE2Xvv2rIkRJ0_At3o4gLw',
 'plus_code': {'compound_code': 'V73Q+86 Raleigh, North Carolina',
               'global_code': '8773V73Q+86'},
 'rating': 4.9,
 'reference': 'ChIJfVE2Xvv2rIkRJ0_At3o4gLw',
 'scope': 'GOOGLE',
 'types': ['lawyer', 'point_of_interest', 'establishment'],
 'user_ratings_total': 20,
 'vicinity': '5540 McNeely Dr #202, Raleigh'}

Despite having all of this info, we’re still missing the CID, and that’s where the call to the details endpoint comes into play.  We make sure to specify fields and list just CID so that we’re not billed for info that we don’t need.

Calling the details endpoint yields the last bit of info that we’ll need before gathering the categories.

{'url': 'https://maps.google.com/?cid=13582918575869415207'}

With the CID in hand, we can turn to the web scraping part of the process.  Although Google Maps won’t show you all of the categories, they do actually return the full list embedded inside the page.

If you inspect the response to the CID URL request, you should find the categories buried deep inside their nested data structure.

The tricky part is extracting the categories from the big soup of JSON that Google returns.

From trial and experimentation, I’ve found a combination of text and JSON parsing that seems to work reliably.  

Here’s the script after adding the requests and scraping logic.

import requests
import json
import re
import sys
import os


def get_places(keyword, lat, lng):
    place_results = requests.get(
        "https://maps.googleapis.com/maps/api/place/nearbysearch/json",
        params={
            "key": os.environ["GOOGLE_MAPS_API_KEY"],
            "keyword": keyword,
            "location": "{},{}".format(lat, lng),
            "radius": 5000,
        },
    ).json()

    return place_results["results"]


def get_place_details(place_id):
    place_details = requests.get(
        "https://maps.googleapis.com/maps/api/place/details/json",
        params={
            "key": os.environ["GOOGLE_MAPS_API_KEY"],
            "place_id": place_id,
            "fields": "url",
        },
    ).json()

    return place_details["result"]


def get_location_categories(cid):
    response = requests.get(
        "https://www.google.com/maps?cid={}&hl=en".format(cid),
        proxies={"http": os.environ["PROXY_URL"], "https": os.environ["PROXY_URL"]},
        timeout=10,
    )

    start = response.text.find("window.APP_INITIALIZATION_STATE=")
    end = response.text.find("window.APP_FLAGS", start)

    if start > 0 and end > 0:
        content = json.loads(
            response.text[
                start + len("window.APP_INITIALIZATION_STATE=") : end - 1
            ]
        )
        content = json.loads(content[3][6][5:])
        return content[6][13]

    return []


def find_categories(keyword, lat, lng):
    for place in get_places(keyword, lat, lng):
        details = get_place_details(place["place_id"])
        categories = get_location_categories(details["url"].split("=")[1])


if __name__ == "__main__":
    find_categories(sys.argv[1], sys.argv[2], sys.argv[3])

You’ll notice that I’m using a proxy for the web requests.  If you want to scale this process up, I would highly recommend doing so from behind a proxy, just in case Google decides you’re making too many requests.

At this point we have the categories, but we’re not doing anything with them, aside from printing them to the console.

Let’s take the script a bit further and write the results to a CSV file we can view in Excel.

import requests
import json
import re
import csv
import sys
import os


def get_places(keyword, lat, lng):
    place_results = requests.get(
        "https://maps.googleapis.com/maps/api/place/nearbysearch/json",
        params={
            "key": os.environ["GOOGLE_MAPS_API_KEY"],
            "keyword": keyword,
            "location": "{},{}".format(lat, lng),
            "radius": 5000,
        },
    ).json()

    return place_results["results"]


def get_place_details(place_id):
    place_details = requests.get(
        "https://maps.googleapis.com/maps/api/place/details/json",
        params={
            "key": os.environ["GOOGLE_MAPS_API_KEY"],
            "place_id": place_id,
            "fields": "url",
        },
    ).json()

    return place_details["result"]


def get_location_categories(cid):
    response = requests.get(
        "https://www.google.com/maps?cid={}&hl=en".format(cid),
        proxies={"http": os.environ["PROXY_URL"], "https": os.environ["PROXY_URL"]},
        timeout=10,
    )

    start = response.text.find("window.APP_INITIALIZATION_STATE=")
    end = response.text.find("window.APP_FLAGS", start)

    if start > 0 and end > 0:
        content = json.loads(
            response.text[start + len("window.APP_INITIALIZATION_STATE=") : end - 1]
        )
        content = json.loads(content[3][6][5:])
        return content[6][13]

    return []


def find_categories(keyword, lat, lng):
    writer = csv.writer(open("categories.csv", "w"), dialect="excel")
    writer.writerow(["URL", "Name", "Categories"])

    for place in get_places(keyword, lat, lng):
        details = get_place_details(place["place_id"])

        try:
            categories = get_location_categories(details["url"].split("=")[1])
        except:
            continue

        writer.writerow([details["url"], place["name"], ", ".join(categories)])


if __name__ == "__main__":
    find_categories(sys.argv[1], sys.argv[2], sys.argv[3])

Now you can run the script by providing a search term and location in the form of latitude/longitude, and you should get a nicely formatted CSV file in return.  Here’s an example invocation of the script.

$ python categories.py "plumber near me" 35.864 -78.728

I hope this little automation hack can help you discover the right GMB categories!

5/5 - (4 votes)