Use the Google Analytics API with Python

Python and Google AnalyticsIn this post you will learn to create a Python script that allows users to enter its Google Analytics account and get information from there.

In order to do that, we will create a Project in the Google Developers Console and authorize it to use the Analytics API.

Next, we will use the Oauth 2.0 protocol to allow users to connect to their Analytics account through our Project.

And finally, we will retrieve the number of sessions of our view, segmented by traffic source.

Let’s start!

Create a Project in Google Developers Console

Go to the Google Developers Console and login with your account.

Click on Create Project and write your Project name and choose (if you want) your project ID.

Next, on your new project menu, go to APIs & auth –> Credentials. Here, in the Oauth section, click on Create new Client ID.

In this case, as we are creating a script that will run on our computer, we will choose Installed application as the application type, and Other as the installed application type.

Create a Project for Google Analytics API

Finally, click on Create Client ID.

You will see, next to the OAuth section, the credentials for your project, which contain your Client ID, the Client Secret, and the redirect URIS. Click on Download JSON to download them, and save the file as client_secrets.json.

From here, go to APIs & auth –> Consent screen and personalize the message that your users will see when requesting access to their accounts.

Next, we need to activate the Goolge Analytics API in your Project. Go to APIs & auth –> APIs and look for the Analytics API. You just need to activate it by clicking at the OFF button on the right.

Ok! now that we have our Project created we can move on to our Python script!

The Google API Python client library

In order to use the Analytics API with Python, we will use the Google API Python Client library. You can install it in your working environment using pip (how? learn to install Python, virtualenv and virtualenvwrapper to work with virtual environments).

We also install the python-gflags library, which we will use latter in the code.

Next create the file analytics_service_object.py in your working directory (where client_secrets.json is located). This file will create an authorized Analytics Service object, used to interact with the user’s analytics accounts.

In the previous script:

  • CLIENT_SECRETS loads your credentials from the client_secrets.json file.
  • TOKEN_FILE_NAME is the file where the user-specific credentials will be stored (this file also includes some project-specific credentials).
  • the prepare_credentials() function tries to load the credentials from the TOKEN_FILE_NAME and if they don’t exist it creates new ones using the run_flow function.
  • the initialize_service() function uses the credentials to build an authorized Analytics Service object, and returns this object.

Now, when you type

you will see, in a browser window, the consent screen you customized before. This means that your Project is asking your permission to access your Analytics account through the API. After clicking yes, your new credentials will be stored in TOKEN_FILE_NAME so that you won’t have to enter them again (except when the access_token expires).

The Analytics Service object

Once we have an authorized Analytics service object, we can use it to retrieve all the data in the user’s analytics accounts.

For example, to get a list of all the existing accounts of the user, just type:

This will give you a dictionary containing the following keys:

  • username: the email address of the user
  • kind: analytics#accounts
  • items: a list of the user’s accounts.
  • totalResults
  • itemsPerPage
  • startIndex

As we will see, this is a common structure when getting data from analytics, even when we ask for properties or views instead of accounts (the returned object has the same keys).

Moreover, the items value is a list of accounts, each of which is in turn a dictionary with keys:

  • id: your account id
  • kind: analytics#account
  • childLink
  • created
  • permissions
  • selfLink
  • updated

Therefore, you can get a list of your users accounts with:

You can also see the account ids in the Google Analytics web. You have to go to the Admin tab, and open the top-left drop down menu. There, your different accounts will be displayed, with their id on the right.

But as you may know, each Account can have multiple Properties, each of which has a different tracking code. To obtain a list of the Properties inside the Account with an id of account_id, you can use:

where webproperties is a dictionary with the same keys as accounts, but in which

  • kind: analytics#webproperties
  • items: list of web properties for this account

Again, each web property is a dictionary that contains the keys:

  • id: the web property id
  • kind: analytics#webproperty

and many more (you can print the webproperties object to see its keys).

You’ll see that the web property id is the tracking code of this property, which you can also obtain in the Google Analytics Admin tab.

But there is another level! Inside each Property there can be multiple views! You can obtain a list of views (or profiles) of each web property with:

The profiles dictionary contains the same keys as accounts and webproperties, but with

  • kind: analytics#profiles
  • items: list of profiles for this account and web property

and each profile has:

  • id: the profile id
  • kind: analytics#profile
  • name: the profile name

Get the number of Sessions of a Google Analytics View

Now that we know how to get information about our accounts, properties and views, let’s obtain the number of sessions of a view during a period of time.

Create the file get_sessions.py and write:

Note: you have to add your view id in “your_profile_id”, and then, run this script with:

Check all the functionalities of the service.data().ga().get() method, and retrieve all the data you want form your view!

Get the number of Sessions for each traffic source

Obtaining the number of sessions for each traffic source (i.e. organic, referral, social, direct, email and other) is a little bit trickier. You have to work with filters in order to segment your data.

Here’s a little script that does this, thanks to Michael for the update 🙂

Again, add your view’s id in “your_profile_id”, and change the start_date and end_date to match the time interval you want.

After running this script, you’ll see the desired information in your terminal.

Another solution to get the number of sessions by traffic source, less optimized but instructive, is to use filter instead of dimensions:

Again, add your view’s id in “your_profile_id”, and change the start_date and end_date to match the time interval you want.

After running this script, you’ll see the desired information in your terminal.

Some information you may find useful when working with filters:

  • , means OR
  • ; means AND
  • == means exact match
  • != means does not match
  • =@ means contains substring
  • !@ means does not contain substring
  • learn more in the Google Reference Guide

That’s all for today! 🙂

Please, +1 if was useful and share it with your friends! Thaaanks!

Google+TwitterLinkedInFacebookReddit

Please, add +Marina Mele in your comments. This way I will get a notification email and I will answer you as soon as possible! :-)