Dec. 23, 2020
Events can be defined as any action taken by a user on a website or in an app (landing on a page, clicking a link, submitting a form, etc.). This is typically referred to as the "clickstream" because it is a raw stream of events that are logged as a user interacts with and navigates through web pages. Landing on this blog post has allowed me to track what you clicked on to get here!
Google Analytics defines a session as a collection of events that lasts until there is 30 minutes of inactivity. As long as events keep firing and are being logged without 30 minutes going by between each one, these events will be grouped as a single session. This methodology assumes if a user has been inactive on a site for a long period of time, they either physically left their device or navigated away to a different site and aren’t coming back soon. This time window can be adjusted to whatever you want, but 30 minutes is the industry standard.
Sessions make it easier to group and analyze events and actions users take, which helps us understand the customer journey and their interactions with your product. Some analyses include web traffic patterns, cohort analysis, and assigning attribution. These types of analyses enable you to optimize marketing campaigns and user flows to increase conversion rates and close more deals.
For the most part, Google Analytics can provide a great out of the box solution to analyze your web traffic, but if you are part of a larger organization with more complex data pipelines and third-party tracking vendors, you will eventually need to have an analyst pull, standardize, and drill down into these disparate data sets. Most likely this will be conducted in SQL, Python, or R.
If you use Snowflake as your database, you can use the CONDITIONAL_TRUE_EVENT function to easily sessionize events in one line of SQL code. Historically, I've had to create many sub-queries or CTEs (Common Table Expressions) to get the same end result, but window functions make queries much cleaner and more efficient.
These steps can be consolidated in one line of code in Snowflake using the CONDITIONAL_TRUE_EVENT and the LAG window functions:
The breakdown should end up looking something like this (click image to open in a new tab):
Now imagine trying to analyze this with thousands of other users and hydrating website activity with marketing, sales, and purchase data. We can answer questions such as “How many sessions/conversions were generated from a Google Ad Campaign?” or “Where in the checkout process are users most likely to abandon their cart?” This is why a good data analyst/scientist is key to manipulate data and answer these questions, which can help optimize spend and drive outcomes that will help scale the business!
Have any questions about how to aggregate touchpoints for marketing attribution? Contact me!
If you have more questions about analyzing website activity, I suggest reading Web Analytics 2.0 as a starting point:
James Roselle is a data engineer based in Boston.Learn more!