Raw Data
Anywhere: Warehouse offers two primary methods for accessing your data:
- Raw Data: normalized event data without transformations.
- Ready to Analyze Views: pre-transformed data ready for analysis.
Raw Data is the original method of getting your Fullstory data into your data warehouse or storage system. This approach provides normalized event data in its base format (often referred to as "Bronze schema") without additional transformations.
See the Getting Started page for a detailed comparison to determine which method is right for you.
Destinations
Anywhere: Warehouse supports the following Raw Data destinations. For detailed instructions on configuring each destination, visit the specific destination documentation linked below.
File Storage Destinations
Destination | Raw Data Documentation |
---|---|
Amazon S3 | Amazon S3 Raw Data |
Azure Blob Storage | Azure Blob Storage Raw Data |
Google Cloud Storage | Google Cloud Storage Raw Data |
Legacy Data Warehouse Destinations with Views Alternatives
The following destinations initially supported Raw Data and are now available as Ready to Analyze Views:
Destination | Views Documentation |
---|---|
Amazon Redshift | Amazon Redshift Views |
BigQuery | BigQuery Views |
Snowflake | Snowflake Views |
Data Model
The Raw Data data model includes a combination of events, which represent the user interactions on your website or app, and defined objects, which are defined within the Fullstory application.
Events
The foundation for Fullstory's data model are events:
- Events: This data serves as a record of user interactions with your website or app as well as any server-side events you send to Fullstory. Each event comes with a set of properties that describe the event and the user's context at the time of the event.
The event data model is designed to be stable at the outer level, meaning that the fields and their types will not change. However, the data model is extensible, so new fields can be added to the data model without breaking existing integrations or requiring expensive schema migrations.
Defined Objects
Fullstory's data model also includes the following defined objects, which are defined within the Fullstory application and can be synced to your data warehouse:
- Element Definitions: These correspond to named elements that represent complex CSS selectors, making it easier to analyze user interactions with specific UI elements.
- Event Definitions: These correspond to defined events that are created within Fullstory to track specific user behaviors.
- Page Definitions: These are custom page groupings, allowing you to analyze user behavior on specific sections of your website.
Relationship between events and defined objects
Certain event types may be related to the defined objects referenced above. The id
from each defined object
record will uniquely link back to a corresponding property in the events table:
Defined object | Event object | Event object property name |
---|---|---|
Element Definitions | EventTarget | element_definition_id |
Event Definitions | Common Event Properties | event_definition_id , additional_event_definition_ids |
Page Definitions | Common Source Properties | page_definition_id |
Sync Expectations
Anywhere: Warehouse syncs with destinations on an hourly interval. See each destination's documentation page for details.
Important Timestamps
There are three timestamp fields that are relevant for destinations: event_time
, processed_time
, and updated_time
.
event_time
is an immutable field that records the timestamp of each event according to the user's device.processed_time
is a timestamp field indicating when the event was processed by Fullstory's servers. On average, 95% of events are captured and processed within 20 minutes of the original event time. Several events, including server side events, may reach Fullstory's servers much later than the original event time. Depending on the contents of these late events, Fullstory may need to reprocess them to report on the most accurate metrics, including session length, page active time, etc. In these scenarios, theprocessed_time
for all events in the session or on the page will be updated and the events will be re-synced to the warehouse.updated_time
indicates when a record was last modified (inserted or updated) and is populated by a built-in function in the warehouse during the sync. This field tracks the last time a record in your warehouse was changed and can be used as a filter to determine which new records to pull into your query.
Sync Latency
Sync latency is a concept that tracks the cadence with which Fullstory syncs new events to the destination.
Captured events flow into Fullstory constantly, then are processed on a defined cadence to be sent to the destination.
The interval is dependent on processed_time
, which indicates when our servers processed the event for the particular destination.
For example, when syncing to Snowflake, the sync latency is
Approximately
processed_time
rounded to the next hour + 1 hour.
This is because Fullstory writes data to a file based on the processed_time
, with each file containing events that
were processed in a given hour. The file is then merged into the database table within the next hour. This timing is approximate
because syncing to the destination depends on how large the file is and how much compute is available to run the merge.
Duplicate events
We guarantee at least one delivery for each event. For Raw Data destinations, events may be duplicated in your warehouse (less than 1%).