lorddata.com / schema
Schema v2.4 · Current

Data Schema &
Field Dictionary

Complete reference for every field in the Lord Data clickstream feed. All fields are delivered as flat, normalized records in Parquet or NDJSON format via S3 or authenticated API. Zero PII in any field.

22
Fields
5
Categories
0
PII Fields
Parquet
NDJSON
CSV
REST API
S3 Drop
Field Name
Type
Description
Nullable
🆔
Identity — 1 field
OBJECT_ID
Primary Key
UUID

Universally Unique Identifier generated per event using UUID v4. Guarantees global uniqueness across all records. Use as a deduplication key when joining datasets.

No
🌍
Geographic — 7 fields
COUNTRY
String

Full English country name resolved from the anonymized network signal. Follows ISO 3166-1 country naming conventions.

Yes
COUNTRY_CODE
String

Two-letter ISO 3166-1 Alpha-2 country code. Consistent, machine-readable identifier suitable for geo-segmentation and JOIN operations with external country dimension tables.

Yes
REGION
String

State, province, or administrative region within the country. Useful for sub-national geographic segmentation in retail and consumer trend analysis.

Yes
CITY
String

City-level geographic resolution. Derived from coarse IP geolocation lookup. Accuracy is city-level; never street or postal-code precision. No precise location data stored.

Yes
POSTAL_CODE
String

Postal or ZIP code approximated from coarse geolocation data. Provided on a best-effort basis; may be null in regions where postal code inference is unreliable at city-level resolution.

Yes
LOCATION_LATITUDE
Float

Coarse latitude coordinate in decimal degrees (WGS 84). Resolution is intentionally limited to ~10–50km radius to prevent re-identification. Never GPS-precise. Suitable for regional heatmaps.

Yes
LOCATION_LONGITUDE
Float

Coarse longitude coordinate in decimal degrees (WGS 84). Always paired with LOCATION_LATITUDE. Apply the same resolution caveat: city-block or GPS precision is never available.

Yes
⏱️
Temporal — 3 fields
TIMESTAMP
Int64

Unix epoch timestamp in milliseconds (ms) since 1970-01-01T00:00:00Z. High-resolution event time for session reconstruction, journey velocity analysis, and time-series modelling. Divide by 1000 for seconds.

No
DATE
String

Calendar date of the event in ISO 8601 format (YYYY-MM-DD), normalized to UTC. Derived from TIMESTAMP for convenience. Use for daily aggregation, date-range partitioning, and S3 prefix-based partitioning.

No
TIME
String

Time-of-day component in UTC, formatted as HH:MM:SS. Always UTC — apply timezone offsets using COUNTRY or CITY fields if local time analysis is required. Millisecond component is not preserved in this field.

No
📱
Device — 6 fields
METHOD
Enum

HTTP request method associated with the event. In the context of clickstream data, this is almost always GET. Other values (POST, HEAD) indicate non-navigation requests such as form submissions or prefetch requests.

Yes
USER_AGENT
String

Raw HTTP User-Agent string as reported by the client browser. Contains browser engine, version, operating system, and device tokens. The parsed, categorical fields (BROWSER, OS, DEVICE_NAME) are derived from this value. Max length 512 characters.

Yes
BROWSER
Parsed from UA
Object

Structured object containing browser name (e.g. Chrome, Safari, Firefox, Edge) and major version number parsed from the User-Agent string. Minor and patch versions are omitted to reduce cardinality.

{ "name": "Chrome", "version": "120" }
Yes
OS
Parsed from UA
Object

Operating system name and version derived from the User-Agent. Name values include Windows, macOS, iOS, Android, Linux. Version is the major OS version only. Useful for platform split analysis in retail and media clients.

{ "name": "iOS", "version": "17" }
Yes
DEVICE_NAME
String

Commercial device model name inferred from the User-Agent string or client hints API. Examples: iPhone 15 Pro, Galaxy S24, MacBook Pro. Null for unrecognized or generic desktop browser signatures where model is not disclosed.

Yes
DEVICE_BRAND
String

Manufacturer brand name of the device. Values include Apple, Samsung, Google, Microsoft, Huawei, Xiaomi, and others. Normalized to consistent capitalization. Distinct from DEVICE_NAME which carries the model; this field carries only the brand.

Yes
PLATFORM
Enum

Normalized device form-factor classification. One of three values: Desktop Mobile Tablet. Current panel distribution: Desktop 92%, Mobile 8%. Tablets are classified as Mobile in the default aggregation.

No
🔗
Web & URL — 5 fields
REFERRER
URL

The full URL of the page that linked to the current destination, as reported by the HTTP Referer header. Null when the navigation was direct (typed URL, bookmark, or new tab). Critical for attribution modeling and traffic-source analysis. Query parameters are preserved but sanitized to remove tokens that match PII patterns.

Yes
URL
URL

Complete destination URL of the event including protocol, subdomain, domain, path, and query string. Query string parameters that match session tokens, user IDs, or email patterns are redacted to [REDACTED] before storage.

No
DOMAIN_NAME
String

Registered domain extracted from the URL (e.g. amazon.com). Subdomains are stripped. This is the primary field for brand-level aggregation, domain visit counting, and competitive share-of-traffic analysis. Pre-indexed for fast GROUP BY queries.

No
SUB_DOMAIN
String

Subdomain component of the URL if present (e.g. www, m, app). Null when the URL has no subdomain. Useful for distinguishing desktop vs mobile subdomain traffic and identifying product sub-sections within large platforms.

Yes
PROTOCOL
Enum

URL scheme component. In practice almost always https. http records indicate legacy or misconfigured origins and may warrant quality filtering before use in consumer brand analysis.

No
PATH
String

URL path component after the domain (e.g. /dp/B09X7). Enables page-level traffic analysis, funnel stage classification, and product category detection. Path segments that match UUID or session token patterns are redacted.

Yes
QUERY_STRING
String

The query string portion of the URL (everything after ?). Parameters identified as PII vectors (email=, user_id=, token=, auth=) are automatically redacted. Preserved parameters include search terms (q=), category (cat=), and UTM tags for attribution.

Yes
{ "results": 0, "query": "" }
No fields match your search. Try a different term.
sample_record.json
Live Record Preview
{
  "OBJECT_ID":         "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "COUNTRY":           "United States",
  "COUNTRY_CODE":      "US",
  "REGION":            "California",
  "CITY":              "San Francisco",
  "POSTAL_CODE":       "94105",
  "LOCATION_LATITUDE": 37.77,
  "LOCATION_LONGITUDE":-122.41,
  "TIMESTAMP":         1735905600000,
  "DATE":              "2026-01-03",
  "TIME":              "14:23:07",
  "METHOD":            "GET",
  "USER_AGENT":        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
  "BROWSER":           { "name": "Chrome", "version": "120" },
  "OS":                { "name": "macOS",  "version": "14"  },
  "DEVICE_NAME":       "MacBook Pro",
  "DEVICE_BRAND":      "Apple",
  "PLATFORM":          "Desktop",
  "REFERRER":          "https://google.com/search?q=enterprise+data+platform",
  "URL":               "https://amazon.com/dp/B09X7MFLJZ",
  "DOMAIN_NAME":       "amazon.com",
  "SUB_DOMAIN":        "www",
  "PROTOCOL":          "https",
  "PATH":              "/dp/B09X7MFLJZ",
  "QUERY_STRING":      "ref=nav_bestseller"
}

Ready to integrate?

Request a free sample dataset and receive a Parquet file with 10,000 real records matching this schema.