Flat File Schema

Overview

This document describes the star schema architecture for Sayari entity and relationship data modeling. The schema uses entity_id as the universal primary key, with all data organized around entities (organizations, people, or other objects) and their relationships.


Fact Tables

The star schema contains two fact tables that serve as the central hub for analysis:

Entity Summary Table

Purpose: Primary fact table containing core entity metrics and attributes.

FieldTypeDescription
entity_idstringPrimary key - Unique identifier for each entity
labelstringDisplay name of the entity
label_enstringEnglish/Latin script version of entity name
degreelongNumber of outgoing relationships
num_documentsintegerCount of source documents
sanctionedbooleanSanctions list indicator
pepbooleanPolitically Exposed Person flag
closedstringClosure/dissolution status
typestringEntity classification (company, person, etc.)
DoBstringDate of birth (person entities)

Key Relationships: All entity dimension tables join via entity_id

Relationship Summary Table

Purpose: Fact table capturing connections between entities with temporal tracking.

FieldTypeDescription
src_idstringForeign key - Source entity
dst_idstringForeign key - Destination entity
typestringRelationship type (owns, employs, etc.)
datestringAs-of date (YYYY-MM-DD)
from_datestringRelationship start date
to_datestringRelationship end date

Key Relationships:

  • src_id and dst_id reference entity_id in Entity Summary
  • Relationship dimensions join via (src_id, dst_id) composite key

Entity Dimension Tables

Core Attributes

Name Table

Purpose: Stores name variations including translations and transliterations.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
namestringThe name value
languagestringLanguage code of the name
contextstringContext where name is used
translatedstringTranslated version
transliteratedstringTransliterated version
originalstringOriginal name in native script

Identifier Table

Purpose: Various identifiers (tax IDs, registration numbers, etc.).

FieldTypeDescription
entity_idstringForeign key to Entity Summary
identifierstringThe identifier value
identifier_typestringType of identifier
datestringAs-of date
from_datestringValid from date
to_datestringValid to date

Status Table

Purpose: Entity status conditions over time.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
statusstringStatus value
status_typestringStatus category
textstringDescriptive text
contextstringAdditional context
from_datestringStatus start date
to_datestringStatus end date
datestringAs-of date

Address Table

Purpose: Comprehensive geographic information for entities.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
addressstringFull address string
translatedstringTranslated address
transliteratedstringTransliterated address
typestringAddress type
languagestringLanguage of address
housestringHouse name
house_numberstringHouse/building number
po_boxstringPost office box
buildingstringBuilding name
entrancestringEntrance identifier
staircasestringStaircase identifier
levelstringFloor/level
unitstringUnit/apartment number
roadstringStreet/road name
metro_stationstringNearest metro station
suburbstringSuburb/neighborhood
city_districtstringCity district
citystringCity name
state_districtstringState district
islandstringIsland name
statestringState/province
postcodestringPostal code
country_regionstringCountry region
countrystringCountry
world_regionstringWorld region
categorystringLocation category
nearstringNearby landmark
xstringLongitude coordinate
ystringLatitude coordinate
precision_codestringGeographic precision level

Business Attributes

Business Purpose Table

Purpose: Industry classifications and business activities.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
business_purposestringBusiness activity description
codestringIndustry code
standardstringClassification standard
datestringAs-of date
from_datestringValid from date
to_datestringValid to date

Company Type Table

Purpose: Company type classification.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
company_typestringCompany type description
datestringAs-of date
from_datestringValid from date
to_datestringValid to date

Country Table

Purpose: Country context and jurisdictions.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
countrystringCountry name
statestringState/province
contextstringRelationship context
datestringAs-of date
from_datestringValid from date
to_datestringValid to date

Financial Attributes

Financials Table

Purpose: Comprehensive financial metrics.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
revenuedecimalTotal revenue
net_incomedecimalNet income
assetsdecimalTotal assets
liabilitiesdecimalTotal liabilities
registered_capitaldecimalRegistered capital
paid_up_capitaldecimalPaid-up capital
employeesintegerNumber of employees
currencystringCurrency code
total_current_liabilitiesdecimalCurrent liabilities
common_stockdecimalCommon stock value
operating_incomedecimalOperating income
total_debtdecimalTotal debt
inventorydecimalInventory value
cash_and_equivalentdecimalCash and equivalents
gross_profitdecimalGross profit
reporting_period_typestringPeriod type

Monetary Value Table

Purpose: Monetary amounts in various contexts.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
valuedecimalMonetary amount
currencystringCurrency code
contextstringContext of the value

Shares Table (Entity)

Purpose: Share ownership information.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
monetary_valuedecimalValue of shares
currencystringCurrency code
percentagedecimalOwnership percentage
typestringShare type

Supplementary Attributes

Contact Table

Purpose: Contact information.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
valuestringContact information
typestringContact type

Measurement Table

Purpose: Measurement information.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
valuestringMeasurement value
typestringType of measurement
unitstringUnit of measurement

Additional Information Table

Purpose: Additional information data.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
valuestringInformation value
typestringInformation type

Edges Table

Purpose: Relationship connection summary.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
edge_namestringType of edge/relationship
countintegerNumber of such relationships

Risk Attributes

Risk Intelligence Table

Purpose: Detailed risk intelligence information.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
typestringRisk type
authoritystringIssuing authority
programstringSanctions/watchlist program
liststringSpecific list name
reasonstringReason for listing
datestringAs-of date
from_datestringListing start date
to_datestringListing end date

Risk Values Table

Purpose: Wide-format table with individual risk factors as columns.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
sanctionedbooleanSanctioned entity flag
pepbooleanPolitically exposed person
export_controlsbooleanExport control violations
forced_labor_xinjiang_origin_directintegerXinjiang forced labor indicator
regulatory_actionbooleanRegulatory action flag
various200+ additional risk factor columns

Note: This table contains 200+ risk factor columns including sanctions indicators, PEP flags, export control violations, forced labor indicators, state ownership markers, and financial crime risks.

Risk Paths Table

Purpose: Risk propagation through entity networks.

FieldTypeDescription
entity_idstringForeign key to Entity Summary
[risk_path_columns]variousRisk propagation path columns

Note: Similar structure to Risk Values but tracks how risks propagate through ownership and relationship chains.


Relationship Dimension Tables

Position Table

Purpose: Employment and governance positions.

FieldTypeDescription
src_idstringForeign key - Person/entity holding position
dst_idstringForeign key - Organization where position is held
datestringAs-of date
from_datestringPosition start date
to_datestringPosition end date
valuestringPosition title or role

Shares Table (Relationship)

Purpose: Ownership stakes between entities.

FieldTypeDescription
src_idstringForeign key - Owner entity
dst_idstringForeign key - Owned entity
num_sharesdoubleNumber of shares owned
monetary_valuedoubleValue of shares
currencystringCurrency of monetary value
percentagedoubleOwnership percentage
typestringType of shares
datestringAs-of date
from_datestringOwnership start date
to_datestringOwnership end date

Business Purpose Table (Relationship)

Purpose: Business purpose for relationships.

FieldTypeDescription
src_idstringForeign key - Source entity
dst_idstringForeign key - Destination entity
valuestringBusiness purpose description
codestringClassification code
standardstringCoding standard used
datestringAs-of date
from_datestringValid from date
to_datestringValid to date

Additional Information Table (Relationship)

Purpose: Additional information relationship details.

FieldTypeDescription
src_idstringForeign key - Source entity
dst_idstringForeign key - Destination entity
typestringType of additional information
valuestringInformation content
datestringAs-of date
from_datestringValid from date
to_datestringValid to date