Understanding Project Entity | Sayari

What is a Project Entity?

A Project Entity serves as an “envelope” for one-to-many possible matches returned from Sayari’s Knowledge Graph via the match resolution process. Its primary purpose is to provide clients with a single, unified representation for a given real-world entity.

Three-Tier Hierarchy

Project Entities operate within a three-tier structure:

Project Level

The top-level container where project_entity_id values are stored. Projects organize related entities for specific business units, use cases, or investigations. Each project has a unique project_id.

Project Entity Level

The mid-level “envelope” that houses matched entity_ids. This layer not only contains related entities but also summarizes and de-duplicates risk as well as supply-chains from the entities below it. The project_entity_id also functions as a stable key within a given project.

Match Level

The base level of the structure. Each match represents a specific entity from Sayari’s Knowledge Graph with its own unique entity_id. These matched entities contain the underlying attributes, risk factors, relationships, and source data that power your risk assessment and due diligence workflows.

Key Concepts

Uniqueness

Each project_entity_id is unique to a specific project_id and set of input parameters provided during matching.

Example

1 // This creates one project_entity_id
2 {
3   "name": "Apple Inc",
4   "address": "One Apple Park Way, Cupertino CA 95014"
5 }
6 
7 // Changing any parameter creates a new project_entity_id
8 {
9   "name": "Apple Inc", 
10   "address": "1 Infinite Loop, Cupertino CA 95014"  // Different address
11 }

Tip use the Project Entity Exists endpoint as a duplicate check to avoid redundant matching.

Stability

Within a project, the project_entity_id remains static for the same set of input parameters, providing reliable reference points for integration.

Risk Summary

Risk factors from individual matched entities are consolidated at the Project Entity level, providing a comprehensive risk profile for the real-world entity.

Working with Project Entities

Best Practices

Use Project Entity ID as Primary Key

When integrating with internal systems, use the project_entity_id as your primary reference key rather than individual entity_id values, ensuring consistency even if underlying matches change.

Minimize Redundant Matching

Store previously matched parameters and their resulting project_entity_id values to avoid unnecessary re-matching of similar inputs.

Provide Comprehensive Attributes

Include as many identifying attributes as possible (name, address, identifiers, country) to improve match precision.

Response Structure

The Project Entity response follows a hierarchical structure:

Top-Level Properties

Project Entity Response

1 {
2   "data": {
3     "project_entity_id": "yebNPJ",
4     "project_id": "0n4473",
5     "label": "Marvel Garment",
6     "upload_ids": [],
7     "strength": "strong",
8     "created_at": "2025-08-26 00:35:50.72865+00",
9     "attributes": { ... },
10     "countries": [ ... ],
11     "risk_categories": [ ... ],
12     "risk_factors": [ ... ],
13     "upstream": { ... },
14     "tags": [],
15     "case": { ... },
16     "matches": [ ... ]
17   }
18 }

Field	Type	Description
`project_entity_id`	String	Unique identifier for the Project Entity within a given project
`project_id`	String	Identifier for the parent project
`label`	String	Display name for the Project Entity
`upload_ids`	Array	List of upload keys if created via batch upload
`strength`	String	Match confidence level (`strong`, `partial`, `no_match`)
`created_at`	String	Timestamp when the Project Entity was created
`attributes`	Object	Input parameters used during match resolution
`countries`	Array	List of countries associated with the entity
`risk_categories`	Array	Categorized risk factors with labels
`risk_factors`	Array	Individual risk factor identifiers
`upstream`	Object	Supply chain and trade information
`tags`	Array	User-defined tags for organization
`case`	Object	Case management information
`matches`	Array	Collection of matched entities from Sayari’s Knowledge Graph

Attributes Object

The attributes object contains the input parameters used during the matching process:

Attributes Structure

1 "attributes": {
2   "name": {
3     "resolve": true,
4     "values": [
5       {
6         "value": "Marvel Garment"
7       }
8     ]
9   },
10   "country": {
11     "resolve": true,
12     "values": [
13       {
14         "value": "KHM"
15       }
16     ]
17   },
18   "address": {
19     "resolve": true,
20     "values": [
21       {
22         "value": "Beung Thom 3 Village, Sangkat Beung Thom, Posenchey, Phnom Penh"
23       }
24     ]
25   }
26 }

Each attribute includes:

resolve (Boolean): Indicates if this attribute was used in match resolution
values (Array): Input values, each containing a value object with the actual data

Risk Categories Structure

Risk categories provide organized groupings of related risk factors:

Risk Categories

1 "risk_categories": [
2   {
3     "id": "forced_labor",
4     "label": "Forced labor",
5     "risk_factors": [
6       "psa_forced_labor_uflpa_origin_subtier",
7       "psa_forced_labor_xinjiang_origin_subtier",
8       "forced_labor_sheffield_hallam_university_reports_origin_subtier"
9     ]
10   },
11   {
12     "id": "environmental_risk",
13     "label": "Environmental risk", 
14     "risk_factors": [
15       "psa_exports_eudr_shipment_wood",
16       "exports_eudr_shipment_wood"
17     ]
18   }
19 ]

Risk Factors Structure

Individual risk factors are listed as simple objects:

Risk Factors

1 "risk_factors": [
2   {
3     "id": "psa_forced_labor_uflpa_origin_subtier"
4   },
5   {
6     "id": "psa_forced_labor_xinjiang_origin_subtier"
7   },
8   {
9     "id": "forced_labor_sheffield_hallam_university_reports_origin_subtier"
10   }
11 ]

To retrieve additional risk metadata such as descriptions, utilize the Get Risk Factors endpoint.

Upstream Object Structure

The upstream object contains supply chain and trade information:

Upstream Structure

1 "upstream": {
2   "countries": [],
3   "risk_factors": [],
4   "products": [
5     "6006", "5808", "9607", "6117", "6004", "3506"
6   ],
7   "has_upstream": true
8 }

Field	Type	Description
`countries`	Array	Countries found in the supply chain
`risk_factors`	Array	Risk factors identified in the supply chain
`products`	Array	HS codes for products in the supply chain
`has_upstream`	Boolean	Whether upstream supply chain data is available

Matches Array

Each match entry represents a potential entity match from Sayari’s Knowledge Graph:

Match Structure

1 {
2   "match_id": "yebNPJ:e0WEtIhAvDUjWs-hg12jSA",
3   "sayari_entity_id": "e0WEtIhAvDUjWs-hg12jSA",
4   "type": "company",
5   "label": "MARVEL GARMENT CO., LTD",
6   "matched_attributes": { ... },
7   "countries": [ ... ],
8   "risk_categories": [ ... ],
9   "risk_factors": [ ... ],
10   "upstream": { ... },
11   "business_purpose": [],
12   "addresses": [ ... ],
13   "sources": [ ... ],
14   "shipped_hs_codes": [ ... ],
15   "received_hs_codes": [ ... ],
16   "created_at": "2025-08-26 00:35:50.72865+00",
17   "updated_at": "2025-08-26 00:50:39.296139+00",
18   "relationship_count": { ... },
19   "match_profile": "corporate"
20 }

Key Match Properties

Field	Type	Description
`match_id`	String	Composite identifier (project_entity_id:entity_id)
`sayari_entity_id`	String	Unique identifier for the matched entity in Sayari’s Knowledge Graph
`type`	String	Entity type (e.g., “company”, “person”)
`label`	String	Display name for the entity
`matched_attributes`	Object	Specific attributes that matched, with highlighted elements using `<em>` tags
`countries`	Array	List of countries associated with the entity
`risk_categories`	Array	Same structure as project-level risk categories
`risk_factors`	Array	Same structure as project-level risk factors
`upstream`	Object	Supply chain data including `trade_counts`
`business_purpose`	Array	Business classification codes
`addresses`	Array	Physical addresses with value objects
`sources`	Array	Data sources with metadata including source type and country
`shipped_hs_codes`	Array	HS codes for products shipped by this entity
`received_hs_codes`	Array	HS codes for products received by this entity
`relationship_count`	Object	Counts of relationships by type (e.g., ships_to, receives_from)
`match_profile`	String	Matching profile used (e.g., “corporate”, “suppliers”)

Match Upstream Structure

The upstream object within matches includes additional trade count information:

Match Upstream with Trade Counts

1 "upstream": {
2   "risk_factors": [],
3   "countries": [],
4   "trade_counts": {
5     "shipper_of": 58,
6     "receiver_of": 34
7   },
8   "has_upstream": true,
9   "products": ["6006", "5808", "9607"]
10 }

Sources Array Structure

Sources provide detailed provenance information:

Sources Structure

1 "sources": [
2   {
3     "id": "9615bab28dddcc89548c928ab192ee7c",
4     "label": "Sri Lanka Imports & Exports (January 2023 - Present)",
5     "source_type": "trade_data",
6     "country": "LKA"
7   }
8 ]

Addresses Array Structure

Addresses Structure

1 "addresses": [
2   {
3     "value": "PPSEZ, PHASE 3, NATIONAL ROAD NO.4, PHUM BOEUNG THOM 3, SANGKAT BOEUNG THOM, KHAN POSENCHEY, PHNOM PENH, CAMBODIA., KH"
4   }
5 ]

Relationship Count Structure

Relationship Count

1 "relationship_count": {
2   "ships_to": 30,
3   "receives_from": 23,
4   "notify_party_of": 2
5 }

Frequently Asked Questions

How stable is the project entity id?

Within a given project, the project entity id will remain stable for a specific set of input parameters and values, ensuring consistency for ongoing operations.

Why do multiple matches appear in the project entity?

Multiple matches typically appear when provided attributes are less specific (e.g., name: “Sovcom Bank”, country: “RUS”) and target multi-national entities with several distinct legal entities. Clients who provide more specific attributes such as identifiers are less likely to encounter multiple matches.

How many matches can appear in a given project entity?

Up to 10 matches can appear in a given project entity.

Can I remove or add entities from a match group?

Individual matches / entity_ids can be removed from the group through the API, allowing for manual refinement when necessary. Support to add matches to the group is planned for future development.

What happens to rolled up risk if I add or remove entities from the match group?

When you remove entities from a match group, the system automatically recalculates the rolled-up risk to accurately reflect the cumulative risk profile of the remaining entities. This ensures that risk assessments remain current and comprehensive at the Project Entity level.

How many project entity IDs can I store within a single project?

Current guidance is to store up to 100,000 project entities per project.

Can the Project Entity be used for ongoing monitoring?

Yes, the Project Entity can serve as the stable reference point for monitoring entities identified in a screening process.

Next Steps

Create Your First Project Entity

Learn how to create and manage Project Entities through the API

Supply Chain Analysis

Explore how to analyze supply chains using Project Entities