You can query a knowledge graph to find a subset of the entities and relationships it contains and identify how different entities are connected. Provenance records can be used and provenance records can optionally be included in the query results. See the following examples:
- From a knowledge graph representing the spread of an infectious disease, work with humans and animals associated through any relationship with a given facility.
- From a knowledge graph representing a manufacturing supply chain, work with any content associated with a specific part including suppliers, means of delivery, warehouses, and so on.
- From a knowledge graph representing an organization, work with devices of a given type, and list their properties, including the name of the responsible employee.
- From a knowledge graph representing tortoises and their habitats, identify habitats where the level of risk was established using information in a specific environmental impact assessment.
You can identify the subset of entities and relationships, or their properties, by querying the knowledge graph. Use openCypher query language to write openCypher queries to discover related entities and their properties and work with this restricted set of information in the knowledge graph, a map, or a link chart.
Write an openCypher query
openCypher queries are to graph databases what SQL queries are to relational databases. The basic structure of the query is to find, or match, entities and return those entities, where the entities you want to find are identified in parentheses. For example, the query MATCH (e) RETURN e returns entities of any type. The number of entities returned is only limited by the knowledge graph's configuration. To restrict the number of graph items returned, use a LIMIT expression. For example, the query MATCH (e) RETURN e LIMIT 5 will return five entities of any type.
The query can identify entities that are related using symbols that create an arrow. For example, the query MATCH (e1)-->(e2) RETURN e1,e2 will return pairs of entities, e1 and e2, where any type of relationship exists between the two entities and any path from entity e1 to entity e2 connects the entities. If the query was written with the arrow pointing in the other direction, paths would be considered starting from the origin entity e2, to the destination entity e1: MATCH (e1)<--(e2) RETURN e1,e2. The manner in which entities are related to each other is referred to as a pattern.
The query can identify specific relationships that should be considered in square brackets. For example, the query MATCH (e1)-[]->(e2) RETURN e1,e2 will return pairs of entities, e1 and e2, where a single relationship of any type connects the two entities. This query shows another way to represent the same queries illustrated above, and illustrates the preferred query syntax. The query can be amended to return the entire tuple describing the relationship by returning the origin entity, e1, the relationship, r, and the destination entity, e2, as follows: MATCH (e1)-[r]->(e2) RETURN e1,r,e2. Similar queries MATCH (e1)-[ ]->( )-[ ]->(e2) RETURN e1,e2 or MATCH (e1)-[*2]->(e2) RETURN e1,e2 will return pairs of entities that are connected by two relationships in the same direction. Queries can also identify patterns where relationships have different directions such as MATCH (e1)-[ ]->(e2)<-[ ]-(e3) RETURN e1,e2,e3.
The example queries above can be used with any knowledge graph.
Tailor a query to a specific knowledge graph by referencing the entity types, relationship types, and properties defined in its data model. Include the name of a specific entity type in your query to constrain the graph items that are considered. For example, the query MATCH (e1:Person)-[r]->(e2) RETURN e1,r,e2 will return all Person entities, e1, in which any relationship, r, connects the Person to another entity, e2, which can be an entity of any type. Compared to the previous example, relationships in which a Pet, Vehicle, or Document entity is the origin of a relationship aren't included in the results.
You can constrain the query to consider specific relationship types and specific related entities by adding relationship types and entity types to the other facets of the query. For example, MATCH (p:Person)-[v:HasVehicle]->(e) RETURN p,v,e will return all Person entities, p, in which a HasVehicle relationship, v, connects the Person to another entity of any type, e. The variables p and v are assigned to the Person entities and HasVehicle relationships, respectively, so information about them can be returned by the query. Compared to the previous example, relationships in which a Pet or Document entity are the destination of a relationship aren't included in the results. Depending on the knowledge graph's data model, the destination entity, e, could be a generic Vehicle entity, or it could be one of a series of specific entity types such as Automobile, Motorcycle, Boat, Airplane, Commercial Vehicle, and so on.
Specific properties of entities and relationships can be included in the query results. For example, MATCH (p:Person)-[:HasVehicle]->(e) RETURN p,e.make,e.model,e.year will run the same query defined previously. However, instead of showing the destination entity itself, the results will show the values stored in several of its properties: the make, model, and year of the vehicle, respectively. In this example, a variable was not assigned for the specific relationship being considered by the query because the relationship's data is not included in the query results or evaluated elsewhere in the query.
Similarly, you can constrain the entities and relationships that are evaluated by specifying properties that define the entities and relationships of interest. The properties to consider are defined by adding a WHERE clause to the query. As with the examples above, variables must be assigned to reference specific information about entities and relationships in the WHERE clause. For example, in the following query, only Person entities with a specific lastName property value are evaluated; HasVehicle relationships are only considered if they have a NULL value in the endDate property; and related Vehicle entities are only considered if the year property has a value that is earlier than 1980: MATCH (p:Person)-[hv:HasVehicle]->(v:Vehicle) WHERE p.lastName = 'Doe' and hv.endDate IS NULL and v.year < 1980 RETURN p,p.firstName,v,v.make,v.year.
Instead of returning a series of individual entities and relationships, your query can return the complete path represented by a pattern. To do this, assign the pattern defined in the MATCH statement to a variable and return that variable. For example, the query MATCH path = (:Person)-[:HasVehicle]->(:Vehicle) RETURN path will return a list of paths for all entity and relationship combinations that satisfy the specified pattern. Each path will contain all parts of the matched pattern: the Person, HasVehicle relationship, and Vehicle. You do not need to assign variables to the individual parts of this pattern since they are not returned by the query.
Use date-time values in a query
When you create a knowledge graph using a hosted graph store or a NoSQL data store with ArcGIS-managed data, date-time values are always expected to be provided in coordinated universal time (UTC). Similarly, when the date-time values are returned by a query, they are always provided as UTC times. That is, date-time values stored in a property of an entity or relationship cannot be associated with a specific time zone. When providing a value, date-times are stored as-is. That is, the knowledge graph does not convert a value provided in local time to UTC automatically. You must perform the conversion yourself and store the correct value to avoid problems when querying the knowledge graph later.
For example, if data is collected about a specific event, everyone editing the knowledge graph should provide date-time values related to that event as the appropriate UTC time. Editors who are located in different time zones should not provide the date-time value associated with their local time zone. Similarly, when querying the knowledge graph, queries should reference the appropriate UTC time associated with the event and not the local time for your time zone. This way, everyone will store correct values and be able to query correct values regardless of their physical location and local time.
To query a knowledge graph using values stored in date properties, you must use the datetime() utility, which interprets the provided value as a date. Specify the value using the date-time format YYYY-MM-DDThh:mm:ss.sssZ to indicate the provided date-time is UTC time. Use the date-time format YYYY-MM-DDThh:mm:ss.sss+00:00 to indicate the provided date is a local time with the appropriate time zone offset. A full date-time value must be provided in ISO format with a specified time zone. If parts of the date-time value are omitted or if the time zone is omitted, the query returns an error.
For example, to find all vehicles purchased after a specific date-time, use a query such as MATCH path=(:Person)-[hv:HasVehicle]->(:Vehicle) WHERE hv.acquisitionDate > datetime('2014-10-18T12:36-08:00') RETURN path. All paths are returned where the HasVehicle relationship has an acquisitionDate property with a value after 12:36pm Pacific Standard Time on October 18, 2014; Pacific Standard Time is eight hours behind UTC.
To find all vehicles purchased in the year 1998, use a query such as MATCH path=(:Person)-[hv:HasVehicle]->(:Vehicle) WHERE hv.acquisitionDate > datetime('1998-01-01T00:00Z') and hv.acquisitionDate < datetime('1998-12-31T23:59Z') RETURN path. All paths are returned where the HasVehicle relationship has an acquisitionDate property between January 1 and midnight of December 31 UTC in the year 1998.
At this time, ArcGIS Knowledge does not support querying a knowledge graph using a time interval or time period.
Use provenance records in a query
When you create a knowledge graph using a hosted graph store or a NoSQL data store with ArcGIS-managed data, provenance can be enabled. Provenance records allow you to indicate where the content of the knowledge graph originated. You can specify that the value stored in a property of an entity or a relationship was derived from a specific document or website. This information can be used later when you query the knowledge graph. You can find entities and relationships with property values that originated from a specific source of information.
Provenance records are excluded from all queries by default. In ArcGIS Pro, you must check the Include Provenance option to leverage information stored in provenance records in your query and include provenance records in the query results. If you create a custom application that communicates with an ArcGIS Knowledge Server site using one of the available developer APIs, your client application must specify that the query results should include provenance records.
Your query must determine which entities and relationships have properties associated with provenance records by comparing the instance identifiers of the graph items to the instance identifiers stored in the provenance records. The query can also evaluate properties of the graph items and the provenance records to get results. Provenance records themselves or their properties can optionally be included in the query results.
For example, to find the owners of all vehicles where the California Department of Motor Vehicles is the source of the information stored in properties of the HasVehicle relationship, use a query such as MATCH (p:Person)-[hv:HasVehicle]->(v:Vehicle), (pr:Provenance) WHERE ID(hv)=pr.instanceID and pr.sourceName ="California Department of Motor Vehicles" RETURN p,hv,v,pr. The query evaluates which HasVehicle relationships have provenance records by comparing instance identifiers of the relationships to the instanceID property of the provenance records. The provenance records evaluated are only those where the record's sourceName property has the appropriate value. Additionally, Person and Vehicle entities associated with the HasVehicle relationships that were found and the provenance records themselves are also included in the query results.
Use spatial operators in a query
ArcGIS Knowledge supports using spatial operators in openCypher queries with point, multipoint, line, and polygon geometries. These are the same geometry types supported for entity types created in a knowledge graph using a hosted graph store or a NoSQL data store with ArcGIS-managed data. If your knowledge graph uses a NoSQL data store with user-managed data, different geometry types may be supported.
The following spatial operators are supported:
- ST_Equals—Returns entities with equal geometries. The syntax is esri.graph.ST_Equals(geometry1, geometry2).
- ST_Intersects—Returns entities with intersecting geometries. The syntax is esri.graph.ST_Intersects(geometry1, geometry2).
- ST_Contains—Returns entities whose geometries are contained by the specified geometry. The syntax is esri.graph.ST_Contains(geometry1, geometry2).
Spatial operators can be incorporated into the WHERE clause of a query. The geometry parameters can reference an entity's geometry or you can specify a geometry that represents a spatial location. You can construct a geometry from a string using the operator esri.graph.ST_WKTToGeometry(string) where the string parameter is an OGC simple feature specified in the well-known text format. For example, to create a geometry representing the coordinates 117.1964763°W 34.0572046°N, you would use the operator esri.graph.ST_WKTToGeometry("POINT (-117.1964763 34.0572046)"). A geometry constructed in this manner can only be specified in the first geometry argument for the spatial operators. The second geometry argument must always reference the geometry associated with an entity in the knowledge graph.
Consider the following examples where entities of the Person type can have point geometries and entities of the Facility type can have polygon geometries:
- The query MATCH (p1:Person), (p2:Person) WHERE esri.graph.ST_Equals(p1.shape, p2.shape) RETURN p1, p2 returns Person entities p1 and p2 entities with equal shapes; that is, both Person entities have identical location geometries.
- The query MATCH (e:Employee), (f:Facility) WHERE esri.graph.ST_Intersects(e.shape, f.shape) RETURN e, f returns Employee entities and Facility entities, e and f, respectively, where geometries for the Employee and Facility entities intersect.
- The query MATCH (f:Facility) WHERE esri.graph.ST_Contains(esri.graph.ST_WKTToGeometry("POINT (-117.1964763 34.0572046)"), f.shape) RETURN f returns Facility entities, f, whose geometries contain the specified point.
The spatial utility ST_GeoDistance is also available, which has the syntax esri.graph.ST_GeoDistance(geometry, geometry). This utility returns the distance between the two geometries. For example, in the query MATCH (n), (e) WHERE esri.graph.ST_GeoDistance(n.shape, e.shape) as distance RETURN n, e, the distance variable in the WHERE clause stores the geodesic distance that is calculated between the entities n and e.
Learn more about openCypher queries
You can learn more about the openCypher query language using a document provided by openCypher Implementers Group. ArcGIS Knowledge does not support all aspects of the openCypher query language. For example, queries can't be used to update the knowledge graph, only to return values.
In ArcGIS Pro, you can learn about openCypher by seeing the queries that retrieve data from a knowledge graph to build histograms. In the Search and Filter pane, on the Histogram tab , click the Settings button , and click Send query to Query tab. The query used to retrieve data for the current set of histograms appears in the Query text box.