GeoAnalytics Tools in Run Python Script—REST APIs | Documentation for ArcGIS Enterprise

The Run Python Script task allows you to programmatically execute most GeoAnalytics Tools with Python using an API that is available when you run the task. A geoanalytics object is instantiated automatically and gives you access to each tool using the syntax shown in the example and table below. Each tool accepts input layers as Spark DataFrames and will return results as a Spark DataFrame or collection of Spark DataFrames. To learn more, see Reading and writing layers in pyspark. DataFrames are held in memory and can be written to a data store at any time. This allows you to chain together multiple GeoAnalytics Tools without writing out intermediate results.

Note:

The API described in this topic can only be used within the Run Python Script task and should not be confused with the ArcGIS API for Python, which uses a different syntax to execute stand-alone GeoAnalytics Tools and is intended for use outside of the Run Python Script task.

In the example below, the Detect Incidents task and Find Hot Spots task are used together and only the final DataFrame is written to a data store as a feature service layer. The input layer (represented in the example below by layers[0]) is a big data file share dataset of city bus locations recorded at 1-minute intervals for 15 days. To learn more about using layers, see Reading and writing layers in pyspark.

Chaining together GeoAnalytics Tools with DataFrames


import time

# Run Detect Incidents to find all bus locations where delay status has changed from False to True
exp = "var dly = $track.field[\"dly\"].history(-2); return dly[0]==\"False\" && dly[1]==\"True\""
delay_incidents = geoanalytics.detect_incidents(input_layer = layers[0], track_fields = ["vid"], start_condition_expression = exp, output_mode = "Incidents")

# Use the resulting DataFrame as input to the Find Hot Spots task
delay_hotspots = geoanalytics.find_hot_spots(point_layer = delay_incidents, bin_size = 0.1, bin_size_unit = "Miles", neighborhood_distance = 1, neighborhood_distance_unit = "Miles", time_step_interval = 1, time_step_interval_unit = "Days")

# Write the Find Hot Spots result to the spatiotemporal big data store
delay_hotspots.write.format("webgis").save("Bus_Delay_HS_{0}".format(time.time()))

For more examples, see Examples: Scripting custom analysis with the Run Python Script task.

The table below describes the method signature for GeoAnalytics Tools in Run Python Script. All tools can be called except for Copy To Data Store and Append Data. The parameter syntax is the same as that of the REST API except where noted. See the documentation for each tool for descriptions of parameter syntax and tool outputs.

Note:

For all tool methods with time_step_repeat and time_step_repeat_unit arguments, these correspond to the timeStepRepeatInterval and timeStepRepeatIntervalUnit REST parameters, respectively.


Tool	Syntax	Returns	Notes
Aggregate Points	aggregate_points(point_layer, bin_type = None, bin_size = None, bin_size_unit = None, polygon_layer = None, time_step_interval = None, time_step_interval_unit = None, time_step_repeat = None, time_step_repeat_unit = None, time_step_reference = None, summary_fields = None)	DataFrame
Build Multi-Variable Grid	build_multi_variable_grid(bin_type = "Square", bin_size = None, bin_size_unit = None, input_layers = None, variable_calculations = None)	DataFrame	input_layers should be list of DataFrames.
Calculate Density	calculate_density(input_layer, fields = None, weight = "Uniform", bin_type = "Square", bin_size = None, bin_size_unit = None, time_step_interval = None, time_step_interval_unit = None, time_step_repeat = None, time_step_repeat_unit = None, time_step_reference = None, radius = None, radius_unit = None, area_units = "SquareKilometers")	DataFrame
Calculate Field	calculate_field(input_layer, field_name, data_type, expression, track_aware = None, track_fields = None, time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None)	DataFrame
Calculate Motion Statistics	calculate_motion_statistics(input_layer, track_fields, track_history_window = 3, motion_statistics = ["All"], idle_distance_tolerance = None, idle_distance_tolerance_unit = None, idle_time_tolerance = None, idle_time_tolerance_unit = None, time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None, distance_method = "Geodesic", distance_unit = "Meters", duration_unit = "Seconds", speed_unit = "MetersPerSecond", acceleration_unit = "MetersPerSecondSquared", elevation_unit = "Meters")	DataFrame
Clip Layer	clip_layer(input_layer, clip_layer)	DataFrame
Create Buffers	create_buffers(input_layer, distance = None, distance_unit = None, field = None, method = "Planar", dissolve_option = "None", dissolve_fields = None, summary_fields = None, multipart = False)	DataFrame
Create Space Time Cube	create_space_time_cube(point_layer, bin_size, bin_size_unit, time_step_interval, time_step_interval_unit, time_step_alignment = None, time_step_reference = None, summary_fields = None, output_name = None)	String	Returns the local path to the resulting space-time cube on a ArcGIS GeoAnalytics Server machine. The cube is written to a temp directory and will be deleted if not copied to a different location.
Describe Dataset	describe_dataset(input_layer, sample_size = None, extent_output = False)	Dictionary	Example result: {"output":<DataFrame>, "outputJSON":<string>,"extentLayer":<DataFrame>,"sampleLayer":<DataFrame>}
Detect Incidents	detect_incidents(input_layer, track_fields, start_condition_expression, end_condition_expression = None, output_mode = "AllFeatures", time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None)	DataFrame
Dissolve Boundaries	dissolve_boundaries(input_layer, dissolve_fields = None, summary_fields = None, multipart = False)	DataFrame
Enrich From Multi-Variable Grid	enrich_from_multi_variable_grid(input_features, grid_layer, enrich_attributes = None)	DataFrame
Find Dwell Locations	find_dwell_locations(input_layer, track_fields, distance_method = "Planar", distance_tolerance, distance_tolerance_unit, time_tolerance, time_tolerance_unit, summary_fields = None, output_type = "DwellMeanCenters", time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None)	DataFrame
Find Hot Spots	find_hot_spots(point_layer, bin_size, bin_size_unit, neighborhood_distance, neighborhood_distance_unit, time_step_interval = None, time_step_interval_unit = None, time_step_alignment = None, time_step_reference = None)	DataFrame
Find Point Clusters	find_point_clusters(input_layer, cluster_method = "DBSCAN", time_method = None, search_duration = None, search_duration_unit = None, min_features_cluster = None, search_distance = None, search_distance_unit = None)	DataFrame
Find Similar Locations	find_similar_locations(input_layer, search_layer, analysis_fields, most_or_least_similar = "MostSimilar", match_method = "AttributeValues", number_of_results = 10, append_fields = None)	Dictionary	Example result: {"output":<DataFrame>, "processInfo":<string>}
Forest-based Classification And Regression	forest_based_classification_and_regression(prediction_type = "Train", in_features = None, features_to_predict = None, variable_predict = None, explanatory_variables = None, number_of_trees = 100, minimum_leaf_size = None, maximum_tree_depth = None, sample_size = 100, random_variables = None, percentage_for_validation = 10, create_variable_importance_table = False, explanatory_variable_matching = None)	Dictionary	Example result: {"outputTrained":<DataFrame>, "variableOfImportance":<DataFrame>,"outputPredicted":<DataFrame>,"processInfo":<string>}
Generalized Linear Regression	generalized_linear_regression(input_layer, features_to_predict = None, dependent_variable = None, explanatory_variables = None, regression_family = "Continuous", generate_coefficient_table = False, explanatory_variable_matching = None, dependent_mapping = None)	Dictionary	Example result: {"output":<DataFrame>, "coefficientTable":<DataFrame>,"outputPredicted":<DataFrame>, "processInfo":<string>}
Geocode Locations	geocode_locations(input_layer, geocode_service_url, geocode_parameters, source_country = None, category = None, include_attributes = None, locator_parameters = None)	DataFrame
Geographically Weighted Regression	geographically_weighted_regression(input_layer, explanatory_variables, dependent_variable, model_type = "Continuous", neighborhood_type = "NumberOfNeighbors", neighborhood_selection_method = "UserDefined", distance_band = None, distance_band_unit = None, number_of_neighbors = None, local_weighting_scheme = "Bisquare")	DataFrame
Group By Proximity	group_by_proximity(input_layer, spatial_relationship, spatial_near_distance = None, spatial_near_distance_unit = None, temporal_relationship = None, temporal_near_distance = None, temporal_near_distance_unit = None)	DataFrame
Join Features	join_features(target_layer, join_layer, join_operation = "JoinOneToOne", keep_all_target_features = False, join_fields = None, summary_fields = None, spatial_relationship = None, spatial_near_distance = None, spatial_near_distance_unit = None, temporal_relationship = None, temporal_near_distance = None, temporal_near_distance_unit = None, attribute_relationship = None, join_condition = None)	DataFrame
Merge Layers	merge_layers(input_layer, merge_layer, merging_attributes = None)	DataFrame
Overlay Layers	overlay_layers(input_layer, overlay_layer, overlay_type = "Intersect", include_overlaps = True)	DataFrame
Reconstruct Tracks	reconstruct_tracks(input_layer, track_fields, method = "Planar", buffer_field = None, summary_fields = None, time_split = None, time_split_unit = None, distance_split = None, distance_split_unit = None, time_boundary_split = None, time_boundary_split_unit = None, time_boundary_reference = None)	DataFrame
Summarize Attributes	summarize_attributes(input_layer, fields, summary_fields = None, time_step_interval = None, time_step_interval_unit = None, time_step_repeat = None, time_step_repeat_unit = None, time_step_reference = None)	DataFrame
Summarize Center And Dispersion	summarize_center_and_dispersion(input_layer, summary_type, ellipse_size = None, weight_field = None, group_fields = None)	Dictionary	Example result: {"centralFeatureLayer":<DataFrame>, "meanCenterLayer":<DataFrame>, "medianCenterLayer":<DataFrame>, "ellipseLayer":<DataFrame>}
Summarize Within	summarize_within(summary_polygons = None, bin_type = None, bin_size = None, bin_size_unit = None, summarized_layer = None, standard_summary_fields = None, weighted_summary_fields = None, sum_shape = True, shape_units = None, group_by_field = None, minority_majority = False, percent_shape = False)	Dictionary	Example result: {"output":<DataFrame>, "groupBySummary":<DataFrame>}
Trace Proximity Events	trace_proximity_events(input_points, entity_id_field, entities_of_interest_ids = None, entities_of_interest_layer = None, distance_method, spatial_search_distance, spatial_search_distance_unit, temporal_search_distance, temporal_search_distance_unit, include_tracks_layer = false, max_trace_depth = 2147483647, attribute_match_criteria = None)	Dictionary	Example result: {"output":<DataFrame>, "tracksLayer":<DataFrame>}

In addition to the tools listed above, a project tool is provided with the geoanalytics package that allows you to project the geometry of a DataFrame into the specified spatial reference.


Tool	Syntax	Returns	Notes
Project	project(input_features, output_coord_system)	DataFrame	input_features is the DataFrame to project and output_coord_system is the WKT or WKID of the spatial reference to use. Example: geoanalytics.project(df, 2796)