- URL:https://<geoanalytics-url>/RunPythonScript
- Version Introduced:10.7
Description
The RunPythonScript operation runs a Python script on an ArcGIS GeoAnalytics Server site. In the script, you can create an analysis pipeline by chaining together multiple GeoAnalytics Tools without writing intermediate results to a data store. You can also use other Python functionality in the script that can be distributed across the GeoAnalytics Server.
For example, suppose that each week you receive a new dataset of vehicle locations containing billions of point features. Each time you receive a new dataset, you must perform the same workflow involving multiple GeoAnalytics Tools to create an information product that you share within your organization. This workflow creates several large intermediate layers that take up a large amount of space in your data store. By scripting this workflow in Python and running the code in the RunPythonScript operation, you can avoid creating these unnecessary intermediate layers, while simplifying the steps to create the information product.
When you use RunPythonScript, the Python code is run on the GeoAnalytics Server. The script runs with the Python 3.9 environment that is installed with GeoAnalytics Server, and all console output is returned as job messages. Some Python modules can be used in the script to use code across multiple cores of one or more machines in the GeoAnalytics Server using Apache Spark 3.3.0 (the compute platform that distributes analysis for GeoAnalytics Tools).
A geoanalytics module is available and allows you to run GeoAnalytics Tools in the script. This package is imported automatically when you use RunPythonScript. To learn more, see Using GeoAnalytics Tools in Run Python Script.
Note:
Some analysis tool optimizations are not available in the RunPythonScript operation.
To interact directly with Spark in the RunPythonScript operation, use the pyspark module, which is imported automatically when you run the task. The pyspark module is the Python API for Spark and provides a collection of distributed analysis tools for data management, clustering, regression, and more that can be called in RunPythonScript and run across GeoAnalytics Server.
For examples demonstrating how to use the geoanalytics and pyspark packages, see Examples: Scripting custom analysis with the Run Python Script task.
When using the geoanalytics and pyspark packages, most functions return analysis results in memory as Spark DataFrames. Spark DataFrames can be written to a data store or used in the script. This allows the chaining together of multiple geoanalytics and pyspark tools, while only writing out the final result to a data store, eliminating the need to create intermediate result layers. To learn more, see Reading and writing layers in pyspark.
For advanced users, an instance of SparkContext is instantiated automatically as sc and can be used in the script to interact with Spark. This allows custom distributed analysis across GeoAnalytics Server.
It is recommended that you use an integrated development environment (IDE) to write the Python script, and copy the script text into the RunPythonScript tool. This way, you can identify syntax and typographical errors before running the script. It is also recommended that you run the script using a small subset of the input data first to verify that there are no logic errors or expectations. You can use the DescribeDataset task to create a sample layer for this purpose.
Note:
To run the RunPythonScript operation, you must have the administrative privilege to publish web tools.
When ArcGIS GeoAnalytics Server is installed on Linux, additional configuration steps are required before using the RunPythonScript operation. These steps are not required in Windows environments. To use RunPythonScript on Linux, install and configure Python 3.7+ for Linux on each machine in the GeoAnalytics Server site, ensuring that Python is installed in the same directory on each machine. Then, update the ArcGIS Server Properties on the GeoAnalytics Server site with the pysparkPython property. The value of this property should be the path to the Python executable on the GeoAnalytics Server machines, for example, {"pysparkPython":"/usr/bin/python"}.
Request parameters
Parameter | Details |
---|---|
pythonScript (Required) | The Python script that will run on GeoAnalytics Server. This must be the full script as a string. The layer provided in inputLayer can be accessed in the script using the layers object. To learn more, see Reading and writing layers in pyspark. GeoAnalytics Tools can be accessed with the geoanalytics object, which is instantiated in the script environment automatically. To learn more, see Using GeoAnalytics Tools in Run Python Script. For a collection of example scripts, see Examples: Scripting custom analysis with the Run Python Script task.. REST examples
|
inputLayers (Required) | A list of input layers that will be used in the Python script. Each input layer follows the same formatting as described in the Feature input topic. This can be one of the following:
In the REST web example for inputLayers shown below, two layers are used in the analysis. The layers provided can be accessed in the script using the layers object. The layer at index 0 will be filtered to only use features when OID > 2. REST examples
|
userParameters (Optional) | A JSON object that will be automatically loaded into the script environment as a local variable named user_variables. REST example
|
context (Optional) | Note:This parameter is not used by the RunPythonScript tool. To control the output data store, use the "dataStore" option when writing Spark DataFrames. To set the processing or output spatial reference, use the project tool in the geoanalytics package. To filter a layer when converting it to a Spark DataFrame, use the "where" or "fields" option when loading the layer's URL. To limit the extent of a layer when converting it to a Spark DataFrame, use the "extent" option when loading the layer's URL. |
f | The response format. The default response format is html. Values: html | json |
Example usage
The following is a sample request URL for RunPythonScript:
https://hostname.domain.com/webadaptor/rest/services/System/GeoAnalyticsTools/GPServer/RunPythonScript/submitJob?pythonScript=print("Hello world!"}&inputLayer={"url":"https://myportal.domain.com/server/rest/services/Hosted/hurricaneTrack/FeatureServer/0", "filter":"Month = 'September'"}
Response
When you submit a request, the service assigns a unique job ID for the transaction.
{
"jobId": "<unique job identifier>",
"jobStatus": "<job status>"
}
After the initial request is submitted, you can use jobId to periodically check the status of the job and messages as described in Check job status. Once the job has successfully completed, use jobId to retrieve the results. To track the status, you can make a request of the following form:
https://<analysis url>/RunPythonScript/jobs/<jobId>
Any Python console output will be returned as an informative job message. In the following example, "Hello World!" is printed to the console using pythonScript and a job message containing the print statement is returned as shown:
{
"type": "esriJobMessageTypeInformative",
"description": "{\"messageCode\":\"BD_101138\",\"message\":\"[Python] Hello World!\",\"params\":{\"text\":\"Hello World!\"}}"
}
Access results
All results written to ArcGIS Enterprise are available in your portal contents.