synapse.ml.services.geospatial package

Submodules

synapse.ml.services.geospatial.AddressGeocoder module

class synapse.ml.services.geospatial.AddressGeocoder.AddressGeocoder(java_obj=None, AADToken=None, AADTokenCol=None, address=None, addressCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AddressGeocoder_57bf0c740037_error', initialPollingDelay=300, maxPollingRetries=1000, outputCol='AddressGeocoder_57bf0c740037_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url='https://atlas.microsoft.com/search/address/batch/json')[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • address (object) – the address to geocode

  • backoffs (list) – array of backoffs to use in the handler

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • initialPollingDelay (int) – number of milliseconds to wait before first poll for result

  • maxPollingRetries (int) – number of times to poll

  • outputCol (str) – The name of the output column

  • pollingDelay (int) – number of milliseconds to wait between polling

  • subscriptionKey (object) – the API key to use

  • suppressMaxRetriesException (bool) – set true to suppress the maxumimum retries exception and report in the error column

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
address = Param(parent='undefined', name='address', doc='ServiceParam: the address to geocode')
backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getAddress()[source]
Returns:

the address to geocode

Return type:

address

getBackoffs()[source]
Returns:

array of backoffs to use in the handler

Return type:

backoffs

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getInitialPollingDelay()[source]
Returns:

number of milliseconds to wait before first poll for result

Return type:

initialPollingDelay

static getJavaPackage()[source]

Returns package name String.

getMaxPollingRetries()[source]
Returns:

number of times to poll

Return type:

maxPollingRetries

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getPollingDelay()[source]
Returns:

number of milliseconds to wait between polling

Return type:

pollingDelay

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getSuppressMaxRetriesException()[source]
Returns:

set true to suppress the maxumimum retries exception and report in the error column

Return type:

suppressMaxRetriesException

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAddress(value)[source]
Parameters:

address – the address to geocode

setAddressCol(value)[source]
Parameters:

address – the address to geocode

setBackoffs(value)[source]
Parameters:

backoffs – array of backoffs to use in the handler

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setInitialPollingDelay(value)[source]
Parameters:

initialPollingDelay – number of milliseconds to wait before first poll for result

setMaxPollingRetries(value)[source]
Parameters:

maxPollingRetries – number of times to poll

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, address=None, addressCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AddressGeocoder_57bf0c740037_error', initialPollingDelay=300, maxPollingRetries=1000, outputCol='AddressGeocoder_57bf0c740037_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url='https://atlas.microsoft.com/search/address/batch/json')[source]

Set the (keyword only) parameters

setPollingDelay(value)[source]
Parameters:

pollingDelay – number of milliseconds to wait between polling

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setSuppressMaxRetriesException(value)[source]
Parameters:

suppressMaxRetriesException – set true to suppress the maxumimum retries exception and report in the error column

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.geospatial.CheckPointInPolygon module

class synapse.ml.services.geospatial.CheckPointInPolygon.CheckPointInPolygon(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='CheckPointInPolygon_ce370b2f1349_error', handler=None, latitude=None, latitudeCol=None, longitude=None, longitudeCol=None, outputCol='CheckPointInPolygon_ce370b2f1349_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url='https://atlas.microsoft.com/', userDataIdentifier=None, userDataIdentifierCol=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • latitude (object) – the latitude of location

  • longitude (object) – the longitude of location

  • outputCol (str) – The name of the output column

  • subscriptionKey (object) – the API key to use

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

  • userDataIdentifier (object) – the identifier for the user uploaded data

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

static getJavaPackage()[source]

Returns package name String.

getLatitude()[source]
Returns:

the latitude of location

Return type:

latitude

getLongitude()[source]
Returns:

the longitude of location

Return type:

longitude

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

getUserDataIdentifier()[source]
Returns:

the identifier for the user uploaded data

Return type:

userDataIdentifier

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
latitude = Param(parent='undefined', name='latitude', doc='ServiceParam: the latitude of location')
longitude = Param(parent='undefined', name='longitude', doc='ServiceParam: the longitude of location')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setGeography(value)[source]
setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setLatitude(value)[source]
Parameters:

latitude – the latitude of location

setLatitudeCol(value)[source]
Parameters:

latitude – the latitude of location

setLongitude(value)[source]
Parameters:

longitude – the longitude of location

setLongitudeCol(value)[source]
Parameters:

longitude – the longitude of location

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='CheckPointInPolygon_ce370b2f1349_error', handler=None, latitude=None, latitudeCol=None, longitude=None, longitudeCol=None, outputCol='CheckPointInPolygon_ce370b2f1349_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url='https://atlas.microsoft.com/', userDataIdentifier=None, userDataIdentifierCol=None)[source]

Set the (keyword only) parameters

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

setUserDataIdentifier(value)[source]
Parameters:

userDataIdentifier – the identifier for the user uploaded data

setUserDataIdentifierCol(value)[source]
Parameters:

userDataIdentifier – the identifier for the user uploaded data

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')
userDataIdentifier = Param(parent='undefined', name='userDataIdentifier', doc='ServiceParam: the identifier for the user uploaded data')

synapse.ml.services.geospatial.ReverseAddressGeocoder module

class synapse.ml.services.geospatial.ReverseAddressGeocoder.ReverseAddressGeocoder(java_obj=None, AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReverseAddressGeocoder_8a8645c4203f_error', initialPollingDelay=300, latitude=None, latitudeCol=None, longitude=None, longitudeCol=None, maxPollingRetries=1000, outputCol='ReverseAddressGeocoder_8a8645c4203f_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url='https://atlas.microsoft.com/search/address/reverse/batch/json')[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • backoffs (list) – array of backoffs to use in the handler

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • initialPollingDelay (int) – number of milliseconds to wait before first poll for result

  • latitude (object) – the latitude of location

  • longitude (object) – the longitude of location

  • maxPollingRetries (int) – number of times to poll

  • outputCol (str) – The name of the output column

  • pollingDelay (int) – number of milliseconds to wait between polling

  • subscriptionKey (object) – the API key to use

  • suppressMaxRetriesException (bool) – set true to suppress the maxumimum retries exception and report in the error column

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBackoffs()[source]
Returns:

array of backoffs to use in the handler

Return type:

backoffs

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getInitialPollingDelay()[source]
Returns:

number of milliseconds to wait before first poll for result

Return type:

initialPollingDelay

static getJavaPackage()[source]

Returns package name String.

getLatitude()[source]
Returns:

the latitude of location

Return type:

latitude

getLongitude()[source]
Returns:

the longitude of location

Return type:

longitude

getMaxPollingRetries()[source]
Returns:

number of times to poll

Return type:

maxPollingRetries

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getPollingDelay()[source]
Returns:

number of milliseconds to wait between polling

Return type:

pollingDelay

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getSuppressMaxRetriesException()[source]
Returns:

set true to suppress the maxumimum retries exception and report in the error column

Return type:

suppressMaxRetriesException

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
latitude = Param(parent='undefined', name='latitude', doc='ServiceParam: the latitude of location')
longitude = Param(parent='undefined', name='longitude', doc='ServiceParam: the longitude of location')
maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBackoffs(value)[source]
Parameters:

backoffs – array of backoffs to use in the handler

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setInitialPollingDelay(value)[source]
Parameters:

initialPollingDelay – number of milliseconds to wait before first poll for result

setLatitude(value)[source]
Parameters:

latitude – the latitude of location

setLatitudeCol(value)[source]
Parameters:

latitude – the latitude of location

setLongitude(value)[source]
Parameters:

longitude – the longitude of location

setLongitudeCol(value)[source]
Parameters:

longitude – the longitude of location

setMaxPollingRetries(value)[source]
Parameters:

maxPollingRetries – number of times to poll

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReverseAddressGeocoder_8a8645c4203f_error', initialPollingDelay=300, latitude=None, latitudeCol=None, longitude=None, longitudeCol=None, maxPollingRetries=1000, outputCol='ReverseAddressGeocoder_8a8645c4203f_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url='https://atlas.microsoft.com/search/address/reverse/batch/json')[source]

Set the (keyword only) parameters

setPollingDelay(value)[source]
Parameters:

pollingDelay – number of milliseconds to wait between polling

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setSuppressMaxRetriesException(value)[source]
Parameters:

suppressMaxRetriesException – set true to suppress the maxumimum retries exception and report in the error column

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

Module contents

SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.

SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.

SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.