synapse.ml.services.openai package
Submodules
synapse.ml.services.openai.OpenAIChatCompletion module
- class synapse.ml.services.openai.OpenAIChatCompletion.OpenAIChatCompletion(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAIChatCompletion_d97fa46800a8_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, messagesCol=None, n=None, nCol=None, outputCol='OpenAIChatCompletion_d97fa46800a8_output', presencePenalty=None, presencePenaltyCol=None, responseFormat=None, responseFormatCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=360.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Bases:
ComplexParamsMixin
,JavaMLReadable
,JavaMLWritable
,JavaTransformer
- Parameters:
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
bestOf¶ (object) – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
cacheLevel¶ (object) – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customHeaders¶ (object) – Map of Custom Header Key-Value Tuples.
customUrlRoot¶ (str) – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
echo¶ (object) – Echo back the prompt in addition to the completion
frequencyPenalty¶ (object) – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
handler¶ (object) – Which strategy to use when handling requests
logProbs¶ (object) – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
maxTokens¶ (object) – The maximum number of tokens to generate. Has minimum of 0.
messagesCol¶ (str) – The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).
n¶ (object) – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
presencePenalty¶ (object) – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
responseFormat¶ (object) – Response format for the completion. Can be ‘json_object’ or ‘text’.
stop¶ (object) – A sequence which indicates the end of the current document.
temperature¶ (object) – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
timeout¶ (float) – number of seconds to wait before closing the connection
topP¶ (object) – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
user¶ (object) – The ID of the end-user, for use in tracking and rate-limiting.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- bestOf = Param(parent='undefined', name='bestOf', doc='ServiceParam: How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.')
- cacheLevel = Param(parent='undefined', name='cacheLevel', doc='ServiceParam: can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customHeaders = Param(parent='undefined', name='customHeaders', doc='ServiceParam: Map of Custom Header Key-Value Tuples.')
- customUrlRoot = Param(parent='undefined', name='customUrlRoot', doc='The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.')
- deploymentName = Param(parent='undefined', name='deploymentName', doc='ServiceParam: The name of the deployment')
- echo = Param(parent='undefined', name='echo', doc='ServiceParam: Echo back the prompt in addition to the completion')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- frequencyPenalty = Param(parent='undefined', name='frequencyPenalty', doc='ServiceParam: How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.')
- getBestOf()[source]
- Returns:
How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- Return type:
bestOf
- getCacheLevel()[source]
- Returns:
can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- Return type:
cacheLevel
- getConcurrentTimeout()[source]
- Returns:
max number seconds to wait on futures if concurrency >= 1
- Return type:
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns:
A Custom Value for Authorization Header
- Return type:
CustomAuthHeader
- getCustomHeaders()[source]
- Returns:
Map of Custom Header Key-Value Tuples.
- Return type:
customHeaders
- getCustomUrlRoot()[source]
- Returns:
The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- Return type:
customUrlRoot
- getFrequencyPenalty()[source]
- Returns:
How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- Return type:
frequencyPenalty
- getLogProbs()[source]
- Returns:
Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- Return type:
logProbs
- getMaxTokens()[source]
- Returns:
The maximum number of tokens to generate. Has minimum of 0.
- Return type:
maxTokens
- getMessagesCol()[source]
- Returns:
The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).
- Return type:
messagesCol
- getN()[source]
- Returns:
How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- Return type:
n
- getPresencePenalty()[source]
- Returns:
How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- Return type:
presencePenalty
- getResponseFormat()[source]
- Returns:
Response format for the completion. Can be ‘json_object’ or ‘text’.
- Return type:
responseFormat
- getStop()[source]
- Returns:
A sequence which indicates the end of the current document.
- Return type:
stop
- getTemperature()[source]
- Returns:
What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- Return type:
temperature
- getTimeout()[source]
- Returns:
number of seconds to wait before closing the connection
- Return type:
timeout
- getTopP()[source]
- Returns:
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- Return type:
topP
- getUser()[source]
- Returns:
The ID of the end-user, for use in tracking and rate-limiting.
- Return type:
user
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- logProbs = Param(parent='undefined', name='logProbs', doc='ServiceParam: Include the log probabilities on the `logprobs` most likely tokens, as well the chosen tokens. So for example, if `logprobs` is 10, the API will return a list of the 10 most likely tokens. If `logprobs` is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.')
- maxTokens = Param(parent='undefined', name='maxTokens', doc='ServiceParam: The maximum number of tokens to generate. Has minimum of 0.')
- messagesCol = Param(parent='undefined', name='messagesCol', doc='The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).')
- n = Param(parent='undefined', name='n', doc='ServiceParam: How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- presencePenalty = Param(parent='undefined', name='presencePenalty', doc='ServiceParam: How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.')
- responseFormat = Param(parent='undefined', name='responseFormat', doc="ServiceParam: Response format for the completion. Can be 'json_object' or 'text'.")
- setBestOf(value)[source]
- Parameters:
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setBestOfCol(value)[source]
- Parameters:
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setCacheLevel(value)[source]
- Parameters:
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setCacheLevelCol(value)[source]
- Parameters:
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setConcurrentTimeout(value)[source]
- Parameters:
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomHeaders(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomHeadersCol(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomUrlRoot(value)[source]
- Parameters:
customUrlRoot¶ – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- setFrequencyPenalty(value)[source]
- Parameters:
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setFrequencyPenaltyCol(value)[source]
- Parameters:
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setLogProbs(value)[source]
- Parameters:
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setLogProbsCol(value)[source]
- Parameters:
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setMaxTokens(value)[source]
- Parameters:
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setMaxTokensCol(value)[source]
- Parameters:
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setMessagesCol(value)[source]
- Parameters:
messagesCol¶ – The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).
- setN(value)[source]
- Parameters:
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setNCol(value)[source]
- Parameters:
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAIChatCompletion_d97fa46800a8_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, messagesCol=None, n=None, nCol=None, outputCol='OpenAIChatCompletion_d97fa46800a8_output', presencePenalty=None, presencePenaltyCol=None, responseFormat=None, responseFormatCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=360.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Set the (keyword only) parameters
- setPresencePenalty(value)[source]
- Parameters:
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setPresencePenaltyCol(value)[source]
- Parameters:
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setResponseFormat(value)[source]
- Parameters:
responseFormat¶ – Response format for the completion. Can be ‘json_object’ or ‘text’.
- setResponseFormatCol(value)[source]
- Parameters:
responseFormat¶ – Response format for the completion. Can be ‘json_object’ or ‘text’.
- setStop(value)[source]
- Parameters:
stop¶ – A sequence which indicates the end of the current document.
- setStopCol(value)[source]
- Parameters:
stop¶ – A sequence which indicates the end of the current document.
- setTemperature(value)[source]
- Parameters:
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTemperatureCol(value)[source]
- Parameters:
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTimeout(value)[source]
- Parameters:
timeout¶ – number of seconds to wait before closing the connection
- setTopP(value)[source]
- Parameters:
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setTopPCol(value)[source]
- Parameters:
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setUser(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- setUserCol(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- stop = Param(parent='undefined', name='stop', doc='ServiceParam: A sequence which indicates the end of the current document.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- temperature = Param(parent='undefined', name='temperature', doc='ServiceParam: What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or `top_p` but not both. Minimum of 0 and maximum of 2 allowed.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- topP = Param(parent='undefined', name='topP', doc='ServiceParam: An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or `temperature` but not both. Minimum of 0 and maximum of 1 allowed.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- user = Param(parent='undefined', name='user', doc='ServiceParam: The ID of the end-user, for use in tracking and rate-limiting.')
synapse.ml.services.openai.OpenAICompletion module
- class synapse.ml.services.openai.OpenAICompletion.OpenAICompletion(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, batchPrompt=None, batchPromptCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAICompletion_595eda8b403c_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, n=None, nCol=None, outputCol='OpenAICompletion_595eda8b403c_output', presencePenalty=None, presencePenaltyCol=None, prompt=None, promptCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=360.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Bases:
ComplexParamsMixin
,JavaMLReadable
,JavaMLWritable
,JavaTransformer
- Parameters:
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
bestOf¶ (object) – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
cacheLevel¶ (object) – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customHeaders¶ (object) – Map of Custom Header Key-Value Tuples.
customUrlRoot¶ (str) – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
echo¶ (object) – Echo back the prompt in addition to the completion
frequencyPenalty¶ (object) – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
handler¶ (object) – Which strategy to use when handling requests
logProbs¶ (object) – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
maxTokens¶ (object) – The maximum number of tokens to generate. Has minimum of 0.
n¶ (object) – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
presencePenalty¶ (object) – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
stop¶ (object) – A sequence which indicates the end of the current document.
temperature¶ (object) – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
timeout¶ (float) – number of seconds to wait before closing the connection
topP¶ (object) – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
user¶ (object) – The ID of the end-user, for use in tracking and rate-limiting.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- batchPrompt = Param(parent='undefined', name='batchPrompt', doc='ServiceParam: Sequence of prompts to complete')
- bestOf = Param(parent='undefined', name='bestOf', doc='ServiceParam: How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.')
- cacheLevel = Param(parent='undefined', name='cacheLevel', doc='ServiceParam: can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customHeaders = Param(parent='undefined', name='customHeaders', doc='ServiceParam: Map of Custom Header Key-Value Tuples.')
- customUrlRoot = Param(parent='undefined', name='customUrlRoot', doc='The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.')
- deploymentName = Param(parent='undefined', name='deploymentName', doc='ServiceParam: The name of the deployment')
- echo = Param(parent='undefined', name='echo', doc='ServiceParam: Echo back the prompt in addition to the completion')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- frequencyPenalty = Param(parent='undefined', name='frequencyPenalty', doc='ServiceParam: How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.')
- getBestOf()[source]
- Returns:
How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- Return type:
bestOf
- getCacheLevel()[source]
- Returns:
can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- Return type:
cacheLevel
- getConcurrentTimeout()[source]
- Returns:
max number seconds to wait on futures if concurrency >= 1
- Return type:
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns:
A Custom Value for Authorization Header
- Return type:
CustomAuthHeader
- getCustomHeaders()[source]
- Returns:
Map of Custom Header Key-Value Tuples.
- Return type:
customHeaders
- getCustomUrlRoot()[source]
- Returns:
The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- Return type:
customUrlRoot
- getFrequencyPenalty()[source]
- Returns:
How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- Return type:
frequencyPenalty
- getLogProbs()[source]
- Returns:
Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- Return type:
logProbs
- getMaxTokens()[source]
- Returns:
The maximum number of tokens to generate. Has minimum of 0.
- Return type:
maxTokens
- getN()[source]
- Returns:
How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- Return type:
n
- getPresencePenalty()[source]
- Returns:
How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- Return type:
presencePenalty
- getStop()[source]
- Returns:
A sequence which indicates the end of the current document.
- Return type:
stop
- getTemperature()[source]
- Returns:
What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- Return type:
temperature
- getTimeout()[source]
- Returns:
number of seconds to wait before closing the connection
- Return type:
timeout
- getTopP()[source]
- Returns:
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- Return type:
topP
- getUser()[source]
- Returns:
The ID of the end-user, for use in tracking and rate-limiting.
- Return type:
user
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- logProbs = Param(parent='undefined', name='logProbs', doc='ServiceParam: Include the log probabilities on the `logprobs` most likely tokens, as well the chosen tokens. So for example, if `logprobs` is 10, the API will return a list of the 10 most likely tokens. If `logprobs` is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.')
- maxTokens = Param(parent='undefined', name='maxTokens', doc='ServiceParam: The maximum number of tokens to generate. Has minimum of 0.')
- n = Param(parent='undefined', name='n', doc='ServiceParam: How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- presencePenalty = Param(parent='undefined', name='presencePenalty', doc='ServiceParam: How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.')
- prompt = Param(parent='undefined', name='prompt', doc='ServiceParam: The text to complete')
- setBestOf(value)[source]
- Parameters:
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setBestOfCol(value)[source]
- Parameters:
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setCacheLevel(value)[source]
- Parameters:
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setCacheLevelCol(value)[source]
- Parameters:
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setConcurrentTimeout(value)[source]
- Parameters:
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomHeaders(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomHeadersCol(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomUrlRoot(value)[source]
- Parameters:
customUrlRoot¶ – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- setFrequencyPenalty(value)[source]
- Parameters:
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setFrequencyPenaltyCol(value)[source]
- Parameters:
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setLogProbs(value)[source]
- Parameters:
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setLogProbsCol(value)[source]
- Parameters:
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setMaxTokens(value)[source]
- Parameters:
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setMaxTokensCol(value)[source]
- Parameters:
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setN(value)[source]
- Parameters:
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setNCol(value)[source]
- Parameters:
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, batchPrompt=None, batchPromptCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAICompletion_595eda8b403c_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, n=None, nCol=None, outputCol='OpenAICompletion_595eda8b403c_output', presencePenalty=None, presencePenaltyCol=None, prompt=None, promptCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=360.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Set the (keyword only) parameters
- setPresencePenalty(value)[source]
- Parameters:
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setPresencePenaltyCol(value)[source]
- Parameters:
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setStop(value)[source]
- Parameters:
stop¶ – A sequence which indicates the end of the current document.
- setStopCol(value)[source]
- Parameters:
stop¶ – A sequence which indicates the end of the current document.
- setTemperature(value)[source]
- Parameters:
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTemperatureCol(value)[source]
- Parameters:
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTimeout(value)[source]
- Parameters:
timeout¶ – number of seconds to wait before closing the connection
- setTopP(value)[source]
- Parameters:
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setTopPCol(value)[source]
- Parameters:
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setUser(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- setUserCol(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- stop = Param(parent='undefined', name='stop', doc='ServiceParam: A sequence which indicates the end of the current document.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- temperature = Param(parent='undefined', name='temperature', doc='ServiceParam: What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or `top_p` but not both. Minimum of 0 and maximum of 2 allowed.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- topP = Param(parent='undefined', name='topP', doc='ServiceParam: An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or `temperature` but not both. Minimum of 0 and maximum of 1 allowed.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- user = Param(parent='undefined', name='user', doc='ServiceParam: The ID of the end-user, for use in tracking and rate-limiting.')
synapse.ml.services.openai.OpenAIDefaults module
synapse.ml.services.openai.OpenAIEmbedding module
- class synapse.ml.services.openai.OpenAIEmbedding.OpenAIEmbedding(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, dimensions=None, dimensionsCol=None, errorCol='OpenAIEmbedding_83cb7f23aee9_error', handler=None, outputCol='OpenAIEmbedding_83cb7f23aee9_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=360.0, url=None, user=None, userCol=None)[source]
Bases:
ComplexParamsMixin
,JavaMLReadable
,JavaMLWritable
,JavaTransformer
- Parameters:
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customHeaders¶ (object) – Map of Custom Header Key-Value Tuples.
customUrlRoot¶ (str) – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
dimensions¶ (object) – Number of dimensions for output embeddings.
handler¶ (object) – Which strategy to use when handling requests
timeout¶ (float) – number of seconds to wait before closing the connection
user¶ (object) – The ID of the end-user, for use in tracking and rate-limiting.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customHeaders = Param(parent='undefined', name='customHeaders', doc='ServiceParam: Map of Custom Header Key-Value Tuples.')
- customUrlRoot = Param(parent='undefined', name='customUrlRoot', doc='The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.')
- deploymentName = Param(parent='undefined', name='deploymentName', doc='ServiceParam: The name of the deployment')
- dimensions = Param(parent='undefined', name='dimensions', doc='ServiceParam: Number of dimensions for output embeddings.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns:
max number seconds to wait on futures if concurrency >= 1
- Return type:
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns:
A Custom Value for Authorization Header
- Return type:
CustomAuthHeader
- getCustomHeaders()[source]
- Returns:
Map of Custom Header Key-Value Tuples.
- Return type:
customHeaders
- getCustomUrlRoot()[source]
- Returns:
The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- Return type:
customUrlRoot
- getDimensions()[source]
- Returns:
Number of dimensions for output embeddings.
- Return type:
dimensions
- getTimeout()[source]
- Returns:
number of seconds to wait before closing the connection
- Return type:
timeout
- getUser()[source]
- Returns:
The ID of the end-user, for use in tracking and rate-limiting.
- Return type:
user
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters:
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomHeaders(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomHeadersCol(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomUrlRoot(value)[source]
- Parameters:
customUrlRoot¶ – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- setDimensionsCol(value)[source]
- Parameters:
dimensions¶ – Number of dimensions for output embeddings.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, dimensions=None, dimensionsCol=None, errorCol='OpenAIEmbedding_83cb7f23aee9_error', handler=None, outputCol='OpenAIEmbedding_83cb7f23aee9_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=360.0, url=None, user=None, userCol=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters:
timeout¶ – number of seconds to wait before closing the connection
- setUser(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- setUserCol(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: Input text to get embeddings for.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- user = Param(parent='undefined', name='user', doc='ServiceParam: The ID of the end-user, for use in tracking and rate-limiting.')
synapse.ml.services.openai.OpenAIPrompt module
- class synapse.ml.services.openai.OpenAIPrompt.OpenAIPrompt(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, dropPrompt=True, echo=None, echoCol=None, errorCol='OpenAIPrompt_d28aaeda34bb_error', frequencyPenalty=None, frequencyPenaltyCol=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, messagesCol='OpenAIPrompt_d28aaeda34bb_messages', n=None, nCol=None, outputCol='OpenAIPrompt_d28aaeda34bb_output', postProcessing='', postProcessingOptions={}, presencePenalty=None, presencePenaltyCol=None, promptTemplate=None, responseFormat=None, responseFormatCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, systemPrompt="You are an AI chatbot who wants to answer user's questions and complete tasks. Follow their instructions carefully and be brief if they don't say otherwise.", temperature=None, temperatureCol=None, timeout=360.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Bases:
ComplexParamsMixin
,JavaMLReadable
,JavaMLWritable
,JavaTransformer
- Parameters:
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
bestOf¶ (object) – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
cacheLevel¶ (object) – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customHeaders¶ (object) – Map of Custom Header Key-Value Tuples.
customUrlRoot¶ (str) – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
dropPrompt¶ (bool) – whether to drop the column of prompts after templating (when using legacy models)
echo¶ (object) – Echo back the prompt in addition to the completion
frequencyPenalty¶ (object) – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
logProbs¶ (object) – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
maxTokens¶ (object) – The maximum number of tokens to generate. Has minimum of 0.
messagesCol¶ (str) – The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).
n¶ (object) – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
postProcessing¶ (str) – Post processing options: csv, json, regex
postProcessingOptions¶ (dict) – Options (default): delimiter=’,’, jsonSchema, regex, regexGroup=0
presencePenalty¶ (object) – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
promptTemplate¶ (str) – The prompt. supports string interpolation {col1}: {col2}.
responseFormat¶ (object) – Response format for the completion. Can be ‘json_object’ or ‘text’.
stop¶ (object) – A sequence which indicates the end of the current document.
temperature¶ (object) – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
timeout¶ (float) – number of seconds to wait before closing the connection
topP¶ (object) – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
user¶ (object) – The ID of the end-user, for use in tracking and rate-limiting.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- bestOf = Param(parent='undefined', name='bestOf', doc='ServiceParam: How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.')
- cacheLevel = Param(parent='undefined', name='cacheLevel', doc='ServiceParam: can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customHeaders = Param(parent='undefined', name='customHeaders', doc='ServiceParam: Map of Custom Header Key-Value Tuples.')
- customUrlRoot = Param(parent='undefined', name='customUrlRoot', doc='The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.')
- deploymentName = Param(parent='undefined', name='deploymentName', doc='ServiceParam: The name of the deployment')
- dropPrompt = Param(parent='undefined', name='dropPrompt', doc='whether to drop the column of prompts after templating (when using legacy models)')
- echo = Param(parent='undefined', name='echo', doc='ServiceParam: Echo back the prompt in addition to the completion')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- frequencyPenalty = Param(parent='undefined', name='frequencyPenalty', doc='ServiceParam: How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.')
- getBestOf()[source]
- Returns:
How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- Return type:
bestOf
- getCacheLevel()[source]
- Returns:
can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- Return type:
cacheLevel
- getConcurrentTimeout()[source]
- Returns:
max number seconds to wait on futures if concurrency >= 1
- Return type:
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns:
A Custom Value for Authorization Header
- Return type:
CustomAuthHeader
- getCustomHeaders()[source]
- Returns:
Map of Custom Header Key-Value Tuples.
- Return type:
customHeaders
- getCustomUrlRoot()[source]
- Returns:
The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- Return type:
customUrlRoot
- getDropPrompt()[source]
- Returns:
whether to drop the column of prompts after templating (when using legacy models)
- Return type:
dropPrompt
- getFrequencyPenalty()[source]
- Returns:
How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- Return type:
frequencyPenalty
- getLogProbs()[source]
- Returns:
Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- Return type:
logProbs
- getMaxTokens()[source]
- Returns:
The maximum number of tokens to generate. Has minimum of 0.
- Return type:
maxTokens
- getMessagesCol()[source]
- Returns:
The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).
- Return type:
messagesCol
- getN()[source]
- Returns:
How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- Return type:
n
- getPostProcessing()[source]
- Returns:
Post processing options: csv, json, regex
- Return type:
postProcessing
- getPostProcessingOptions()[source]
- Returns:
Options (default): delimiter=’,’, jsonSchema, regex, regexGroup=0
- Return type:
postProcessingOptions
- getPresencePenalty()[source]
- Returns:
How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- Return type:
presencePenalty
- getPromptTemplate()[source]
- Returns:
The prompt. supports string interpolation {col1}: {col2}.
- Return type:
promptTemplate
- getResponseFormat()[source]
- Returns:
Response format for the completion. Can be ‘json_object’ or ‘text’.
- Return type:
responseFormat
- getStop()[source]
- Returns:
A sequence which indicates the end of the current document.
- Return type:
stop
- getTemperature()[source]
- Returns:
What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- Return type:
temperature
- getTimeout()[source]
- Returns:
number of seconds to wait before closing the connection
- Return type:
timeout
- getTopP()[source]
- Returns:
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- Return type:
topP
- getUser()[source]
- Returns:
The ID of the end-user, for use in tracking and rate-limiting.
- Return type:
user
- logProbs = Param(parent='undefined', name='logProbs', doc='ServiceParam: Include the log probabilities on the `logprobs` most likely tokens, as well the chosen tokens. So for example, if `logprobs` is 10, the API will return a list of the 10 most likely tokens. If `logprobs` is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.')
- maxTokens = Param(parent='undefined', name='maxTokens', doc='ServiceParam: The maximum number of tokens to generate. Has minimum of 0.')
- messagesCol = Param(parent='undefined', name='messagesCol', doc='The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).')
- n = Param(parent='undefined', name='n', doc='ServiceParam: How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- postProcessing = Param(parent='undefined', name='postProcessing', doc='Post processing options: csv, json, regex')
- postProcessingOptions = Param(parent='undefined', name='postProcessingOptions', doc="Options (default): delimiter=',', jsonSchema, regex, regexGroup=0")
- presencePenalty = Param(parent='undefined', name='presencePenalty', doc='ServiceParam: How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.')
- promptTemplate = Param(parent='undefined', name='promptTemplate', doc='The prompt. supports string interpolation {col1}: {col2}.')
- responseFormat = Param(parent='undefined', name='responseFormat', doc="ServiceParam: Response format for the completion. Can be 'json_object' or 'text'.")
- setBestOf(value)[source]
- Parameters:
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setBestOfCol(value)[source]
- Parameters:
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setCacheLevel(value)[source]
- Parameters:
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setCacheLevelCol(value)[source]
- Parameters:
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setConcurrentTimeout(value)[source]
- Parameters:
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomHeaders(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomHeadersCol(value)[source]
- Parameters:
customHeaders¶ – Map of Custom Header Key-Value Tuples.
- setCustomUrlRoot(value)[source]
- Parameters:
customUrlRoot¶ – The custom URL root for the service. This will not append OpenAI specific model path completions (i.e. /chat/completions) to the URL.
- setDropPrompt(value)[source]
- Parameters:
dropPrompt¶ – whether to drop the column of prompts after templating (when using legacy models)
- setFrequencyPenalty(value)[source]
- Parameters:
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setFrequencyPenaltyCol(value)[source]
- Parameters:
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setLogProbs(value)[source]
- Parameters:
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setLogProbsCol(value)[source]
- Parameters:
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setMaxTokens(value)[source]
- Parameters:
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setMaxTokensCol(value)[source]
- Parameters:
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setMessagesCol(value)[source]
- Parameters:
messagesCol¶ – The column messages to generate chat completions for, in the chat format. This column should have type Array(Struct(role: String, content: String)).
- setN(value)[source]
- Parameters:
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setNCol(value)[source]
- Parameters:
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, customHeaders=None, customHeadersCol=None, customUrlRoot=None, deploymentName=None, deploymentNameCol=None, dropPrompt=True, echo=None, echoCol=None, errorCol='OpenAIPrompt_d28aaeda34bb_error', frequencyPenalty=None, frequencyPenaltyCol=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, messagesCol='OpenAIPrompt_d28aaeda34bb_messages', n=None, nCol=None, outputCol='OpenAIPrompt_d28aaeda34bb_output', postProcessing='', postProcessingOptions={}, presencePenalty=None, presencePenaltyCol=None, promptTemplate=None, responseFormat=None, responseFormatCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, systemPrompt="You are an AI chatbot who wants to answer user's questions and complete tasks. Follow their instructions carefully and be brief if they don't say otherwise.", temperature=None, temperatureCol=None, timeout=360.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Set the (keyword only) parameters
- setPostProcessing(value)[source]
- Parameters:
postProcessing¶ – Post processing options: csv, json, regex
- setPostProcessingOptions(value)[source]
- Parameters:
postProcessingOptions¶ – Options (default): delimiter=’,’, jsonSchema, regex, regexGroup=0
- setPresencePenalty(value)[source]
- Parameters:
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setPresencePenaltyCol(value)[source]
- Parameters:
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setPromptTemplate(value)[source]
- Parameters:
promptTemplate¶ – The prompt. supports string interpolation {col1}: {col2}.
- setResponseFormat(value)[source]
- Parameters:
responseFormat¶ – Response format for the completion. Can be ‘json_object’ or ‘text’.
- setResponseFormatCol(value)[source]
- Parameters:
responseFormat¶ – Response format for the completion. Can be ‘json_object’ or ‘text’.
- setStop(value)[source]
- Parameters:
stop¶ – A sequence which indicates the end of the current document.
- setStopCol(value)[source]
- Parameters:
stop¶ – A sequence which indicates the end of the current document.
- setTemperature(value)[source]
- Parameters:
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTemperatureCol(value)[source]
- Parameters:
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTimeout(value)[source]
- Parameters:
timeout¶ – number of seconds to wait before closing the connection
- setTopP(value)[source]
- Parameters:
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setTopPCol(value)[source]
- Parameters:
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setUser(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- setUserCol(value)[source]
- Parameters:
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- stop = Param(parent='undefined', name='stop', doc='ServiceParam: A sequence which indicates the end of the current document.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- systemPrompt = Param(parent='undefined', name='systemPrompt', doc='The initial system prompt to be used.')
- temperature = Param(parent='undefined', name='temperature', doc='ServiceParam: What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or `top_p` but not both. Minimum of 0 and maximum of 2 allowed.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- topP = Param(parent='undefined', name='topP', doc='ServiceParam: An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or `temperature` but not both. Minimum of 0 and maximum of 1 allowed.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- user = Param(parent='undefined', name='user', doc='ServiceParam: The ID of the end-user, for use in tracking and rate-limiting.')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.