Module ai (2.29.0)

This module integrates BigQuery built-in AI functions for use with Series/DataFrame objects, such as AI.GENERATE_BOOL: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-generate-bool

Modules Functions

classify

  classify 
 ( 
 input 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 List 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ] 
 ], 
 typing 
 . 
 Tuple 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ], 
 ... 
 ], 
 ], 
 categories 
 : 
 tuple 
 [ 
 str 
 , 
 ... 
 ] 
 | 
 list 
 [ 
 str 
 ], 
 * 
 , 
 connection_id 
 : 
 str 
 | 
 None 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Classifies a given input into one of the specified categories. It will always return one of the provided categories best fit the prompt input.

Examples:

 >>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> df = bpd.DataFrame({'creature': ['Cat', 'Salmon']})
>>> df['type'] = bbq.ai.classify(df['creature'], ['Mammal', 'Fish'])
>>> df
  creature    type
0      Cat  Mammal
1   Salmon    Fish
<BLANKLINE>
[2 rows x 2 columns] 
Returns
Type
Description
A new series of strings.

forecast

  forecast 
 ( 
 df 
 : 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 | 
 pandas 
 . 
 core 
 . 
 frame 
 . 
 DataFrame 
 , 
 * 
 , 
 data_col 
 : 
 str 
 , 
 timestamp_col 
 : 
 str 
 , 
 model 
 : 
 str 
 = 
 "TimesFM 2.0" 
 , 
 id_cols 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Iterable 
 [ 
 str 
 ]] 
 = 
 None 
 , 
 horizon 
 : 
 int 
 = 
 10 
 , 
 confidence_level 
 : 
 float 
 = 
 0.95 
 , 
 context_window 
 : 
 int 
 | 
 None 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Forecast time series at future horizon. Using Google Research's open source TimesFM( https://github.com/google-research/timesfm ) model.

Exceptions
Type
Description
ValueError
when any column ID does not exist in the dataframe.
Returns
Type
Description
DataFrame
The forecast dataframe matches that of the BigQuery AI.FORECAST function. See: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-ai-forecast

generate

  generate 
 ( 
 prompt 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 List 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ] 
 ], 
 typing 
 . 
 Tuple 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ], 
 ... 
 ], 
 ], 
 * 
 , 
 connection_id 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 endpoint 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 request_type 
 : 
 typing 
 . 
 Literal 
 [ 
 "dedicated" 
 , 
 "shared" 
 , 
 "unspecified" 
 ] 
 = 
 "unspecified" 
 , 
 model_params 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Mapping 
 [ 
 typing 
 . 
 Any 
 , 
 typing 
 . 
 Any 
 ]] 
 = 
 None 
 , 
 output_schema 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Mapping 
 [ 
 str 
 , 
 str 
 ]] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Returns the AI analysis based on the prompt, which can be any combination of text and unstructured data.

Examples:

 >>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> country = bpd.Series(["Japan", "Canada"])
>>> bbq.ai.generate(("What's the capital city of ", country, " one word only"))
0    {'result': 'Tokyo\n', 'full_response': '{"cand...
1    {'result': 'Ottawa\n', 'full_response': '{"can...
dtype: struct<result: string, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow]

>>> bbq.ai.generate(("What's the capital city of ", country, " one word only")).struct.field("result")
0     Tokyo\n
1    Ottawa\n
Name: result, dtype: string 

You get structured output when the output_schema parameter is set:

 >>> animals = bpd.Series(["Rabbit", "Spider"])
>>> bbq.ai.generate(animals, output_schema={"number_of_legs": "INT64", "is_herbivore": "BOOL"})
0    {'is_herbivore': True, 'number_of_legs': 4, 'f...
1    {'is_herbivore': False, 'number_of_legs': 8, '...
dtype: struct<is_herbivore: bool, number_of_legs: int64, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow] 
Returns
Type
Description
A new struct Series with the result data. The struct contains these fields: * "result": a STRING value containing the model's response to the prompt. The result is None if the request fails or is filtered by responsible AI. If you specify an output schema then result is replaced by your custom schema. * "full_response": a JSON value containing the response from the projects.locations.endpoints.generateContent call to the model. The generated text is in the text element. * "status": a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

generate_bool

  generate_bool 
 ( 
 prompt 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 List 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ] 
 ], 
 typing 
 . 
 Tuple 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ], 
 ... 
 ], 
 ], 
 * 
 , 
 connection_id 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 endpoint 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 request_type 
 : 
 typing 
 . 
 Literal 
 [ 
 "dedicated" 
 , 
 "shared" 
 , 
 "unspecified" 
 ] 
 = 
 "unspecified" 
 , 
 model_params 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Mapping 
 [ 
 typing 
 . 
 Any 
 , 
 typing 
 . 
 Any 
 ]] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Returns the AI analysis based on the prompt, which can be any combination of text and unstructured data.

Examples:

 >>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> df = bpd.DataFrame({
...     "col_1": ["apple", "bear", "pear"],
...     "col_2": ["fruit", "animal", "animal"]
... })
>>> bbq.ai.generate_bool((df["col_1"], " is a ", df["col_2"]))
0    {'result': True, 'full_response': '{"candidate...
1    {'result': True, 'full_response': '{"candidate...
2    {'result': False, 'full_response': '{"candidat...
dtype: struct<result: bool, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow]

>>> bbq.ai.generate_bool((df["col_1"], " is a ", df["col_2"])).struct.field("result")
0     True
1     True
2    False
Name: result, dtype: boolean 
Returns
Type
Description
A new struct Series with the result data. The struct contains these fields: * "result": a BOOL value containing the model's response to the prompt. The result is None if the request fails or is filtered by responsible AI. * "full_response": a JSON value containing the response from the projects.locations.endpoints.generateContent call to the model. The generated text is in the text element. * "status": a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

generate_double

  generate_double 
 ( 
 prompt 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 List 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ] 
 ], 
 typing 
 . 
 Tuple 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ], 
 ... 
 ], 
 ], 
 * 
 , 
 connection_id 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 endpoint 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 request_type 
 : 
 typing 
 . 
 Literal 
 [ 
 "dedicated" 
 , 
 "shared" 
 , 
 "unspecified" 
 ] 
 = 
 "unspecified" 
 , 
 model_params 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Mapping 
 [ 
 typing 
 . 
 Any 
 , 
 typing 
 . 
 Any 
 ]] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Returns the AI analysis based on the prompt, which can be any combination of text and unstructured data.

Examples:

 >>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> animal = bpd.Series(["Kangaroo", "Rabbit", "Spider"])
>>> bbq.ai.generate_double(("How many legs does a ", animal, " have?"))
0    {'result': 2.0, 'full_response': '{"candidates...
1    {'result': 4.0, 'full_response': '{"candidates...
2    {'result': 8.0, 'full_response': '{"candidates...
dtype: struct<result: double, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow]

>>> bbq.ai.generate_double(("How many legs does a ", animal, " have?")).struct.field("result")
0    2.0
1    4.0
2    8.0
Name: result, dtype: Float64 
Returns
Type
Description
A new struct Series with the result data. The struct contains these fields: * "result": an DOUBLE value containing the model's response to the prompt. The result is None if the request fails or is filtered by responsible AI. * "full_response": a JSON value containing the response from the projects.locations.endpoints.generateContent call to the model. The generated text is in the text element. * "status": a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

generate_int

  generate_int 
 ( 
 prompt 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 List 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ] 
 ], 
 typing 
 . 
 Tuple 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ], 
 ... 
 ], 
 ], 
 * 
 , 
 connection_id 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 endpoint 
 : 
 str 
 | 
 None 
 = 
 None 
 , 
 request_type 
 : 
 typing 
 . 
 Literal 
 [ 
 "dedicated" 
 , 
 "shared" 
 , 
 "unspecified" 
 ] 
 = 
 "unspecified" 
 , 
 model_params 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Mapping 
 [ 
 typing 
 . 
 Any 
 , 
 typing 
 . 
 Any 
 ]] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Returns the AI analysis based on the prompt, which can be any combination of text and unstructured data.

Examples:

 >>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> animal = bpd.Series(["Kangaroo", "Rabbit", "Spider"])
>>> bbq.ai.generate_int(("How many legs does a ", animal, " have?"))
0    {'result': 2, 'full_response': '{"candidates":...
1    {'result': 4, 'full_response': '{"candidates":...
2    {'result': 8, 'full_response': '{"candidates":...
dtype: struct<result: int64, full_response: extension<dbjson<JSONArrowType>>, status: string>[pyarrow]

>>> bbq.ai.generate_int(("How many legs does a ", animal, " have?")).struct.field("result")
0    2
1    4
2    8
Name: result, dtype: Int64 
Returns
Type
Description
A new struct Series with the result data. The struct contains these fields: * "result": an integer (INT64) value containing the model's response to the prompt. The result is None if the request fails or is filtered by responsible AI. * "full_response": a JSON value containing the response from the projects.locations.endpoints.generateContent call to the model. The generated text is in the text element. * "status": a STRING value that contains the API response status for the corresponding row. This value is empty if the operation was successful.

if_

  if_ 
 ( 
 prompt 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 List 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ] 
 ], 
 typing 
 . 
 Tuple 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ], 
 ... 
 ], 
 ], 
 * 
 , 
 connection_id 
 : 
 str 
 | 
 None 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Evaluates the prompt to True or False. Compared to ai.generate_bool() , this function provides optimization such that not all rows are evaluated with the LLM.

Examples:

 >>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> us_state = bpd.Series(["Massachusetts", "Illinois", "Hawaii"])
>>> bbq.ai.if_((us_state, " has a city called Springfield"))
0     True
1     True
2    False
dtype: boolean

>>> us_state[bbq.ai.if_((us_state, " has a city called Springfield"))]
0    Massachusetts
1         Illinois
dtype: string 
Returns
Type
Description
A new series of bools.

score

  score 
 ( 
 prompt 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 List 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ] 
 ], 
 typing 
 . 
 Tuple 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 pandas 
 . 
 core 
 . 
 series 
 . 
 Series 
 ], 
 ... 
 ], 
 ], 
 * 
 , 
 connection_id 
 : 
 str 
 | 
 None 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Computes a score based on rubrics described in natural language. It will return a double value. There is no fixed range for the score returned. To get high quality results, provide a scoring rubric with examples in the prompt.

Examples:

 >>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq
>>> animal = bpd.Series(["Tiger", "Rabbit", "Blue Whale"])
>>> bbq.ai.score(("Rank the relative weights of ", animal, " on the scale from 1 to 3")) # doctest: +SKIP
0    2.0
1    1.0
2    3.0
dtype: Float64 
Returns
Type
Description
A new series of double (float) values.
Create a Mobile Website
View Site in Mobile | Classic
Share by: