- 3.36.0 (latest)
- 3.35.1
- 3.34.0
- 3.33.0
- 3.31.0
- 3.30.0
- 3.29.0
- 3.27.0
- 3.26.0
- 3.25.0
- 3.24.0
- 3.23.1
- 3.22.0
- 3.21.0
- 3.20.1
- 3.19.0
- 3.18.0
- 3.17.2
- 3.16.0
- 3.15.0
- 3.14.1
- 3.13.0
- 3.12.0
- 3.11.4
- 3.4.0
- 3.3.6
- 3.2.0
- 3.1.0
- 3.0.1
- 2.34.4
- 2.33.0
- 2.32.0
- 2.31.0
- 2.30.1
- 2.29.0
- 2.28.1
- 2.27.1
- 2.26.0
- 2.25.2
- 2.24.1
- 2.23.3
- 2.22.1
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.1
- 2.15.0
- 2.14.0
- 2.13.1
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.2
- 2.5.0
- 2.4.0
- 2.3.1
- 2.2.0
- 2.1.0
- 2.0.0
- 1.28.2
- 1.27.2
- 1.26.1
- 1.25.0
- 1.24.0
- 1.23.1
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
QueryJob
(
job_id
,
query
,
client
,
job_config
=
None
)
Asynchronous job: query tables.
Parameters
job_id
str
the job's ID, within the project belonging to client
.
query
str
SQL query string.
client
google.cloud.bigquery.client.Client
A client which holds credentials and project configuration for the dataset (which requires a project).
job_config
Inheritance
builtins.object > google.api_core.future.base.Future > google.api_core.future.polling.PollingFuture > google.cloud.bigquery.job.base._AsyncJob > QueryJobProperties
allow_large_results
See allow_large_results .
billing_tier
Return billing tier from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.billing_tier
Optional[int]
cache_hit
Return whether or not query results were served from cache.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.cache_hit
Optional[bool]
clustering_fields
See clustering_fields .
connection_properties
See connection_properties .
.. versionadded:: 2.29.0
create_disposition
See create_disposition .
create_session
See create_session .
.. versionadded:: 2.29.0
created
Datetime at which the job was created.
Optional[datetime.datetime]
ddl_operation_performed
Optional[str]: Return the DDL operation performed.
ddl_target_routine
Optional[ google.cloud.bigquery.routine.RoutineReference ]: Return the DDL target routine, present for CREATE/DROP FUNCTION/PROCEDURE queries.
ddl_target_table
Optional[ google.cloud.bigquery.table.TableReference ]: Return the DDL target table, present for CREATE/DROP TABLE/VIEW queries.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_table
default_dataset
See default_dataset .
destination
See destination .
destination_encryption_configuration
google.cloud.bigquery.encryption_configuration.EncryptionConfiguration : Custom encryption configuration for the destination table.
Custom encryption configuration (e.g., Cloud KMS keys) or :data: None
if using default encryption.
dry_run
See dry_run .
ended
Datetime at which the job finished.
Optional[datetime.datetime]
error_result
Error information about the job as a whole.
Optional[Mapping]
errors
Information about individual errors generated by the job.
Optional[List[Mapping]]
estimated_bytes_processed
Return the estimated number of bytes processed by the query.
Optional[int]
etag
ETag for the job resource.
Optional[str]
flatten_results
See flatten_results .
job_id
str: ID of the job.
job_type
Type of job.
str
labels
Dict[str, str]: Labels for the job.
location
str: Location where the job runs.
maximum_billing_tier
See maximum_billing_tier .
maximum_bytes_billed
See maximum_bytes_billed .
num_child_jobs
The number of child jobs executed.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs
num_dml_affected_rows
Return the number of DML rows affected by the job.
Optional[int]
parent_job_id
Return the ID of the parent job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id
Optional[str]
path
URL path for the job's APIs.
str
priority
See priority .
project
Project bound to the job.
str
query
str: The query text used in this query job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery.FIELDS.query
query_parameters
See query_parameters .
query_plan
Return query plan from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.query_plan
range_partitioning
See range_partitioning .
referenced_tables
Return referenced tables from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.referenced_tables
List[Dict]
reservation_usage
Job resource usage breakdown by reservation.
schema
The schema of the results.
Present only for successful dry run of non-legacy SQL queries.
schema_update_options
See schema_update_options .
script_statistics
Statistics for a child job of a script.
self_link
URL for the job resource.
Optional[str]
session_info
[Preview] Information of the session if this job is part of one.
.. versionadded:: 2.29.0
slot_millis
Union[int, None]: Slot-milliseconds used by this query job.
started
Datetime at which the job was started.
Optional[datetime.datetime]
state
Status of the job.
Optional[str]
statement_type
Return statement type from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.statement_type
Optional[str]
table_definitions
See table_definitions .
time_partitioning
See time_partitioning .
timeline
List(TimelineEntry): Return the query execution timeline from job statistics.
total_bytes_billed
Return total bytes billed from job statistics, if present.
Optional[int]
total_bytes_processed
Return total bytes processed from job statistics, if present.
Optional[int]
transaction_info
Information of the multi-statement transaction if this job is part of one.
Since a scripting query job can execute multiple transactions, this
property is only expected on child jobs. Use the list_jobs
method with the parent_job
parameter to iterate over child jobs.
.. versionadded:: 2.24.0
udf_resources
See udf_resources .
undeclared_query_parameters
Return undeclared query parameters from job statistics, if present.
List[Union[ google.cloud.bigquery.query.ArrayQueryParameter
, google.cloud.bigquery.query.ScalarQueryParameter
, google.cloud.bigquery.query.StructQueryParameter
]]
use_legacy_sql
See use_legacy_sql .
use_query_cache
See use_query_cache .
user_email
E-mail address of user who submitted the job.
Optional[str]
write_disposition
See write_disposition .
bi_engine_stats
API documentation for bigquery.job.QueryJob.bi_engine_stats
property.
dml_stats
API documentation for bigquery.job.QueryJob.dml_stats
property.
Methods
add_done_callback
add_done_callback
(
fn
)
Add a callback to be executed when the operation is complete.
If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.
fn
Callable[Future]
The callback to execute when the operation is complete.
cancel
cancel
(
client
=
None
,
retry
:
retries
.
Retry
=
< google
.
api_core
.
retry
.
Retry
object
> ,
timeout
:
float
=
None
)
API call: cancel job via a POST request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel
timeout
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using retry
client
Optional[ google.cloud.bigquery.client.Client
]
the client to use. If not passed, falls back to the client
stored on the current dataset.
retry
Optional[google.api_core.retry.Retry]
How to retry the RPC.
bool
cancelled
cancelled
()
Check if the job has been cancelled.
This always returns False. It's not possible to check if a job was
cancelled in the API. This method is here to satisfy the interface
for google.api_core.future.Future
.
bool
done
done
(
retry
:
retries
.
Retry
=
< google
.
api_core
.
retry
.
Retry
object
> ,
timeout
:
float
=
None
,
reload
:
bool
=
True
)
Checks if the job is complete.
timeout
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using retry
.
reload
Optional[bool]
If True
, make an API call to refresh the job state of unfinished jobs before checking. Default True
.
retry
Optional[google.api_core.retry.Retry]
How to retry the RPC. If the job state is DONE
, retrying is aborted early, as the job will not change anymore.
bool
exception
exception
(
timeout
=
None
)
Get the exception from the operation, blocking if necessary.
timeout
int
How long to wait for the operation to complete. If None, wait indefinitely.
Optional[google.api_core.GoogleAPICallError]
exists
exists
(
client
=
None
,
retry
:
retries
.
Retry
=
< google
.
api_core
.
retry
.
Retry
object
> ,
timeout
:
float
=
None
)
API call: test for the existence of the job via a GET request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
timeout
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using retry
.
client
Optional[ google.cloud.bigquery.client.Client
]
the client to use. If not passed, falls back to the client
stored on the current dataset.
retry
Optional[google.api_core.retry.Retry]
How to retry the RPC.
bool
from_api_repr
from_api_repr
(
resource
:
dict
,
client
:
Client
)
Factory: construct a job given its API representation
resource
Dict
dataset job representation returned from the API
client
google.cloud.bigquery.client.Client
Client which holds credentials and project configuration for the dataset.
google.cloud.bigquery.job.QueryJob
resource
.reload
reload
(
client
=
None
,
retry
:
retries
.
Retry
=
< google
.
api_core
.
retry
.
Retry
object
> ,
timeout
:
float
=
None
)
API call: refresh job properties via a GET request.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
timeout
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using retry
.
client
Optional[ google.cloud.bigquery.client.Client
]
the client to use. If not passed, falls back to the client
stored on the current dataset.
retry
Optional[google.api_core.retry.Retry]
How to retry the RPC.
result
result
(
page_size
:
int
=
None
,
max_results
:
int
=
None
,
retry
:
retries
.
Retry
=
< google
.
api_core
.
retry
.
Retry
object
> ,
timeout
:
float
=
None
,
start_index
:
int
=
None
,
job_retry
:
retries
.
Retry
=
< google
.
api_core
.
retry
.
Retry
object
> )
Start the job and wait for it to complete and get the result.
page_size
Optional[int]
The maximum number of rows in each page of results from this request. Non-positive values are ignored.
max_results
Optional[int]
The maximum total number of rows from this request.
timeout
Optional[float]
The number of seconds to wait for the underlying HTTP transport before using retry
. If multiple requests are made under the hood, timeout
applies to each individual request.
start_index
Optional[int]
The zero-based index of the starting row to read.
retry
Optional[google.api_core.retry.Retry]
How to retry the call that retrieves rows. This only applies to making RPC calls. It isn't used to retry failed jobs. This has a reasonable default that should only be overridden with care. If the job state is DONE
, retrying is aborted early even if the results are not available, as this will not change anymore.
job_retry
Optional[google.api_core.retry.Retry]
How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing None
disables job retry. Not all jobs can be retried. If job_id
was provided to the query that created this job, then the job returned by the query will not be retryable, and an exception will be raised if non- None
non-default job_retry
is also provided.
google.cloud.exceptions.GoogleAPICallError
concurrent.futures.TimeoutError
TypeError
None
and non-default job_retry
is provided and the job is not retryable.total_rows
attribute set, which counts the total number of rows **in the result set** (this is distinct from the total number of rows in the current page: iterator.page.num_items
). If the query is a special query that produces no results, e.g. a DDL query, an _EmptyRowIterator
instance is returned.running
running
()
True if the operation is currently running.
set_exception
set_exception
(
exception
)
Set the Future's exception.
set_result
set_result
(
result
)
Set the Future's result.
to_api_repr
to_api_repr
()
Generate a resource for _begin
.
to_arrow
to_arrow
(
progress_bar_type
:
str
=
None
,
bqstorage_client
:
Optional
[
bigquery_storage
.
BigQueryReadClient
]
=
None
,
create_bqstorage_client
:
bool
=
True
,
max_results
:
Optional
[
int
]
=
None
,
)
[Beta] Create a class: pyarrow.Table
by loading all pages of a
table or query.
progress_bar_type
Optional[str]
If set, use the tqdm https://tqdm.github.io/
_ library to display a progress bar while the data downloads. Install the tqdm
package to use this feature. Possible values of progress_bar_type
include: None
No progress bar. 'tqdm'
Use the tqdm.tqdm
function to print a progress bar to :data: sys.stdout
. 'tqdm_notebook'
Use the tqdm.notebook.tqdm
function to display a progress bar as a Jupyter notebook widget. 'tqdm_gui'
Use the tqdm.tqdm_gui
function to display a progress bar as a graphical dialog box.
bqstorage_client
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires google-cloud-bigquery-storage
library. Reading from a specific partition or snapshot is not currently supported by this method.
create_bqstorage_client
Optional[bool]
If True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client
parameter for more information. This argument does nothing if bqstorage_client
is supplied. .. versionadded:: 1.24.0
max_results
Optional[int]
Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0
to_dataframe
to_dataframe
(
bqstorage_client
:
Optional
[
bigquery_storage
.
BigQueryReadClient
]
=
None
,
dtypes
:
Dict
[
str
,
Any
]
=
None
,
progress_bar_type
:
str
=
None
,
create_bqstorage_client
:
bool
=
True
,
max_results
:
Optional
[
int
]
=
None
,
geography_as_object
:
bool
=
False
,
)
Return a pandas DataFrame from a QueryJob
bqstorage_client
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the fastavro
and google-cloud-bigquery-storage
libraries. Reading from a specific partition or snapshot is not currently supported by this method.
dtypes
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas dtype
s. The provided dtype
is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.
progress_bar_type
Optional[str]
If set, use the tqdm https://tqdm.github.io/
_ library to display a progress bar while the data downloads. Install the tqdm
package to use this feature. See to_dataframe
for details. .. versionadded:: 1.11.0
create_bqstorage_client
Optional[bool]
If True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client
parameter for more information. This argument does nothing if bqstorage_client
is supplied. .. versionadded:: 1.24.0
max_results
Optional[int]
Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0
geography_as_object
Optional[bool]
If True
, convert GEOGRAPHY data to shapely
geometry objects. If False
(default), don't cast geography data to shapely
geometry objects. .. versionadded:: 2.24.0
ValueError
pandas
library cannot be imported, or the bigquery_storage_v1
module is required but cannot be imported. Also if geography_as_object
is True
, but the shapely
library cannot be imported.pandas.DataFrame
pandas.DataFrame
populated with row data and column headers from the query results. The column headers are derived from the destination table's schema.to_geodataframe
to_geodataframe
(
bqstorage_client
:
bigquery_storage
.
BigQueryReadClient
=
None
,
dtypes
:
Dict
[
str
,
Any
]
=
None
,
progress_bar_type
:
str
=
None
,
create_bqstorage_client
:
bool
=
True
,
max_results
:
Optional
[
int
]
=
None
,
geography_column
:
Optional
[
str
]
=
None
,
)
Return a GeoPandas GeoDataFrame from a QueryJob
dtypes
Optional[Map[str, Union[str, pandas.Series.dtype]]]
A dictionary of column names pandas dtype
s. The provided dtype
is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.
progress_bar_type
Optional[str]
If set, use the tqdm https://tqdm.github.io/
_ library to display a progress bar while the data downloads. Install the tqdm
package to use this feature. See to_dataframe
for details. .. versionadded:: 1.11.0
create_bqstorage_client
Optional[bool]
If True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client
parameter for more information. This argument does nothing if bqstorage_client
is supplied. .. versionadded:: 1.24.0
max_results
Optional[int]
Maximum number of rows to include in the result. No limit by default. .. versionadded:: 2.21.0
geography_column
Optional[str]
If there are more than one GEOGRAPHY column, identifies which one to use to construct a GeoPandas GeoDataFrame. This option can be ommitted if there's only one GEOGRAPHY column.
bqstorage_client
Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API. This method requires the fastavro
and google-cloud-bigquery-storage
libraries. Reading from a specific partition or snapshot is not currently supported by this method.
ValueError
geopandas
library cannot be imported, or the bigquery_storage_v1
module is required but cannot be imported. .. versionadded:: 2.24.0geopandas.GeoDataFrame
geopandas.GeoDataFrame
populated with row data and column headers from the query results. The column headers are derived from the destination table's schema.__init__
__init__
(
job_id
,
query
,
client
,
job_config
=
None
)
Initialize self. See help(type(self)) for accurate signature.
QueryJob
QueryJob
(
job_id
,
query
,
client
,
job_config
=
None
)
Asynchronous job: query tables.
job_id
str
the job's ID, within the project belonging to client
.
query
str
SQL query string.
client
google.cloud.bigquery.client.Client
A client which holds credentials and project configuration for the dataset (which requires a project).
job_config