- 2.17.0 (latest)
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 2.0.0-dev0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Changelog
2.17.0 (2025-08-22)
Features
-
Add reset_index names, col_level, col_fill, allow_duplicates args ( #2017 ) ( c02a1b6 )
-
Support callable for series mask method ( #2014 ) ( 5ac32eb )
2.16.0 (2025-08-20)
Features
-
Add
bigframes.pandas.options.display.precision
option ( #1979 ) ( 15e6175 ) -
Add level, inplace params to reset_index ( #1988 ) ( 3446950 )
-
Add ML code samples from dbt blog post ( #1978 ) ( ebaa244 )
-
Add where, coalesce, fillna, casewhen, invert local impl ( #1976 ) ( f7f686c )
-
Adjust anywidget CSS to prevent overflow ( #1981 ) ( 204f083 )
-
Support callable bigframes function for dataframe where ( #1990 ) ( 44c1ec4 )
-
Support callable for series where method ( #2005 ) ( 768b82a )
-
When using
repr_mode = "anywidget"
, numeric values align right ( 15e6175 )
Bug Fixes
-
Address the packages issue for bigframes function ( #1991 ) ( 68f1d22 )
-
Correct pypdf dependency specifier for remote PDF functions ( #1980 ) ( 0bd5e1b )
-
Enable default retries in calls to BQ Storage Read API ( #1985 ) ( f25d7bd )
-
Fix the copyright year in dbt sample files ( #1996 ) ( fad5722 )
Performance Improvements
Documentation
-
Add examples of running bigframes in kaggle ( #2002 ) ( 7d89d76 )
-
Remove preview warning from partial ordering mode sample notebook ( #1986 ) ( 132e0ed )
2.15.0 (2025-08-11)
Features
-
Add
st_buffer
,st_centroid
, andst_convexhull
and their corresponding GeoSeries methods ( #1963 ) ( c4c7fa5 ) -
Allow callable as a conditional or replacement input in DataFrame.where ( #1971 ) ( a8d57d2 )
Bug Fixes
-
Add warnings for duplicated or conflicting type hints in bigfram… ( #1956 ) ( d38e42c )
-
Make
remote_function
more robust when there arecreate_function
retries ( #1973 ) ( cd954ac ) -
Make ExecutionMetrics stats tracking more robust to missing stats ( #1977 ) ( feb3ff4 )
Performance Improvements
Documentation
2.14.0 (2025-08-05)
Features
-
Dynamic table width for better display across devices ( https://github.com/googleapis/python-bigquery-dataframes/issues/1948 ) ( a6d30ae ) ( a6d30ae )
-
Support series input in managed function ( #1920 ) ( 62a189f )
Bug Fixes
Performance Improvements
Documentation
-
Add code snippet for storing dataframes to a CSV file ( #1943 ) ( a511e09 )
-
Add code snippet for storing dataframes to a CSV file ( #1953 ) ( a298a02 )
2.13.0 (2025-07-25)
Features
-
_read_gbq_colab creates hybrid session ( #1901 ) ( 31b17b0 )
-
Add CSS styling for TableWidget pagination interface ( #1934 ) ( 5b232d7 )
-
Add row numbering local pushdown in hybrid execution ( #1932 ) ( 92a2377 )
Bug Fixes
Dependencies
2.12.0 (2025-07-23)
Features
-
Add code samples for dbt bigframes integration ( #1898 ) ( 7e03252 )
-
Add isin local execution to hybrid engine ( #1915 ) ( c0cefd3 )
-
Add ml.metrics.mean_absolute_error method ( #1910 ) ( 15b8449 )
-
Allow local arithmetic execution in hybrid engine ( #1906 ) ( ebdcd02 )
-
Provide day_of_year and day_of_week for dt accessor ( #1911 ) ( 40e7638 )
-
Support params
max_batching_rows
,container_cpu
, andcontainer_memory
forudf
( #1897 ) ( 8baa912 ) -
Support typed pyarrow.Scalar in assignment ( #1930 ) ( cd28e12 )
Bug Fixes
-
Correct min field from max() to min() in remote function tests ( #1917 ) ( d5c54fc )
-
Resolve location reset issue in bigquery options ( #1914 ) ( c15cb8a )
-
Series.str.isdigit in unicode superscripts and fractions ( #1924 ) ( 8d46c36 )
Documentation
-
Add code snippets for session and IO public docs ( #1919 ) ( 6e01cbe )
-
Add snippets for performance optimization doc ( #1923 ) ( 4da309e )
2.11.0 (2025-07-15)
Features
-
Add
__contains__
to Index, Series, DataFrame ( #1899 ) ( 07222bf ) -
Add pagination buttons (prev/next) to anywidget mode for DataFrames ( #1841 ) ( 8eca767 )
-
Add total_rows property to pandas batches iterator ( #1888 ) ( e3f5e65 )
-
Support
date
data type for to_datetime() ( #1902 ) ( 24050cb ) -
Support bpd.Series(json_data, dtype=”json”) ( #1882 ) ( 05cb7d0 )
Bug Fixes
-
DataFrame string addition respects order ( #1894 ) ( 52c8233 )
-
Show slot_millis_sum warning only when
allow_large_results=False
( #1892 ) ( 25efabc ) -
Used query row count metadata instead of table metadata ( #1893 ) ( e1ebc53 )
2.10.0 (2025-07-08)
Features
-
df.to_pandas_batches()
returns one empty DataFrame ifdf
is empty ( #1878 ) ( e43d15d ) -
Add simple stats support to hybrid local pushdown ( #1873 ) ( 8715105 )
Bug Fixes
Documentation
2.9.0 (2025-06-30)
Features
-
Add
bpd.read_arrow
to convert an Arrow object into a bigframes DataFrame ( #1855 ) ( 633bf98 ) -
Create
deploy_remote_function
anddeploy_udf
functions to immediately deploy functions to BigQuery ( #1832 ) ( c706759 ) -
Support local execution of comparison ops ( #1849 ) ( 1c45ccb )
Bug Fixes
-
Fix bug with DataFrame.agg for string values ( #1870 ) ( 81e4d64 )
-
Generate GoogleSQL instead of legacy SQL data types for
dry_run=True
frombpd._read_gbq_colab
with local pandas DataFrame ( #1867 ) ( fab3c38 ) -
Revert dict back to protobuf in the iam binding update ( #1838 ) ( 9fb3cb4 )
Documentation
2.8.0 (2025-06-23)
⚠ BREAKING CHANGES
- add required param ‘engine’ to multimodal functions ( #1834 )
Features
-
Add
bpd.options.compute.maximum_result_rows
option to limit client data download ( #1829 ) ( e22a3f6 ) -
Add
bpd.options.display.repr_mode = "anywidget"
to create an interactive display of the results ( #1820 ) ( be0a3cf ) -
Add required param ‘engine’ to multimodal functions ( #1834 ) ( 37666e4 )
Performance Improvements
Documentation
2.7.0 (2025-06-16)
Features
-
Add bbq.json_query_array and warn bbq.json_extract_array deprecated ( #1811 ) ( dc9eb27 )
-
Add bbq.json_value_array and deprecate bbq.json_extract_string_array ( #1818 ) ( 019051e )
-
Support custom build service account in
remote_function
( #1796 ) ( e586151 )
Bug Fixes
-
Correct read_csv behaviours with use_cols, names, index_col ( #1804 ) ( 855031a )
-
Fix single row broadcast with null index ( #1803 ) ( 080eb7b )
Documentation
-
Document how to use ai.map() for information extraction ( #1808 ) ( b586746 )
-
Rearrange README.rst to include a short code sample ( #1812 ) ( f6265db )
-
Use pandas API instead of pandas-like or pandas-compatible ( #1825 ) ( aa32369 )
2.6.0 (2025-06-09)
Features
-
Implement ST_ISCLOSED geography function ( #1789 ) ( 36bc179 )
-
Implement ST_LENGTH geography function ( #1791 ) ( c5b7fda )
-
Support isin with bigframes.pandas.Index arg ( #1779 ) ( e480d29 )
Bug Fixes
-
Address
read_csv
with bothindex_col
anduse_cols
behavior inconsistency with pandas ( #1785 ) ( ba7c313 ) -
Allow KMeans model init parameter as k-means++ alias ( #1790 ) ( 0b59cf1 )
-
Replace function now can handle bpd.NA value. ( #1786 ) ( 7269512 )
Documentation
-
Adjust strip method examples to match latest pandas ( #1797 ) ( 817b0c0 )
-
Fix docstrings to improve html rendering of code examples ( #1788 ) ( 38d9b73 )
2.5.0 (2025-05-30)
⚠ BREAKING CHANGES
- the updated
ai.map()
parameter list is not backward-compatible
Features
-
Add
bpd.options.bigquery.requests_transport_adapters
option ( #1755 ) ( bb45db8 ) -
Add bbq.json_query and warn bbq.json_extract deprecated ( #1756 ) ( ec81dd2 )
-
Add deprecation warning to Gemini-1.5-X, text-embedding-004, and remove remove legacy models in notebooks and docs ( #1723 ) ( 80aad9a )
-
Add structured output for ai map, ai filter and ai join ( #1746 ) ( 133ac6b )
-
Add support for df.loc list, column(s) ( 768a757 )
-
Include bq schema and query string in dry run results ( #1752 ) ( bb51147 )
-
Support
inplace=True
inrename
andrename_axis
( #1744 ) ( 734cc65 ) -
Support astype conversions to and from JSON dtypes ( #1716 ) ( 8ef4de1 )
-
Support dict param for dataframe.agg() ( #1772 ) ( f9c29c8 )
-
Support dtype parameter in read_csv for bigquery engine ( #1749 ) ( 50dca4c )
Bug Fixes
-
Fix the default value for na_value for numpy conversions ( #1766 ) ( 0629cac )
-
Include location in Session-based temporary storage manager DDL queries ( #1780 ) ( acba032 )
-
Prevent creating unnecessary client objects in multithreaded environments ( #1757 ) ( 1cf9f5e )
-
Reduce bigquery table modification via DML for to_gbq ( #1737 ) ( 545cdca )
-
Stop ignoring arguments to
MatrixFactorization.score(X, y)
( #1726 ) ( 55c07e9 ) -
Support JSON and STRUCT for bbq.sql_scalar ( #1754 ) ( 190390b )
-
Support str.replace re.compile with flags ( #1736 ) ( f8d2cd2 )
Performance Improvements
-
Faster local data comparison using idenitity ( #1738 ) ( 2858b1e )
-
Use JOB_CREATION_OPTIONAL when
allow_large_results=False
( #1763 ) ( 15f3f2a )
Dependencies
Documentation
-
Add MatrixFactorization to the table of contents ( #1725 ) ( 611e43b )
-
Fix typo for “population” in the
GeminiTextGenerator.predict(..., output_schema={...})
sample notebook ( #1748 ) ( bd07e05 ) -
Integrations notebook extracts token from
bqclient._http.credentials
instead ofbqclient._credentials
( #1784 ) ( 6e63eca ) -
Updated multimodal notebook instructions ( #1745 ) ( 1df8ca6 )
-
Use partial ordering mode in the quickstart sample ( #1734 ) ( 476b7dd )
2.4.0 (2025-05-12)
Features
-
Add “dayofyear” property for
dt
accessors ( #1692 ) ( 9d4a59d ) -
Add
.dt.days
,.dt.seconds
,dt.microseconds
, anddt.total_seconds()
for timedelta series. ( #1713 ) ( 2b3a45f ) -
Add inplace arg support to sort methods ( #1710 ) ( d1ccb52 )
-
Improve error message in
Series.apply
for direct udfs ( #1673 ) ( 1a658b2 ) -
Publish bigframes blob(Multimodal) to preview ( #1693 ) ( e4c85ba )
-
Support () operator between timedeltas ( #1702 ) ( edaac89 )
-
Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models ( #1305 ) ( b16740e )
-
Support to_strip parameter for str.strip, str.lstrip and str.rstrip ( #1705 ) ( a84ee75 )
Bug Fixes
Performance Improvements
Dependencies
Documentation
-
Add snippets for Matrix Factorization tutorials ( #1630 ) ( 24b37ae )
-
Deprecate
bpd.options.bigquery.allow_large_results
in favor ofbpd.options.compute.allow_large_results
( #1597 ) ( 18780b4 ) -
Include import statement in the bigframes code snippet ( #1699 ) ( 08d70b6 )
-
Include the clean-up step in the udf code snippet ( #1698 ) ( 48992e2 )
-
Move multimodal notebook out of experimental folder ( #1712 ) ( 68b6532 )
-
Update blob_display option in snippets ( #1714 ) ( 8b30143 )
2.3.0 (2025-05-06)
Features
Bug Fixes
-
Guarantee guid thread safety across threads ( #1684 ) ( cb0267d )
-
Support large lists of lists in bpd.Series() constructor ( #1662 ) ( 0f4024c )
-
Use value equality to check types for unix epoch functions and timestamp diff ( #1690 ) ( 81e8fb8 )
Performance Improvements
-
to_datetime()
now avoids caching inputs unless data is inspected to infer format ( #1667 ) ( dd08857 )
Documentation
-
Add a visualization notebook to BigFrame samples ( #1675 ) ( ee062bf )
-
Update snippet for
Create a k-means
model tutorial ( #1664 ) ( 761c364 )
2.2.0 (2025-04-30)
Features
-
Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints ( #1650 ) ( 4fb54df )
-
Add GeminiTextGenerator.predict structured output ( #1653 ) ( 6199023 )
-
DataFrames. getitemsupport for slice input ( #1668 ) ( 563f0cb )
-
Print right origin of
PreviewWarning
for thebpd.udf
( #1629 ) ( 48d10d1 ) -
Session.bytes_processed_sum will be updated when allow_large_re… ( #1669 ) ( ae312db )
-
Support names parameter in read_csv for bigquery engine ( #1659 ) ( 3388191 )
-
Support passing list of values to bigframes.core.sql.simple_literal ( #1641 ) ( 102d363 )
Bug Fixes
-
Prefer remote schema instead of throwing on materialize conflicts ( #1644 ) ( 53fc25b )
-
Resolve issue where pre-release versions of google-auth are installed ( #1491 ) ( ebb7a5e )
Performance Improvements
Dependencies
Documentation
-
Fix
bq_dataframes_template
notebook to work if partial ordering mode is enabled ( #1665 ) ( f442e7a ) -
Note that
udf
is in preview and must be python 3.11 compatible ( #1629 ) ( 48d10d1 )
2.1.0 (2025-04-22)
Features
-
Add
bigframes.bigquery.st_distance
function ( #1637 ) ( bf1ae70 ) -
Enhance
read_csv
index_col
parameter support ( #1631 ) ( f4e5b26 )
Bug Fixes
-
Add retry for test_clean_up_via_context_manager ( #1627 ) ( 58e7cb0 )
-
Improve robustness of managed udf code extraction ( #1634 ) ( 8cc56d5 )
Documentation
2.0.0 (2025-04-17)
⚠ BREAKING CHANGES
-
make
dataset
andname
params mandatory inudf
( #1619 ) -
Locational endpoints support is not available in BigFrames 2.0.
-
change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator ( #1558 )
-
change default ingress setting for
remote_function
to internal-only ( #1544 ) -
make
remote_function
params keyword only ( #1537 ) -
make
remote_function
default service account explicit ( #1537 ) -
set
allow_large_results=False
by default ( #1541 )
Features
-
Add
on
parameter indataframe.rolling()
anddataframe.groupby.rolling()
( #1556 ) ( 45c9d9f ) -
Add component to manage temporary tables ( #1559 ) ( 0a4e245 )
-
Add support for creating a Matrix Factorization model ( #1330 ) ( b5297f9 )
-
Allow
input_types
,output_type
, anddataset
to be used positionally inremote_function
( #1560 ) ( bcac8c6 ) -
Allow pandas.cut ‘labels’ parameter to accept a list of string ( #1549 ) ( af842b1 )
-
Change default ingress setting for
remote_function
to internal-only ( #1544 ) ( c848a80 ) -
Detect duplicate column/index names in read_gbq before send query. ( #1615 ) ( 40d6960 )
-
Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy ( #1605 ) ( b4b7073 )
-
Make
remote_function
default service account explicit ( #1537 ) ( 9eb9089 ) -
Set
allow_large_results=False
by default ( #1541 ) ( e9fb712 ) -
Support bigquery connection in managed function ( #1554 ) ( f6f697a )
-
Support inlining small list, struct, json data ( #1589 ) ( 2ce891f )
-
Use session temp tables for all ephemeral storage ( #1569 ) ( 9711b83 )
-
Use validated local storage for data uploads ( #1612 ) ( aee4159 )
-
Warn the deprecated
max_download_size
,random_state
andsampling_method
parameters in(DataFrame|Series).to_pandas()
( #1573 ) ( b9623da )
Bug Fixes
-
to_pandas_batches()
respectspage_size
andmax_results
again ( #1572 ) ( 27c5905 ) -
Ensure
page_size
works correctly into_pandas_batches
whenmax_results
is not set ( #1588 ) ( 570cff3 ) -
Include role and service account in IAM exception ( #1564 ) ( 8c50755 )
-
Make
dataset
andname
params mandatory inudf
( #1619 ) ( 637e860 ) -
Pandas.cut returns labels index for numeric breaks when labels=False ( #1548 ) ( b2375de )
-
Prevent
KeyError
inbpd.concat
with empty DF and struct/array types DF ( #1568 ) ( b4da1cf ) -
Read_csv supports for tilde local paths and includes index for bigquery_stream write engine ( #1580 ) ( 352e8e4 )
-
Use dictionaries to avoid problematic google.iam namespace ( #1611 ) ( b03e44f )
Performance Improvements
Dependencies
Documentation
-
Add details for
bigquery_connection
in[@bpd](https://github.com/bpd).udf
docstring ( #1609 ) ( ef63772 ) -
Add explain forecast snippet to multiple time series tutorial ( #1586 ) ( 40c55a0 )
-
Add message to remove default model for version 3.0 ( #1563 ) ( 910be2b )
-
Add samples for ArimaPlus
time_series_id_col
feature ( #1577 ) ( 1e4cd9c ) -
Deprecate default model in
TextEmbedddingGenerator
,GeminiTextGenerator
, and otherbigframes.ml.llm
classes ( #1570 ) ( 89ab33e ) -
Include all licenses for vendored packages in the root LICENSE file ( #1626 ) ( 8116ed0 )
-
Remove gemini-1.5 deprecation warning for
GeminiTextGenerator
( #1562 ) ( 0cc6784 ) -
Use restructured text to allow publishing to PyPI ( #1565 ) ( d1e9ec2 )
Miscellaneous Chores
1.42.0 (2025-03-27)
Features
-
Add
GeoSeries.difference()
andbigframes.bigquery.st_difference()
( #1471 ) ( e9fe815 ) -
Add
GeoSeries.intersection()
andbigframes.bigquery.st_intersection()
( #1529 ) ( 8542bd4 ) -
Add Linear_Regression.global_explain() ( #1446 ) ( 7e5b6a8 )
-
Allow iloc to support lists of negative indices ( #1497 ) ( a9cf215 )
-
Support window partition by geo column ( #1512 ) ( bdcb1e7 )
Bug Fixes
-
Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X ( #1534 ) ( c93e720 )
-
Change the default value for pdf extract/chunk ( #1517 ) ( a70a607 )
-
Local data always has sequential index ( #1514 ) ( 014bd33 )
-
Read_pandas inline returns None when exceeds limit ( #1525 ) ( 578081e )
-
Temporary fix for StreamingDataFrame not working backend bug ( #1533 ) ( 6ab4ffd )
-
Tolerate BQ connection service account propagation delay ( #1505 ) ( 6681f1f )
Performance Improvements
Documentation
1.41.0 (2025-03-19)
Features
-
Add support for the ‘right’ parameter in ‘pandas.cut’ ( #1496 ) ( 8aff128 )
-
Support BQ managed functions through
read_gbq_function
( #1476 ) ( 802183d ) -
Warn when the BigFrames version is more than a year old ( #1455 ) ( 00e0750 )
Bug Fixes
Performance Improvements
Documentation
1.40.0 (2025-03-11)
⚠ BREAKING CHANGES
- reading JSON data as a custom arrow extension type ( #1458 )
Features
-
Reading JSON data as a custom arrow extension type ( #1458 ) ( e720f41 )
-
Support list output for managed function ( #1457 ) ( 461e9e0 )
Bug Fixes
-
Fix list-like indexers in partial ordering mode ( #1456 ) ( fe72ada )
-
Fix the merge issue between 1424 and 1373 ( #1461 ) ( 7b6e361 )
-
Use
==
instead ofis
for timedelta type equality checks ( #1480 ) ( 0db248b )
Performance Improvements
1.39.0 (2025-03-05)
Features
-
(Preview) Support
diff()
for date series ( #1423 ) ( 521e987 ) -
(Preview) Support aggregations over timedeltas ( #1418 ) ( 1251ded )
-
(Preview) Support arithmetics between dates and timedeltas ( #1413 ) ( 962b152 )
-
(Preview) Support automatic load of timedelta from BQ tables. ( #1429 ) ( b2917bb )
-
Add
allow_large_results
option to many I/O methods. Set toFalse
to reduce latency ( #1428 ) ( dd2f488 ) -
Support interface for BigQuery managed functions ( #1373 ) ( 2bbf53f )
-
Warn if default ingress_settings is used in remote_functions ( #1419 ) ( dfd891a )
Bug Fixes
-
Do not compare schema description during schema validation ( #1452 ) ( 03a3a56 )
-
Remove warnings for null index and partial ordering mode in prep for GA ( #1431 ) ( 6785aee )
-
Warn if default
cloud_function_service_account
is used inremote_function
( #1424 ) ( fe7463a ) -
Write chunked text instead of dummy text for pdf chunk ( #1444 ) ( 96b0e8a )
Performance Improvements
Documentation
1.38.0 (2025-02-24)
Features
-
(Preview) Support diff aggregation for timestamp series. ( #1405 ) ( abe48d6 )
-
Add
GeoSeries.from_wkt()
andGeoSeries.to_wkt()
( #1401 ) ( 2993b28 ) -
Support routines with ARRAY return type in
read_gbq_function
( #1412 ) ( 4b60049 )
Bug Fixes
-
Calling to_timdelta() over timedeltas no longer changes their values ( #1411 ) ( 650a190 )
-
Replace empty dict with None to avoid mutable default arguments ( #1416 ) ( fa4e3ad )
Performance Improvements
Dependencies
Documentation
-
Add samples using SQL methods via the
bigframes.bigquery
module ( #1358 ) ( f54e768 ) -
Add snippets for visualizing a time series and creating a time series model for the Limit forecasted values in time series model tutorial ( #1310 ) ( c6c9120 )
1.37.0 (2025-02-19)
Features
-
(Preview) Support add, sub, mult, div, and more between timedeltas ( #1396 ) ( ffa63d4 )
-
(Preview) Support comparison, ordering, and filtering for timedeltas ( #1387 ) ( 34d01b2 )
-
(Preview) Support subtraction in DATETIME/TIMESTAMP columns with timedelta columns ( #1390 ) ( 50ad3a5 )
-
JSON dtype support for read_pandas and Series constructor ( #1391 ) ( 44f4137 )
Bug Fixes
Performance Improvements
Documentation
1.36.0 (2025-02-11)
Features
-
(Preview) Support addition between a timestamp and a timedelta ( #1369 ) ( b598aa8 )
-
(Preview) Support casting floats and list-likes to timedelta series ( #1362 ) ( 65933b6 )
-
(Preview) Support timestamp subtractions ( #1346 ) ( 86b7e72 )
-
Add
bigframes.bigquery.st_area
and suggest it fromGeoSeries.area
( #1318 ) ( 8b5ffa8 )
Bug Fixes
-
Dtype parameter ineffective in Series/DataFrame construction ( #1354 ) ( b9bdca8 )
-
Translate labels to col ids when copying dataframes ( #1372 ) ( 0c55b07 )
Performance Improvements
1.35.0 (2025-02-04)
Features
-
(Preview) Support timedeltas for read_pandas() ( #1349 ) ( 866ba9e )
-
Allow
case_when
to change dtypes if case list contains the condition(True, some_default_value)
( #1311 ) ( 5c2a2c6 ) -
Support time_series_id_col in ARIMAPlus ( #1282 ) ( 97532c9 )
Bug Fixes
-
Exclude
DataFrame
andSeries
__call__
from unimplemented API metrics ( #1351 ) ( f2d5264 ) -
Make
DataFrame
__getattr__
and__setattr__
more robust to subclassing ( #1352 ) ( 417de3a )
Performance Improvements
Dependencies
Documentation
-
Add link to DataFrames intro to improve SEO ( #1176 ) ( aafb5be )
-
Add snippet to explain the univariate model’s forecast result in the Forecast a single time series with a univariate model tutorial ( #1272 ) ( c22126b )
1.34.0 (2025-01-27)
⚠ BREAKING CHANGES
- Enable reading JSON data with
dbjson
extension dtype ( #1139 )
Features
-
(df|s).hist(), (df|s).line(), (df|s).area(), (df|s).bar(), df.scatter() ( #1320 ) ( bd3f584 )
-
(Preview) Define timedelta type and to_timedelta function ( #1317 ) ( 3901951 )
-
Enable reading JSON data with
dbjson
extension dtype ( #1139 ) ( f672262 )
1.33.0 (2025-01-22)
Features
-
Add
bigframes.bigquery.sql_scalar()
to apply SQL syntax on Series objects ( #1293 ) ( aa2f73a ) -
Add unix_seconds, unix_millis and unix_micros for timestamp series. ( #1297 ) ( e4b0c8d )
-
Support array output in
remote_function
( #1057 ) ( bdee173 )
Bug Fixes
-
Dataframe sort_values Series input keyerror. ( #1285 ) ( 5a2731b )
-
Fix read_gbq_function issue in dataframe apply method ( #1174 ) ( 0318764 )
-
Series sort_index and sort_values now raises when axis!=0 ( #1294 ) ( 94bc2f2 )
Documentation
-
Add snippet to forecast future time series in the Forecast a single time series with a univariate model tutorial ( #1271 ) ( a687050 )
1.32.0 (2025-01-13)
Features
Bug Fixes
-
Avoid global mutation in
BigQueryOptions.client_endpoints_override
( #1280 ) ( 788f6e9 ) -
Fix erroneous window bounds removal during compilation ( #1163 ) ( f91756a )
Dependencies
Documentation
-
Add bq studio links that allows users to generate Jupiter notebooks in bq studio with github contents ( #1266 ) ( 58f13cb )
-
Add snippet to evaluate ARIMA plus model in the Forecast a single time series with a univariate model tutorial ( #1267 ) ( 3dcae2d )
-
Add snippet to see the ARIMA coefficients in the Forecast a single time series with a univariate model tutorial ( #1268 ) ( 059a564 )
-
Update
bigframes.pandas.pandas
docstrings ( #1247 ) ( c4bffc3 ) -
Use 002 model for better scalability in text generation ( #1270 ) ( bb7a850 )
1.31.0 (2025-01-05)
Features
Bug Fixes
-
Raise if trying to change
ordering_mode
after session has started ( #1252 ) ( 8cfaae8 ) -
Reduce the number of labels added to query jobs ( #1245 ) ( fdcdc18 )
Documentation
1.30.0 (2024-12-30)
Features
-
Add
LinearRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns ( #1190 ) ( e13eca2 ) -
Add
LogisticRegression.predict_explain()
to generateML.EXPLAIN_PREDICT
columns ( #1222 ) ( bcbc732 ) -
Add
write_engine
parameter toread_FORMATNAME
methods to control how data is written to BigQuery ( #371 ) ( ed47ef1 ) -
Add client side retry to GeminiTextGenerator ( #1242 ) ( 8193abe )
-
Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 ( #1208 ) ( 298fc73 )
-
Add support for
LinearRegression.predict_explain
andLogisticRegression.predict_explain
parameter,top_k_features
( #1228 ) ( 3068e19 )
Bug Fixes
-
Throw an error message when setting is_row_processor=True to read a multi param function ( #1160 ) ( b2816a5 )
Documentation
-
Add an “open in BQ Studio” link to all BigFrames sample notebooks ( #1223 ) ( e0a8288 )
-
Add bq studio link for a new ipynb file called “bq_dataframes_template.ipynb” ( #1239 ) ( 840aaff )
-
Add examples for ml PCA and SimpleImputer ( #1236 ) ( 0d84459 )
-
Add python snippet for “Create the time series model” section of the Forecast a single time series with a univariate model tutorial ( #1227 ) ( 20f3190 )
1.29.0 (2024-12-12)
Features
Documentation
-
Add Gemini 2.0 text gen sample notebook ( #1211 ) ( 9596b66 )
-
Update bigframes.pandas.index docs return types ( #1191 ) ( c63e7da )
1.28.0 (2024-12-11)
Features
-
bigframes.bigquery.vector_search
supportsuse_brute_force
andfraction_lists_to_search
parameters ( #1158 ) ( 131edc3 ) -
Add
ARIMAPlus.predict_explain()
to generate forecasts with explanation columns ( #1177 ) ( 05f8b4d ) -
Add client_endpoints_override to bq options ( #1167 ) ( be74b99 )
-
Add support for temporal types in dataframe’s describe() method ( #1189 ) ( 2d564a6 )
-
Allow join-free alignment of analytic expressions ( #1168 ) ( daef4f0 )
-
Series.isin supports bigframes.Series arg ( #1195 ) ( 0d8a16b )
-
Update llm.TextEmbeddingGenerator to 005 ( #1186 ) ( 3072d38 )
Bug Fixes
-
Fix error loading local dataframes into bigquery ( #1165 ) ( 5b355ef )
-
Fix series.isin using local path always ( #1202 ) ( a44eafd )
Performance Improvements
Dependencies
- Remove
ibis-framework
by vendoring a fork of the package tobigframes_vendored
. ( #1170 ) ( 421d24d )
Documentation
-
Add a code sample using
bpd.options.bigquery.ordering_mode = "partial"
( #909 ) ( f80d705 ) -
Add snippet for creating boosted tree model ( #1142 ) ( a972668 )
-
Add snippet for evaluating a boosted tree model ( #1154 ) ( 9d8970a )
-
Add snippet for predicting classifications using a boosted tree model ( #1156 ) ( e7b83f1 )
-
Add third party
pandas.Index methods
and docstrings ( #1171 ) ( a970294 ) -
Fix Bigframes.Pandas.General_Function missing docs ( #1164 ) ( de923d0 )
-
Update
bigframes.pandas.Index
docstrings ( #1144 ) ( 557ab8d )
1.27.0 (2024-11-16)
Features
Bug Fixes
Documentation
1.26.0 (2024-11-12)
Features
Bug Fixes
-
Fix Series.to_frame generating string label instead of int where name is None ( #1118 ) ( 14e32b5 )
-
Update the API documentation with newly added rep ( #1120 ) ( 72c228b )
Performance Improvements
Documentation
-
Add file for Classification with a Boosted Treed Model and snippet for preparing sample data ( #1135 ) ( 7ac6639 )
-
Add snippet for Linear Regression tutorial Predict Outcomes section ( #1101 ) ( 108f4a9 )
-
Update
DataFrame
docstrings to include the errors section ( #1127 ) ( a38d4c4 ) -
Update Session doctrings to include exceptions ( #1130 ) ( a870421 )
1.25.0 (2024-10-29)
Features
-
Add the
ground_with_google_search
option for GeminiTextGenerator predict ( #1119 ) ( ca02cd4 ) -
Add warning when user tries to access struct series fields with
__getitem__
( #1082 ) ( 20e5c58 ) -
Allow
fit
to take additional eval data in linear and ensemble models ( #1096 ) ( 254875c ) -
Support context manager for bigframes session ( #1107 ) ( 5f7b8b1 )
Performance Improvements
1.24.0 (2024-10-24)
Features
Documentation
1.23.0 (2024-10-23)
Features
-
Add
bigframes.bigquery.create_vector_index
to assist in creating vector index onARRAY<FLOAT64>
columns ( #1024 ) ( 863d694 ) -
Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. ( #1105 ) ( 7094c85 )
-
Add support for pandas series & data frames as inputs for ml models. ( #1088 ) ( 30c8883 )
-
Cleanup temp resources with session deletion ( #1068 ) ( 1d5373d )
-
Show possible correct key(s) in
.__getitem__
KeyError message ( #1097 ) ( 32fab96 )
Bug Fixes
-
Escape ids more consistently in ml module ( #1074 ) ( 103e998 )
-
Remove index requirement from some dataframe APIs ( #1073 ) ( 2d16f6d )
-
Update session metrics in
read_gbq_query
( #1084 ) ( dced460 )
Performance Improvements
-
Speed up tree transforms during sql compile ( #1071 ) ( d73fe9d )
-
Utilize ORDER BY LIMIT over ROW_NUMBER where possible ( #1077 ) ( 7003d1a )
Documentation
-
Add ml tutorial for Evaluate the model ( #1038 ) ( a120bae )
-
Show best practice of closing the session to cleanup resources in sample notebooks ( #1095 ) ( 62a88e8 )
-
Update docstrings of Session and related files ( #1087 ) ( bf93e80 )
1.22.0 (2024-10-09)
Features
-
Support regional endpoints for more bigquery locations ( #1061 ) ( 45b672a )
-
Update LLM generators to warn user about model name instead of raising error. ( #1048 ) ( 650d80d )
Bug Fixes
-
Access MATERIALIZED_VIEW with read_gbq ( #1070 ) ( 601e984 )
-
Correct zero row count in DataFrame from table view ( #1062 ) ( b536070 )
-
Fix generic error message when entering an incorrect column name ( #1031 ) ( 5ac217d )
-
Make invalid location warning case-insensitive ( #1044 ) ( b6cd55a )
-
Remove palm2 test case from llm load test ( #1063 ) ( 575a10a )
-
Show warning for unknown location set through .ctor ( #1052 ) ( 02c2da7 )
Performance Improvements
Documentation
1.21.0 (2024-10-02)
Features
-
Add deprecation warning to PaLM2TextGenerator model ( #1035 ) ( 1183b0f )
-
Add DeprecationWarning for PaLM2TextEmbeddingGenerator ( #1018 ) ( 4af5bbb )
-
Add ml.model_selection.cross_validate support ( #1020 ) ( 1a38063 )
-
Allow access of struct fields with dot operators on
Series
( #1019 ) ( ef76f13 )
Bug Fixes
-
Ensure no double execution for to_pandas ( #1032 ) ( 4992cc2 )
-
Remove pre-caching of remote function results ( #1028 ) ( 0359bc8 )
Documentation
1.20.0 (2024-09-25)
Features
-
Add bigframes.bigquery.approx_top_count ( #1010 ) ( 3263bd7 )
-
Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations ( #955 ) ( 1930b4e )
-
Allow multiple columns input for llm models ( #998 ) ( 2fe5e48 )
Bug Fixes
Documentation
-
Limit pypi notebook to 7 days and add more info about differences with partial ordering mode ( #1013 ) ( 3c54399 )
-
Move and edit existing linear-regression tutorial snippet ( #991 ) ( 4cb62fd )
1.19.0 (2024-09-24)
Features
-
Support bool and bytes types in
describe(include='all')
( #994 ) ( cc48f58 ) -
Support ingress settings in
remote_function
( #1011 ) ( 8e9919b )
Bug Fixes
Performance Improvements
Dependencies
1.18.0 (2024-09-18)
Features
-
Add “include” param to describe for string types ( #973 ) ( deac6d2 )
-
Add
subset
parameter toDataFrame.dropna
to select which columns to consider ( #981 ) ( f7c03dc )
Bug Fixes
-
DataFrameGroupby.agg now works with unnamed tuples ( #985 ) ( 0f047b4 )
-
Fix a bug that raises exception when re-indexing columns with their original order ( #988 ) ( 596b03b )
-
Make the
Series.apply
outcomeassign
able to the original dataframe in partial ordering mode ( #874 ) ( c94ead9 )
Dependencies
1.17.0 (2024-09-11)
Features
-
Add
__version__
alias to bigframes.pandas ( #967 ) ( 9ce10b4 ) -
Define list accessor for bigframes Series ( #946 ) ( 8e8279d )
-
Enable read_csv() to process other files ( #940 ) ( 3b35860 )
-
Include the bigframes package version alongside the feedback link in error messages ( #936 ) ( 7b59b6d )
Bug Fixes
-
Make
read_gbq_function
work for multi-param functions ( #947 ) ( c750be6 ) -
Support
read_gbq_function
for axis=1 application ( #950 ) ( 86e54b1 )
Documentation
-
Add docstring returns section to Options ( #937 ) ( a2640a2 )
-
Update title of pypi notebook example to reflect use of the PyPI public dataset ( #952 ) ( cd62e60 )
1.16.0 (2024-09-04)
Features
-
Add
DataFrame.struct.explode
to add struct subfields to a DataFrame ( #916 ) ( ad2f75e ) -
Implement
bigframes.bigquery.json_extract_array
( #910 ) ( 575a29e ) -
Recover struct column from exploded Series ( #904 ) ( 7dd304c )
Bug Fixes
-
Fix issue with iterating on >10gb dataframes ( #949 ) ( 2b0f0fa )
-
Unordered mode errors in ml train_test_split ( #925 ) ( 85d7c21 )
Performance Improvements
Dependencies
Documentation
-
Add Claude3 ML and RemoteFunc notebooks ( #930 ) ( cfd16c1 )
-
Create sample notebook to manipulate struct and array data ( #883 ) ( 3031903 )
-
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook ( #890 ) ( d1883cc )
1.15.0 (2024-08-20)
Features
Documentation
-
Add columns for “requires ordering/index” to supported APIs summary ( #892 ) ( d2fc51a )
-
Remove duplicate description for
kms_key_name
( #898 ) ( 1053d56 )
1.14.0 (2024-08-14)
Features
Bug Fixes
Performance Improvements
Documentation
1.13.0 (2024-08-05)
Features
-
df.apply(axis=1)
to support remote function with mutiple params ( #851 ) ( 2158818 ) -
Allow windowing in ‘partial’ ordering mode ( #861 ) ( ca26fe5 )
-
Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters ( #879 ) ( 8753bdd )
Bug Fixes
Documentation
1.12.0 (2024-07-31)
Features
-
Add config option to set partial ordering mode ( #855 ) ( 823c0ce )
-
Add stratify param support to ml.model_selection.train_test_split method ( #815 ) ( 27f8631 )
-
Allow DataFrame.join for self-join on Null index ( #860 ) ( e950533 )
-
Support remote function cleanup with
session.close
( #818 ) ( ed06436 ) -
Support to_csv/parquet/json to local files/objects ( #858 ) ( d0ab9cc )
Bug Fixes
-
Fewer relation joins from df self-operations ( #823 ) ( 0d24f73 )
-
Fix unordered mode using ordered path to print frame ( #839 ) ( 93785cb )
-
Reduce redundant
remote_function
deployments ( #856 ) ( cbf2d42 )
Documentation
-
Add partner attribution steps to integrations sample notebook ( #835 ) ( d7b333f )
-
Make
get_global_session
/close_session
/reset_session
appears in the docs ( #847 ) ( 01d6bbb )
1.11.1 (2024-07-08)
Documentation
-
Remove session and connection in llm notebook ( #821 ) ( 74170da )
-
Remove the experimental flask icon from the public docs ( #820 ) ( 067ff17 )
1.11.0 (2024-07-01)
Features
-
Add
bigframes.streaming.to_pubsub
method to create continuous query that writes to Pub/Sub ( #801 ) ( b47f32d ) -
Add
DataFrame.to_arrow
to create Arrow Table from DataFrame ( #807 ) ( 1e3feda ) -
Add
PolynomialFeatures
support toto_gbq
and pipelines ( #805 ) ( 57d98b9 ) -
Add Series.peek to preview data efficiently ( #727 ) ( 580e1b9 )
-
Expose gcf memory param in
remote_function
( #803 ) ( 014765c ) -
More informative error when query plan too complex ( #811 ) ( 136dc24 )
Bug Fixes
Documentation
1.10.0 (2024-06-21)
Features
-
Add ml.preprocessing.PolynomialFeatures class ( #793 ) ( b4fbb51 )
-
Bigframes.streaming module for continuous queries ( #703 ) ( 0433a1c )
-
Include index columns in DataFrame.sql if they are named ( #788 ) ( c8d16c0 )
Bug Fixes
-
Allow
__repr__
to work with uninitialed DataFrame/Series/Index ( #778 ) ( e14c7a9 ) -
Df.loc with the 2nd input as bigframes boolean Series ( #789 ) ( a4ac82e )
-
Ensure numpy version matches in
remote_function
deployment ( #798 ) ( 324d93c ) -
Fix temp table creation retries by now throwing if table already exists. ( #787 ) ( 0e57d1f )
-
Self-join optimization doesn’t needlessly invalidate caching ( #797 ) ( 1b96b80 )
1.9.0 (2024-06-10)
Features
-
Allow functions returned from
bpd.read_gbq_function
to execute outside ofapply
( #706 ) ( ad7d8ac )
Bug Fixes
-
ARIMAPlus loads auto_arima_min_order param ( #752 ) ( 39d7013 )
-
Improve to_pandas_batches for large results ( #746 ) ( 61f18cb )
-
Resolve issue with unset thread-local options ( #741 ) ( d93dbaf )
Documentation
1.8.0 (2024-05-31)
Features
-
merge
only generates a default index if both inputs already have an index ( #733 ) ( 25d049c ) -
Add
GroupBy.size()
to get number of rows in each group ( #479 ) ( 1fca588 ) -
Add slot_millis and add stats to session object ( #725 ) ( 72e9583 )
-
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings ( #731 ) ( f12c906 )
-
Allow functions decorated with
bpd.remote_function()
to execute locally ( #704 ) ( d850da6 ) -
Ensure
"bigframes-api"
label is always set on jobs, even if the API is unknown ( #722 ) ( 1832778 ) -
Support type annotations to supply input and output types to
bpd.remote_function()
decorator ( #717 ) ( 4a12e3c ) -
Support type annotations with
bpd.remote_function()
andaxis=1
(a preview feature) ( #730 ) ( e5a2992 )
Bug Fixes
-
Correct index labels in multiple aggregations for DataFrameGroupBy ( #723 ) ( 6a78c89 )
-
Set
bpd.remote_function()
sinput_types
andoutput_types
default toNone
to allow omitting them when type annotations are present ( #729 ) ( 0e25a3b ) -
Warn and disable time travel for linked datasets ( #712 ) ( 085fa9d )
Performance Improvements
Documentation
1.7.0 (2024-05-20)
Features
-
read_gbq_query
supportsfilters
( 9386373 ) -
read_gbq
suggests a correct column name when one is not found ( 9386373 ) -
Add
DefaultIndexKind.NULL
to use asindex_col
inread_gbq\*
, creating an indexless DataFrame/Series ( #662 ) ( 29e4886 ) -
Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) ( #663 ) ( 412f28b )
-
To_datetime supports utc=False for string inputs ( #579 ) ( adf9889 )
Bug Fixes
-
read_gbq_table
respects primary keys even whenfilters
are set ( #689 ) ( 9386373 ) -
Improve escaping of literals and identifiers ( #682 ) ( da9b136 )
-
Properly identify non-unique index in tables without primary keys ( #699 ) ( 6e0f4d8 )
-
Remove a usage of the
resource
package when not available, such as on Windows ( #681 ) ( 96243f2 ) -
The imported samples error and use peek() ( #688 ) ( 1a0b744 )
Performance Improvements
-
Don’t run query immediately from
read_gbq_table
iffilters
is set ( 9386373 ) -
Use a
LIMIT
clause whenmax_results
is set ( 9386373 )
Documentation
-
Add code snippets for imported onnx tutorials ( #684 ) ( cb36e46 )
-
Add code snippets for imported tensorflow model ( #679 ) ( b02c401 )
-
Use
class_weight="balanced"
in the logistic regression prediction tutorial ( #678 ) ( b951549 )
1.6.0 (2024-05-13)
Features
-
Add
strategy="quantile"
in KBinsDiscretizer ( #654 ) ( c6c487f ) -
Suggest correct options in bpd.options.bigquery.location ( #666 ) ( 57ccabc )
-
Support
axis=1
indf.apply
for scalar outputs ( #629 ) ( f6bdc4a ) -
Support gcf vpc connector in
remote_function
( #677 ) ( 9ca92d0 ) -
Warn with a more specific
DefaultLocationWarning
category when no location can be detected ( #648 ) ( e084e54 )
Bug Fixes
Dependencies
- Add jellyfish as a dependency for spelling correction ( 57ccabc )
Documentation
-
Add code snippets for llm text generatiion ( #669 ) ( 93416ed )
-
Document inlining of small data in
read_\*
APIs ( #670 ) ( 306953a )
1.5.0 (2024-05-07)
Features
-
bigframes.options
andbigframes.option_context
now uses thread-local variables to prevent context managers in separate threads from affecting each other ( #652 ) ( 651fd7d ) -
Add
ARIMAPlus.coef_
property exposingML.ARIMA_COEFFICIENTS
functionality ( #585 ) ( 81d1262 ) -
Add a unique session_id to Session and allow cleaning up sessions ( #553 ) ( c8d4e23 )
-
Add the
bigframes.bigquery
sub-package with abigframes.bigquery.array_length
function ( #630 ) ( 9963f85 ) -
Always do a query dry run when
option.repr_mode == "deferred"
( #652 ) ( 651fd7d ) -
Custom query labels for compute options ( #638 ) ( f561799 )
-
Warn with
DefaultIndexWarning
fromread_gbq
on clustered/partitioned tables with noindex_col
orfilters
set ( #631 , #658 ) ( 2715d2b , 73064dd ) -
Support
index_col=False
inread_csv
andengine="bigquery"
( 73064dd ) -
Support gcf max instance count in
remote_function
( #657 ) ( 36578ab )
Bug Fixes
-
Don’t raise UnknownLocationWarning for US or EU multi-regions ( #653 ) ( 8e4616b )
-
Fix bug with na in the column labels in stack ( #659 ) ( 4a34293 )
-
Use explicit session in
PaLM2TextGenerator
( #651 ) ( e4f13c3 )
Documentation
-
Add python code sample for multiple forecasting time series ( #531 ) ( 16866d2 )
-
Fix the Palm2TextGenerator output token size ( #649 ) ( c67e501 )
1.4.0 (2024-04-29)
Features
-
Add .cache() method to persist intermediate dataframe ( #626 ) ( a5c94ec )
-
Add transpose support for small homogeneously typed DataFrames. ( #621 ) ( 054075d )
-
Allow single input type in
remote_function
( #641 ) ( 3aa643f ) -
Expose gcf max timeout in
remote_function
( #639 ) ( dfeaad0 ) -
Series binary ops compatible with more types ( #618 ) ( 518d315 )
-
Support the
score
method forPaLM2TextGenerator
( #634 ) ( 3ffc1d2 )
Bug Fixes
-
Allow to_pandas to download more than 10GB ( #637 ) ( ce56495 )
-
Extend row hash to 128 bits to guarantee unique row id ( #632 ) ( 9005c6e )
Performance Improvements
-
Automatically condense internal expression representation ( #516 ) ( 03c1b0d )
-
Cache transpose to allow performant retranspose ( #635 ) ( 44b738d )
Documentation
-
Add supported pandas apis on the main page ( #628 ) ( 8d2a51c )
-
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial ( #623 ) ( 2b84c4f )
-
Address more technical writers’ feedback ( #640 ) ( 1e7793c )
1.3.0 (2024-04-22)
Features
-
Add fine tuning
fit()
for Palm2TextGenerator ( #616 ) ( 9c106bd ) -
Expose
max_batching_rows
inremote_function
( #622 ) ( 240a1ac ) -
Support primary key(s) in
read_gbq
by using as theindex_col
by default ( #625 ) ( 75bb240 ) -
Warn if location is set to unknown location ( #609 ) ( 3706b4f )
Bug Fixes
-
Infer narrowest numeric type when combining numeric columns ( #602 ) ( 8f9ece6 )
-
Use exact median implementation by default ( #619 ) ( 9d205ae )
Documentation
-
Fix rendering of examples for multiple apis ( #620 ) ( 9665e39 )
-
Set
index_cols
inread_gbq
as a best practice ( #624 ) ( 70015b7 )
1.2.0 (2024-04-15)
Features
Bug Fixes
-
Address more technical writers feedback ( #581 ) ( 4b08d92 )
-
Inverting int now does bitwise inversion rather than sign flip ( #574 ) ( 5f1db8b )
Documentation
-
Add code samples for
str
accessor methdos ( #594 ) ( a557ea2 ) -
Add docs for
DataFrame
andSeries
dunder methods ( #562 ) ( 8fc26c4 )
1.1.0 (2024-04-04)
Features
-
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops ( #505 ) ( e8e66cf )
-
Allow DataFrame binary ops to align on either axis and with loc… ( #544 ) ( 6d8f3af )
-
Expose
DataFrame.bqclient
to assist in integrations ( #519 ) ( 0be8911 ) -
Read_pandas accepts pandas Series and Index objects ( #573 ) ( f8821fe )
-
Support
ML.GENERATE_EMBEDDING
inPaLM2TextEmbeddingGenerator
( #539 ) ( 1156c1e ) -
Support max_columns in repr and make repr more efficient ( #515 ) ( 54e49cf )
Bug Fixes
-
Don’t download 100gb onto local python machine in load test ( #537 ) ( 082c58b )
-
Exclude list-like s parameter in plot.scatter ( #568 ) ( 1caac27 )
-
Fix case where df.peek would fail to execute even with force=True ( #511 ) ( 8eca99a )
-
Plot.scatter s parameter cannot accept float-like column ( #563 ) ( 8d39187 )
-
Product operation produces float result for all input types ( #501 ) ( 6873b30 )
-
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible ( #561 ) ( 4995c00 )
-
Respect hard stack size limit and swallow limit change exception. ( #558 ) ( 4833908 )
-
Restore string to date/time type coercion ( #565 ) ( 4ae0262 )
-
Sync the notebook with embedding changes ( #550 ) ( 347f2dd )
-
Use bytes limit on frame inlining rather than element count ( #576 ) ( 659a161 )
Performance Improvements
Dependencies
Documentation
-
bigframes.options.bigquery.project
andlocation
are optional in some circumstances ( #548 ) ( 90bcec5 ) -
Add “Supported pandas APIs” reference to the documentation ( #542 ) ( 74c3915 )
-
Add General Availability banner to README ( #507 ) ( 262ff59 )
-
Add the code samples for metrics{auc, roc_auc_score, roc_curve} ( #520 ) ( 5f37b09 )
-
Address more comments from technical writers to meet legal purposes ( #571 ) ( 9084df3 )
-
Migrate the overview page to Bigframes official landing page ( #536 ) ( a0fb8bb )
1.0.0 (2024-03-25)
⚠ BREAKING CHANGES
-
rename model parameter
min_rel_progress
totol
-
early_stop
setting no longer supported, always usesTrue
-
rename model parameter
n_parallell_trees
ton_estimators
-
rename
class_weights
toclass_weight
-
rename
learn_rate
tolearning_rate
-
PCA
n_components
supports float value andNone
, default toNone
-
rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 )
Features
-
Allow assigning directly to Series.name property ( #495 ) ( ad0e99e )
-
Ensure
Series.str.len()
can get length of array columns ( #497 ) ( 10c0446 ) -
Option to use bq connection without check ( #460 ) ( 0b3f8e5 )
-
PCA
n_components
supports float value andNone
, default toNone
( 65c6f47 ) -
Rename
class_weights
toclass_weight
( 65c6f47 ) -
Rename
learn_rate
tolearning_rate
( 65c6f47 ) -
Rename model parameter
min_rel_progress
totol
( 65c6f47 ) -
Rename model parameter
n_parallell_trees
ton_estimators
( 65c6f47 ) -
Rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 ) ( 65c6f47 )
-
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 ( #504 ) ( fbada4a )
Bug Fixes
-
early_stop
setting no longer supported, always usesTrue
( 65c6f47 ) -
Plot.scatter
c
argument functionalities ( #494 ) ( d6ee994 ) -
Properly support format param for numerical input. ( #486 ) ( ae20c35 )
-
Renable to_csv and to_json related tests ( #468 ) ( 2b9a01d )
-
Sampling plot cannot preserve ordering if index is not ordered ( #475 ) ( a5345fe )
-
Use actual BigQuery types rather than ibis types in to_pandas ( #500 ) ( 82b4f91 )
Dependencies
Documentation
-
Add code samples for metrics.{accuracy_score, confusion_matrix} ( #478 ) ( 3e3329a )
-
Add code samples for metrics.{recall_score, precision_score, f11_score} ( #502 ) ( 370fe90 )
-
Update bigquery connection documentation ( #499 ) ( 4bfe094 )
-
Update LLM + K-means notebook to handle partial failures ( #496 ) ( 97afad9 )
0.26.0 (2024-03-20)
⚠ BREAKING CHANGES
- exclude remote models for .register() ( #465 )
Features
-
read_gbq_table
supportsLIKE
as a operator infilters
( #454 ) ( d2d425a ) -
Set
force=True
by default inDataFrame.peek()
( #469 ) ( 4e8e97d ) -
Support datetime related casting in (Series|DataFrame|Index).astype ( #442 ) ( fde339b )
Bug Fixes
-
Any() on empty set now correctly returns False ( #471 ) ( f55680c )
-
Disable to_json and to_csv related tests ( #462 ) ( 874026d )
-
Fix grouping series on multiple other series ( #455 ) ( 3971bd2 )
-
Groupby aggregates no longer check if grouping keys are numeric ( #472 ) ( 4fbf938 )
-
Raise
ValueError
whenread_pandas()
receives a bigframesDataFrame
( #447 ) ( b28f9fd ) -
Series.(to_csv|to_json) leverages bq export ( #452 ) ( 718a00c )
-
Warn when
read_gbq
/read_gbq_table
uses the snapshot time cache ( #441 ) ( e16a8c0 )
Documentation
-
Add code samples for
ml.metrics.r2_score
( #459 ) ( 85fefa2 ) -
Add version information to bug template ( #437 ) ( 91bd39e )
-
Indicate that project and location are optional in example notebooks ( #451 ) ( 1df0140 )
0.25.0 (2024-03-14)
Features
-
(Series|DataFrame).plot.(line|area|scatter) ( #431 ) ( 0772510 )
-
Support CMEK for
remote_function
cloud functions ( #430 ) ( 2fd69f4 )
0.24.0 (2024-03-12)
⚠ BREAKING CHANGES
-
read_parquet
uses a “pandas” engine to parse files by default. Useengine="bigquery"
for the previous behavior
Features
Bug Fixes
-
Move
third_party.bigframes_vendored
tobigframes_vendored
( #424 ) ( 763edeb ) -
Only do row identity based joins when joining by index ( #356 ) ( 76b252f )
Documentation
-
Add predict sample to samples/snippets/bqml_getting_started_test.py ( #388 ) ( 6a3b0cc )
-
Fix the note rendering for DataFrames methods: nlargest, nsmallest ( #417 ) ( 38bd2ba )
0.23.0 (2024-03-05)
Features
-
Add ml.metrics.pairwise.euclidean_distance ( #397 ) ( 1726588 )
-
Add TextEmbedding model version support ( #394 ) ( e0f1ab0 )
Bug Fixes
-
Code exception in
remote_function
now prevents retry and surfaces in the client ( #387 ) ( dd3643d )
Dependencies
- Update ibis to version 8.0.0 and refactor
remote_function
to use ibis UDF method ( #277 ) ( 350499b )
Documentation
0.22.0 (2024-02-27)
⚠ BREAKING CHANGES
-
rename cosine_similarity to paired_cosine_distances ( #393 )
-
move model optional args to kwargs ( #381 )
Features
-
Add ml.metrics.pairwise.manhattan_distance ( #392 ) ( 9d31865 )
-
Enable regional endpoints for me-central2 ( #386 ) ( 469674d )
Bug Fixes
-
Avoid ibis warning for “database” table() method argument ( #390 ) ( a0490a4 )
-
Rename cosine_similarity to paired_cosine_distances ( #393 ) ( 81ece46 )
Performance Improvements
Dependencies
Documentation
-
Add a code sample for creating a kmeans model ( #267 ) ( 4291d65 )
-
Fix
bigframes.pandas.concat
documentation ( #382 ) ( 234b61c )
Miscellaneous Chores
Code Refactoring
0.21.0 (2024-02-13)
Features
-
Add ml.metrics.pairwise.cosine_similarity function ( #374 ) ( 126f566 )
-
Limited support of lambdas in
Series.apply
( #345 ) ( 208e081 ) -
Support bigframes.pandas.to_datetime for scalars, iterables and series. ( #372 ) ( ffb0d15 )
Bug Fixes
Documentation
0.20.1 (2024-02-06)
Performance Improvements
Documentation
0.20.0 (2024-01-30)
Features
-
Add
DataFrame.peek()
as an efficient alternative tohead()
results preview ( #318 ) ( 9c34d83 ) -
Add ARIMA_EVAULATE options in forecasting models ( #336 ) ( 73e997b )
-
Add Index constructor, repr, copy, get_level_values, to_series ( #334 ) ( e5d054e )
-
Improve error message for drive based BQ table reads ( #344 ) ( 0794788 )
-
Update cut to work without labels = False and show intervals as dict ( #335 ) ( 4ff53db )
Bug Fixes
-
Chance default connection name in getting_started.ipnyb ( #347 ) ( 677f014 )
-
Series iteration correctly returns values instead of index ( #339 ) ( 2c6af9b )
Documentation
0.19.2 (2024-01-22)
Bug Fixes
Documentation
0.19.1 (2024-01-17)
Bug Fixes
Documentation
0.19.0 (2024-01-09)
Features
-
Add ‘columns’ as an alias for ‘col_order’ ( #298 ) ( a01b271 )
-
Add Series dt.tz and dt.unit properties ( #303 ) ( 2e1a403 )
-
Allow manually set clustering_columns in dataframe.to_gbq ( #302 ) ( 9c21323 )
-
Support assigning to columns like a property ( #304 ) ( f645c56 )
-
Support upcasting numeric columns in concat ( #294 ) ( e3a056a )
Bug Fixes
Documentation
0.18.0 (2024-01-02)
Features
-
Add IntervalIndex support to bigframes.pandas.cut ( #254 ) ( 6c1969a )
-
Specific pyarrow mappings for decimal, bytes types ( #283 ) ( a1c0631 )
Bug Fixes
-
Dataframes to_gbq now creates dataset if it doesn’t exist ( #222 ) ( bac62f7 )
-
Exclude pandas 2.2.0rc0 to unblock prerelease tests ( #292 ) ( ac1a745 )
-
Fix DataFrameGroupby.agg() issue with as_index=False ( #273 ) ( ab49350 )
-
Make
Series.str.replace
work for simple strings ( #285 ) ( ad67465 ) -
Update dataframe.to_gbq to dedup column names. ( #286 ) ( 746115d )
Dependencies
Documentation
-
Add code snippets for explore query result page ( #278 ) ( 7cbbb7d )
-
Code samples for
astype
common to DataFrame and Series ( #280 ) ( 95b673a ) -
Code samples for
DataFrame.copy
andSeries.copy
( #290 ) ( 7cbc2b0 ) -
Code samples for
isna
,isnull
,dropna
,isin
( #289 ) ( ad51035 ) -
Code samples for
reset_index
andsort_values
( #282 ) ( acc0eb7 ) -
Code samples for
sample
,get
,Series.round
( #295 ) ( c2b1892 ) -
Code samples for
Series.{add, replace, unique, T, transpose}
( #287 ) ( 0e1bbfc ) -
Code samples for
Series.{map, to_list, count}
( #290 ) ( 7cbc2b0 ) -
Code samples for
Series.{name, std, agg}
( #293 ) ( eb69f60 ) -
Code samples for
Series.groupby
andSeries.{sum,mean,min,max}
( #280 ) ( 95b673a ) -
Code samples for DataFrame
set_index
,items
( #295 ) ( c2b1892 )
0.17.0 (2023-12-14)
Features
Bug Fixes
-
Increase recursion limit, cache compilation tree hashes ( #184 ) ( b54791c )
-
Replaced raise
NotImplementedError
with returnNotImplemented
( #258 ) ( a133822 )
Documentation
-
Add code samples for
values
andvalue_counts
( #249 ) ( f247d95 ) -
Add sample for getting started with BQML ( #141 ) ( fb14f54 )
0.16.0 (2023-12-12)
Features
-
Add DataFrame from_dict and from_records methods ( #244 ) ( 8d81e24 )
-
Add nunique method to Series/DataFrameGroupby ( #256 ) ( c8ec245 )
-
Support dataframe.loc with conditional columns selection ( #233 ) ( 3febea9 )
Bug Fixes
-
Enfore pandas version requirement <2.1.4 ( #265 ) ( 9dd63f6 )
-
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests ( b02fc2c )
-
Fix value_counts column label for normalize=True ( #245 ) ( d3fa6f2 )
-
Migrate e2e tests to bigframes-load-testing project ( 8766ac6 )
Documentation
-
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… ( #252 ) ( 8c63697 )
-
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… ( #234 ) ( e735412 )
-
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod ( #243 ) ( 0523a31 )
-
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… ( #251 ) ( 77074ec )
-
Correct the params rendering for
ml.remote
andml.ensemble
modules ( #248 ) ( c2829e3 ) -
Fix return annotation in API docstrings ( #253 ) ( 89a1c67 )
0.15.0 (2023-11-29)
⚠ BREAKING CHANGES
- model.predict returns all the columns ( #204 )
Features
-
Add info and memory_usage methods to dataframe ( #219 ) ( 9d6613d )
-
Add the recent api method for ML component ( #225 ) ( ed8876d )
-
Send warnings on LLM prediction partial failures ( #216 ) ( 81125f9 )
Bug Fixes
-
Avoid unnecessary row_number() on sort key for io ( #211 ) ( a18d40e )
-
Make to_pandas override enable_downsampling when sampling_method is manually set. ( #200 ) ( ae03756 )
-
Update the llm+kmeans notebook with recent change ( #236 ) ( f8917ab )
-
Use anonymous dataset to create
remote_function
( #205 ) ( 69b016e )
Documentation
-
Add code samples for
index
andcolumn
properties ( #212 ) ( c88d38e ) -
Add code samples for df reshaping, function, merge, and join methods ( #203 ) ( 010486c )
-
Add examples for dataframe.kurt, dataframe.std, dataframe.count ( #232 ) ( f9c6e72 )
-
Add examples for dataframe.mean, dataframe.median, dataframe.va… ( #228 ) ( edd0522 )
-
Add examples for dataframe.min, dataframe.max and dataframe.sum ( #227 ) ( 3a375e8 )
-
Code samples for
Series.dot
andDataFrame.dot
( #226 ) ( b62a07a ) -
Code samples for
Series.where
andSeries.mask
( #217 ) ( 52dfad2 ) -
Code samples for dataframe.any, dataframe.all and dataframe.prod ( #223 ) ( d7957fa )
-
Make the code samples reflect default bq connection usage ( #206 ) ( 71844b0 )
Miscellaneous Chores
0.14.1 (2023-11-16)
Bug Fixes
Documentation
0.14.0 (2023-11-14)
Features
-
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods ( #162 ) ( 6a28403 )
-
Add series.sample (identical to existing dataframe.sample) ( #187 ) ( 37914a4 )
-
Log most recent API calls as
recent-bigframes-api-xx
labels on BigQuery jobs ( #145 ) ( 4ea33b7 ) -
Read_gbq creates order deterministically without table copy ( #191 ) ( 8ab81de )
-
Support
date_series.astype("string[pyarrow]")
to cast DATE to STRING ( #186 ) ( aee0e8e ) -
Temporary resources no longer use BigQuery Sessions ( #194 ) ( 4a02cac )
Bug Fixes
-
Default to 7 days expiration for
read_csv
,read_json
,read_parquet
( #193 ) ( 03606cd ) -
Deprecate the
remote_service_type
in llm model ( #180 ) ( a8a409a ) -
For reset_index on unnamed multiindex, always use level_[n] label ( #182 ) ( f95000d )
-
Match pandas behavior when assigning listlike to empty dfs ( #172 ) ( c1d1f42 )
-
Use anonymous dataset instead of session dataset for temp tables ( #181 ) ( 800d44e )
-
Use random table when loading data for
read_csv
,read_json
,read_parquet
( #175 ) ( 9d2e6dc )
Documentation
-
Add code samples for
read_gbq_function
using community UDFs ( #188 ) ( 7506eab ) -
Add docstring code samples for
Series.apply
andDataFrame.map
( #185 ) ( c816d84 ) -
Add llm kmeans notebook as an included example ( #177 ) ( d49ae42 )
-
Use
head()
to get topn
results, not to preview results ( #190 ) ( 87f84c9 )
0.13.0 (2023-11-07)
Features
-
to_gbq
without a destination table writes to a temporary table ( #158 ) ( e1817c9 ) -
Add
DataFrame.__iter__
,DataFrame.iterrows
,DataFrame.itertuples
, andDataFrame.keys
methods ( #164 ) ( c065071 ) -
Add interpolate() to series and dataframe ( #157 ) ( b9cb55c )
-
Support 32k text-generation and multilingual embedding models ( #161 ) ( 5f0ea37 )
Bug Fixes
0.12.0 (2023-11-01)
Features
-
Add
DataFrame.to_pandas_batches()
to download largeDataFrame
objects ( #136 ) ( 3afd4a3 ) -
Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs ( #133 ) ( 63c7919 )
Bug Fixes
-
Don’t override the global logging config ( #138 ) ( 2ddbf74 )
-
Fix bug with column names under repeated column assignment ( #150 ) ( 29032d0 )
-
Resolve plotly rendering issue by using ipython html for job pro… ( #134 ) ( 39df43e )
-
Use indexee’s session for loc listlike cases ( #152 ) ( 27c5725 )
Documentation
-
Fix indentation on
read_gbq_function
code sample ( #163 ) ( 0801d96 ) -
Link to ML.EVALUATE BQML page for score() methods ( #137 ) ( 45c617f )
0.11.0 (2023-10-26)
Features
-
Add back
reset_session
as an alias forclose_session
( #124 ) ( 694a85a ) -
Change
query
parameter toquery_or_table
inread_gbq
( #127 ) ( f9bb3c4 )
Bug Fixes
-
Expose
bigframes.pandas.reset_session
as a public API ( #128 ) ( b17e1f4 ) -
Use series’s own session in series.reindex listlike case ( #135 ) ( 95bff3f )
Documentation
-
Add runnable code samples for DataFrames I/O methods and property ( #129 ) ( 6fea8ef )
-
Add runnable code samples for reading methods ( #125 ) ( a669919 )
0.10.0 (2023-10-19)
Features
0.9.0 (2023-10-18)
⚠ BREAKING CHANGES
- rename
bigframes.pandas.reset_session
toclose_session
( #101 )
Features
-
Add
bigframes.options.bigquery.application_name
for partner attribution ( #117 ) ( 52d64ff ) -
Rename
bigframes.pandas.reset_session
toclose_session
( #101 ) ( 36693bf ) -
Send BigQuery cancel request when canceling bigframes process ( #103 ) ( e325fbb )
-
Support external packages in
remote_function
( #98 ) ( ec10c4a ) -
Use ArrowDtype for STRUCT columns in
to_pandas
( #85 ) ( 9238fad )
Bug Fixes
Performance Improvements
Documentation
0.8.0 (2023-10-12)
⚠ BREAKING CHANGES
- The default behavior of
to_parquet
is changing from no compression to'snappy'
compression.
Features
- Support compression in
to_parquet
( a8c286f )
Bug Fixes
0.7.0 (2023-10-11)
Features
-
Add aliases for several series properties ( #80 ) ( c0efec8 )
-
Add iat and iloc accessing by tuples of integers ( #90 ) ( 228aeba )
Bug Fixes
Documentation
0.6.0 (2023-10-04)
Features
-
Add update and align methods to dataframe ( #57 ) ( bf050cf )
-
Support STRUCT data type with
Series.struct.field
to extract child fields ( #71 ) ( 17afac9 )
Bug Fixes
-
Avoid
403 response too large to return
error withread_gbq
and large query results ( #77 ) ( 8f3b5b2 ) -
Change return type of
Series.loc[scalar]
( #40 ) ( fff3d45 ) -
Fix df/series.iloc by list with multiindex ( #79 ) ( 971d091 )
0.5.0 (2023-09-28)
Features
-
Add
DataFrame.kurtosis
/DF.kurt
method ( c1900c2 ) -
Add
DataFrame.rolling
andDataFrame.expanding
methods ( c1900c2 ) -
Add
items
,apply
methods toDataFrame
. ( #43 ) ( 3adc1b3 ) -
Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )
-
Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. ( #38 ) ( 1a254a4 ) -
Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
class_weights="balanced"
inLogisticRegression
model ( c1900c2 ) -
Support
df[column_name] = df_only_one_column
( c1900c2 ) -
Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support casting string to integer or float ( #59 ) ( 3502f83 )
Bug Fixes
-
Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )
-
LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )
-
Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )
Performance Improvements
-
Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )
-
Inline small
Series
andDataFrames
in query text ( #45 ) ( 5e199ec ) -
Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )
-
Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )
Documentation
- Link to Remote Functions code samples from README and API reference ( c1900c2 )
0.4.0 (2023-09-16)
Features
-
Add
axis
parameter todroplevel
andreorder_levels
( 7c6b0dd ) -
Add
bfill
andffill
toDataFrame
andSeries
( 7c6b0dd ) -
Add
DataFrame.combine
andDataFrame.combine_first
( #27 ) ( 7c6b0dd ) -
Add
DataFrame.nlargest
,nsmallest
( 7c6b0dd ) -
Add
DataFrame.pct_change
andSeries.pct_change
( 7c6b0dd ) -
Add
DataFrame.skew
andGroupBy.skew
( 7c6b0dd ) -
Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
( 7c6b0dd ) -
Add
diff
method toDataFrame
andGroupBy
( 7c6b0dd ) -
Add
filter
andreindex
toSeries
andDataFrame
( 7c6b0dd ) -
Add
reindex_like
toDataFrame
andSeries
( 7c6b0dd ) -
Add
swaplevel
toDataFrame
andSeries
( 7c6b0dd ) -
Add partial support for
Sereies.replace
( 7c6b0dd ) -
Support
DataFrame.loc[bool_series, column] = scalar
( 7c6b0dd ) -
Support a persistent
name
inremote_function
( 7c6b0dd )
Bug Fixes
-
remote_function
uses same credentials as other APIs ( 7c6b0dd ) -
Add type hints to models ( 7c6b0dd )
-
Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )
-
Remove
transforms
parameter inmodel.fit
( breaking change) ( 7c6b0dd ) -
Support column joins with “None indexer” ( 7c6b0dd )
-
Use for literals
Int64Dtype
incut
( 7c6b0dd ) -
Use lowercase strings for parameter literals in
bigframes.ml
( breaking change) ( 7c6b0dd )
Performance Improvements
-
bigframes-api
label to I/O query jobs ( 7c6b0dd )
Documentation
-
Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )
-
Document region logic in README ( 7c6b0dd )
-
Fix OneHotEncoder sample ( 7c6b0dd )
0.3.2 (2023-09-06)
Bug Fixes
0.3.1 (2023-09-05)
Bug Fixes
0.3.0 (2023-09-02)
Features
-
Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases ( a32b747 ) -
Add
bigframes.pandas.read_pickle
function ( a32b747 ) -
Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
( 89b9503 ) -
Add
fit_transform
tobigquery.ml
transformers ( a32b747 ) -
Add
Series.dropna
andDataFrame.fillna
( 8fab755 ) -
Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
( a32b747 ) -
Support
bigframes.pandas.merge()
( 8fab755 ) -
Support
DataFrame.isin
with list and dict inputs ( 8fab755 ) -
Support
DataFrame.pivot
( a32b747 ) -
Support
DataFrame.stack
( 89b9503 ) -
Support
DataFrame
-DataFrame
binary operations ( 8fab755 ) -
Support
df[my_column] = [a python list]
( 89b9503 ) -
Support
Index.is_monotonic
( 8fab755 ) -
Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument ( 89b9503 ) -
Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument ( 89b9503 ) -
Support
pow()
and power operator inDataFrame
andSeries
( 8fab755 ) -
Support
read_json
withengine=bigquery
for newline-delimited JSON files ( 89b9503 ) -
Support
Series.corr
( 89b9503 ) -
Support
Series.map
( 8fab755 ) -
Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
( 8fab755 ) -
Support MultiIndex for DataFrame columns ( a32b747 )
-
Use
pandas.Index
for column labels ( a32b747 ) -
Use default session and connection in
ml.llm
andml.imported
( 8fab755 )
Bug Fixes
-
Add error message to
set_index
( a32b747 ) -
Align column names with pandas in
DataFrame.agg
results ( 89b9503 ) -
Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined ( 89b9503 ) -
Check for IAM role on the BigQuery connection when initializing a
remote_function
( 89b9503 ) -
Check that types are specified in
read_gbq_function
( a32b747 ) -
Don’t use query cache for Session construction ( a32b747 )
-
Include survey link in abstract
NotImplementedError
exception messages ( 89b9503 ) -
Label temp table creation jobs with
source=bigquery-dataframes-temp
label ( 89b9503 ) -
Make
X_train
argument names consistent across methods ( 8fab755 ) -
Raise AttributeError for unimplemented pandas methods ( 89b9503 )
-
Raise exception for invalid function in
read_gbq_function
( a32b747 ) -
Support spaces in column names in
DataFrame
initializater ( 89b9503 )
Performance Improvements
-
Add local cache for
__repr_\*__
methods ( a32b747 ) -
Lazily instantiate client library objects ( 89b9503 )
-
Use
row_number()
filter forhead
/tail
( 8fab755 )
Documentation
-
Add ML section under Overview ( a32b747 )
-
Add release status to table of contents ( a32b747 )
-
Add samples and best practices to
read_gbq
docs ( a32b747 ) -
Correct the return types of Dataframe and Series ( a32b747 )
-
Create subfolders for notebooks ( a32b747 )
-
Fix link to GitHub ( 89b9503 )
-
Highlight bigframes is open-source ( a32b747 )
-
Sample ML Drug Name Generation notebook ( a32b747 )
-
Set
options.bigquery.project
in sample code ( 89b9503 ) -
Transform remote function user guide into sample code ( a32b747 )
-
Update remote function notebook with read_gbq_function usage ( 8fab755 )
0.2.0 (2023-08-17)
Features
-
Add KMeans.cluster_centers_.
-
Allow column labels to be any type handled by bq df, column labels can be integers now.
-
Add dataframegroupby.agg().
-
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
-
Add match, fullmatch, get, pad str methods.
-
Add series isin function.
Bug Fixes
-
Update ML package to use sessions for queries.
-
Optimize
read_gbq
withindex_col
set to cluster byindex_col
. -
Raise ValueError if the location mismatched.
-
read_gbq
no longer uses ‘time travel’ with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
0.1.1 (2023-08-14)
Documentation
- Correct link to code repository in
setup.py
and use correct terminology forconsole.cloud.google.com
links.
0.1.0 (2023-08-11)
Features
-
Add
bigframes.pandas
package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more. -
Add
bigframes.ml
package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .
0.0.0 (2023-02-22)
- Empty package to reserve package name.