Changelog

PyPI History

2.28.0 (2025-11-03)

Features

Add bigframes.bigquery.st_simplify ( #2210 ) ( ecee2bc )
Add Series.dt.day_name ( #2218 ) ( 5e006e4 )
Polars engine supports std, var ( #2215 ) ( ef5e83a )
Support INFORMATION_SCHEMA views in read_gbq ( #1895 ) ( d97cafc )
Support some python standard lib callables in apply/combine ( #2187 ) ( 86a2756 )

Bug Fixes

Correct connection normalization in blob system tests ( #2222 ) ( a0e1e50 )
Improve error handling in blob operations ( #2194 ) ( d410046 )
Resolve AttributeError in TableWidget and improve initialization ( #1937 ) ( 4c4c9b1 )

Documentation

Update bq_dataframes_llm_output_schema.ipynb ( #2004 ) ( 316ba9f )

2.27.0 (2025-10-24)

Features

Add absto dataframe ( #2186 ) ( c331dfe )
Add df.groupby().corr()/cov() support ( #2190 ) ( ccd7c07 )
Add str accessor to index ( #2179 ) ( cd87ce0 )
Add support for np.isnan and np.isfinite ufuncs ( #2188 ) ( 68723bc )
Include local data bytes in the dry run report when available ( #2185 ) ( ee2c40c )
Support len() on Groupby objects ( #2183 ) ( 4191821 )
Support pa.json_(pa.string()) in struct/list if available ( #2180 ) ( 5ec3cc0 )

Documentation

Update AI operators deprecation notice ( #2182 ) ( 2c50310 )

2.26.0 (2025-10-17)

⚠ BREAKING CHANGES

turn Series.struct.dtypes into a property to match pandas ( https://github.com/googleapis/python-bigquery-dataframes/pull/2169 )

Features

Add df.sort_index(axis=1) ( #2173 ) ( ebf95e3 )
Enhanced multimodal error handling with verbose mode for blob image functions ( #2024 ) ( f9e28fe )
Implement cos, sin, and log operations for polars compiler ( #2170 ) ( 5613e44 )
Make all and any compatible with integer columns on Polars session ( #2154 ) ( 6353d6e )

Bug Fixes

blob.display() shows
Turn Series.struct.dtypes into a property to match pandas ( https://github.com/googleapis/python-bigquery-dataframes/pull/2169 ) ( 62f7e9f )

Documentation

Clarify that only NULL values are handled by fillna/isna, not NaN ( #2176 ) ( 8f27e73 )
Remove import bigframes.pandas as bpd boilerplate from many samples ( #2147 ) ( 1a01ab9 )

2.25.0 (2025-10-13)

Features

Add barh, pie plot types ( #2146 ) ( 5cc3c5b )
Add Index. eqfor consts, aligned objects ( #2141 ) ( 8514200 )
Add output_schema parameter to ai.generate() ( #2139 ) ( ef0b0b7 )
Create session-scoped cut , DataFrame , MultiIndex , Index , Series , to_datetime , and to_timedelta methods ( #2157 ) ( 5e1e809 )
Replace ML.GENERATE_TEXT with AI.GENERATE for audio transcription ( #2151 ) ( a410d0a )
Support string literal inputs for AI functions ( #2152 ) ( 7600001 )

Bug Fixes

Address typo in error message ( #2142 ) ( cdf2dd5 )
Avoid possible circular imports in global session ( #2115 ) ( 095c0b8 )
Fix too many cluster columns requested by caching ( #2155 ) ( 35c1c33 )
Show progress even in job optional queries ( #2119 ) ( 1f48d3a )
Yield row count from read session if otherwise unknown ( #2148 ) ( 8997d4d )

Documentation

Add a brief intro notebook for bbq AI functions ( #2150 ) ( 1f434fb )
Fix ai function related docs ( #2149 ) ( 93a0749 )
Remove progress bar from getting started template ( #2143 ) ( d13abad )

2.24.0 (2025-10-07)

Features

Add ai.classify() to bigframes.bigquery package ( #2137 ) ( 56e5033 )
Add ai.generate() to bigframes.bigquery module ( #2128 ) ( 3810452 )
Add ai.if_() and ai.score() to bigframes.bigquery package ( #2132 ) ( 32502f4 )

Bug Fixes

Fix internal type errors with temporal accessors ( #2125 ) ( c390da1 )
Fix row count local execution bug ( #2133 ) ( ece0762 )
Join on, how args are now positional ( #2140 ) ( b711815 )
Only show JSON dtype warning when accessing dtypes directly ( #2136 ) ( eca22ee )
Remove noisy AmbiguousWindowWarning from partial ordering mode ( #2129 ) ( 4607f86 )

Performance Improvements

Scale read stream workers to cpu count ( #2135 ) ( 67e46cd )

2.23.0 (2025-09-29)

Features

Add ai.generate_double to bigframes.bigquery package ( #2111 ) ( 6b8154c )

Bug Fixes

Prevent invalid syntax for no-op .replace ops ( #2112 ) ( c311876 )

Documentation

Add timedelta notebook sample ( #2124 ) ( d1a9888 )

2.22.0 (2025-09-25)

Features

Add GroupBy.__iter__ ( #1394 ) ( c56a78c )
Add ai.generate_int to bigframes.bigquery package ( #2109 ) ( af6b862 )
Add Groupby.describe() ( #2088 ) ( 328a765 )
Implement Index.to_list() ( #2106 ) ( 60056ca )
Implement inplace parameter for DataFrame.drop ( #2105 ) ( 3487f13 )
Support callable for series map method ( #2100 ) ( ac25618 )
Support df.info() with null index ( #2094 ) ( fb81eea )

Bug Fixes

Avoid ibis fillna warning in compiler ( #2113 ) ( 7ef667b )
Negative start and stop parameter values in Series.str.slice() ( #2104 ) ( f57a348 )
Throw type error for incomparable join keys ( #2098 ) ( 9dc9695 )
Transformers with non-standard column names throw errors ( #2089 ) ( a2daa3f )

2.21.0 (2025-09-17)

Features

Add bigframes.bigquery.to_json ( #2078 ) ( 0fc795a )
Support average=’binary’ in precision_score() ( #2080 ) ( 920f381 )
Support pandas series in ai.generate_bool ( #2086 ) ( a3de53f )

Bug Fixes

Allow bigframes.options.bigquery.credentials to be None ( #2092 ) ( 78f4001 )

2.20.0 (2025-09-16)

Features

Add __dataframe__ interchange support ( #2063 ) ( 3b46a0d )
Add ai_generate_bool to the bigframes.bigquery package ( #2060 ) ( 70d6562 )
Add bigframes.bigquery.to_json_string ( #2076 ) ( 41e8f33 )
Add rank(pct=True) support ( #2084 ) ( c1e871d )
Add StreamingDataFrame.to_bigtable and .to_pubsub start_timestamp parameter ( #2066 ) ( a63cbae )
Can call agg with some callables ( #2055 ) ( 17a1ed9 )
Support astype to json ( #2073 ) ( 6bd6738 )
Support pandas.Index as key for DataFrame. setitem() ( #2062 ) ( b3cf824 )
Support pd.cut() for array-like type ( #2064 ) ( 21eb213 )
Support to cast struct to json ( #2067 ) ( b0ff718 )

Bug Fixes

Deflake ai_gen_bool multimodel test ( #2085 ) ( 566a37a )
Do not scroll page selector in anywidget repr_mode ( #2082 ) ( 5ce5d63 )
Fix the potential invalid VPC egress configuration ( #2068 ) ( cce4966 )
Return a DataFrame containing query stats for all non-SELECT statements ( #2071 ) ( a52b913 )
Use the remote and managed functions for bigframes results ( #2079 ) ( 49b91e8 )

Performance Improvements

Avoid re-authenticating if credentials have already been fetched ( #2058 ) ( 913de1b )
Improve apply axis=1 performance ( #2077 ) ( 12e4380 )

2.19.0 (2025-09-09)

Features

Add str.join method ( #2054 ) ( 8804ada )
Support display.max_colwidth option ( #2053 ) ( 5229e07 )
Support VPC egress setting in remote function ( #2059 ) ( 5df779d )

Bug Fixes

Fix issue mishandling chunked array while loading data ( #2051 ) ( 873d0ee )
Remove warning for slot_millis_sum ( #2047 ) ( 425a691 )

2.18.0 (2025-09-03)

⚠ BREAKING CHANGES

add allow_large_results option to read_gbq_query , aligning with bpd.options.compute.allow_large_results option ( #1935 )

Features

Add allow_large_results option to read_gbq_query , aligning with bpd.options.compute.allow_large_results option ( #1935 ) ( a7963fe )
Add parameter shuffle for ml.model_selection.train_test_split ( #2030 ) ( 2c72c56 )
Can pivot unordered, unindexed dataframe ( #2040 ) ( 1a0f710 )
Local date accessor execution support ( #2034 ) ( 7ac6fe1 )
Support args in dataframe apply method ( #2026 ) ( 164c481 )
Support args in series apply method ( #2013 ) ( d9d725c )
Support callable for dataframe mask method ( #2020 ) ( 9d4504b )
Support multi-column assignment for DataFrame ( #2028 ) ( ba0d23b )
Support string matching in local executor ( #2032 ) ( c0b54f0 )

Bug Fixes

Fix scalar op lowering tree walk ( #2029 ) ( 935af10 )
Read_csv fails when check file size for wildcard gcs files ( #2019 ) ( b0d620b )
Resolve the validation issue for other arg in dataframe where method ( #2042 ) ( 8689199 )

Performance Improvements

Improve axis=1 aggregation performance ( #2036 ) ( fbb2094 )
Improve iter_nodes_topo performance using Kahn’s algorithm ( #2038 ) ( 3961637 )

2.17.0 (2025-08-22)

Features

Add isin local execution impl ( #1993 ) ( 26df6e6 )
Add reset_index names, col_level, col_fill, allow_duplicates args ( #2017 ) ( c02a1b6 )
Support callable for series mask method ( #2014 ) ( 5ac32eb )

2.16.0 (2025-08-20)

Features

Add bigframes.pandas.options.display.precision option ( #1979 ) ( 15e6175 )
Add level, inplace params to reset_index ( #1988 ) ( 3446950 )
Add ML code samples from dbt blog post ( #1978 ) ( ebaa244 )
Add where, coalesce, fillna, casewhen, invert local impl ( #1976 ) ( f7f686c )
Adjust anywidget CSS to prevent overflow ( #1981 ) ( 204f083 )
Format page number in table widget ( #1992 ) ( e83836e )
Or, And, Xor can execute locally ( #1994 ) ( 59c52a5 )
Support callable bigframes function for dataframe where ( #1990 ) ( 44c1ec4 )
Support callable for series where method ( #2005 ) ( 768b82a )
When using repr_mode = "anywidget" , numeric values align right ( 15e6175 )

Bug Fixes

Address the packages issue for bigframes function ( #1991 ) ( 68f1d22 )
Correct pypdf dependency specifier for remote PDF functions ( #1980 ) ( 0bd5e1b )
Enable default retries in calls to BQ Storage Read API ( #1985 ) ( f25d7bd )
Fix the copyright year in dbt sample files ( #1996 ) ( fad5722 )

Performance Improvements

Faster session startup by defering anon dataset fetch ( #1982 ) ( 2720c4c )

Documentation

Add examples of running bigframes in kaggle ( #2002 ) ( 7d89d76 )
Remove preview warning from partial ordering mode sample notebook ( #1986 ) ( 132e0ed )

2.15.0 (2025-08-11)

Features

Add st_buffer , st_centroid , and st_convexhull and their corresponding GeoSeries methods ( #1963 ) ( c4c7fa5 )
Add first, last support to GroupBy ( #1969 ) ( 41dda88 )
Add value_counts to GroupBy classes ( #1974 ) ( 82175a4 )
Allow callable as a conditional or replacement input in DataFrame.where ( #1971 ) ( a8d57d2 )
Can cast locally in hybrid engine ( #1944 ) ( d9bc4a5 )
Df.join lsuffix and rsuffix support ( #1857 ) ( 26515c3 )

Bug Fixes

Add warnings for duplicated or conflicting type hints in bigfram… ( #1956 ) ( d38e42c )
Make remote_function more robust when there are create_function retries ( #1973 ) ( cd954ac )
Make ExecutionMetrics stats tracking more robust to missing stats ( #1977 ) ( feb3ff4 )

Performance Improvements

Remove an unnecessary extra dry_run query from read_gbq_table ( #1972 ) ( d17b711 )

Documentation

Divide BQ DataFrames quickstart code cell ( #1975 ) ( fedb8f2 )

2.14.0 (2025-08-05)

Features

Dynamic table width for better display across devices ( https://github.com/googleapis/python-bigquery-dataframes/issues/1948 ) ( a6d30ae ) ( a6d30ae )
Retry AI/ML jobs that fail more often ( #1965 ) ( 25bde9f )
Support series input in managed function ( #1920 ) ( 62a189f )

Bug Fixes

Enhance type error messages for bigframes functions ( #1958 ) ( 770918e )

Performance Improvements

Use promote_offsets for consistent row number generation for index.get_loc ( #1957 ) ( c67a25a )

Documentation

Add code snippet for storing dataframes to a CSV file ( #1943 ) ( a511e09 )
Add code snippet for storing dataframes to a CSV file ( #1953 ) ( a298a02 )

2.13.0 (2025-07-25)

Features

_read_gbq_colab creates hybrid session ( #1901 ) ( 31b17b0 )
Add CSS styling for TableWidget pagination interface ( #1934 ) ( 5b232d7 )
Add row numbering local pushdown in hybrid execution ( #1932 ) ( 92a2377 )
Implement Index.get_loc ( #1921 ) ( bbbcaf3 )

Bug Fixes

Add license header and correct issues in dbt sample ( #1931 ) ( ab01b0a )

Dependencies

Replace google-cloud-iam with grpc-google-iam-v1 ( #1864 ) ( e5ff8f7 )

2.12.0 (2025-07-23)

Features

Add code samples for dbt bigframes integration ( #1898 ) ( 7e03252 )
Add isin local execution to hybrid engine ( #1915 ) ( c0cefd3 )
Add ml.metrics.mean_absolute_error method ( #1910 ) ( 15b8449 )
Allow local arithmetic execution in hybrid engine ( #1906 ) ( ebdcd02 )
Provide day_of_year and day_of_week for dt accessor ( #1911 ) ( 40e7638 )
Support params max_batching_rows , container_cpu , and container_memory for udf ( #1897 ) ( 8baa912 )
Support typed pyarrow.Scalar in assignment ( #1930 ) ( cd28e12 )

Bug Fixes

Correct min field from max() to min() in remote function tests ( #1917 ) ( d5c54fc )
Resolve location reset issue in bigquery options ( #1914 ) ( c15cb8a )
Series.str.isdigit in unicode superscripts and fractions ( #1924 ) ( 8d46c36 )

Documentation

Add code snippets for session and IO public docs ( #1919 ) ( 6e01cbe )
Add snippets for performance optimization doc ( #1923 ) ( 4da309e )

2.11.0 (2025-07-15)

Features

Add __contains__ to Index, Series, DataFrame ( #1899 ) ( 07222bf )
Add thresh param for Dataframe.dropna ( #1885 ) ( 1395a50 )
Add concat pushdown for hybrid engine ( #1891 ) ( 813624d )
Add pagination buttons (prev/next) to anywidget mode for DataFrames ( #1841 ) ( 8eca767 )
Add total_rows property to pandas batches iterator ( #1888 ) ( e3f5e65 )
Hybrid engine local join support ( #1900 ) ( 1aa7950 )
Support date data type for to_datetime() ( #1902 ) ( 24050cb )
Support bpd.Series(json_data, dtype=”json”) ( #1882 ) ( 05cb7d0 )

Bug Fixes

Bpd.merge on common columns ( #1905 ) ( a1fa112 )
DataFrame string addition respects order ( #1894 ) ( 52c8233 )
Show slot_millis_sum warning only when allow_large_results=False ( #1892 ) ( 25efabc )
Used query row count metadata instead of table metadata ( #1893 ) ( e1ebc53 )

2.10.0 (2025-07-08)

Features

df.to_pandas_batches() returns one empty DataFrame if df is empty ( #1878 ) ( e43d15d )
Add filter pushdown to hybrid engine ( #1871 ) ( 6454aff )
Add simple stats support to hybrid local pushdown ( #1873 ) ( 8715105 )

Bug Fixes

Fix issues where duration type returned as int ( #1875 ) ( f30f750 )

Documentation

Update gsutil commands to gcloud commands ( #1876 ) ( c289f70 )

2.9.0 (2025-06-30)

Features

Add bpd.read_arrow to convert an Arrow object into a bigframes DataFrame ( #1855 ) ( 633bf98 )
Add experimental polars execution ( #1747 ) ( daf0c3b )
Add size op support in local engine ( #1865 ) ( 942e66c )
Create deploy_remote_function and deploy_udf functions to immediately deploy functions to BigQuery ( #1832 ) ( c706759 )
Support index item assign in Series ( #1868 ) ( c5d251a )
Support item assignment in series ( #1859 ) ( 25684ff )
Support local execution of comparison ops ( #1849 ) ( 1c45ccb )

Bug Fixes

Fix bug selecting column repeatedly ( #1858 ) ( cc339e9 )
Fix bug with DataFrame.agg for string values ( #1870 ) ( 81e4d64 )
Generate GoogleSQL instead of legacy SQL data types for dry_run=True from bpd._read_gbq_colab with local pandas DataFrame ( #1867 ) ( fab3c38 )
Revert dict back to protobuf in the iam binding update ( #1838 ) ( 9fb3cb4 )

Documentation

Add data visualization samples for public doc ( #1847 ) ( 15e1277 )
Changed broken logo ( #1866 ) ( e3c06b4 )
Update ai.forecast notebook ( #1844 ) ( 1863538 )

2.8.0 (2025-06-23)

⚠ BREAKING CHANGES

add required param ‘engine’ to multimodal functions ( #1834 )

Features

Add bpd.options.compute.maximum_result_rows option to limit client data download ( #1829 ) ( e22a3f6 )
Add bpd.options.display.repr_mode = "anywidget" to create an interactive display of the results ( #1820 ) ( be0a3cf )
Add DataFrame.ai.forecast() support ( #1828 ) ( 7bc7f36 )
Add describe() method to Series ( #1827 ) ( a4205f8 )
Add required param ‘engine’ to multimodal functions ( #1834 ) ( 37666e4 )

Performance Improvements

Produce simpler sql ( #1836 ) ( cf9c22a )

Documentation

Add ai.forecast notebook ( #1840 ) ( 2430497 )

2.7.0 (2025-06-16)

Features

Add bbq.json_query_array and warn bbq.json_extract_array deprecated ( #1811 ) ( dc9eb27 )
Add bbq.json_value_array and deprecate bbq.json_extract_string_array ( #1818 ) ( 019051e )
Add groupby cumcount ( #1798 ) ( 18f43e8 )
Support custom build service account in remote_function ( #1796 ) ( e586151 )

Bug Fixes

Correct read_csv behaviours with use_cols, names, index_col ( #1804 ) ( 855031a )
Fix single row broadcast with null index ( #1803 ) ( 080eb7b )

Documentation

Document how to use ai.map() for information extraction ( #1808 ) ( b586746 )
Rearrange README.rst to include a short code sample ( #1812 ) ( f6265db )
Use pandas API instead of pandas-like or pandas-compatible ( #1825 ) ( aa32369 )

2.6.0 (2025-06-09)

Features

Add blob.transcribe function ( #1773 ) ( 86159a7 )
Implement ai.classify() ( #1781 ) ( 8af26d0 )
Implement item() for Series and Index ( #1792 ) ( d2154c8 )
Implement ST_ISCLOSED geography function ( #1789 ) ( 36bc179 )
Implement ST_LENGTH geography function ( #1791 ) ( c5b7fda )
Support isin with bigframes.pandas.Index arg ( #1779 ) ( e480d29 )

Bug Fixes

Address read_csv with both index_col and use_cols behavior inconsistency with pandas ( #1785 ) ( ba7c313 )
Allow KMeans model init parameter as k-means++ alias ( #1790 ) ( 0b59cf1 )
Replace function now can handle pd.NA value. ( #1786 ) ( 7269512 )

Documentation

Adjust strip method examples to match latest pandas ( #1797 ) ( 817b0c0 )
Fix docstrings to improve html rendering of code examples ( #1788 ) ( 38d9b73 )

2.5.0 (2025-05-30)

⚠ BREAKING CHANGES

the updated ai.map() parameter list is not backward-compatible

Features

Add bpd.options.bigquery.requests_transport_adapters option ( #1755 ) ( bb45db8 )
Add bbq.json_query and warn bbq.json_extract deprecated ( #1756 ) ( ec81dd2 )
Add bpd.options.reset() method ( #1743 ) ( 36c359d )
Add DataFrame.round method ( #1742 ) ( 3ea6043 )
Add deferred data uploading ( #1720 ) ( 1f6442e )
Add deprecation warning to Gemini-1.5-X, text-embedding-004, and remove remove legacy models in notebooks and docs ( #1723 ) ( 80aad9a )
Add structured output for ai map, ai filter and ai join ( #1746 ) ( 133ac6b )
Add support for df.loc list, column(s) ( 768a757 )
Include bq schema and query string in dry run results ( #1752 ) ( bb51147 )
Support inplace=True in rename and rename_axis ( #1744 ) ( 734cc65 )
Support unique() for Index ( #1750 ) ( 27fac78 )
Support astype conversions to and from JSON dtypes ( #1716 ) ( 8ef4de1 )
Support dict param for dataframe.agg() ( #1772 ) ( f9c29c8 )
Support dtype parameter in read_csv for bigquery engine ( #1749 ) ( 50dca4c )
Use read api for some peek ops ( #1731 ) ( 108f4d2 )

Bug Fixes

Fix clip int series with float bounds ( #1739 ) ( d451aef )
Fix error with self-merge operations ( #1774 ) ( e5fe143 )
Fix the default value for na_value for numpy conversions ( #1766 ) ( 0629cac )
Include location in Session-based temporary storage manager DDL queries ( #1780 ) ( acba032 )
Prevent creating unnecessary client objects in multithreaded environments ( #1757 ) ( 1cf9f5e )
Reduce bigquery table modification via DML for to_gbq ( #1737 ) ( 545cdca )
Stop ignoring arguments to MatrixFactorization.score(X, y) ( #1726 ) ( 55c07e9 )
Support JSON and STRUCT for bbq.sql_scalar ( #1754 ) ( 190390b )
Support str.replace re.compile with flags ( #1736 ) ( f8d2cd2 )

Performance Improvements

Faster local data comparison using idenitity ( #1738 ) ( 2858b1e )
Optimize repr for unordered gbq table ( #1778 ) ( 2bc4fbc )
Use JOB_CREATION_OPTIONAL when allow_large_results=False ( #1763 ) ( 15f3f2a )

Dependencies

Avoid gcsfs==2025.5.0 ( #1762 ) ( 68d5e2c )

Documentation

Add llm output_schema notebook ( #1732 ) ( b2261cc )
Add MatrixFactorization to the table of contents ( #1725 ) ( 611e43b )
Fix typo for “population” in the GeminiTextGenerator.predict(..., output_schema={...}) sample notebook ( #1748 ) ( bd07e05 )
Integrations notebook extracts token from bqclient._http.credentials instead of bqclient._credentials ( #1784 ) ( 6e63eca )
Updated multimodal notebook instructions ( #1745 ) ( 1df8ca6 )
Use partial ordering mode in the quickstart sample ( #1734 ) ( 476b7dd )

2.4.0 (2025-05-12)

Features

Add “dayofyear” property for dt accessors ( #1692 ) ( 9d4a59d )
Add .dt.days , .dt.seconds , dt.microseconds , and dt.total_seconds() for timedelta series. ( #1713 ) ( 2b3a45f )
Add DatetimeIndex class ( #1719 ) ( c3c830c )
Add isocalendar() for dt accessor” ( #1717 ) ( 0479763 )
Add bigframes.bigquery.json_value ( #1697 ) ( 46a9c53 )
Add blob.exif function support ( #1703 ) ( 3f79528 )
Add inplace arg support to sort methods ( #1710 ) ( d1ccb52 )
Improve error message in Series.apply for direct udfs ( #1673 ) ( 1a658b2 )
Publish bigframes blob(Multimodal) to preview ( #1693 ) ( e4c85ba )
Support () operator between timedeltas ( #1702 ) ( edaac89 )
Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models ( #1305 ) ( b16740e )
Support to_strip parameter for str.strip, str.lstrip and str.rstrip ( #1705 ) ( a84ee75 )

Bug Fixes

Fix dayofyear doc test ( #1701 ) ( 9b777a0 )
Fix issues with chunked arrow data ( #1700 ) ( e3289b7 )
Rename columns with protected names such as _TABLE_SUFFIX in to_gbq() ( #1691 ) ( 8ec6079 )

Performance Improvements

Defer query in read_gbq with wildcard tables ( #1661 ) ( 5c125c9 )
Rechunk result pages client side ( #1680 ) ( 67d8760 )

Dependencies

Move bigtable and pubsub to extras ( #1696 ) ( 597d817 )

Documentation

Add snippets for Matrix Factorization tutorials ( #1630 ) ( 24b37ae )
Deprecate bpd.options.bigquery.allow_large_results in favor of bpd.options.compute.allow_large_results ( #1597 ) ( 18780b4 )
Include import statement in the bigframes code snippet ( #1699 ) ( 08d70b6 )
Include the clean-up step in the udf code snippet ( #1698 ) ( 48992e2 )
Move multimodal notebook out of experimental folder ( #1712 ) ( 68b6532 )
Update blob_display option in snippets ( #1714 ) ( 8b30143 )

2.3.0 (2025-05-06)

Features

Add dry_run parameter to read_gbq() , read_gbq_table() and read_gbq_query() ( #1674 ) ( 4c5dee5 )

Bug Fixes

Guarantee guid thread safety across threads ( #1684 ) ( cb0267d )
Support large lists of lists in bpd.Series() constructor ( #1662 ) ( 0f4024c )
Use value equality to check types for unix epoch functions and timestamp diff ( #1690 ) ( 81e8fb8 )

Performance Improvements

to_datetime() now avoids caching inputs unless data is inspected to infer format ( #1667 ) ( dd08857 )

Documentation

Add a visualization notebook to BigFrame samples ( #1675 ) ( ee062bf )
Fix spacing of k-means code snippet ( #1687 ) ( 99f45dd )
Update snippet for Create a k-means model tutorial ( #1664 ) ( 761c364 )

2.2.0 (2025-04-30)

Features

Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints ( #1650 ) ( 4fb54df )
Add GeminiTextGenerator.predict structured output ( #1653 ) ( 6199023 )
DataFrames. getitemsupport for slice input ( #1668 ) ( 563f0cb )
Print right origin of PreviewWarning for the bpd.udf ( #1629 ) ( 48d10d1 )
Session.bytes_processed_sum will be updated when allow_large_re… ( #1669 ) ( ae312db )
Short circuit query for local scan ( #1618 ) ( e84f232 )
Support names parameter in read_csv for bigquery engine ( #1659 ) ( 3388191 )
Support passing list of values to bigframes.core.sql.simple_literal ( #1641 ) ( 102d363 )
Support write api as loading option ( #1617 ) ( c46ad06 )

Bug Fixes

DataFrame accessors is not pupulated ( #1639 ) ( 28afa2c )
Prefer remote schema instead of throwing on materialize conflicts ( #1644 ) ( 53fc25b )
Remove itertools.pairwise usage ( #1638 ) ( 9662745 )
Resolve issue where pre-release versions of google-auth are installed ( #1491 ) ( ebb7a5e )
Resolve some of the typo errors ( #1655 ) ( cd7fbde )

Performance Improvements

Fold row count ops when known ( #1656 ) ( c958dbe )
Use flyweight for node fields ( #1654 ) ( 8482bfc )

Dependencies

Support shapely 1.8.5+ again ( #1651 ) ( ae83e61 )

Documentation

Add JSON data types notebook ( #1647 ) ( 9128c4a )
Add sample code snippets for udf ( #1649 ) ( 53caa8d )
Fix bq_dataframes_template notebook to work if partial ordering mode is enabled ( #1665 ) ( f442e7a )
Note that udf is in preview and must be python 3.11 compatible ( #1629 ) ( 48d10d1 )

2.1.0 (2025-04-22)

Features

Add bigframes.bigquery.st_distance function ( #1637 ) ( bf1ae70 )
Enable local json string validations ( #1614 ) ( 233347a )
Enhance read_csv index_col parameter support ( #1631 ) ( f4e5b26 )

Bug Fixes

Add retry for test_clean_up_via_context_manager ( #1627 ) ( 58e7cb0 )
Improve robustness of managed udf code extraction ( #1634 ) ( 8cc56d5 )

Documentation

Add code samples in the udf API docstring ( #1632 ) ( f68b80c )

2.0.0 (2025-04-17)

⚠ BREAKING CHANGES

make dataset and name params mandatory in udf ( #1619 )
Locational endpoints support is not available in BigFrames 2.0.
change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator ( #1558 )
change default ingress setting for remote_function to internal-only ( #1544 )
make remote_function params keyword only ( #1537 )
make remote_function default service account explicit ( #1537 )
set allow_large_results=False by default ( #1541 )

Features

Add on parameter in dataframe.rolling() and dataframe.groupby.rolling() ( #1556 ) ( 45c9d9f )
Add component to manage temporary tables ( #1559 ) ( 0a4e245 )
Add Series.to_pandas_batches() method ( #1592 ) ( 09ce979 )
Add support for creating a Matrix Factorization model ( #1330 ) ( b5297f9 )
Allow input_types , output_type , and dataset to be used positionally in remote_function ( #1560 ) ( bcac8c6 )
Allow pandas.cut ‘labels’ parameter to accept a list of string ( #1549 ) ( af842b1 )
Change default ingress setting for remote_function to internal-only ( #1544 ) ( c848a80 )
Detect duplicate column/index names in read_gbq before send query. ( #1615 ) ( 40d6960 )
Drop support for locational endpoints ( #1542 ) ( 4bf2e43 )
Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy ( #1605 ) ( b4b7073 )
Improve local data validation ( #1598 ) ( 815e471 )
Make remote_function default service account explicit ( #1537 ) ( 9eb9089 )
Set allow_large_results=False by default ( #1541 ) ( e9fb712 )
Support bigquery connection in managed function ( #1554 ) ( f6f697a )
Support bq connection path format ( #1550 ) ( e7eb918 )
Support gemini-2.0-X models ( #1558 ) ( 3104fab )
Support inlining small list, struct, json data ( #1589 ) ( 2ce891f )
Support time range rolling on Series. ( #1590 ) ( 6e98a2c )
Use session temp tables for all ephemeral storage ( #1569 ) ( 9711b83 )
Use validated local storage for data uploads ( #1612 ) ( aee4159 )
Warn the deprecated max_download_size , random_state and sampling_method parameters in (DataFrame|Series).to_pandas() ( #1573 ) ( b9623da )

Bug Fixes

to_pandas_batches() respects page_size and max_results again ( #1572 ) ( 27c5905 )
Ensure page_size works correctly in to_pandas_batches when max_results is not set ( #1588 ) ( 570cff3 )
Include role and service account in IAM exception ( #1564 ) ( 8c50755 )
Make dataset and name params mandatory in udf ( #1619 ) ( 637e860 )
Pandas.cut returns labels index for numeric breaks when labels=False ( #1548 ) ( b2375de )
Prevent KeyError in bpd.concat with empty DF and struct/array types DF ( #1568 ) ( b4da1cf )
Read_csv supports for tilde local paths and includes index for bigquery_stream write engine ( #1580 ) ( 352e8e4 )
Use dictionaries to avoid problematic google.iam namespace ( #1611 ) ( b03e44f )

Performance Improvements

Directly read gbq table for simple plans ( #1607 ) ( 6ad38e8 )

Dependencies

Remove jellyfish dependency ( #1604 ) ( 1ac0e1e )
Remove parsy dependency ( #1610 ) ( 293f676 )
Remove test dependency on pytest-mock package ( #1622 ) ( 1ba72ea )
Support a shapely versions 1.8.5+ ( #1621 ) ( e39ee3b )

Documentation

Add details for bigquery_connection in [@bpd](https://github.com/bpd).udf docstring ( #1609 ) ( ef63772 )
Add explain forecast snippet to multiple time series tutorial ( #1586 ) ( 40c55a0 )
Add message to remove default model for version 3.0 ( #1563 ) ( 910be2b )
Add samples for ArimaPlus time_series_id_col feature ( #1577 ) ( 1e4cd9c )
Add warning for bigframes 2.0 ( #1557 ) ( 3f0eaa1 )
Deprecate default model in TextEmbedddingGenerator , GeminiTextGenerator , and other bigframes.ml.llm classes ( #1570 ) ( 89ab33e )
Include all licenses for vendored packages in the root LICENSE file ( #1626 ) ( 8116ed0 )
Remove gemini-1.5 deprecation warning for GeminiTextGenerator ( #1562 ) ( 0cc6784 )
Use restructured text to allow publishing to PyPI ( #1565 ) ( d1e9ec2 )

Miscellaneous Chores

Make remote_function params keyword only ( #1537 ) ( 9eb9089 )

1.42.0 (2025-03-27)

Features

Add closed parameter in rolling() ( #1539 ) ( 8bcc89b )
Add GeoSeries.difference() and bigframes.bigquery.st_difference() ( #1471 ) ( e9fe815 )
Add GeoSeries.intersection() and bigframes.bigquery.st_intersection() ( #1529 ) ( 8542bd4 )
Add df.take and series.take ( #1509 ) ( 7d00be6 )
Add Linear_Regression.global_explain() ( #1446 ) ( 7e5b6a8 )
Allow iloc to support lists of negative indices ( #1497 ) ( a9cf215 )
Support dry_run in to_pandas() ( #1436 ) ( 75fc7e0 )
Support window partition by geo column ( #1512 ) ( bdcb1e7 )
Upgrade BQ managed udf to preview ( #1536 ) ( 4a7fe4d )

Bug Fixes

Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X ( #1534 ) ( c93e720 )
Change the default value for pdf extract/chunk ( #1517 ) ( a70a607 )
Local data always has sequential index ( #1514 ) ( 014bd33 )
Read_pandas inline returns None when exceeds limit ( #1525 ) ( 578081e )
Temporary fix for StreamingDataFrame not working backend bug ( #1533 ) ( 6ab4ffd )
Tolerate BQ connection service account propagation delay ( #1505 ) ( 6681f1f )

Performance Improvements

Update shape to use quer_and_wait ( #1519 ) ( 34ab9b8 )

Documentation

Update GeoSeries.difference() and bigframes.bigquery.st_difference() docs ( #1526 ) ( d553fa2 )

1.41.0 (2025-03-19)

Features

Add support for the ‘right’ parameter in ‘pandas.cut’ ( #1496 ) ( 8aff128 )
Support BQ managed functions through read_gbq_function ( #1476 ) ( 802183d )
Warn when the BigFrames version is more than a year old ( #1455 ) ( 00e0750 )

Bug Fixes

Fix pandas.cut errors with empty bins ( #1499 ) ( 434fb5d )
Fix read_gbq with ORDER BY query and index_col set ( #963 ) ( de46d2f )

Performance Improvements

Eliminate count queries in llm retry ( #1489 ) ( 1c934c2 )

Documentation

Add a sample notebook for vector search ( #1500 ) ( f3bf139 )

1.40.0 (2025-03-11)

⚠ BREAKING CHANGES

reading JSON data as a custom arrow extension type ( #1458 )

Features

Reading JSON data as a custom arrow extension type ( #1458 ) ( e720f41 )
Support list output for managed function ( #1457 ) ( 461e9e0 )

Bug Fixes

Fix list-like indexers in partial ordering mode ( #1456 ) ( fe72ada )
Fix the merge issue between 1424 and 1373 ( #1461 ) ( 7b6e361 )
Use == instead of is for timedelta type equality checks ( #1480 ) ( 0db248b )

Performance Improvements

Compilation no longer bounded by recursion ( #1464 ) ( 27ab028 )

1.39.0 (2025-03-05)

Features

(Preview) Support diff() for date series ( #1423 ) ( 521e987 )
(Preview) Support aggregations over timedeltas ( #1418 ) ( 1251ded )
(Preview) Support arithmetics between dates and timedeltas ( #1413 ) ( 962b152 )
(Preview) Support automatic load of timedelta from BQ tables. ( #1429 ) ( b2917bb )
Add allow_large_results option to many I/O methods. Set to False to reduce latency ( #1428 ) ( dd2f488 )
Add GeoSeries.boundary() ( #1435 ) ( 32cddfe )
Add allow_large_results to peek ( #1448 ) ( 67487b9 )
Add groupby.rank() ( #1433 ) ( 3a633d5 )
Iloc multiple columns selection. ( #1437 ) ( ddfd02a )
Support interface for BigQuery managed functions ( #1373 ) ( 2bbf53f )
Warn if default ingress_settings is used in remote_functions ( #1419 ) ( dfd891a )

Bug Fixes

Do not compare schema description during schema validation ( #1452 ) ( 03a3a56 )
Remove warnings for null index and partial ordering mode in prep for GA ( #1431 ) ( 6785aee )
Warn if default cloud_function_service_account is used in remote_function ( #1424 ) ( fe7463a )
Window operations over JSON columns ( #1451 ) ( 0070e77 )
Write chunked text instead of dummy text for pdf chunk ( #1444 ) ( 96b0e8a )

Performance Improvements

Speed up DataFrame corr, cov ( #1309 ) ( c598c0a )

Documentation

Add snippet for explaining the linear regression model prediction ( #1427 ) ( 7c37c7d )

1.38.0 (2025-02-24)

Features

(Preview) Support diff aggregation for timestamp series. ( #1405 ) ( abe48d6 )
Add GeoSeries.from_wkt() and GeoSeries.to_wkt() ( #1401 ) ( 2993b28 )
Support DF. array(copy=True) ( #1403 ) ( 693ed8c )
Support routines with ARRAY return type in read_gbq_function ( #1412 ) ( 4b60049 )

Bug Fixes

Calling to_timdelta() over timedeltas no longer changes their values ( #1411 ) ( 650a190 )
Replace empty dict with None to avoid mutable default arguments ( #1416 ) ( fa4e3ad )

Performance Improvements

Avoid redundant SQL casts ( #1399 ) ( 6ee48d5 )

Dependencies

Remove scikit-learn and sqlalchemy as required dependencies ( #1296 ) ( fd8bc89 )

Documentation

Add samples using SQL methods via the bigframes.bigquery module ( #1358 ) ( f54e768 )
Add snippets for visualizing a time series and creating a time series model for the Limit forecasted values in time series model tutorial ( #1310 ) ( c6c9120 )

1.37.0 (2025-02-19)

Features

(Preview) Support add, sub, mult, div, and more between timedeltas ( #1396 ) ( ffa63d4 )
(Preview) Support comparison, ordering, and filtering for timedeltas ( #1387 ) ( 34d01b2 )
(Preview) Support subtraction in DATETIME/TIMESTAMP columns with timedelta columns ( #1390 ) ( 50ad3a5 )
JSON dtype support for read_pandas and Series constructor ( #1391 ) ( 44f4137 )

Bug Fixes

Ensure binops with pandas objects returns bigquery dataframes ( #1404 ) ( 3cee24b )

Performance Improvements

Prune projections more aggressively ( #1398 ) ( 7990262 )
Simplify sum aggregate SQL text ( #1395 ) ( 0145656 )
Use simple null constraints to simplify queries ( #1381 ) ( 00611d4 )

Documentation

Add DataFrame.struct docs ( #1348 ) ( 7e9e93a )

1.36.0 (2025-02-11)

Features

(Preview) Support addition between a timestamp and a timedelta ( #1369 ) ( b598aa8 )
(Preview) Support casting floats and list-likes to timedelta series ( #1362 ) ( 65933b6 )
(Preview) Support timestamp subtractions ( #1346 ) ( 86b7e72 )
Add bigframes.bigquery.st_area and suggest it from GeoSeries.area ( #1318 ) ( 8b5ffa8 )
Add GeoSeries.from_xy() ( #1364 ) ( 3c3e14c )

Bug Fixes

Dtype parameter ineffective in Series/DataFrame construction ( #1354 ) ( b9bdca8 )
Translate labels to col ids when copying dataframes ( #1372 ) ( 0c55b07 )

Performance Improvements

Prune unused operations from sql ( #1365 ) ( 923da03 )
Simplify merge join key coalescing ( #1361 ) ( 7ae565d )

1.35.0 (2025-02-04)

Features

(Preview) Support timedeltas for read_pandas() ( #1349 ) ( 866ba9e )
Add Series.keys() ( #1342 ) ( deb015d )
Allow case_when to change dtypes if case list contains the condition (True, some_default_value) ( #1311 ) ( 5c2a2c6 )
Support python type as astype arg ( #1316 ) ( b26e135 )
Support time_series_id_col in ARIMAPlus ( #1282 ) ( 97532c9 )

Bug Fixes

Exclude DataFrame and Series __call__ from unimplemented API metrics ( #1351 ) ( f2d5264 )
Make DataFrame __getattr__ and __setattr__ more robust to subclassing ( #1352 ) ( 417de3a )

Performance Improvements

Fall back to ordering by bq pk when possible ( #1350 ) ( 3c4abf2 )
Improve isin performance ( #1203 ) ( db087b0 )
Prevent inlining of remote ops ( #1347 ) ( 012081a )

Dependencies

Add support for Python 3.13 for everything but remote functions ( #1307 ) ( 533db96 )

Documentation

Add GeoSeries docs ( #1327 ) ( 05f83d1 )
Add link to DataFrames intro to improve SEO ( #1176 ) ( aafb5be )
Add snippet to explain the univariate model’s forecast result in the Forecast a single time series with a univariate model tutorial ( #1272 ) ( c22126b )

1.34.0 (2025-01-27)

⚠ BREAKING CHANGES

Enable reading JSON data with dbjson extension dtype ( #1139 )

Features

(df|s).hist(), (df|s).line(), (df|s).area(), (df|s).bar(), df.scatter() ( #1320 ) ( bd3f584 )
(Preview) Define timedelta type and to_timedelta function ( #1317 ) ( 3901951 )
Add DataFrame.corrwith method ( #1315 ) ( b503355 )
Add DataFrame.mask method ( #1302 ) ( 8b8155f )
Enable reading JSON data with dbjson extension dtype ( #1139 ) ( f672262 )

1.33.0 (2025-01-22)

Features

Add bigframes.bigquery.sql_scalar() to apply SQL syntax on Series objects ( #1293 ) ( aa2f73a )
Add unix_seconds, unix_millis and unix_micros for timestamp series. ( #1297 ) ( e4b0c8d )
DataFrame.join supports Series other ( #1303 ) ( ee37a0a )
Support array output in remote_function ( #1057 ) ( bdee173 )

Bug Fixes

Dataframe sort_values Series input keyerror. ( #1285 ) ( 5a2731b )
Fix read_gbq_function issue in dataframe apply method ( #1174 ) ( 0318764 )
Series sort_index and sort_values now raises when axis!=0 ( #1294 ) ( 94bc2f2 )

Documentation

Add snippet to forecast future time series in the Forecast a single time series with a univariate model tutorial ( #1271 ) ( a687050 )
Update bigframes.pandas.Series docs ( #1273 ) ( 0cac64f )

1.32.0 (2025-01-13)

Features

Add max_retries to TextEmbeddingGenerator and Claude3TextGenerator ( #1259 ) ( 8077ff4 )
Bigframes.bigquery.parse_json ( #1265 ) ( 27bbd80 )
Support DataFrame.astype(dict) ( #1262 ) ( 5934f8e )

Bug Fixes

Avoid global mutation in BigQueryOptions.client_endpoints_override ( #1280 ) ( 788f6e9 )
Fix erroneous window bounds removal during compilation ( #1163 ) ( f91756a )

Dependencies

Relax sqlglot upper bound ( #1278 ) ( c71ec09 )

Documentation

Add bq studio links that allows users to generate Jupiter notebooks in bq studio with github contents ( #1266 ) ( 58f13cb )
Add snippet to evaluate ARIMA plus model in the Forecast a single time series with a univariate model tutorial ( #1267 ) ( 3dcae2d )
Add snippet to see the ARIMA coefficients in the Forecast a single time series with a univariate model tutorial ( #1268 ) ( 059a564 )
Update bigframes.pandas.pandas docstrings ( #1247 ) ( c4bffc3 )
Use 002 model for better scalability in text generation ( #1270 ) ( bb7a850 )

1.31.0 (2025-01-05)

Features

Implement confirmation threshold for semantic operators ( #1251 ) ( 5ba4511 )

Bug Fixes

Raise if trying to change ordering_mode after session has started ( #1252 ) ( 8cfaae8 )
Reduce the number of labels added to query jobs ( #1245 ) ( fdcdc18 )

Documentation

Remove bq studio link ( #1258 ) ( dd4fd2e )
Update bigframes.pandas.DatetimeMethods docstrings ( #1246 ) ( 10f08da )
Update semantic_operators.ipynb ( #1260 ) ( a2ed989 )

1.30.0 (2024-12-30)

Features

Add GeoSeries.x and GeoSeries.y ( #1126 ) ( 4c3548f )
Add LinearRegression.predict_explain() to generate ML.EXPLAIN_PREDICT columns ( #1190 ) ( e13eca2 )
Add LogisticRegression.predict_explain() to generate ML.EXPLAIN_PREDICT columns ( #1222 ) ( bcbc732 )
Add write_engine parameter to read_FORMATNAME methods to control how data is written to BigQuery ( #371 ) ( ed47ef1 )
Add client side retry to GeminiTextGenerator ( #1242 ) ( 8193abe )
Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 ( #1208 ) ( 298fc73 )
Add support for LinearRegression.predict_explain and LogisticRegression.predict_explain parameter, top_k_features ( #1228 ) ( 3068e19 )
Support dataframe where method ( #1166 ) ( 71b4053 )

Bug Fixes

Arima model series input. ( #1237 ) ( f7d52d9 )
Json in struct destination type ( #1187 ) ( 200c9bb )
Throw an error message when setting is_row_processor=True to read a multi param function ( #1160 ) ( b2816a5 )

Documentation

Add an “open in BQ Studio” link to all BigFrames sample notebooks ( #1223 ) ( e0a8288 )
Add bq studio link for a new ipynb file called “bq_dataframes_template.ipynb” ( #1239 ) ( 840aaff )
Add example for logistic regression ( #1240 ) ( 4d854fd )
Add examples for ml PCA and SimpleImputer ( #1236 ) ( 0d84459 )
Add KMeans example ( #1234 ) ( d87ab97 )
Add linear model example ( #1235 ) ( 2c3e1fd )
Add ml.model_selection examples ( #1238 ) ( 50648e4 )
Add python snippet for “Create the time series model” section of the Forecast a single time series with a univariate model tutorial ( #1227 ) ( 20f3190 )

1.29.0 (2024-12-12)

Features

Add Gemini 2.0 preview text model support ( #1209 ) ( 1021d57 )

Documentation

Add Gemini 2.0 text gen sample notebook ( #1211 ) ( 9596b66 )
Update bigframes.pandas.index docs return types ( #1191 ) ( c63e7da )

1.28.0 (2024-12-11)

Features

(Series | DataFrame).plot.bar ( #1152 ) ( 0fae2e0 )
bigframes.bigquery.vector_search supports use_brute_force and fraction_lists_to_search parameters ( #1158 ) ( 131edc3 )
Add ARIMAPlus.predict_explain() to generate forecasts with explanation columns ( #1177 ) ( 05f8b4d )
Add client_endpoints_override to bq options ( #1167 ) ( be74b99 )
Add support for temporal types in dataframe’s describe() method ( #1189 ) ( 2d564a6 )
Allow join-free alignment of analytic expressions ( #1168 ) ( daef4f0 )
Series.isin supports bigframes.Series arg ( #1195 ) ( 0d8a16b )
Update llm.TextEmbeddingGenerator to 005 ( #1186 ) ( 3072d38 )

Bug Fixes

Fix error loading local dataframes into bigquery ( #1165 ) ( 5b355ef )
Fix null index join with ‘on’ arg ( #1153 ) ( 9015c33 )
Fix series.isin using local path always ( #1202 ) ( a44eafd )

Performance Improvements

Update df.corr, df.cov to be used with more than 30 columns case. ( #1161 ) ( 9dcf1aa )

Dependencies

Remove ibis-framework by vendoring a fork of the package to bigframes_vendored . ( #1170 ) ( 421d24d )

Documentation

Add a code sample using bpd.options.bigquery.ordering_mode = "partial" ( #909 ) ( f80d705 )
Add snippet for creating boosted tree model ( #1142 ) ( a972668 )
Add snippet for evaluating a boosted tree model ( #1154 ) ( 9d8970a )
Add snippet for predicting classifications using a boosted tree model ( #1156 ) ( e7b83f1 )
Add third party pandas.Index methods and docstrings ( #1171 ) ( a970294 )
Fix Bigframes.Pandas.General_Function missing docs ( #1164 ) ( de923d0 )
Update bigframes.pandas.Index docstrings ( #1144 ) ( 557ab8d )

1.27.0 (2024-11-16)

Features

Add astype(type, errors=’null’) to cast safely ( #1122 ) ( b4d17ff )

Bug Fixes

Dataframe fillna with scalar. ( #1132 ) ( 37f8c32 )
Exclude index columns from model fitting processes. ( #1138 ) ( 8d4da15 )
Unordered mode too many labels issue. ( #1148 ) ( 7216b21 )

Documentation

Document groupby.head and groupby.size methods ( #1111 ) ( a61eb4d )

1.26.0 (2024-11-12)

Features

Add basic geopandas functionality ( #962 ) ( 3759c63 )
Support json_extract_string_array in the bigquery module ( #1131 ) ( 4ef8bac )

Bug Fixes

Fix Series.to_frame generating string label instead of int where name is None ( #1118 ) ( 14e32b5 )
Update the API documentation with newly added rep ( #1120 ) ( 72c228b )

Performance Improvements

Reduce CURRENT_TIMESTAMP queries ( #1114 ) ( 32274b1 )
Reduce dry runs from read_gbq with table ( #1129 ) ( f7e4354 )

Documentation

Add file for Classification with a Boosted Treed Model and snippet for preparing sample data ( #1135 ) ( 7ac6639 )
Add snippet for Linear Regression tutorial Predict Outcomes section ( #1101 ) ( 108f4a9 )
Update DataFrame docstrings to include the errors section ( #1127 ) ( a38d4c4 )
Update GroupBy docstrings ( #1103 ) ( 9867a78 )
Update Session doctrings to include exceptions ( #1130 ) ( a870421 )

1.25.0 (2024-10-29)

Features

Add the ground_with_google_search option for GeminiTextGenerator predict ( #1119 ) ( ca02cd4 )
Add warning when user tries to access struct series fields with __getitem__ ( #1082 ) ( 20e5c58 )
Allow fit to take additional eval data in linear and ensemble models ( #1096 ) ( 254875c )
Support context manager for bigframes session ( #1107 ) ( 5f7b8b1 )

Performance Improvements

Improve series.unique performance and replace drop_duplicates i… ( #1108 ) ( 499f24a )

1.24.0 (2024-10-24)

Features

Support series items method ( #1089 ) ( 245a89c )

Documentation

Update docstrings of DataFrame and related files ( #1092 ) ( 15e9fd5 )

1.23.0 (2024-10-23)

Features

Add bigframes.bigquery.create_vector_index to assist in creating vector index on ARRAY<FLOAT64> columns ( #1024 ) ( 863d694 )
Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. ( #1105 ) ( 7094c85 )
Add support for pandas series & data frames as inputs for ml models. ( #1088 ) ( 30c8883 )
Cleanup temp resources with session deletion ( #1068 ) ( 1d5373d )
Show possible correct key(s) in .__getitem__ KeyError message ( #1097 ) ( 32fab96 )
Support uploading local geo data ( #1036 ) ( 51cdd33 )

Bug Fixes

Escape ids more consistently in ml module ( #1074 ) ( 103e998 )
Model.fit metric not collected issue. ( #1085 ) ( 06cec00 )
Remove index requirement from some dataframe APIs ( #1073 ) ( 2d16f6d )
Update session metrics in read_gbq_query ( #1084 ) ( dced460 )

Performance Improvements

Speed up tree transforms during sql compile ( #1071 ) ( d73fe9d )
Utilize ORDER BY LIMIT over ROW_NUMBER where possible ( #1077 ) ( 7003d1a )

Documentation

Add ml tutorial for Evaluate the model ( #1038 ) ( a120bae )
Show best practice of closing the session to cleanup resources in sample notebooks ( #1095 ) ( 62a88e8 )
Update docstrings of Session and related files ( #1087 ) ( bf93e80 )

1.22.0 (2024-10-09)

Features

Support regional endpoints for more bigquery locations ( #1061 ) ( 45b672a )
Update LLM generators to warn user about model name instead of raising error. ( #1048 ) ( 650d80d )

Bug Fixes

Access MATERIALIZED_VIEW with read_gbq ( #1070 ) ( 601e984 )
Correct zero row count in DataFrame from table view ( #1062 ) ( b536070 )
Fix generic error message when entering an incorrect column name ( #1031 ) ( 5ac217d )
Make explode respect the index labels ( #1064 ) ( 99ca0df )
Make invalid location warning case-insensitive ( #1044 ) ( b6cd55a )
Remove palm2 test case from llm load test ( #1063 ) ( 575a10a )
Show warning for unknown location set through .ctor ( #1052 ) ( 02c2da7 )

Performance Improvements

Reduce schema tracking overhead ( #1056 ) ( 1c3879d )
Repr generates fewer queries ( #1046 ) ( d204603 )
Speedup internal tree comparisons ( #1060 ) ( 4379438 )

Documentation

Add docstring return type section to BigQueryOptions class ( #964 ) ( 307385f )

1.21.0 (2024-10-02)

Features

Add deprecation warning to PaLM2TextGenerator model ( #1035 ) ( 1183b0f )
Add DeprecationWarning for PaLM2TextEmbeddingGenerator ( #1018 ) ( 4af5bbb )
Add ml.model_selection.cross_validate support ( #1020 ) ( 1a38063 )
Allow access of struct fields with dot operators on Series ( #1019 ) ( ef76f13 )

Bug Fixes

Ensure no double execution for to_pandas ( #1032 ) ( 4992cc2 )
Remove pre-caching of remote function results ( #1028 ) ( 0359bc8 )

Documentation

Add ml cross-validation notebook ( #1037 ) ( 057f3f0 )

1.20.0 (2024-09-25)

Features

Add bigframes.bigquery.approx_top_count ( #1010 ) ( 3263bd7 )
Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations ( #955 ) ( 1930b4e )
Allow multiple columns input for llm models ( #998 ) ( 2fe5e48 )

Bug Fixes

Fix reprcaching with partial ordering ( #1016 ) ( 208a984 )

Documentation

Limit pypi notebook to 7 days and add more info about differences with partial ordering mode ( #1013 ) ( 3c54399 )
Move and edit existing linear-regression tutorial snippet ( #991 ) ( 4cb62fd )

1.19.0 (2024-09-24)

Features

Add ml.model_selection.KFold class ( #1001 ) ( 952cab9 )
Support bool and bytes types in describe(include='all') ( #994 ) ( cc48f58 )
Support ingress settings in remote_function ( #1011 ) ( 8e9919b )

Bug Fixes

Fix miscasting issues with case_when ( #1003 ) ( 038139d )

Performance Improvements

Join op discards child ordering in unordered mode ( #923 ) ( 1b5b0ee )

Dependencies

Update ibis version in prerelease tests ( #1012 ) ( f89785f )

1.18.0 (2024-09-18)

Features

Add “include” param to describe for string types ( #973 ) ( deac6d2 )
Add subset parameter to DataFrame.dropna to select which columns to consider ( #981 ) ( f7c03dc )

Bug Fixes

DataFrameGroupby.agg now works with unnamed tuples ( #985 ) ( 0f047b4 )
Fix a bug that raises exception when re-indexing columns with their original order ( #988 ) ( 596b03b )
Make the Series.apply outcome assign able to the original dataframe in partial ordering mode ( #874 ) ( c94ead9 )

Dependencies

Limit ibis-framework version to 9.2.0 ( #989 ) ( 06c1b33 )
Update to ibis-framework 9.x and newer sqlglot ( #827 ) ( 89ea44f )

1.17.0 (2024-09-11)

Features

Add __version__ alias to bigframes.pandas ( #967 ) ( 9ce10b4 )
Add Gemini 1.5 stable models support ( #945 ) ( c1cde19 )
Allow setting table labels in to_gbq ( #941 ) ( cccc6ca )
Define list accessor for bigframes Series ( #946 ) ( 8e8279d )
Enable read_csv() to process other files ( #940 ) ( 3b35860 )
Include the bigframes package version alongside the feedback link in error messages ( #936 ) ( 7b59b6d )

Bug Fixes

Astype Decimal to Int64 conversion. ( #957 ) ( 27764a6 )
Make read_gbq_function work for multi-param functions ( #947 ) ( c750be6 )
Support read_gbq_function for axis=1 application ( #950 ) ( 86e54b1 )

Documentation

Add docstring returns section to Options ( #937 ) ( a2640a2 )
Update title of pypi notebook example to reflect use of the PyPI public dataset ( #952 ) ( cd62e60 )

1.16.0 (2024-09-04)

Features

Add DataFrame.struct.explode to add struct subfields to a DataFrame ( #916 ) ( ad2f75e )
Implement bigframes.bigquery.json_extract_array ( #910 ) ( 575a29e )
Recover struct column from exploded Series ( #904 ) ( 7dd304c )

Bug Fixes

Fix issue with iterating on >10gb dataframes ( #949 ) ( 2b0f0fa )
Improve Series.replace for dict input ( #907 ) ( 4208044 )
NullIndex in ML model.predict error ( #917 ) ( 612271d )
Struct field non-nullable type issue. ( #914 ) ( 149d5ff )
Unordered mode errors in ml train_test_split ( #925 ) ( 85d7c21 )

Performance Improvements

Improve repr performance ( #918 ) ( 46f2dd7 )

Dependencies

Re-introduce support for numpy 1.24.x ( #931 ) ( 3d71913 )
Update minimum support to Pandas 1.5.3 and Pyarrow 10.0.1 ( #903 ) ( 7ed3962 )

Documentation

Add Claude3 ML and RemoteFunc notebooks ( #930 ) ( cfd16c1 )
Create sample notebook to manipulate struct and array data ( #883 ) ( 3031903 )
Update struct examples. ( #953 ) ( d632cd0 )
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook ( #890 ) ( d1883cc )

1.15.0 (2024-08-20)

Features

Add llm.TextEmbeddingGenerator to support new embedding models ( #905 ) ( 6bc6a41 )
Add ml.llm.Claude3TextGenerator model ( #901 ) ( 7050038 )

Documentation

Add columns for “requires ordering/index” to supported APIs summary ( #892 ) ( d2fc51a )
Remove duplicate description for kms_key_name ( #898 ) ( 1053d56 )
Update embedding model notebooks ( #906 ) ( d9b8ef5 )

1.14.0 (2024-08-14)

Features

Implement bigframes.bigquery.json_extract ( #868 ) ( 3dbf84b )
Implement Series.str.__getitem__ ( #897 ) ( e027b7e )

Bug Fixes

Fix caching from generating row numbers in partial ordering mode ( #872 ) ( 52b7786 )

Performance Improvements

Generate SQL with fewer CTEs ( #877 ) ( eb60804 )
Speed up compilation by reducing redundant type normalization ( #896 ) ( e0b11bc )

Documentation

Add streaming html docs ( #884 ) ( 171da6c )
Fix the DisplayOptions doc rendering ( #893 ) ( 3eb6a17 )
Update streaming notebook ( #887 ) ( 6e6f9df )

1.13.0 (2024-08-05)

Features

df.apply(axis=1) to support remote function with mutiple params ( #851 ) ( 2158818 )
Allow windowing in ‘partial’ ordering mode ( #861 ) ( ca26fe5 )
Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters ( #879 ) ( 8753bdd )

Bug Fixes

Fix issue with invalid sql generated by ml distance functions ( #865 ) ( 9959fc8 )

Documentation

Create sample notebook using ordering_mode="partial" ( #880 ) ( c415eb9 )
Update streaming notebook ( #875 ) ( e9b0557 )

1.12.0 (2024-07-31)

Features

Add bigframes-mode label to query jobs ( #832 ) ( c9eaff0 )
Add config option to set partial ordering mode ( #855 ) ( 823c0ce )
Add stratify param support to ml.model_selection.train_test_split method ( #815 ) ( 27f8631 )
Add streaming.StreamingDataFrame class ( #864 ) ( a7d7197 )
Allow DataFrame.join for self-join on Null index ( #860 ) ( e950533 )
Support remote function cleanup with session.close ( #818 ) ( ed06436 )
Support to_csv/parquet/json to local files/objects ( #858 ) ( d0ab9cc )

Bug Fixes

Fewer relation joins from df self-operations ( #823 ) ( 0d24f73 )
Fix ‘sql’ property for null index ( #844 ) ( 1b6a556 )
Fix unordered mode using ordered path to print frame ( #839 ) ( 93785cb )
Reduce redundant remote_function deployments ( #856 ) ( cbf2d42 )

Documentation

Add partner attribution steps to integrations sample notebook ( #835 ) ( d7b333f )
Make get_global_session / close_session / reset_session appears in the docs ( #847 ) ( 01d6bbb )

1.11.1 (2024-07-08)

Documentation

Remove session and connection in llm notebook ( #821 ) ( 74170da )
Remove the experimental flask icon from the public docs ( #820 ) ( 067ff17 )

1.11.0 (2024-07-01)

Features

Add .agg support for size ( #792 ) ( 87e6018 )
Add bigframes.bigquery.json_set ( #782 ) ( 1b613e0 )
Add bigframes.streaming.to_pubsub method to create continuous query that writes to Pub/Sub ( #801 ) ( b47f32d )
Add DataFrame.to_arrow to create Arrow Table from DataFrame ( #807 ) ( 1e3feda )
Add PolynomialFeatures support to to_gbq and pipelines ( #805 ) ( 57d98b9 )
Add Series.peek to preview data efficiently ( #727 ) ( 580e1b9 )
Expose gcf memory param in remote_function ( #803 ) ( 014765c )
More informative error when query plan too complex ( #811 ) ( 136dc24 )

Bug Fixes

Include internally required packages in remote_function hash ( #799 ) ( 4b8fc15 )

Documentation

Document dtype limitation on row processing remote_function ( #800 ) ( 487dff6 )

1.10.0 (2024-06-21)

Features

Add dataframe.insert ( #770 ) ( e8bab68 )
Add groupby head API ( #791 ) ( 44202bc )
Add ml.preprocessing.PolynomialFeatures class ( #793 ) ( b4fbb51 )
Bigframes.streaming module for continuous queries ( #703 ) ( 0433a1c )
Include index columns in DataFrame.sql if they are named ( #788 ) ( c8d16c0 )

Bug Fixes

Allow __repr__ to work with uninitialed DataFrame/Series/Index ( #778 ) ( e14c7a9 )
Df.loc with the 2nd input as bigframes boolean Series ( #789 ) ( a4ac82e )
Ensure numpy version matches in remote_function deployment ( #798 ) ( 324d93c )
Fix temp table creation retries by now throwing if table already exists. ( #787 ) ( 0e57d1f )
Self-join optimization doesn’t needlessly invalidate caching ( #797 ) ( 1b96b80 )

1.9.0 (2024-06-10)

Features

Allow functions returned from bpd.read_gbq_function to execute outside of apply ( #706 ) ( ad7d8ac )
Support bigquery.vector_search() ( #736 ) ( dad66fd )
Support score() in GeminiTextGenerator ( #740 ) ( b2c7d8b )
Support bytes type in remote_function ( #761 ) ( 4915424 )
Support fit() in GeminiTextGenerator ( #758 ) ( d751f5c )

Bug Fixes

ARIMAPlus loads auto_arima_min_order param ( #752 ) ( 39d7013 )
Improve to_pandas_batches for large results ( #746 ) ( 61f18cb )
Resolve issue with unset thread-local options ( #741 ) ( d93dbaf )

Documentation

Fix ML.EVALUATE spelling ( #749 ) ( 7899749 )
Remove LogisticRegression normal_equation strategy ( #753 ) ( ea5d367 )

1.8.0 (2024-05-31)

Features

merge only generates a default index if both inputs already have an index ( #733 ) ( 25d049c )
Add + , - as unary ops, ^ binary op ( #724 ) ( 968d825 )
Add GroupBy.size() to get number of rows in each group ( #479 ) ( 1fca588 )
Add DataFrame ~ operator ( #721 ) ( 354abc1 )
Add GeminiText 1.5 Preview models ( #737 ) ( 56cbd3b )
Add slot_millis and add stats to session object ( #725 ) ( 72e9583 )
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings ( #731 ) ( f12c906 )
Allow functions decorated with bpd.remote_function() to execute locally ( #704 ) ( d850da6 )
Ensure "bigframes-api" label is always set on jobs, even if the API is unknown ( #722 ) ( 1832778 )
Support ml.SimpleImputer in bigframes ( #708 ) ( 4c4415f )
Support type annotations to supply input and output types to bpd.remote_function() decorator ( #717 ) ( 4a12e3c )
Support type annotations with bpd.remote_function() and axis=1 (a preview feature) ( #730 ) ( e5a2992 )

Bug Fixes

Correct index labels in multiple aggregations for DataFrameGroupBy ( #723 ) ( 6a78c89 )
Fix Null index assign series to column ( #711 ) ( ffb4b57 )
Set bpd.remote_function() s input_types and output_types default to None to allow omitting them when type annotations are present ( #729 ) ( 0e25a3b )
Warn and disable time travel for linked datasets ( #712 ) ( 085fa9d )

Performance Improvements

Optimize dataframe-series alignment on axis=1 ( #732 ) ( 3d39221 )

Documentation

Add examples to DataFrameGroupBy and SeriesGroupBy ( #701 ) ( e7da0f0 )

1.7.0 (2024-05-20)

Features

read_gbq_query supports filters ( 9386373 )
read_gbq suggests a correct column name when one is not found ( 9386373 )
Add DefaultIndexKind.NULL to use as index_col in read_gbq\* , creating an indexless DataFrame/Series ( #662 ) ( 29e4886 )
Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) ( #663 ) ( 412f28b )
To_datetime supports utc=False for string inputs ( #579 ) ( adf9889 )

Bug Fixes

read_gbq_table respects primary keys even when filters are set ( #689 ) ( 9386373 )
Fix type error in test_cluster ( #698 ) ( 14d81c1 )
Improve escaping of literals and identifiers ( #682 ) ( da9b136 )
Properly identify non-unique index in tables without primary keys ( #699 ) ( 6e0f4d8 )
Remove a usage of the resource package when not available, such as on Windows ( #681 ) ( 96243f2 )
The imported samples error and use peek() ( #688 ) ( 1a0b744 )

Performance Improvements

Don’t run query immediately from read_gbq_table if filters is set ( 9386373 )
Use a LIMIT clause when max_results is set ( 9386373 )

Documentation

Add code snippets for imported onnx tutorials ( #684 ) ( cb36e46 )
Add code snippets for imported tensorflow model ( #679 ) ( b02c401 )
Use class_weight="balanced" in the logistic regression prediction tutorial ( #678 ) ( b951549 )

1.6.0 (2024-05-13)

Features

Add DataFrame.__delitem__ ( #673 ) ( 2218c21 )
Add Series.case_when() ( #673 ) ( 2218c21 )
Add strategy="quantile" in KBinsDiscretizer ( #654 ) ( c6c487f )
Add Series.combine ( #680 ) ( 2fd1b81 )
Series.str.split ( #675 ) ( 6eb19a7 )
Suggest correct options in bpd.options.bigquery.location ( #666 ) ( 57ccabc )
Support axis=1 in df.apply for scalar outputs ( #629 ) ( f6bdc4a )
Support gcf vpc connector in remote_function ( #677 ) ( 9ca92d0 )
Warn with a more specific DefaultLocationWarning category when no location can be detected ( #648 ) ( e084e54 )

Bug Fixes

Include index_col when selecting columns and filters in read_gbq_table ( #648 ) ( e084e54 )

Dependencies

Add jellyfish as a dependency for spelling correction ( 57ccabc )

Documentation

Add code snippets for llm text generatiion ( #669 ) ( 93416ed )
Add logistic regression samples ( #673 ) ( 2218c21 )
Address lint errors in code samples ( #665 ) ( 4fc8964 )
Document inlining of small data in read_\* APIs ( #670 ) ( 306953a )

1.5.0 (2024-05-07)

Features

bigframes.options and bigframes.option_context now uses thread-local variables to prevent context managers in separate threads from affecting each other ( #652 ) ( 651fd7d )
Add ARIMAPlus.coef_ property exposing ML.ARIMA_COEFFICIENTS functionality ( #585 ) ( 81d1262 )
Add a unique session_id to Session and allow cleaning up sessions ( #553 ) ( c8d4e23 )
Add the bigframes.bigquery sub-package with a bigframes.bigquery.array_length function ( #630 ) ( 9963f85 )
Always do a query dry run when option.repr_mode == "deferred" ( #652 ) ( 651fd7d )
Custom query labels for compute options ( #638 ) ( f561799 )
Warn with DefaultIndexWarning from read_gbq on clustered/partitioned tables with no index_col or filters set ( #631 , #658 ) ( 2715d2b , 73064dd )
Support index_col=False in read_csv and engine="bigquery" ( 73064dd )
Support gcf max instance count in remote_function ( #657 ) ( 36578ab )

Bug Fixes

Don’t raise UnknownLocationWarning for US or EU multi-regions ( #653 ) ( 8e4616b )
Fix bug with na in the column labels in stack ( #659 ) ( 4a34293 )
Use explicit session in PaLM2TextGenerator ( #651 ) ( e4f13c3 )

Documentation

Add python code sample for multiple forecasting time series ( #531 ) ( 16866d2 )
Fix the Palm2TextGenerator output token size ( #649 ) ( c67e501 )

1.4.0 (2024-04-29)

Features

Add .cache() method to persist intermediate dataframe ( #626 ) ( a5c94ec )
Add transpose support for small homogeneously typed DataFrames. ( #621 ) ( 054075d )
Allow single input type in remote_function ( #641 ) ( 3aa643f )
Expose gcf max timeout in remote_function ( #639 ) ( dfeaad0 )
Series binary ops compatible with more types ( #618 ) ( 518d315 )
Support the score method for PaLM2TextGenerator ( #634 ) ( 3ffc1d2 )

Bug Fixes

Allow to_pandas to download more than 10GB ( #637 ) ( ce56495 )
Extend row hash to 128 bits to guarantee unique row id ( #632 ) ( 9005c6e )
Llm fine tuning tests ( #627 ) ( 4724a1a )
Llm palm score tests ( #643 ) ( cf4ec3a )

Performance Improvements

Automatically condense internal expression representation ( #516 ) ( 03c1b0d )
Cache transpose to allow performant retranspose ( #635 ) ( 44b738d )

Documentation

Add supported pandas apis on the main page ( #628 ) ( 8d2a51c )
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial ( #623 ) ( 2b84c4f )
Address more technical writers’ feedback ( #640 ) ( 1e7793c )

1.3.0 (2024-04-22)

Features

Add Series.struct.dtypes property ( #599 ) ( d924ec2 )
Add fine tuning fit() for Palm2TextGenerator ( #616 ) ( 9c106bd )
Add quantile statistic ( #613 ) ( bc82804 )
Expose max_batching_rows in remote_function ( #622 ) ( 240a1ac )
Support primary key(s) in read_gbq by using as the index_col by default ( #625 ) ( 75bb240 )
Warn if location is set to unknown location ( #609 ) ( 3706b4f )

Bug Fixes

Address technical writers fb ( #611 ) ( 9f8f181 )
Infer narrowest numeric type when combining numeric columns ( #602 ) ( 8f9ece6 )
Use exact median implementation by default ( #619 ) ( 9d205ae )

Documentation

Fix rendering of examples for multiple apis ( #620 ) ( 9665e39 )
Set index_cols in read_gbq as a best practice ( #624 ) ( 70015b7 )

1.2.0 (2024-04-15)

Features

Add hasnans, combine_first, update to Series ( #600 ) ( 86e0f38 )
Add MultiIndex subclass. ( #596 ) ( 5d0f149 )
Add pivot_table for DataFrame. ( #473 ) ( 5f1d670 )
Add Series.autocorr ( #605 ) ( 4ec8034 )
Support list of numerics in pandas.cut ( #580 ) ( 290f95d )

Bug Fixes

Address more technical writers feedback ( #581 ) ( 4b08d92 )
Error for object dtype on read_pandas ( #570 ) ( 8702dcf )
Inverting int now does bitwise inversion rather than sign flip ( #574 ) ( 5f1db8b )
Loc setitem dtype issue. ( #603 ) ( b94bae9 )
Toc menu missing plotting name ( #591 ) ( eed12c1 )

Documentation

(Series|Dataframe).dtypes ( #598 ) ( edef48f )
Add code samples for str accessor methdos ( #594 ) ( a557ea2 )
Add docs for DataFrame and Series dunder methods ( #562 ) ( 8fc26c4 )
Add examples for at/iat ( #582 ) ( 3be4a2e )

1.1.0 (2024-04-04)

Features

(Series|DataFrame).explode ( #556 ) ( 9e32f57 )
Add DataFrame.eval and DataFrame.query ( #361 ) ( 5e28ebd )
Add ColumnTransformer save/load ( #541 ) ( 9d8cf67 )
Add ml.metrics.mean_squared_error ( #559 ) ( 853c25e )
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops ( #505 ) ( e8e66cf )
Add transformers save/load ( #552 ) ( d805241 )
Allow DataFrame binary ops to align on either axis and with loc… ( #544 ) ( 6d8f3af )
Expose DataFrame.bqclient to assist in integrations ( #519 ) ( 0be8911 )
Read_pandas accepts pandas Series and Index objects ( #573 ) ( f8821fe )
Support ML.GENERATE_EMBEDDING in PaLM2TextEmbeddingGenerator ( #539 ) ( 1156c1e )
Support max_columns in repr and make repr more efficient ( #515 ) ( 54e49cf )

Bug Fixes

Assign NaN scalar to column error. ( #513 ) ( 0a4153c )
Don’t download 100gb onto local python machine in load test ( #537 ) ( 082c58b )
Exclude list-like s parameter in plot.scatter ( #568 ) ( 1caac27 )
Fix case where df.peek would fail to execute even with force=True ( #511 ) ( 8eca99a )
Fix error in Series.drop(0) ( #575 ) ( 75dd786 )
Include all names in MultiIndex repr ( #564 ) ( b188146 )
Plot.scatter s parameter cannot accept float-like column ( #563 ) ( 8d39187 )
Product operation produces float result for all input types ( #501 ) ( 6873b30 )
Reloaded transformer .transform error ( #569 ) ( 39fe474 )
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible ( #561 ) ( 4995c00 )
Respect hard stack size limit and swallow limit change exception. ( #558 ) ( 4833908 )
Restore string to date/time type coercion ( #565 ) ( 4ae0262 )
Sync the notebook with embedding changes ( #550 ) ( 347f2dd )
Use bytes limit on frame inlining rather than element count ( #576 ) ( 659a161 )

Performance Improvements

Add multi-query execution capability for complex dataframes ( #427 ) ( d2d7e33 )

Dependencies

Include pyarrow as a dependency ( #529 ) ( 9b1525a )

Documentation

bigframes.options.bigquery.project and location are optional in some circumstances ( #548 ) ( 90bcec5 )
Add “Supported pandas APIs” reference to the documentation ( #542 ) ( 74c3915 )
Add General Availability banner to README ( #507 ) ( 262ff59 )
Add opeartions in API docs ( #557 ) ( ea95761 )
Add progress_bar code sample ( #508 ) ( 92a1af3 )
Add the code samples for metrics{auc, roc_auc_score, roc_curve} ( #520 ) ( 5f37b09 )
Address more comments from technical writers to meet legal purposes ( #571 ) ( 9084df3 )
Fix docs of ARIMAPlus.predict ( #512 ) ( 3b80f95 )
Include Index in table-of-contents ( #564 ) ( b188146 )
Mark Gemini model as Pre-GA ( #543 ) ( 769868b )
Migrate the overview page to Bigframes official landing page ( #536 ) ( a0fb8bb )

1.0.0 (2024-03-25)

⚠ BREAKING CHANGES

rename model parameter min_rel_progress to tol
early_stop setting no longer supported, always uses True
rename model parameter n_parallell_trees to n_estimators
rename class_weights to class_weight
rename learn_rate to learning_rate
PCA n_components supports float value and None , default to None
rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 )

Features

Add configuration option to read_gbq ( #401 ) ( 85cede2 )
Add ml ARIMAPlus model params ( #488 ) ( 352cb85 )
Add ml KMeans model params ( #477 ) ( 23a8d9a )
Add ml LogisticRegression model params ( #481 ) ( f959b65 )
Add ml PCA model params ( #474 ) ( fb5d83b )
Add params for LinearRegression model ( #464 ) ( 21b2188 )
Add support for Python 3.12 ( #231 ) ( df2976f )
Allow assigning directly to Series.name property ( #495 ) ( ad0e99e )
Ensure Series.str.len() can get length of array columns ( #497 ) ( 10c0446 )
Option to use bq connection without check ( #460 ) ( 0b3f8e5 )
PCA n_components supports float value and None , default to None ( 65c6f47 )
Rename class_weights to class_weight ( 65c6f47 )
Rename learn_rate to learning_rate ( 65c6f47 )
Rename model parameter min_rel_progress to tol ( 65c6f47 )
Rename model parameter n_parallell_trees to n_estimators ( 65c6f47 )
Rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 ) ( 65c6f47 )
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 ( #504 ) ( fbada4a )
Support dataframe.cov ( #498 ) ( c4beafd )
Support Series.dt.floor ( #493 ) ( 2dd01c2 )
Support Series.dt.normalize ( #483 ) ( 0bf1e91 )
Update plot sample to 1000 rows ( #458 ) ( 60d4a7b )

Bug Fixes

early_stop setting no longer supported, always uses True ( 65c6f47 )
Fix -1 offset lookups failing ( #463 ) ( 2dfb9c2 )
Plot.scatter c argument functionalities ( #494 ) ( d6ee994 )
Properly support format param for numerical input. ( #486 ) ( ae20c35 )
Renable to_csv and to_json related tests ( #468 ) ( 2b9a01d )
Sampling plot cannot preserve ordering if index is not ordered ( #475 ) ( a5345fe )
Use actual BigQuery types rather than ibis types in to_pandas ( #500 ) ( 82b4f91 )

Dependencies

Support pandas 2.2 ( #492 ) ( e2cf50e )

Documentation

Add code samples for metrics.{accuracy_score, confusion_matrix} ( #478 ) ( 3e3329a )
Add code samples for metrics.{recall_score, precision_score, f11_score} ( #502 ) ( 370fe90 )
Improve API documentation ( #489 ) ( 751266e )
Update bigquery connection documentation ( #499 ) ( 4bfe094 )
Update LLM + K-means notebook to handle partial failures ( #496 ) ( 97afad9 )

0.26.0 (2024-03-20)

⚠ BREAKING CHANGES

exclude remote models for .register() ( #465 )

Features

(Series|DataFrame).plot ( #438 ) ( 1c3e668 )
read_gbq_table supports LIKE as a operator in filters ( #454 ) ( d2d425a )
Add DataFrame.pipe() method ( #421 ) ( 95f5a6e )
Set force=True by default in DataFrame.peek() ( #469 ) ( 4e8e97d )
Support datetime related casting in (Series|DataFrame|Index).astype ( #442 ) ( fde339b )
Support Series.dt.strftime ( #453 ) ( 8f6e955 )

Bug Fixes

Any() on empty set now correctly returns False ( #471 ) ( f55680c )
Df.drop_na preserves columns dtype ( #457 ) ( 3bab1a9 )
Disable to_json and to_csv related tests ( #462 ) ( 874026d )
Exclude remote models for .register() ( #465 ) ( 73fe0f8 )
Fix broken link in covid notebook ( #450 ) ( adadb06 )
Fix broken multiindex loc cases ( #467 ) ( b519197 )
Fix grouping series on multiple other series ( #455 ) ( 3971bd2 )
Groupby aggregates no longer check if grouping keys are numeric ( #472 ) ( 4fbf938 )
Raise ValueError when read_pandas() receives a bigframes DataFrame ( #447 ) ( b28f9fd )
Series.(to_csv|to_json) leverages bq export ( #452 ) ( 718a00c )
Warn when read_gbq / read_gbq_table uses the snapshot time cache ( #441 ) ( e16a8c0 )

Documentation

Add code samples for ml.metrics.r2_score ( #459 ) ( 85fefa2 )
Add the docs for loc and iloc indexers ( #446 ) ( 14ab8d8 )
Add the pages for at and iat indexers ( #456 ) ( 340f0b5 )
Add version information to bug template ( #437 ) ( 91bd39e )
Indicate that project and location are optional in example notebooks ( #451 ) ( 1df0140 )

0.25.0 (2024-03-14)

Features

(Series|DataFrame).plot.(line|area|scatter) ( #431 ) ( 0772510 )
Support CMEK for remote_function cloud functions ( #430 ) ( 2fd69f4 )

0.24.0 (2024-03-12)

⚠ BREAKING CHANGES

read_parquet uses a “pandas” engine to parse files by default. Use engine="bigquery" for the previous behavior

Features

(Series|Dataframe).plot.hist() ( #420 ) ( 4aadff4 )
Add detect_anomalies to ml ARIMAPlus and KMeans models ( #426 ) ( 6df28ed )
Add engine parameter to read_parquet ( #413 ) ( 31325a1 )
Add ml PCA.detect_anomalies method ( #422 ) ( 8d82945 )
Support BYOSA in remote_function ( #407 ) ( d92ced2 )
Support CMEK for BQ tables ( #403 ) ( 9a678e3 )

Bug Fixes

Move third_party.bigframes_vendored to bigframes_vendored ( #424 ) ( 763edeb )
Only do row identity based joins when joining by index ( #356 ) ( 76b252f )
Read_pandas inline respects location ( #412 ) ( ae0e3ea )

Documentation

Add predict sample to samples/snippets/bqml_getting_started_test.py ( #388 ) ( 6a3b0cc )
Document minimum IAM requirement ( #416 ) ( 36173b0 )
Fix the note rendering for DataFrames methods: nlargest, nsmallest ( #417 ) ( 38bd2ba )

0.23.0 (2024-03-05)

Features

Add ml.metrics.pairwise.euclidean_distance ( #397 ) ( 1726588 )
Add TextEmbedding model version support ( #394 ) ( e0f1ab0 )

Bug Fixes

Code exception in remote_function now prevents retry and surfaces in the client ( #387 ) ( dd3643d )
Docs link for metrics.pairwise ( #400 ) ( a60aba7 )

Dependencies

Update ibis to version 8.0.0 and refactor remote_function to use ibis UDF method ( #277 ) ( 350499b )

Documentation

Update README to point to new summary pages ( #402 ) ( bfe2b23 )

0.22.0 (2024-02-27)

⚠ BREAKING CHANGES

rename cosine_similarity to paired_cosine_distances ( #393 )
move model optional args to kwargs ( #381 )

Features

Add DataFrames.corr() method ( #379 ) ( 67fd434 )
Add ml.metrics.pairwise.manhattan_distance ( #392 ) ( 9d31865 )
Enable regional endpoints for me-central2 ( #386 ) ( 469674d )

Bug Fixes

Avoid ibis warning for “database” table() method argument ( #390 ) ( a0490a4 )
Correct the numeric literal dtype ( #365 ) ( 93b02cd )
Rename cosine_similarity to paired_cosine_distances ( #393 ) ( 81ece46 )

Performance Improvements

Inline read_pandas for small data ( #383 ) ( 59b446b )

Dependencies

Add minimum version constraint for sqlglot to 19.9.0 ( #389 ) ( 8b62d77 )

Documentation

Add a code sample for creating a kmeans model ( #267 ) ( 4291d65 )
Fix bigframes.pandas.concat documentation ( #382 ) ( 234b61c )

Miscellaneous Chores

Release 0.22.0 ( #396 ) ( 8f73d9e )

Code Refactoring

Move model optional args to kwargs ( #381 ) ( 4037992 )

0.21.0 (2024-02-13)

Features

Add Series.cov method ( #368 ) ( 443db22 )
Add ml.llm.GeminiTextGenerator model ( #370 ) ( de1e0a4 )
Add ml.metrics.pairwise.cosine_similarity function ( #374 ) ( 126f566 )
Add XGBoostModel ( #363 ) ( d5518b2 )
Limited support of lambdas in Series.apply ( #345 ) ( 208e081 )
Support bigframes.pandas.to_datetime for scalars, iterables and series. ( #372 ) ( ffb0d15 )
Support read_gbq wildcard table path ( #377 ) ( 90caf86 )

Bug Fixes

Error message fix. ( #375 ) ( 930cf6b )

Documentation

Clarify ADC pre-auth in a non-interactive environment ( #348 ) ( 99a9e6e )

0.20.1 (2024-02-06)

Performance Improvements

Make repr cache the block where appropriate ( #350 ) ( 068879f )

Documentation

Add a sample to demonstrate the evaluation results ( #364 ) ( cff0919 )
Fix the DataFrame.apply code sample ( #366 ) ( 1866a26 )

0.20.0 (2024-01-30)

Features

Add DataFrame.peek() as an efficient alternative to head() results preview ( #318 ) ( 9c34d83 )
Add ARIMA_EVAULATE options in forecasting models ( #336 ) ( 73e997b )
Add Index constructor, repr, copy, get_level_values, to_series ( #334 ) ( e5d054e )
Improve error message for drive based BQ table reads ( #344 ) ( 0794788 )
Update cut to work without labels = False and show intervals as dict ( #335 ) ( 4ff53db )

Bug Fixes

Chance default connection name in getting_started.ipnyb ( #347 ) ( 677f014 )
Series iteration correctly returns values instead of index ( #339 ) ( 2c6af9b )

Documentation

Add code samples for Series.{between, cumprod} ( #353 ) ( 09a52fd )

0.19.2 (2024-01-22)

Bug Fixes

Read_gbq large response issue ( #332 ) ( b8178b9 )
Use object dtype for ARRAY columns in to_pandas() with pandas 1.x ( #329 ) ( 374ddb5 )

Documentation

Add DataFrame.applymap documentation ( #326 ) ( bd531a1 )
Add code samples for series methods ( #323 ) ( 32cc6fa )
Add remote model requirements ( #333 ) ( c91f70c )

0.19.1 (2024-01-17)

Bug Fixes

Handle multi-level columns for df aggregates properly ( #305 ) ( 5bb45ba )
Update max_output_token limitation. ( #308 ) ( 5cccd36 )

Documentation

Add code samples for Series.corr ( #316 ) ( 9150c16 )

0.19.0 (2024-01-09)

Features

Add ‘columns’ as an alias for ‘col_order’ ( #298 ) ( a01b271 )
Add Series dt.tz and dt.unit properties ( #303 ) ( 2e1a403 )
Add to_gbq() method for LLM models ( #299 ) ( dafbc1b )
Allow manually set clustering_columns in dataframe.to_gbq ( #302 ) ( 9c21323 )
Support assigning to columns like a property ( #304 ) ( f645c56 )
Support upcasting numeric columns in concat ( #294 ) ( e3a056a )

Bug Fixes

DF.drop tuple input as multi-index ( #301 ) ( 21391a9 )
Fix bug converting non-string labels to sql ids ( #296 ) ( a61c5fe )

Documentation

Add code samples for Series.ffill and DataFrame.ffill ( #307 ) ( 1c63b45 )

0.18.0 (2024-01-02)

Features

Add dataframe.to_html ( #259 ) ( 2cd6489 )
Add IntervalIndex support to bigframes.pandas.cut ( #254 ) ( 6c1969a )
Add replace method to DataFrame ( #261 ) ( 5092215 )
Specific pyarrow mappings for decimal, bytes types ( #283 ) ( a1c0631 )

Bug Fixes

Dataframes to_gbq now creates dataset if it doesn’t exist ( #222 ) ( bac62f7 )
Exclude pandas 2.2.0rc0 to unblock prerelease tests ( #292 ) ( ac1a745 )
Fix DataFrameGroupby.agg() issue with as_index=False ( #273 ) ( ab49350 )
Make Series.str.replace work for simple strings ( #285 ) ( ad67465 )
Update dataframe.to_gbq to dedup column names. ( #286 ) ( 746115d )
Use setuptools.find_namespace_packages ( #246 ) ( 9ec352a )

Dependencies

Migrate to ibis-framework >= "7.1.0" ( #53 ) ( 9798a2b )

Documentation

Add code snippets for explore query result page ( #278 ) ( 7cbbb7d )
Code samples for astype common to DataFrame and Series ( #280 ) ( 95b673a )
Code samples for DataFrame.copy and Series.copy ( #290 ) ( 7cbc2b0 )
Code samples for drop and fillna ( #284 ) ( 9c5012e )
Code samples for isna , isnull , dropna , isin ( #289 ) ( ad51035 )
Code samples for rename , size ( #293 ) ( eb69f60 )
Code samples for reset_index and sort_values ( #282 ) ( acc0eb7 )
Code samples for sample , get , Series.round ( #295 ) ( c2b1892 )
Code samples for Series.{add, replace, unique, T, transpose} ( #287 ) ( 0e1bbfc )
Code samples for Series.{map, to_list, count} ( #290 ) ( 7cbc2b0 )
Code samples for Series.{name, std, agg} ( #293 ) ( eb69f60 )
Code samples for Series.groupby and Series.{sum,mean,min,max} ( #280 ) ( 95b673a )
Code samples for DataFrame set_index , items ( #295 ) ( c2b1892 )
Fix the rendering for get_dummies ( #291 ) ( 252f3a2 )

0.17.0 (2023-12-14)

Features

Add filters argument to read_gbq for enhanced data querying ( #198 ) ( 034f71f )
Add module/class level api tracking ( #272 ) ( 4f3db3d )
Deprecate use_regional_endpoints ( #199 ) ( 319a1f2 )

Bug Fixes

Increase recursion limit, cache compilation tree hashes ( #184 ) ( b54791c )
Replaced raise NotImplementedError with return NotImplemented ( #258 ) ( a133822 )

Documentation

Add code samples for values and value_counts ( #249 ) ( f247d95 )
Add sample for getting started with BQML ( #141 ) ( fb14f54 )

0.16.0 (2023-12-12)

Features

Add ARIMAPlus.predict parameters ( #264 ) ( 99598c7 )
Add DataFrame from_dict and from_records methods ( #244 ) ( 8d81e24 )
Add DataFrame.select_dtypes method ( #242 ) ( 1737acc )
Add nunique method to Series/DataFrameGroupby ( #256 ) ( c8ec245 )
Support dataframe.loc with conditional columns selection ( #233 ) ( 3febea9 )

Bug Fixes

Enfore pandas version requirement <2.1.4 ( #265 ) ( 9dd63f6 )
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests ( b02fc2c )
Fix value_counts column label for normalize=True ( #245 ) ( d3fa6f2 )
Migrate e2e tests to bigframes-load-testing project ( 8766ac6 )
Ml.sql logic ( #262 ) ( 68c6fdf )
Update the llm_kmeans notebook ( #247 ) ( 66d1839 )

Documentation

Add code samples for shape and head ( #257 ) ( 5bdcc65 )
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… ( #252 ) ( 8c63697 )
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… ( #234 ) ( e735412 )
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod ( #243 ) ( 0523a31 )
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… ( #251 ) ( 77074ec )
Correct the docs for option_context ( #263 ) ( d21c6dd )
Correct the params rendering for ml.remote and ml.ensemble modules ( #248 ) ( c2829e3 )
Fix return annotation in API docstrings ( #253 ) ( 89a1c67 )

0.15.0 (2023-11-29)

⚠ BREAKING CHANGES

model.predict returns all the columns ( #204 )

Features

Add info and memory_usage methods to dataframe ( #219 ) ( 9d6613d )
Add remote vertex model support ( #237 ) ( 0bfc4fb )
Add the recent api method for ML component ( #225 ) ( ed8876d )
Model.predict returns all the columns ( #204 ) ( 416171a )
Send warnings on LLM prediction partial failures ( #216 ) ( 81125f9 )

Bug Fixes

Add df snapshots lookup for read_gbq ( #229 ) ( d0d9b84 )
Avoid unnecessary row_number() on sort key for io ( #211 ) ( a18d40e )
Dedup special character ( #209 ) ( dd78acb )
Invalid JSON type of the notebook ( #215 ) ( a729831 )
Make to_pandas override enable_downsampling when sampling_method is manually set. ( #200 ) ( ae03756 )
Polish the llm+kmeans notebook ( #208 ) ( e8532b1 )
Update the llm+kmeans notebook with recent change ( #236 ) ( f8917ab )
Use anonymous dataset to create remote_function ( #205 ) ( 69b016e )

Documentation

Add code samples for index and column properties ( #212 ) ( c88d38e )
Add code samples for df reshaping, function, merge, and join methods ( #203 ) ( 010486c )
Add examples for dataframe.kurt, dataframe.std, dataframe.count ( #232 ) ( f9c6e72 )
Add examples for dataframe.mean, dataframe.median, dataframe.va… ( #228 ) ( edd0522 )
Add examples for dataframe.min, dataframe.max and dataframe.sum ( #227 ) ( 3a375e8 )
Code samples for Series.dot and DataFrame.dot ( #226 ) ( b62a07a )
Code samples for Series.where and Series.mask ( #217 ) ( 52dfad2 )
Code samples for dataframe.any, dataframe.all and dataframe.prod ( #223 ) ( d7957fa )
Make the code samples reflect default bq connection usage ( #206 ) ( 71844b0 )

Miscellaneous Chores

Release 0.15.0 ( #241 ) ( 6c899be )

0.14.1 (2023-11-16)

Bug Fixes

Correctly handle null values when initializing fingerprint ordering ( #210 ) ( 8324f13 )

Documentation

Add an example notebook about line graphs ( #197 ) ( f957b27 )

0.14.0 (2023-11-14)

Features

Add ‘cross’ join support ( #176 ) ( 765446a )
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods ( #162 ) ( 6a28403 )
Add series.sample (identical to existing dataframe.sample) ( #187 ) ( 37914a4 )
Add unordered sql compilation ( #156 ) ( 58f420c )
Log most recent API calls as recent-bigframes-api-xx labels on BigQuery jobs ( #145 ) ( 4ea33b7 )
Read_gbq creates order deterministically without table copy ( #191 ) ( 8ab81de )
Support date_series.astype("string[pyarrow]") to cast DATE to STRING ( #186 ) ( aee0e8e )
Support series.at[row_label] = scalar ( #173 ) ( 0c8bd33 )
Temporary resources no longer use BigQuery Sessions ( #194 ) ( 4a02cac )

Bug Fixes

All sort operation are now stable ( #195 ) ( 3a2761f )
Default to 7 days expiration for read_csv , read_json , read_parquet ( #193 ) ( 03606cd )
Deprecate the remote_service_type in llm model ( #180 ) ( a8a409a )
For reset_index on unnamed multiindex, always use level_[n] label ( #182 ) ( f95000d )
Match pandas behavior when assigning listlike to empty dfs ( #172 ) ( c1d1f42 )
Use anonymous dataset instead of session dataset for temp tables ( #181 ) ( 800d44e )
Use random table for read_pandas ( #192 ) ( 741c75e )
Use random table when loading data for read_csv , read_json , read_parquet ( #175 ) ( 9d2e6dc )

Documentation

Add code samples for read_gbq_function using community UDFs ( #188 ) ( 7506eab )
Add docstring code samples for Series.apply and DataFrame.map ( #185 ) ( c816d84 )
Add llm kmeans notebook as an included example ( #177 ) ( d49ae42 )
Use head() to get top n results, not to preview results ( #190 ) ( 87f84c9 )

0.13.0 (2023-11-07)

Features

to_gbq without a destination table writes to a temporary table ( #158 ) ( e1817c9 )
Add DataFrame.__iter__ , DataFrame.iterrows , DataFrame.itertuples , and DataFrame.keys methods ( #164 ) ( c065071 )
Add Series.__iter__ method ( #164 ) ( c065071 )
Add interpolate() to series and dataframe ( #157 ) ( b9cb55c )
Support 32k text-generation and multilingual embedding models ( #161 ) ( 5f0ea37 )

Bug Fixes

Update default temp table expiration to 7 days ( #174 ) ( 4ff26cd )

0.12.0 (2023-11-01)

Features

Add DataFrame.melt ( #113 ) ( 4e4409c )
Add DataFrame.to_pandas_batches() to download large DataFrame objects ( #136 ) ( 3afd4a3 )
Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs ( #133 ) ( 63c7919 )
Add pandas.qcut ( #104 ) ( 8e44518 )
Add pd.get_dummies ( #149 ) ( d8baad5 )
Add unstack to series, add level param ( #115 ) ( 5edcd19 )
Implement operator @ for DataFrame.dot ( #139 ) ( 79a638e )
Populate ibis version in user agent ( #140 ) ( c639a36 )

Bug Fixes

Don’t override the global logging config ( #138 ) ( 2ddbf74 )
Fix bug with column names under repeated column assignment ( #150 ) ( 29032d0 )
Resolve plotly rendering issue by using ipython html for job pro… ( #134 ) ( 39df43e )
Use indexee’s session for loc listlike cases ( #152 ) ( 27c5725 )

Documentation

Add artithmetic df sample code ( #153 ) ( ac44ccd )
Fix indentation on read_gbq_function code sample ( #163 ) ( 0801d96 )
Link to ML.EVALUATE BQML page for score() methods ( #137 ) ( 45c617f )

0.11.0 (2023-10-26)

Features

Add back reset_session as an alias for close_session ( #124 ) ( 694a85a )
Change query parameter to query_or_table in read_gbq ( #127 ) ( f9bb3c4 )

Bug Fixes

Expose bigframes.pandas.reset_session as a public API ( #128 ) ( b17e1f4 )
Use series’s own session in series.reindex listlike case ( #135 ) ( 95bff3f )

Documentation

Add runnable code samples for DataFrames I/O methods and property ( #129 ) ( 6fea8ef )
Add runnable code samples for reading methods ( #125 ) ( a669919 )

0.10.0 (2023-10-19)

Features

Implement DataFrame.dot for matrix multiplication ( #67 ) ( 29dd414 )

0.9.0 (2023-10-18)

⚠ BREAKING CHANGES

rename bigframes.pandas.reset_session to close_session ( #101 )

Features

Add bigframes.options.bigquery.application_name for partner attribution ( #117 ) ( 52d64ff )
Add AtIndexer getitems ( #107 ) ( 752b01f )
Rename bigframes.pandas.reset_session to close_session ( #101 ) ( 36693bf )
Send BigQuery cancel request when canceling bigframes process ( #103 ) ( e325fbb )
Support external packages in remote_function ( #98 ) ( ec10c4a )
Use ArrowDtype for STRUCT columns in to_pandas ( #85 ) ( 9238fad )

Bug Fixes

Support multiindex for three loc getitem overloads ( #113 ) ( 68e3cd3 )

Performance Improvements

If primary keys are defined, read_gbq avoids copying table data ( #112 ) ( e6c0cd1 )

Documentation

Add documentation for Series.struct.field and Series.struct.explode ( #114 ) ( a6dab9c )
Add open-source link in API doc ( #106 ) ( db51fe3 )
Update ML overview API doc ( #105 ) ( 1b3f3a5 )

0.8.0 (2023-10-12)

⚠ BREAKING CHANGES

The default behavior of to_parquet is changing from no compression to 'snappy' compression.

Features

Support compression in to_parquet ( a8c286f )

Bug Fixes

Create session dataset for remote functions only when needed ( #94 ) ( 1d385be )

0.7.0 (2023-10-11)

Features

Add aliases for several series properties ( #80 ) ( c0efec8 )
Add equals methods to series/dataframe ( #76 ) ( 636a209 )
Add iat and iloc accessing by tuples of integers ( #90 ) ( 228aeba )
Add level param to DataFrame.stack ( #88 ) ( 97b8bec )
Allow df.drop to take an index object ( #68 ) ( 740c451 )
Use default session connection ( #87 ) ( 4ae4ef9 )

Bug Fixes

Change the invalid url in docs ( #93 ) ( 969800d )

Documentation

Add more preprocessing models into the docs menu. ( #97 ) ( 1592315 )

0.6.0 (2023-10-04)

Features

Add df.unstack ( #63 ) ( 4a84714 )
Add idxmin, idxmax to series, dataframe ( #74 ) ( 781307e )
Add ml.preprocessing.KBinsDiscretizer ( #81 ) ( 24c6256 )
Add multi-column dataframe merge ( #73 ) ( c9fa85c )
Add update and align methods to dataframe ( #57 ) ( bf050cf )
Support STRUCT data type with Series.struct.field to extract child fields ( #71 ) ( 17afac9 )

Bug Fixes

Avoid 403 response too large to return error with read_gbq and large query results ( #77 ) ( 8f3b5b2 )
Change return type of Series.loc[scalar] ( #40 ) ( fff3d45 )
Fix df/series.iloc by list with multiindex ( #79 ) ( 971d091 )

0.5.0 (2023-09-28)

Features

Add DataFrame.kurtosis / DF.kurt method ( c1900c2 )
Add DataFrame.rolling and DataFrame.expanding methods ( c1900c2 )
Add items , apply methods to DataFrame . ( #43 ) ( 3adc1b3 )
Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )
Add index dtype , astype , drop , fillna , aggregate attributes. ( #38 ) ( 1a254a4 )
Add ml.preprocessing.LabelEncoder ( #50 ) ( 2510461 )
Add ml.preprocessing.MaxAbsScaler ( #56 ) ( 14b262b )
Add ml.preprocessing.MinMaxScaler ( #64 ) ( 392113b )
Add more index methods ( #54 ) ( a6e32aa )
Support calculate_p_values parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support class_weights="balanced" in LogisticRegression model ( c1900c2 )
Support df[column_name] = df_only_one_column ( c1900c2 )
Support early_stop parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support enable_global_explain parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support l2_reg parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support learn_rate_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support ls_init_learn_rate parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support max_iterations parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support min_rel_progress parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support optimize_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support casting string to integer or float ( #59 ) ( 3502f83 )

Bug Fixes

Fix header skipping logic in read_csv ( #49 ) ( d56258c )
Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )
LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )
Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )

Performance Improvements

Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )
Inline small Series and DataFrames in query text ( #45 ) ( 5e199ec )
Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )
Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )

Documentation

Link to Remote Functions code samples from README and API reference ( c1900c2 )

0.4.0 (2023-09-16)

Features

Add axis parameter to droplevel and reorder_levels ( 7c6b0dd )
Add bfill and ffill to DataFrame and Series ( 7c6b0dd )
Add DataFrame.combine and DataFrame.combine_first ( #27 ) ( 7c6b0dd )
Add DataFrame.nlargest , nsmallest ( 7c6b0dd )
Add DataFrame.pct_change and Series.pct_change ( 7c6b0dd )
Add DataFrame.skew and GroupBy.skew ( 7c6b0dd )
Add DataFrame.to_dict , to_excel , to_latex , to_records , to_string , to_markdown , to_pickle , to_orc ( 7c6b0dd )
Add diff method to DataFrame and GroupBy ( 7c6b0dd )
Add filter and reindex to Series and DataFrame ( 7c6b0dd )
Add reindex_like to DataFrame and Series ( 7c6b0dd )
Add swaplevel to DataFrame and Series ( 7c6b0dd )
Add partial support for Sereies.replace ( 7c6b0dd )
Support DataFrame.loc[bool_series, column] = scalar ( 7c6b0dd )
Support a persistent name in remote_function ( 7c6b0dd )

Bug Fixes

remote_function uses same credentials as other APIs ( 7c6b0dd )
Add type hints to models ( 7c6b0dd )
Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )
Remove transforms parameter in model.fit ( breaking change) ( 7c6b0dd )
Support column joins with “None indexer” ( 7c6b0dd )
Use for literals Int64Dtype in cut ( 7c6b0dd )
Use lowercase strings for parameter literals in bigframes.ml ( breaking change) ( 7c6b0dd )

Performance Improvements

bigframes-api label to I/O query jobs ( 7c6b0dd )

Documentation

Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )
Document region logic in README ( 7c6b0dd )
Fix OneHotEncoder sample ( 7c6b0dd )

0.3.2 (2023-09-06)

Bug Fixes

Make release.sh script for PyPI upload executable ( #20 ) ( 9951610 )

0.3.1 (2023-09-05)

Bug Fixes

release:Use correct directory name for release build config ( #17 ) ( 3dd25b3 )

0.3.0 (2023-09-02)

Features

Add bigframes.get_global_session() and bigframes.reset_session() aliases ( a32b747 )
Add bigframes.pandas.read_pickle function ( a32b747 )
Add components_ , explained_variance_ , and explained_variance_ratio_ properties to bigframes.ml.decomposition.PCA ( 89b9503 )
Add fit_transform to bigquery.ml transformers ( a32b747 )
Add Series.dropna and DataFrame.fillna ( 8fab755 )
Add Series.str methods isalpha , isdigit , isdecimal , isalnum , isspace , islower , isupper , zfill , center ( a32b747 )
Support bigframes.pandas.merge() ( 8fab755 )
Support DataFrame.isin with list and dict inputs ( 8fab755 )
Support DataFrame.pivot ( a32b747 )
Support DataFrame.stack ( 89b9503 )
Support DataFrame - DataFrame binary operations ( 8fab755 )
Support df[my_column] = [a python list] ( 89b9503 )
Support Index.is_monotonic ( 8fab755 )
Support np.arcsin , np.arccos , np.arctan , np.sinh , np.cosh , np.tanh , np.arcsinh , np.arccosh , np.arctanh , np.exp with Series argument ( 89b9503 )
Support np.sin , np.cos , np.tan , np.log , np.log10 , np.sqrt , np.abs with Series argument ( 89b9503 )
Support pow() and power operator in DataFrame and Series ( 8fab755 )
Support read_json with engine=bigquery for newline-delimited JSON files ( 89b9503 )
Support Series.corr ( 89b9503 )
Support Series.map ( 8fab755 )
Support for np.add , np.subtract , np.multiply , np.divide , np.power ( 8fab755 )
Support MultiIndex for DataFrame columns ( a32b747 )
Use pandas.Index for column labels ( a32b747 )
Use default session and connection in ml.llm and ml.imported ( 8fab755 )

Bug Fixes

Add error message to set_index ( a32b747 )
Align column names with pandas in DataFrame.agg results ( 89b9503 )
Allow (but still not recommended) ORDER BY in read_gbq input when an index_col is defined ( 89b9503 )
Check for IAM role on the BigQuery connection when initializing a remote_function ( 89b9503 )
Check that types are specified in read_gbq_function ( a32b747 )
Don’t use query cache for Session construction ( a32b747 )
Include survey link in abstract NotImplementedError exception messages ( 89b9503 )
Label temp table creation jobs with source=bigquery-dataframes-temp label ( 89b9503 )
Make X_train argument names consistent across methods ( 8fab755 )
Raise AttributeError for unimplemented pandas methods ( 89b9503 )
Raise exception for invalid function in read_gbq_function ( a32b747 )
Support spaces in column names in DataFrame initializater ( 89b9503 )

Performance Improvements

Add local cache for __repr_\*__ methods ( a32b747 )
Lazily instantiate client library objects ( 89b9503 )
Use row_number() filter for head / tail ( 8fab755 )

Documentation

Add ML section under Overview ( a32b747 )
Add release status to table of contents ( a32b747 )
Add samples and best practices to read_gbq docs ( a32b747 )
Correct the return types of Dataframe and Series ( a32b747 )
Create subfolders for notebooks ( a32b747 )
Fix link to GitHub ( 89b9503 )
Highlight bigframes is open-source ( a32b747 )
Sample ML Drug Name Generation notebook ( a32b747 )
Set options.bigquery.project in sample code ( 89b9503 )
Transform remote function user guide into sample code ( a32b747 )
Update remote function notebook with read_gbq_function usage ( 8fab755 )

0.2.0 (2023-08-17)

Features

Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.

Bug Fixes

Update ML package to use sessions for queries.
Optimize read_gbq with index_col set to cluster by index_col .
Raise ValueError if the location mismatched.
read_gbq no longer uses ‘time travel’ with query inputs.

Documentation

Add docstring to _uniform_sampling to avoid user using it.

0.1.1 (2023-08-14)

Documentation

Correct link to code repository in setup.py and use correct terminology for console.cloud.google.com links.

0.1.0 (2023-08-11)

Features

Add bigframes.pandas package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.
Add bigframes.ml package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .

0.0.0 (2023-02-22)

Empty package to reserve package name.