Changelog

PyPI History

2.17.0 (2025-08-22)

Features

  • Add isin local execution impl ( #1993 ) ( 26df6e6 )

  • Add reset_index names, col_level, col_fill, allow_duplicates args ( #2017 ) ( c02a1b6 )

  • Support callable for series mask method ( #2014 ) ( 5ac32eb )

2.16.0 (2025-08-20)

Features

  • Add bigframes.pandas.options.display.precision option ( #1979 ) ( 15e6175 )

  • Add level, inplace params to reset_index ( #1988 ) ( 3446950 )

  • Add ML code samples from dbt blog post ( #1978 ) ( ebaa244 )

  • Add where, coalesce, fillna, casewhen, invert local impl ( #1976 ) ( f7f686c )

  • Adjust anywidget CSS to prevent overflow ( #1981 ) ( 204f083 )

  • Format page number in table widget ( #1992 ) ( e83836e )

  • Or, And, Xor can execute locally ( #1994 ) ( 59c52a5 )

  • Support callable bigframes function for dataframe where ( #1990 ) ( 44c1ec4 )

  • Support callable for series where method ( #2005 ) ( 768b82a )

  • When using repr_mode = "anywidget" , numeric values align right ( 15e6175 )

Bug Fixes

  • Address the packages issue for bigframes function ( #1991 ) ( 68f1d22 )

  • Correct pypdf dependency specifier for remote PDF functions ( #1980 ) ( 0bd5e1b )

  • Enable default retries in calls to BQ Storage Read API ( #1985 ) ( f25d7bd )

  • Fix the copyright year in dbt sample files ( #1996 ) ( fad5722 )

Performance Improvements

  • Faster session startup by defering anon dataset fetch ( #1982 ) ( 2720c4c )

Documentation

  • Add examples of running bigframes in kaggle ( #2002 ) ( 7d89d76 )

  • Remove preview warning from partial ordering mode sample notebook ( #1986 ) ( 132e0ed )

2.15.0 (2025-08-11)

Features

  • Add st_buffer , st_centroid , and st_convexhull and their corresponding GeoSeries methods ( #1963 ) ( c4c7fa5 )

  • Add first, last support to GroupBy ( #1969 ) ( 41dda88 )

  • Add value_counts to GroupBy classes ( #1974 ) ( 82175a4 )

  • Allow callable as a conditional or replacement input in DataFrame.where ( #1971 ) ( a8d57d2 )

  • Can cast locally in hybrid engine ( #1944 ) ( d9bc4a5 )

  • Df.join lsuffix and rsuffix support ( #1857 ) ( 26515c3 )

Bug Fixes

  • Add warnings for duplicated or conflicting type hints in bigfram… ( #1956 ) ( d38e42c )

  • Make remote_function more robust when there are create_function retries ( #1973 ) ( cd954ac )

  • Make ExecutionMetrics stats tracking more robust to missing stats ( #1977 ) ( feb3ff4 )

Performance Improvements

  • Remove an unnecessary extra dry_run query from read_gbq_table ( #1972 ) ( d17b711 )

Documentation

  • Divide BQ DataFrames quickstart code cell ( #1975 ) ( fedb8f2 )

2.14.0 (2025-08-05)

Features

Bug Fixes

  • Enhance type error messages for bigframes functions ( #1958 ) ( 770918e )

Performance Improvements

  • Use promote_offsets for consistent row number generation for index.get_loc ( #1957 ) ( c67a25a )

Documentation

  • Add code snippet for storing dataframes to a CSV file ( #1943 ) ( a511e09 )

  • Add code snippet for storing dataframes to a CSV file ( #1953 ) ( a298a02 )

2.13.0 (2025-07-25)

Features

  • _read_gbq_colab creates hybrid session ( #1901 ) ( 31b17b0 )

  • Add CSS styling for TableWidget pagination interface ( #1934 ) ( 5b232d7 )

  • Add row numbering local pushdown in hybrid execution ( #1932 ) ( 92a2377 )

  • Implement Index.get_loc ( #1921 ) ( bbbcaf3 )

Bug Fixes

  • Add license header and correct issues in dbt sample ( #1931 ) ( ab01b0a )

Dependencies

  • Replace google-cloud-iam with grpc-google-iam-v1 ( #1864 ) ( e5ff8f7 )

2.12.0 (2025-07-23)

Features

  • Add code samples for dbt bigframes integration ( #1898 ) ( 7e03252 )

  • Add isin local execution to hybrid engine ( #1915 ) ( c0cefd3 )

  • Add ml.metrics.mean_absolute_error method ( #1910 ) ( 15b8449 )

  • Allow local arithmetic execution in hybrid engine ( #1906 ) ( ebdcd02 )

  • Provide day_of_year and day_of_week for dt accessor ( #1911 ) ( 40e7638 )

  • Support params max_batching_rows , container_cpu , and container_memory for udf ( #1897 ) ( 8baa912 )

  • Support typed pyarrow.Scalar in assignment ( #1930 ) ( cd28e12 )

Bug Fixes

  • Correct min field from max() to min() in remote function tests ( #1917 ) ( d5c54fc )

  • Resolve location reset issue in bigquery options ( #1914 ) ( c15cb8a )

  • Series.str.isdigit in unicode superscripts and fractions ( #1924 ) ( 8d46c36 )

Documentation

  • Add code snippets for session and IO public docs ( #1919 ) ( 6e01cbe )

  • Add snippets for performance optimization doc ( #1923 ) ( 4da309e )

2.11.0 (2025-07-15)

Features

  • Add __contains__ to Index, Series, DataFrame ( #1899 ) ( 07222bf )

  • Add thresh param for Dataframe.dropna ( #1885 ) ( 1395a50 )

  • Add concat pushdown for hybrid engine ( #1891 ) ( 813624d )

  • Add pagination buttons (prev/next) to anywidget mode for DataFrames ( #1841 ) ( 8eca767 )

  • Add total_rows property to pandas batches iterator ( #1888 ) ( e3f5e65 )

  • Hybrid engine local join support ( #1900 ) ( 1aa7950 )

  • Support date data type for to_datetime() ( #1902 ) ( 24050cb )

  • Support bpd.Series(json_data, dtype=”json”) ( #1882 ) ( 05cb7d0 )

Bug Fixes

  • Bpd.merge on common columns ( #1905 ) ( a1fa112 )

  • DataFrame string addition respects order ( #1894 ) ( 52c8233 )

  • Show slot_millis_sum warning only when allow_large_results=False ( #1892 ) ( 25efabc )

  • Used query row count metadata instead of table metadata ( #1893 ) ( e1ebc53 )

2.10.0 (2025-07-08)

Features

  • df.to_pandas_batches() returns one empty DataFrame if df is empty ( #1878 ) ( e43d15d )

  • Add filter pushdown to hybrid engine ( #1871 ) ( 6454aff )

  • Add simple stats support to hybrid local pushdown ( #1873 ) ( 8715105 )

Bug Fixes

  • Fix issues where duration type returned as int ( #1875 ) ( f30f750 )

Documentation

  • Update gsutil commands to gcloud commands ( #1876 ) ( c289f70 )

2.9.0 (2025-06-30)

Features

  • Add bpd.read_arrow to convert an Arrow object into a bigframes DataFrame ( #1855 ) ( 633bf98 )

  • Add experimental polars execution ( #1747 ) ( daf0c3b )

  • Add size op support in local engine ( #1865 ) ( 942e66c )

  • Create deploy_remote_function and deploy_udf functions to immediately deploy functions to BigQuery ( #1832 ) ( c706759 )

  • Support index item assign in Series ( #1868 ) ( c5d251a )

  • Support item assignment in series ( #1859 ) ( 25684ff )

  • Support local execution of comparison ops ( #1849 ) ( 1c45ccb )

Bug Fixes

  • Fix bug selecting column repeatedly ( #1858 ) ( cc339e9 )

  • Fix bug with DataFrame.agg for string values ( #1870 ) ( 81e4d64 )

  • Generate GoogleSQL instead of legacy SQL data types for dry_run=True from bpd._read_gbq_colab with local pandas DataFrame ( #1867 ) ( fab3c38 )

  • Revert dict back to protobuf in the iam binding update ( #1838 ) ( 9fb3cb4 )

Documentation

2.8.0 (2025-06-23)

⚠ BREAKING CHANGES

  • add required param ‘engine’ to multimodal functions ( #1834 )

Features

  • Add bpd.options.compute.maximum_result_rows option to limit client data download ( #1829 ) ( e22a3f6 )

  • Add bpd.options.display.repr_mode = "anywidget" to create an interactive display of the results ( #1820 ) ( be0a3cf )

  • Add DataFrame.ai.forecast() support ( #1828 ) ( 7bc7f36 )

  • Add describe() method to Series ( #1827 ) ( a4205f8 )

  • Add required param ‘engine’ to multimodal functions ( #1834 ) ( 37666e4 )

Performance Improvements

Documentation

2.7.0 (2025-06-16)

Features

  • Add bbq.json_query_array and warn bbq.json_extract_array deprecated ( #1811 ) ( dc9eb27 )

  • Add bbq.json_value_array and deprecate bbq.json_extract_string_array ( #1818 ) ( 019051e )

  • Add groupby cumcount ( #1798 ) ( 18f43e8 )

  • Support custom build service account in remote_function ( #1796 ) ( e586151 )

Bug Fixes

  • Correct read_csv behaviours with use_cols, names, index_col ( #1804 ) ( 855031a )

  • Fix single row broadcast with null index ( #1803 ) ( 080eb7b )

Documentation

  • Document how to use ai.map() for information extraction ( #1808 ) ( b586746 )

  • Rearrange README.rst to include a short code sample ( #1812 ) ( f6265db )

  • Use pandas API instead of pandas-like or pandas-compatible ( #1825 ) ( aa32369 )

2.6.0 (2025-06-09)

Features

Bug Fixes

  • Address read_csv with both index_col and use_cols behavior inconsistency with pandas ( #1785 ) ( ba7c313 )

  • Allow KMeans model init parameter as k-means++ alias ( #1790 ) ( 0b59cf1 )

  • Replace function now can handle bpd.NA value. ( #1786 ) ( 7269512 )

Documentation

  • Adjust strip method examples to match latest pandas ( #1797 ) ( 817b0c0 )

  • Fix docstrings to improve html rendering of code examples ( #1788 ) ( 38d9b73 )

2.5.0 (2025-05-30)

⚠ BREAKING CHANGES

  • the updated ai.map() parameter list is not backward-compatible

Features

  • Add bpd.options.bigquery.requests_transport_adapters option ( #1755 ) ( bb45db8 )

  • Add bbq.json_query and warn bbq.json_extract deprecated ( #1756 ) ( ec81dd2 )

  • Add bpd.options.reset() method ( #1743 ) ( 36c359d )

  • Add DataFrame.round method ( #1742 ) ( 3ea6043 )

  • Add deferred data uploading ( #1720 ) ( 1f6442e )

  • Add deprecation warning to Gemini-1.5-X, text-embedding-004, and remove remove legacy models in notebooks and docs ( #1723 ) ( 80aad9a )

  • Add structured output for ai map, ai filter and ai join ( #1746 ) ( 133ac6b )

  • Add support for df.loc list, column(s) ( 768a757 )

  • Include bq schema and query string in dry run results ( #1752 ) ( bb51147 )

  • Support inplace=True in rename and rename_axis ( #1744 ) ( 734cc65 )

  • Support unique() for Index ( #1750 ) ( 27fac78 )

  • Support astype conversions to and from JSON dtypes ( #1716 ) ( 8ef4de1 )

  • Support dict param for dataframe.agg() ( #1772 ) ( f9c29c8 )

  • Support dtype parameter in read_csv for bigquery engine ( #1749 ) ( 50dca4c )

  • Use read api for some peek ops ( #1731 ) ( 108f4d2 )

Bug Fixes

  • Fix clip int series with float bounds ( #1739 ) ( d451aef )

  • Fix error with self-merge operations ( #1774 ) ( e5fe143 )

  • Fix the default value for na_value for numpy conversions ( #1766 ) ( 0629cac )

  • Include location in Session-based temporary storage manager DDL queries ( #1780 ) ( acba032 )

  • Prevent creating unnecessary client objects in multithreaded environments ( #1757 ) ( 1cf9f5e )

  • Reduce bigquery table modification via DML for to_gbq ( #1737 ) ( 545cdca )

  • Stop ignoring arguments to MatrixFactorization.score(X, y) ( #1726 ) ( 55c07e9 )

  • Support JSON and STRUCT for bbq.sql_scalar ( #1754 ) ( 190390b )

  • Support str.replace re.compile with flags ( #1736 ) ( f8d2cd2 )

Performance Improvements

  • Faster local data comparison using idenitity ( #1738 ) ( 2858b1e )

  • Optimize repr for unordered gbq table ( #1778 ) ( 2bc4fbc )

  • Use JOB_CREATION_OPTIONAL when allow_large_results=False ( #1763 ) ( 15f3f2a )

Dependencies

Documentation

  • Add llm output_schema notebook ( #1732 ) ( b2261cc )

  • Add MatrixFactorization to the table of contents ( #1725 ) ( 611e43b )

  • Fix typo for “population” in the GeminiTextGenerator.predict(..., output_schema={...}) sample notebook ( #1748 ) ( bd07e05 )

  • Integrations notebook extracts token from bqclient._http.credentials instead of bqclient._credentials ( #1784 ) ( 6e63eca )

  • Updated multimodal notebook instructions ( #1745 ) ( 1df8ca6 )

  • Use partial ordering mode in the quickstart sample ( #1734 ) ( 476b7dd )

2.4.0 (2025-05-12)

Features

  • Add “dayofyear” property for dt accessors ( #1692 ) ( 9d4a59d )

  • Add .dt.days , .dt.seconds , dt.microseconds , and dt.total_seconds() for timedelta series. ( #1713 ) ( 2b3a45f )

  • Add DatetimeIndex class ( #1719 ) ( c3c830c )

  • Add isocalendar() for dt accessor” ( #1717 ) ( 0479763 )

  • Add bigframes.bigquery.json_value ( #1697 ) ( 46a9c53 )

  • Add blob.exif function support ( #1703 ) ( 3f79528 )

  • Add inplace arg support to sort methods ( #1710 ) ( d1ccb52 )

  • Improve error message in Series.apply for direct udfs ( #1673 ) ( 1a658b2 )

  • Publish bigframes blob(Multimodal) to preview ( #1693 ) ( e4c85ba )

  • Support () operator between timedeltas ( #1702 ) ( edaac89 )

  • Support forecast_limit_lower_bound and forecast_limit_upper_bound in ARIMA_PLUS (and ARIMA_PLUS_XREG) models ( #1305 ) ( b16740e )

  • Support to_strip parameter for str.strip, str.lstrip and str.rstrip ( #1705 ) ( a84ee75 )

Bug Fixes

  • Fix dayofyear doc test ( #1701 ) ( 9b777a0 )

  • Fix issues with chunked arrow data ( #1700 ) ( e3289b7 )

  • Rename columns with protected names such as _TABLE_SUFFIX in to_gbq() ( #1691 ) ( 8ec6079 )

Performance Improvements

  • Defer query in read_gbq with wildcard tables ( #1661 ) ( 5c125c9 )

  • Rechunk result pages client side ( #1680 ) ( 67d8760 )

Dependencies

Documentation

  • Add snippets for Matrix Factorization tutorials ( #1630 ) ( 24b37ae )

  • Deprecate bpd.options.bigquery.allow_large_results in favor of bpd.options.compute.allow_large_results ( #1597 ) ( 18780b4 )

  • Include import statement in the bigframes code snippet ( #1699 ) ( 08d70b6 )

  • Include the clean-up step in the udf code snippet ( #1698 ) ( 48992e2 )

  • Move multimodal notebook out of experimental folder ( #1712 ) ( 68b6532 )

  • Update blob_display option in snippets ( #1714 ) ( 8b30143 )

2.3.0 (2025-05-06)

Features

  • Add dry_run parameter to read_gbq() , read_gbq_table() and read_gbq_query() ( #1674 ) ( 4c5dee5 )

Bug Fixes

  • Guarantee guid thread safety across threads ( #1684 ) ( cb0267d )

  • Support large lists of lists in bpd.Series() constructor ( #1662 ) ( 0f4024c )

  • Use value equality to check types for unix epoch functions and timestamp diff ( #1690 ) ( 81e8fb8 )

Performance Improvements

  • to_datetime() now avoids caching inputs unless data is inspected to infer format ( #1667 ) ( dd08857 )

Documentation

  • Add a visualization notebook to BigFrame samples ( #1675 ) ( ee062bf )

  • Fix spacing of k-means code snippet ( #1687 ) ( 99f45dd )

  • Update snippet for Create a k-means model tutorial ( #1664 ) ( 761c364 )

2.2.0 (2025-04-30)

Features

  • Add gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 to fine tune score endponts and multimodal endpoints ( #1650 ) ( 4fb54df )

  • Add GeminiTextGenerator.predict structured output ( #1653 ) ( 6199023 )

  • DataFrames. getitemsupport for slice input ( #1668 ) ( 563f0cb )

  • Print right origin of PreviewWarning for the bpd.udf ( #1629 ) ( 48d10d1 )

  • Session.bytes_processed_sum will be updated when allow_large_re… ( #1669 ) ( ae312db )

  • Short circuit query for local scan ( #1618 ) ( e84f232 )

  • Support names parameter in read_csv for bigquery engine ( #1659 ) ( 3388191 )

  • Support passing list of values to bigframes.core.sql.simple_literal ( #1641 ) ( 102d363 )

  • Support write api as loading option ( #1617 ) ( c46ad06 )

Bug Fixes

  • DataFrame accessors is not pupulated ( #1639 ) ( 28afa2c )

  • Prefer remote schema instead of throwing on materialize conflicts ( #1644 ) ( 53fc25b )

  • Remove itertools.pairwise usage ( #1638 ) ( 9662745 )

  • Resolve issue where pre-release versions of google-auth are installed ( #1491 ) ( ebb7a5e )

  • Resolve some of the typo errors ( #1655 ) ( cd7fbde )

Performance Improvements

Dependencies

Documentation

  • Add JSON data types notebook ( #1647 ) ( 9128c4a )

  • Add sample code snippets for udf ( #1649 ) ( 53caa8d )

  • Fix bq_dataframes_template notebook to work if partial ordering mode is enabled ( #1665 ) ( f442e7a )

  • Note that udf is in preview and must be python 3.11 compatible ( #1629 ) ( 48d10d1 )

2.1.0 (2025-04-22)

Features

  • Add bigframes.bigquery.st_distance function ( #1637 ) ( bf1ae70 )

  • Enable local json string validations ( #1614 ) ( 233347a )

  • Enhance read_csv index_col parameter support ( #1631 ) ( f4e5b26 )

Bug Fixes

  • Add retry for test_clean_up_via_context_manager ( #1627 ) ( 58e7cb0 )

  • Improve robustness of managed udf code extraction ( #1634 ) ( 8cc56d5 )

Documentation

  • Add code samples in the udf API docstring ( #1632 ) ( f68b80c )

2.0.0 (2025-04-17)

⚠ BREAKING CHANGES

  • make dataset and name params mandatory in udf ( #1619 )

  • Locational endpoints support is not available in BigFrames 2.0.

  • change default LLM model to gemini-2.0-flash-001, drop PaLM2TextGenerator and PaLM2TextEmbeddingGenerator ( #1558 )

  • change default ingress setting for remote_function to internal-only ( #1544 )

  • make remote_function params keyword only ( #1537 )

  • make remote_function default service account explicit ( #1537 )

  • set allow_large_results=False by default ( #1541 )

Features

  • Add on parameter in dataframe.rolling() and dataframe.groupby.rolling() ( #1556 ) ( 45c9d9f )

  • Add component to manage temporary tables ( #1559 ) ( 0a4e245 )

  • Add Series.to_pandas_batches() method ( #1592 ) ( 09ce979 )

  • Add support for creating a Matrix Factorization model ( #1330 ) ( b5297f9 )

  • Allow input_types , output_type , and dataset to be used positionally in remote_function ( #1560 ) ( bcac8c6 )

  • Allow pandas.cut ‘labels’ parameter to accept a list of string ( #1549 ) ( af842b1 )

  • Change default ingress setting for remote_function to internal-only ( #1544 ) ( c848a80 )

  • Detect duplicate column/index names in read_gbq before send query. ( #1615 ) ( 40d6960 )

  • Drop support for locational endpoints ( #1542 ) ( 4bf2e43 )

  • Enable time range rolling for DataFrame, DataFrameGroupBy and SeriesGroupBy ( #1605 ) ( b4b7073 )

  • Improve local data validation ( #1598 ) ( 815e471 )

  • Make remote_function default service account explicit ( #1537 ) ( 9eb9089 )

  • Set allow_large_results=False by default ( #1541 ) ( e9fb712 )

  • Support bigquery connection in managed function ( #1554 ) ( f6f697a )

  • Support bq connection path format ( #1550 ) ( e7eb918 )

  • Support gemini-2.0-X models ( #1558 ) ( 3104fab )

  • Support inlining small list, struct, json data ( #1589 ) ( 2ce891f )

  • Support time range rolling on Series. ( #1590 ) ( 6e98a2c )

  • Use session temp tables for all ephemeral storage ( #1569 ) ( 9711b83 )

  • Use validated local storage for data uploads ( #1612 ) ( aee4159 )

  • Warn the deprecated max_download_size , random_state and sampling_method parameters in (DataFrame|Series).to_pandas() ( #1573 ) ( b9623da )

Bug Fixes

  • to_pandas_batches() respects page_size and max_results again ( #1572 ) ( 27c5905 )

  • Ensure page_size works correctly in to_pandas_batches when max_results is not set ( #1588 ) ( 570cff3 )

  • Include role and service account in IAM exception ( #1564 ) ( 8c50755 )

  • Make dataset and name params mandatory in udf ( #1619 ) ( 637e860 )

  • Pandas.cut returns labels index for numeric breaks when labels=False ( #1548 ) ( b2375de )

  • Prevent KeyError in bpd.concat with empty DF and struct/array types DF ( #1568 ) ( b4da1cf )

  • Read_csv supports for tilde local paths and includes index for bigquery_stream write engine ( #1580 ) ( 352e8e4 )

  • Use dictionaries to avoid problematic google.iam namespace ( #1611 ) ( b03e44f )

Performance Improvements

  • Directly read gbq table for simple plans ( #1607 ) ( 6ad38e8 )

Dependencies

Documentation

  • Add details for bigquery_connection in [@bpd](https://github.com/bpd).udf docstring ( #1609 ) ( ef63772 )

  • Add explain forecast snippet to multiple time series tutorial ( #1586 ) ( 40c55a0 )

  • Add message to remove default model for version 3.0 ( #1563 ) ( 910be2b )

  • Add samples for ArimaPlus time_series_id_col feature ( #1577 ) ( 1e4cd9c )

  • Add warning for bigframes 2.0 ( #1557 ) ( 3f0eaa1 )

  • Deprecate default model in TextEmbedddingGenerator , GeminiTextGenerator , and other bigframes.ml.llm classes ( #1570 ) ( 89ab33e )

  • Include all licenses for vendored packages in the root LICENSE file ( #1626 ) ( 8116ed0 )

  • Remove gemini-1.5 deprecation warning for GeminiTextGenerator ( #1562 ) ( 0cc6784 )

  • Use restructured text to allow publishing to PyPI ( #1565 ) ( d1e9ec2 )

Miscellaneous Chores

  • Make remote_function params keyword only ( #1537 ) ( 9eb9089 )

1.42.0 (2025-03-27)

Features

  • Add closed parameter in rolling() ( #1539 ) ( 8bcc89b )

  • Add GeoSeries.difference() and bigframes.bigquery.st_difference() ( #1471 ) ( e9fe815 )

  • Add GeoSeries.intersection() and bigframes.bigquery.st_intersection() ( #1529 ) ( 8542bd4 )

  • Add df.take and series.take ( #1509 ) ( 7d00be6 )

  • Add Linear_Regression.global_explain() ( #1446 ) ( 7e5b6a8 )

  • Allow iloc to support lists of negative indices ( #1497 ) ( a9cf215 )

  • Support dry_run in to_pandas() ( #1436 ) ( 75fc7e0 )

  • Support window partition by geo column ( #1512 ) ( bdcb1e7 )

  • Upgrade BQ managed udf to preview ( #1536 ) ( 4a7fe4d )

Bug Fixes

  • Add deprecation warning to TextEmbeddingGenerator model, espeically gemini-1.0-X and gemini-1.5-X ( #1534 ) ( c93e720 )

  • Change the default value for pdf extract/chunk ( #1517 ) ( a70a607 )

  • Local data always has sequential index ( #1514 ) ( 014bd33 )

  • Read_pandas inline returns None when exceeds limit ( #1525 ) ( 578081e )

  • Temporary fix for StreamingDataFrame not working backend bug ( #1533 ) ( 6ab4ffd )

  • Tolerate BQ connection service account propagation delay ( #1505 ) ( 6681f1f )

Performance Improvements

Documentation

  • Update GeoSeries.difference() and bigframes.bigquery.st_difference() docs ( #1526 ) ( d553fa2 )

1.41.0 (2025-03-19)

Features

  • Add support for the ‘right’ parameter in ‘pandas.cut’ ( #1496 ) ( 8aff128 )

  • Support BQ managed functions through read_gbq_function ( #1476 ) ( 802183d )

  • Warn when the BigFrames version is more than a year old ( #1455 ) ( 00e0750 )

Bug Fixes

  • Fix pandas.cut errors with empty bins ( #1499 ) ( 434fb5d )

  • Fix read_gbq with ORDER BY query and index_col set ( #963 ) ( de46d2f )

Performance Improvements

  • Eliminate count queries in llm retry ( #1489 ) ( 1c934c2 )

Documentation

  • Add a sample notebook for vector search ( #1500 ) ( f3bf139 )

1.40.0 (2025-03-11)

⚠ BREAKING CHANGES

  • reading JSON data as a custom arrow extension type ( #1458 )

Features

  • Reading JSON data as a custom arrow extension type ( #1458 ) ( e720f41 )

  • Support list output for managed function ( #1457 ) ( 461e9e0 )

Bug Fixes

  • Fix list-like indexers in partial ordering mode ( #1456 ) ( fe72ada )

  • Fix the merge issue between 1424 and 1373 ( #1461 ) ( 7b6e361 )

  • Use == instead of is for timedelta type equality checks ( #1480 ) ( 0db248b )

Performance Improvements

  • Compilation no longer bounded by recursion ( #1464 ) ( 27ab028 )

1.39.0 (2025-03-05)

Features

  • (Preview) Support diff() for date series ( #1423 ) ( 521e987 )

  • (Preview) Support aggregations over timedeltas ( #1418 ) ( 1251ded )

  • (Preview) Support arithmetics between dates and timedeltas ( #1413 ) ( 962b152 )

  • (Preview) Support automatic load of timedelta from BQ tables. ( #1429 ) ( b2917bb )

  • Add allow_large_results option to many I/O methods. Set to False to reduce latency ( #1428 ) ( dd2f488 )

  • Add GeoSeries.boundary() ( #1435 ) ( 32cddfe )

  • Add allow_large_results to peek ( #1448 ) ( 67487b9 )

  • Add groupby.rank() ( #1433 ) ( 3a633d5 )

  • Iloc multiple columns selection. ( #1437 ) ( ddfd02a )

  • Support interface for BigQuery managed functions ( #1373 ) ( 2bbf53f )

  • Warn if default ingress_settings is used in remote_functions ( #1419 ) ( dfd891a )

Bug Fixes

  • Do not compare schema description during schema validation ( #1452 ) ( 03a3a56 )

  • Remove warnings for null index and partial ordering mode in prep for GA ( #1431 ) ( 6785aee )

  • Warn if default cloud_function_service_account is used in remote_function ( #1424 ) ( fe7463a )

  • Window operations over JSON columns ( #1451 ) ( 0070e77 )

  • Write chunked text instead of dummy text for pdf chunk ( #1444 ) ( 96b0e8a )

Performance Improvements

Documentation

  • Add snippet for explaining the linear regression model prediction ( #1427 ) ( 7c37c7d )

1.38.0 (2025-02-24)

Features

  • (Preview) Support diff aggregation for timestamp series. ( #1405 ) ( abe48d6 )

  • Add GeoSeries.from_wkt() and GeoSeries.to_wkt() ( #1401 ) ( 2993b28 )

  • Support DF. array(copy=True) ( #1403 ) ( 693ed8c )

  • Support routines with ARRAY return type in read_gbq_function ( #1412 ) ( 4b60049 )

Bug Fixes

  • Calling to_timdelta() over timedeltas no longer changes their values ( #1411 ) ( 650a190 )

  • Replace empty dict with None to avoid mutable default arguments ( #1416 ) ( fa4e3ad )

Performance Improvements

Dependencies

  • Remove scikit-learn and sqlalchemy as required dependencies ( #1296 ) ( fd8bc89 )

Documentation

  • Add samples using SQL methods via the bigframes.bigquery module ( #1358 ) ( f54e768 )

  • Add snippets for visualizing a time series and creating a time series model for the Limit forecasted values in time series model tutorial ( #1310 ) ( c6c9120 )

1.37.0 (2025-02-19)

Features

  • (Preview) Support add, sub, mult, div, and more between timedeltas ( #1396 ) ( ffa63d4 )

  • (Preview) Support comparison, ordering, and filtering for timedeltas ( #1387 ) ( 34d01b2 )

  • (Preview) Support subtraction in DATETIME/TIMESTAMP columns with timedelta columns ( #1390 ) ( 50ad3a5 )

  • JSON dtype support for read_pandas and Series constructor ( #1391 ) ( 44f4137 )

Bug Fixes

  • Ensure binops with pandas objects returns bigquery dataframes ( #1404 ) ( 3cee24b )

Performance Improvements

  • Prune projections more aggressively ( #1398 ) ( 7990262 )

  • Simplify sum aggregate SQL text ( #1395 ) ( 0145656 )

  • Use simple null constraints to simplify queries ( #1381 ) ( 00611d4 )

Documentation

1.36.0 (2025-02-11)

Features

  • (Preview) Support addition between a timestamp and a timedelta ( #1369 ) ( b598aa8 )

  • (Preview) Support casting floats and list-likes to timedelta series ( #1362 ) ( 65933b6 )

  • (Preview) Support timestamp subtractions ( #1346 ) ( 86b7e72 )

  • Add bigframes.bigquery.st_area and suggest it from GeoSeries.area ( #1318 ) ( 8b5ffa8 )

  • Add GeoSeries.from_xy() ( #1364 ) ( 3c3e14c )

Bug Fixes

  • Dtype parameter ineffective in Series/DataFrame construction ( #1354 ) ( b9bdca8 )

  • Translate labels to col ids when copying dataframes ( #1372 ) ( 0c55b07 )

Performance Improvements

1.35.0 (2025-02-04)

Features

  • (Preview) Support timedeltas for read_pandas() ( #1349 ) ( 866ba9e )

  • Add Series.keys() ( #1342 ) ( deb015d )

  • Allow case_when to change dtypes if case list contains the condition (True, some_default_value) ( #1311 ) ( 5c2a2c6 )

  • Support python type as astype arg ( #1316 ) ( b26e135 )

  • Support time_series_id_col in ARIMAPlus ( #1282 ) ( 97532c9 )

Bug Fixes

  • Exclude DataFrame and Series __call__ from unimplemented API metrics ( #1351 ) ( f2d5264 )

  • Make DataFrame __getattr__ and __setattr__ more robust to subclassing ( #1352 ) ( 417de3a )

Performance Improvements

Dependencies

  • Add support for Python 3.13 for everything but remote functions ( #1307 ) ( 533db96 )

Documentation

  • Add GeoSeries docs ( #1327 ) ( 05f83d1 )

  • Add link to DataFrames intro to improve SEO ( #1176 ) ( aafb5be )

  • Add snippet to explain the univariate model’s forecast result in the Forecast a single time series with a univariate model tutorial ( #1272 ) ( c22126b )

1.34.0 (2025-01-27)

⚠ BREAKING CHANGES

  • Enable reading JSON data with dbjson extension dtype ( #1139 )

Features

  • (df|s).hist(), (df|s).line(), (df|s).area(), (df|s).bar(), df.scatter() ( #1320 ) ( bd3f584 )

  • (Preview) Define timedelta type and to_timedelta function ( #1317 ) ( 3901951 )

  • Add DataFrame.corrwith method ( #1315 ) ( b503355 )

  • Add DataFrame.mask method ( #1302 ) ( 8b8155f )

  • Enable reading JSON data with dbjson extension dtype ( #1139 ) ( f672262 )

1.33.0 (2025-01-22)

Features

  • Add bigframes.bigquery.sql_scalar() to apply SQL syntax on Series objects ( #1293 ) ( aa2f73a )

  • Add unix_seconds, unix_millis and unix_micros for timestamp series. ( #1297 ) ( e4b0c8d )

  • DataFrame.join supports Series other ( #1303 ) ( ee37a0a )

  • Support array output in remote_function ( #1057 ) ( bdee173 )

Bug Fixes

  • Dataframe sort_values Series input keyerror. ( #1285 ) ( 5a2731b )

  • Fix read_gbq_function issue in dataframe apply method ( #1174 ) ( 0318764 )

  • Series sort_index and sort_values now raises when axis!=0 ( #1294 ) ( 94bc2f2 )

Documentation

  • Add snippet to forecast future time series in the Forecast a single time series with a univariate model tutorial ( #1271 ) ( a687050 )

  • Update bigframes.pandas.Series docs ( #1273 ) ( 0cac64f )

1.32.0 (2025-01-13)

Features

  • Add max_retries to TextEmbeddingGenerator and Claude3TextGenerator ( #1259 ) ( 8077ff4 )

  • Bigframes.bigquery.parse_json ( #1265 ) ( 27bbd80 )

  • Support DataFrame.astype(dict) ( #1262 ) ( 5934f8e )

Bug Fixes

  • Avoid global mutation in BigQueryOptions.client_endpoints_override ( #1280 ) ( 788f6e9 )

  • Fix erroneous window bounds removal during compilation ( #1163 ) ( f91756a )

Dependencies

Documentation

  • Add bq studio links that allows users to generate Jupiter notebooks in bq studio with github contents ( #1266 ) ( 58f13cb )

  • Add snippet to evaluate ARIMA plus model in the Forecast a single time series with a univariate model tutorial ( #1267 ) ( 3dcae2d )

  • Add snippet to see the ARIMA coefficients in the Forecast a single time series with a univariate model tutorial ( #1268 ) ( 059a564 )

  • Update bigframes.pandas.pandas docstrings ( #1247 ) ( c4bffc3 )

  • Use 002 model for better scalability in text generation ( #1270 ) ( bb7a850 )

1.31.0 (2025-01-05)

Features

  • Implement confirmation threshold for semantic operators ( #1251 ) ( 5ba4511 )

Bug Fixes

  • Raise if trying to change ordering_mode after session has started ( #1252 ) ( 8cfaae8 )

  • Reduce the number of labels added to query jobs ( #1245 ) ( fdcdc18 )

Documentation

1.30.0 (2024-12-30)

Features

  • Add GeoSeries.x and GeoSeries.y ( #1126 ) ( 4c3548f )

  • Add LinearRegression.predict_explain() to generate ML.EXPLAIN_PREDICT columns ( #1190 ) ( e13eca2 )

  • Add LogisticRegression.predict_explain() to generate ML.EXPLAIN_PREDICT columns ( #1222 ) ( bcbc732 )

  • Add write_engine parameter to read_FORMATNAME methods to control how data is written to BigQuery ( #371 ) ( ed47ef1 )

  • Add client side retry to GeminiTextGenerator ( #1242 ) ( 8193abe )

  • Add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 ( #1208 ) ( 298fc73 )

  • Add support for LinearRegression.predict_explain and LogisticRegression.predict_explain parameter, top_k_features ( #1228 ) ( 3068e19 )

  • Support dataframe where method ( #1166 ) ( 71b4053 )

Bug Fixes

  • Arima model series input. ( #1237 ) ( f7d52d9 )

  • Json in struct destination type ( #1187 ) ( 200c9bb )

  • Throw an error message when setting is_row_processor=True to read a multi param function ( #1160 ) ( b2816a5 )

Documentation

  • Add an “open in BQ Studio” link to all BigFrames sample notebooks ( #1223 ) ( e0a8288 )

  • Add bq studio link for a new ipynb file called “bq_dataframes_template.ipynb” ( #1239 ) ( 840aaff )

  • Add example for logistic regression ( #1240 ) ( 4d854fd )

  • Add examples for ml PCA and SimpleImputer ( #1236 ) ( 0d84459 )

  • Add KMeans example ( #1234 ) ( d87ab97 )

  • Add linear model example ( #1235 ) ( 2c3e1fd )

  • Add ml.model_selection examples ( #1238 ) ( 50648e4 )

  • Add python snippet for “Create the time series model” section of the Forecast a single time series with a univariate model tutorial ( #1227 ) ( 20f3190 )

1.29.0 (2024-12-12)

Features

  • Add Gemini 2.0 preview text model support ( #1209 ) ( 1021d57 )

Documentation

  • Add Gemini 2.0 text gen sample notebook ( #1211 ) ( 9596b66 )

  • Update bigframes.pandas.index docs return types ( #1191 ) ( c63e7da )

1.28.0 (2024-12-11)

Features

  • (Series | DataFrame).plot.bar ( #1152 ) ( 0fae2e0 )

  • bigframes.bigquery.vector_search supports use_brute_force and fraction_lists_to_search parameters ( #1158 ) ( 131edc3 )

  • Add ARIMAPlus.predict_explain() to generate forecasts with explanation columns ( #1177 ) ( 05f8b4d )

  • Add client_endpoints_override to bq options ( #1167 ) ( be74b99 )

  • Add support for temporal types in dataframe’s describe() method ( #1189 ) ( 2d564a6 )

  • Allow join-free alignment of analytic expressions ( #1168 ) ( daef4f0 )

  • Series.isin supports bigframes.Series arg ( #1195 ) ( 0d8a16b )

  • Update llm.TextEmbeddingGenerator to 005 ( #1186 ) ( 3072d38 )

Bug Fixes

  • Fix error loading local dataframes into bigquery ( #1165 ) ( 5b355ef )

  • Fix null index join with ‘on’ arg ( #1153 ) ( 9015c33 )

  • Fix series.isin using local path always ( #1202 ) ( a44eafd )

Performance Improvements

  • Update df.corr, df.cov to be used with more than 30 columns case. ( #1161 ) ( 9dcf1aa )

Dependencies

  • Remove ibis-framework by vendoring a fork of the package to bigframes_vendored . ( #1170 ) ( 421d24d )

Documentation

  • Add a code sample using bpd.options.bigquery.ordering_mode = "partial" ( #909 ) ( f80d705 )

  • Add snippet for creating boosted tree model ( #1142 ) ( a972668 )

  • Add snippet for evaluating a boosted tree model ( #1154 ) ( 9d8970a )

  • Add snippet for predicting classifications using a boosted tree model ( #1156 ) ( e7b83f1 )

  • Add third party pandas.Index methods and docstrings ( #1171 ) ( a970294 )

  • Fix Bigframes.Pandas.General_Function missing docs ( #1164 ) ( de923d0 )

  • Update bigframes.pandas.Index docstrings ( #1144 ) ( 557ab8d )

1.27.0 (2024-11-16)

Features

  • Add astype(type, errors=’null’) to cast safely ( #1122 ) ( b4d17ff )

Bug Fixes

  • Dataframe fillna with scalar. ( #1132 ) ( 37f8c32 )

  • Exclude index columns from model fitting processes. ( #1138 ) ( 8d4da15 )

  • Unordered mode too many labels issue. ( #1148 ) ( 7216b21 )

Documentation

  • Document groupby.head and groupby.size methods ( #1111 ) ( a61eb4d )

1.26.0 (2024-11-12)

Features

  • Add basic geopandas functionality ( #962 ) ( 3759c63 )

  • Support json_extract_string_array in the bigquery module ( #1131 ) ( 4ef8bac )

Bug Fixes

  • Fix Series.to_frame generating string label instead of int where name is None ( #1118 ) ( 14e32b5 )

  • Update the API documentation with newly added rep ( #1120 ) ( 72c228b )

Performance Improvements

Documentation

  • Add file for Classification with a Boosted Treed Model and snippet for preparing sample data ( #1135 ) ( 7ac6639 )

  • Add snippet for Linear Regression tutorial Predict Outcomes section ( #1101 ) ( 108f4a9 )

  • Update DataFrame docstrings to include the errors section ( #1127 ) ( a38d4c4 )

  • Update GroupBy docstrings ( #1103 ) ( 9867a78 )

  • Update Session doctrings to include exceptions ( #1130 ) ( a870421 )

1.25.0 (2024-10-29)

Features

  • Add the ground_with_google_search option for GeminiTextGenerator predict ( #1119 ) ( ca02cd4 )

  • Add warning when user tries to access struct series fields with __getitem__ ( #1082 ) ( 20e5c58 )

  • Allow fit to take additional eval data in linear and ensemble models ( #1096 ) ( 254875c )

  • Support context manager for bigframes session ( #1107 ) ( 5f7b8b1 )

Performance Improvements

  • Improve series.unique performance and replace drop_duplicates i… ( #1108 ) ( 499f24a )

1.24.0 (2024-10-24)

Features

Documentation

  • Update docstrings of DataFrame and related files ( #1092 ) ( 15e9fd5 )

1.23.0 (2024-10-23)

Features

  • Add bigframes.bigquery.create_vector_index to assist in creating vector index on ARRAY<FLOAT64> columns ( #1024 ) ( 863d694 )

  • Add gemini-1.5-pro-002 and gemini-1.5-flash-002 to known Gemini model list. ( #1105 ) ( 7094c85 )

  • Add support for pandas series & data frames as inputs for ml models. ( #1088 ) ( 30c8883 )

  • Cleanup temp resources with session deletion ( #1068 ) ( 1d5373d )

  • Show possible correct key(s) in .__getitem__ KeyError message ( #1097 ) ( 32fab96 )

  • Support uploading local geo data ( #1036 ) ( 51cdd33 )

Bug Fixes

  • Escape ids more consistently in ml module ( #1074 ) ( 103e998 )

  • Model.fit metric not collected issue. ( #1085 ) ( 06cec00 )

  • Remove index requirement from some dataframe APIs ( #1073 ) ( 2d16f6d )

  • Update session metrics in read_gbq_query ( #1084 ) ( dced460 )

Performance Improvements

  • Speed up tree transforms during sql compile ( #1071 ) ( d73fe9d )

  • Utilize ORDER BY LIMIT over ROW_NUMBER where possible ( #1077 ) ( 7003d1a )

Documentation

  • Add ml tutorial for Evaluate the model ( #1038 ) ( a120bae )

  • Show best practice of closing the session to cleanup resources in sample notebooks ( #1095 ) ( 62a88e8 )

  • Update docstrings of Session and related files ( #1087 ) ( bf93e80 )

1.22.0 (2024-10-09)

Features

  • Support regional endpoints for more bigquery locations ( #1061 ) ( 45b672a )

  • Update LLM generators to warn user about model name instead of raising error. ( #1048 ) ( 650d80d )

Bug Fixes

  • Access MATERIALIZED_VIEW with read_gbq ( #1070 ) ( 601e984 )

  • Correct zero row count in DataFrame from table view ( #1062 ) ( b536070 )

  • Fix generic error message when entering an incorrect column name ( #1031 ) ( 5ac217d )

  • Make explode respect the index labels ( #1064 ) ( 99ca0df )

  • Make invalid location warning case-insensitive ( #1044 ) ( b6cd55a )

  • Remove palm2 test case from llm load test ( #1063 ) ( 575a10a )

  • Show warning for unknown location set through .ctor ( #1052 ) ( 02c2da7 )

Performance Improvements

Documentation

  • Add docstring return type section to BigQueryOptions class ( #964 ) ( 307385f )

1.21.0 (2024-10-02)

Features

  • Add deprecation warning to PaLM2TextGenerator model ( #1035 ) ( 1183b0f )

  • Add DeprecationWarning for PaLM2TextEmbeddingGenerator ( #1018 ) ( 4af5bbb )

  • Add ml.model_selection.cross_validate support ( #1020 ) ( 1a38063 )

  • Allow access of struct fields with dot operators on Series ( #1019 ) ( ef76f13 )

Bug Fixes

  • Ensure no double execution for to_pandas ( #1032 ) ( 4992cc2 )

  • Remove pre-caching of remote function results ( #1028 ) ( 0359bc8 )

Documentation

1.20.0 (2024-09-25)

Features

  • Add bigframes.bigquery.approx_top_count ( #1010 ) ( 3263bd7 )

  • Add bigframes.ml.compose.SQLScalarColumnTransformer to create custom SQL-based transformations ( #955 ) ( 1930b4e )

  • Allow multiple columns input for llm models ( #998 ) ( 2fe5e48 )

Bug Fixes

  • Fix reprcaching with partial ordering ( #1016 ) ( 208a984 )

Documentation

  • Limit pypi notebook to 7 days and add more info about differences with partial ordering mode ( #1013 ) ( 3c54399 )

  • Move and edit existing linear-regression tutorial snippet ( #991 ) ( 4cb62fd )

1.19.0 (2024-09-24)

Features

  • Add ml.model_selection.KFold class ( #1001 ) ( 952cab9 )

  • Support bool and bytes types in describe(include='all') ( #994 ) ( cc48f58 )

  • Support ingress settings in remote_function ( #1011 ) ( 8e9919b )

Bug Fixes

  • Fix miscasting issues with case_when ( #1003 ) ( 038139d )

Performance Improvements

  • Join op discards child ordering in unordered mode ( #923 ) ( 1b5b0ee )

Dependencies

  • Update ibis version in prerelease tests ( #1012 ) ( f89785f )

1.18.0 (2024-09-18)

Features

  • Add “include” param to describe for string types ( #973 ) ( deac6d2 )

  • Add subset parameter to DataFrame.dropna to select which columns to consider ( #981 ) ( f7c03dc )

Bug Fixes

  • DataFrameGroupby.agg now works with unnamed tuples ( #985 ) ( 0f047b4 )

  • Fix a bug that raises exception when re-indexing columns with their original order ( #988 ) ( 596b03b )

  • Make the Series.apply outcome assign able to the original dataframe in partial ordering mode ( #874 ) ( c94ead9 )

Dependencies

  • Limit ibis-framework version to 9.2.0 ( #989 ) ( 06c1b33 )

  • Update to ibis-framework 9.x and newer sqlglot ( #827 ) ( 89ea44f )

1.17.0 (2024-09-11)

Features

  • Add __version__ alias to bigframes.pandas ( #967 ) ( 9ce10b4 )

  • Add Gemini 1.5 stable models support ( #945 ) ( c1cde19 )

  • Allow setting table labels in to_gbq ( #941 ) ( cccc6ca )

  • Define list accessor for bigframes Series ( #946 ) ( 8e8279d )

  • Enable read_csv() to process other files ( #940 ) ( 3b35860 )

  • Include the bigframes package version alongside the feedback link in error messages ( #936 ) ( 7b59b6d )

Bug Fixes

  • Astype Decimal to Int64 conversion. ( #957 ) ( 27764a6 )

  • Make read_gbq_function work for multi-param functions ( #947 ) ( c750be6 )

  • Support read_gbq_function for axis=1 application ( #950 ) ( 86e54b1 )

Documentation

  • Add docstring returns section to Options ( #937 ) ( a2640a2 )

  • Update title of pypi notebook example to reflect use of the PyPI public dataset ( #952 ) ( cd62e60 )

1.16.0 (2024-09-04)

Features

  • Add DataFrame.struct.explode to add struct subfields to a DataFrame ( #916 ) ( ad2f75e )

  • Implement bigframes.bigquery.json_extract_array ( #910 ) ( 575a29e )

  • Recover struct column from exploded Series ( #904 ) ( 7dd304c )

Bug Fixes

  • Fix issue with iterating on >10gb dataframes ( #949 ) ( 2b0f0fa )

  • Improve Series.replace for dict input ( #907 ) ( 4208044 )

  • NullIndex in ML model.predict error ( #917 ) ( 612271d )

  • Struct field non-nullable type issue. ( #914 ) ( 149d5ff )

  • Unordered mode errors in ml train_test_split ( #925 ) ( 85d7c21 )

Performance Improvements

Dependencies

  • Re-introduce support for numpy 1.24.x ( #931 ) ( 3d71913 )

  • Update minimum support to Pandas 1.5.3 and Pyarrow 10.0.1 ( #903 ) ( 7ed3962 )

Documentation

  • Add Claude3 ML and RemoteFunc notebooks ( #930 ) ( cfd16c1 )

  • Create sample notebook to manipulate struct and array data ( #883 ) ( 3031903 )

  • Update struct examples. ( #953 ) ( d632cd0 )

  • Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook ( #890 ) ( d1883cc )

1.15.0 (2024-08-20)

Features

  • Add llm.TextEmbeddingGenerator to support new embedding models ( #905 ) ( 6bc6a41 )

  • Add ml.llm.Claude3TextGenerator model ( #901 ) ( 7050038 )

Documentation

  • Add columns for “requires ordering/index” to supported APIs summary ( #892 ) ( d2fc51a )

  • Remove duplicate description for kms_key_name ( #898 ) ( 1053d56 )

  • Update embedding model notebooks ( #906 ) ( d9b8ef5 )

1.14.0 (2024-08-14)

Features

  • Implement bigframes.bigquery.json_extract ( #868 ) ( 3dbf84b )

  • Implement Series.str.__getitem__ ( #897 ) ( e027b7e )

Bug Fixes

  • Fix caching from generating row numbers in partial ordering mode ( #872 ) ( 52b7786 )

Performance Improvements

  • Generate SQL with fewer CTEs ( #877 ) ( eb60804 )

  • Speed up compilation by reducing redundant type normalization ( #896 ) ( e0b11bc )

Documentation

1.13.0 (2024-08-05)

Features

  • df.apply(axis=1) to support remote function with mutiple params ( #851 ) ( 2158818 )

  • Allow windowing in ‘partial’ ordering mode ( #861 ) ( ca26fe5 )

  • Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters ( #879 ) ( 8753bdd )

Bug Fixes

  • Fix issue with invalid sql generated by ml distance functions ( #865 ) ( 9959fc8 )

Documentation

  • Create sample notebook using ordering_mode="partial" ( #880 ) ( c415eb9 )

  • Update streaming notebook ( #875 ) ( e9b0557 )

1.12.0 (2024-07-31)

Features

  • Add bigframes-mode label to query jobs ( #832 ) ( c9eaff0 )

  • Add config option to set partial ordering mode ( #855 ) ( 823c0ce )

  • Add stratify param support to ml.model_selection.train_test_split method ( #815 ) ( 27f8631 )

  • Add streaming.StreamingDataFrame class ( #864 ) ( a7d7197 )

  • Allow DataFrame.join for self-join on Null index ( #860 ) ( e950533 )

  • Support remote function cleanup with session.close ( #818 ) ( ed06436 )

  • Support to_csv/parquet/json to local files/objects ( #858 ) ( d0ab9cc )

Bug Fixes

  • Fewer relation joins from df self-operations ( #823 ) ( 0d24f73 )

  • Fix ‘sql’ property for null index ( #844 ) ( 1b6a556 )

  • Fix unordered mode using ordered path to print frame ( #839 ) ( 93785cb )

  • Reduce redundant remote_function deployments ( #856 ) ( cbf2d42 )

Documentation

  • Add partner attribution steps to integrations sample notebook ( #835 ) ( d7b333f )

  • Make get_global_session / close_session / reset_session appears in the docs ( #847 ) ( 01d6bbb )

1.11.1 (2024-07-08)

Documentation

  • Remove session and connection in llm notebook ( #821 ) ( 74170da )

  • Remove the experimental flask icon from the public docs ( #820 ) ( 067ff17 )

1.11.0 (2024-07-01)

Features

  • Add .agg support for size ( #792 ) ( 87e6018 )

  • Add bigframes.bigquery.json_set ( #782 ) ( 1b613e0 )

  • Add bigframes.streaming.to_pubsub method to create continuous query that writes to Pub/Sub ( #801 ) ( b47f32d )

  • Add DataFrame.to_arrow to create Arrow Table from DataFrame ( #807 ) ( 1e3feda )

  • Add PolynomialFeatures support to to_gbq and pipelines ( #805 ) ( 57d98b9 )

  • Add Series.peek to preview data efficiently ( #727 ) ( 580e1b9 )

  • Expose gcf memory param in remote_function ( #803 ) ( 014765c )

  • More informative error when query plan too complex ( #811 ) ( 136dc24 )

Bug Fixes

  • Include internally required packages in remote_function hash ( #799 ) ( 4b8fc15 )

Documentation

  • Document dtype limitation on row processing remote_function ( #800 ) ( 487dff6 )

1.10.0 (2024-06-21)

Features

  • Add dataframe.insert ( #770 ) ( e8bab68 )

  • Add groupby head API ( #791 ) ( 44202bc )

  • Add ml.preprocessing.PolynomialFeatures class ( #793 ) ( b4fbb51 )

  • Bigframes.streaming module for continuous queries ( #703 ) ( 0433a1c )

  • Include index columns in DataFrame.sql if they are named ( #788 ) ( c8d16c0 )

Bug Fixes

  • Allow __repr__ to work with uninitialed DataFrame/Series/Index ( #778 ) ( e14c7a9 )

  • Df.loc with the 2nd input as bigframes boolean Series ( #789 ) ( a4ac82e )

  • Ensure numpy version matches in remote_function deployment ( #798 ) ( 324d93c )

  • Fix temp table creation retries by now throwing if table already exists. ( #787 ) ( 0e57d1f )

  • Self-join optimization doesn’t needlessly invalidate caching ( #797 ) ( 1b96b80 )

1.9.0 (2024-06-10)

Features

  • Allow functions returned from bpd.read_gbq_function to execute outside of apply ( #706 ) ( ad7d8ac )

  • Support bigquery.vector_search() ( #736 ) ( dad66fd )

  • Support score() in GeminiTextGenerator ( #740 ) ( b2c7d8b )

  • Support bytes type in remote_function ( #761 ) ( 4915424 )

  • Support fit() in GeminiTextGenerator ( #758 ) ( d751f5c )

Bug Fixes

  • ARIMAPlus loads auto_arima_min_order param ( #752 ) ( 39d7013 )

  • Improve to_pandas_batches for large results ( #746 ) ( 61f18cb )

  • Resolve issue with unset thread-local options ( #741 ) ( d93dbaf )

Documentation

  • Fix ML.EVALUATE spelling ( #749 ) ( 7899749 )

  • Remove LogisticRegression normal_equation strategy ( #753 ) ( ea5d367 )

1.8.0 (2024-05-31)

Features

  • merge only generates a default index if both inputs already have an index ( #733 ) ( 25d049c )

  • Add + , - as unary ops, ^ binary op ( #724 ) ( 968d825 )

  • Add GroupBy.size() to get number of rows in each group ( #479 ) ( 1fca588 )

  • Add DataFrame ~ operator ( #721 ) ( 354abc1 )

  • Add GeminiText 1.5 Preview models ( #737 ) ( 56cbd3b )

  • Add slot_millis and add stats to session object ( #725 ) ( 72e9583 )

  • Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings ( #731 ) ( f12c906 )

  • Allow functions decorated with bpd.remote_function() to execute locally ( #704 ) ( d850da6 )

  • Ensure "bigframes-api" label is always set on jobs, even if the API is unknown ( #722 ) ( 1832778 )

  • Support ml.SimpleImputer in bigframes ( #708 ) ( 4c4415f )

  • Support type annotations to supply input and output types to bpd.remote_function() decorator ( #717 ) ( 4a12e3c )

  • Support type annotations with bpd.remote_function() and axis=1 (a preview feature) ( #730 ) ( e5a2992 )

Bug Fixes

  • Correct index labels in multiple aggregations for DataFrameGroupBy ( #723 ) ( 6a78c89 )

  • Fix Null index assign series to column ( #711 ) ( ffb4b57 )

  • Set bpd.remote_function() s input_types and output_types default to None to allow omitting them when type annotations are present ( #729 ) ( 0e25a3b )

  • Warn and disable time travel for linked datasets ( #712 ) ( 085fa9d )

Performance Improvements

  • Optimize dataframe-series alignment on axis=1 ( #732 ) ( 3d39221 )

Documentation

  • Add examples to DataFrameGroupBy and SeriesGroupBy ( #701 ) ( e7da0f0 )

1.7.0 (2024-05-20)

Features

  • read_gbq_query supports filters ( 9386373 )

  • read_gbq suggests a correct column name when one is not found ( 9386373 )

  • Add DefaultIndexKind.NULL to use as index_col in read_gbq\* , creating an indexless DataFrame/Series ( #662 ) ( 29e4886 )

  • Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) ( #663 ) ( 412f28b )

  • To_datetime supports utc=False for string inputs ( #579 ) ( adf9889 )

Bug Fixes

  • read_gbq_table respects primary keys even when filters are set ( #689 ) ( 9386373 )

  • Fix type error in test_cluster ( #698 ) ( 14d81c1 )

  • Improve escaping of literals and identifiers ( #682 ) ( da9b136 )

  • Properly identify non-unique index in tables without primary keys ( #699 ) ( 6e0f4d8 )

  • Remove a usage of the resource package when not available, such as on Windows ( #681 ) ( 96243f2 )

  • The imported samples error and use peek() ( #688 ) ( 1a0b744 )

Performance Improvements

  • Don’t run query immediately from read_gbq_table if filters is set ( 9386373 )

  • Use a LIMIT clause when max_results is set ( 9386373 )

Documentation

  • Add code snippets for imported onnx tutorials ( #684 ) ( cb36e46 )

  • Add code snippets for imported tensorflow model ( #679 ) ( b02c401 )

  • Use class_weight="balanced" in the logistic regression prediction tutorial ( #678 ) ( b951549 )

1.6.0 (2024-05-13)

Features

  • Add DataFrame.__delitem__ ( #673 ) ( 2218c21 )

  • Add Series.case_when() ( #673 ) ( 2218c21 )

  • Add strategy="quantile" in KBinsDiscretizer ( #654 ) ( c6c487f )

  • Add Series.combine ( #680 ) ( 2fd1b81 )

  • Series.str.split ( #675 ) ( 6eb19a7 )

  • Suggest correct options in bpd.options.bigquery.location ( #666 ) ( 57ccabc )

  • Support axis=1 in df.apply for scalar outputs ( #629 ) ( f6bdc4a )

  • Support gcf vpc connector in remote_function ( #677 ) ( 9ca92d0 )

  • Warn with a more specific DefaultLocationWarning category when no location can be detected ( #648 ) ( e084e54 )

Bug Fixes

  • Include index_col when selecting columns and filters in read_gbq_table ( #648 ) ( e084e54 )

Dependencies

  • Add jellyfish as a dependency for spelling correction ( 57ccabc )

Documentation

  • Add code snippets for llm text generatiion ( #669 ) ( 93416ed )

  • Add logistic regression samples ( #673 ) ( 2218c21 )

  • Address lint errors in code samples ( #665 ) ( 4fc8964 )

  • Document inlining of small data in read_\* APIs ( #670 ) ( 306953a )

1.5.0 (2024-05-07)

Features

  • bigframes.options and bigframes.option_context now uses thread-local variables to prevent context managers in separate threads from affecting each other ( #652 ) ( 651fd7d )

  • Add ARIMAPlus.coef_ property exposing ML.ARIMA_COEFFICIENTS functionality ( #585 ) ( 81d1262 )

  • Add a unique session_id to Session and allow cleaning up sessions ( #553 ) ( c8d4e23 )

  • Add the bigframes.bigquery sub-package with a bigframes.bigquery.array_length function ( #630 ) ( 9963f85 )

  • Always do a query dry run when option.repr_mode == "deferred" ( #652 ) ( 651fd7d )

  • Custom query labels for compute options ( #638 ) ( f561799 )

  • Warn with DefaultIndexWarning from read_gbq on clustered/partitioned tables with no index_col or filters set ( #631 , #658 ) ( 2715d2b , 73064dd )

  • Support index_col=False in read_csv and engine="bigquery" ( 73064dd )

  • Support gcf max instance count in remote_function ( #657 ) ( 36578ab )

Bug Fixes

  • Don’t raise UnknownLocationWarning for US or EU multi-regions ( #653 ) ( 8e4616b )

  • Fix bug with na in the column labels in stack ( #659 ) ( 4a34293 )

  • Use explicit session in PaLM2TextGenerator ( #651 ) ( e4f13c3 )

Documentation

  • Add python code sample for multiple forecasting time series ( #531 ) ( 16866d2 )

  • Fix the Palm2TextGenerator output token size ( #649 ) ( c67e501 )

1.4.0 (2024-04-29)

Features

  • Add .cache() method to persist intermediate dataframe ( #626 ) ( a5c94ec )

  • Add transpose support for small homogeneously typed DataFrames. ( #621 ) ( 054075d )

  • Allow single input type in remote_function ( #641 ) ( 3aa643f )

  • Expose gcf max timeout in remote_function ( #639 ) ( dfeaad0 )

  • Series binary ops compatible with more types ( #618 ) ( 518d315 )

  • Support the score method for PaLM2TextGenerator ( #634 ) ( 3ffc1d2 )

Bug Fixes

  • Allow to_pandas to download more than 10GB ( #637 ) ( ce56495 )

  • Extend row hash to 128 bits to guarantee unique row id ( #632 ) ( 9005c6e )

  • Llm fine tuning tests ( #627 ) ( 4724a1a )

  • Llm palm score tests ( #643 ) ( cf4ec3a )

Performance Improvements

  • Automatically condense internal expression representation ( #516 ) ( 03c1b0d )

  • Cache transpose to allow performant retranspose ( #635 ) ( 44b738d )

Documentation

  • Add supported pandas apis on the main page ( #628 ) ( 8d2a51c )

  • Add the first sample for the Single time-series forecasting from Google Analytics data tutorial ( #623 ) ( 2b84c4f )

  • Address more technical writers’ feedback ( #640 ) ( 1e7793c )

1.3.0 (2024-04-22)

Features

  • Add Series.struct.dtypes property ( #599 ) ( d924ec2 )

  • Add fine tuning fit() for Palm2TextGenerator ( #616 ) ( 9c106bd )

  • Add quantile statistic ( #613 ) ( bc82804 )

  • Expose max_batching_rows in remote_function ( #622 ) ( 240a1ac )

  • Support primary key(s) in read_gbq by using as the index_col by default ( #625 ) ( 75bb240 )

  • Warn if location is set to unknown location ( #609 ) ( 3706b4f )

Bug Fixes

  • Address technical writers fb ( #611 ) ( 9f8f181 )

  • Infer narrowest numeric type when combining numeric columns ( #602 ) ( 8f9ece6 )

  • Use exact median implementation by default ( #619 ) ( 9d205ae )

Documentation

  • Fix rendering of examples for multiple apis ( #620 ) ( 9665e39 )

  • Set index_cols in read_gbq as a best practice ( #624 ) ( 70015b7 )

1.2.0 (2024-04-15)

Features

Bug Fixes

  • Address more technical writers feedback ( #581 ) ( 4b08d92 )

  • Error for object dtype on read_pandas ( #570 ) ( 8702dcf )

  • Inverting int now does bitwise inversion rather than sign flip ( #574 ) ( 5f1db8b )

  • Loc setitem dtype issue. ( #603 ) ( b94bae9 )

  • Toc menu missing plotting name ( #591 ) ( eed12c1 )

Documentation

  • (Series|Dataframe).dtypes ( #598 ) ( edef48f )

  • Add code samples for str accessor methdos ( #594 ) ( a557ea2 )

  • Add docs for DataFrame and Series dunder methods ( #562 ) ( 8fc26c4 )

  • Add examples for at/iat ( #582 ) ( 3be4a2e )

1.1.0 (2024-04-04)

Features

  • (Series|DataFrame).explode ( #556 ) ( 9e32f57 )

  • Add DataFrame.eval and DataFrame.query ( #361 ) ( 5e28ebd )

  • Add ColumnTransformer save/load ( #541 ) ( 9d8cf67 )

  • Add ml.metrics.mean_squared_error ( #559 ) ( 853c25e )

  • Add support for numpy expm1, log1p, floor, ceil, arctan2 ops ( #505 ) ( e8e66cf )

  • Add transformers save/load ( #552 ) ( d805241 )

  • Allow DataFrame binary ops to align on either axis and with loc… ( #544 ) ( 6d8f3af )

  • Expose DataFrame.bqclient to assist in integrations ( #519 ) ( 0be8911 )

  • Read_pandas accepts pandas Series and Index objects ( #573 ) ( f8821fe )

  • Support ML.GENERATE_EMBEDDING in PaLM2TextEmbeddingGenerator ( #539 ) ( 1156c1e )

  • Support max_columns in repr and make repr more efficient ( #515 ) ( 54e49cf )

Bug Fixes

  • Assign NaN scalar to column error. ( #513 ) ( 0a4153c )

  • Don’t download 100gb onto local python machine in load test ( #537 ) ( 082c58b )

  • Exclude list-like s parameter in plot.scatter ( #568 ) ( 1caac27 )

  • Fix case where df.peek would fail to execute even with force=True ( #511 ) ( 8eca99a )

  • Fix error in Series.drop(0) ( #575 ) ( 75dd786 )

  • Include all names in MultiIndex repr ( #564 ) ( b188146 )

  • Plot.scatter s parameter cannot accept float-like column ( #563 ) ( 8d39187 )

  • Product operation produces float result for all input types ( #501 ) ( 6873b30 )

  • Reloaded transformer .transform error ( #569 ) ( 39fe474 )

  • Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible ( #561 ) ( 4995c00 )

  • Respect hard stack size limit and swallow limit change exception. ( #558 ) ( 4833908 )

  • Restore string to date/time type coercion ( #565 ) ( 4ae0262 )

  • Sync the notebook with embedding changes ( #550 ) ( 347f2dd )

  • Use bytes limit on frame inlining rather than element count ( #576 ) ( 659a161 )

Performance Improvements

  • Add multi-query execution capability for complex dataframes ( #427 ) ( d2d7e33 )

Dependencies

Documentation

  • bigframes.options.bigquery.project and location are optional in some circumstances ( #548 ) ( 90bcec5 )

  • Add “Supported pandas APIs” reference to the documentation ( #542 ) ( 74c3915 )

  • Add General Availability banner to README ( #507 ) ( 262ff59 )

  • Add opeartions in API docs ( #557 ) ( ea95761 )

  • Add progress_bar code sample ( #508 ) ( 92a1af3 )

  • Add the code samples for metrics{auc, roc_auc_score, roc_curve} ( #520 ) ( 5f37b09 )

  • Address more comments from technical writers to meet legal purposes ( #571 ) ( 9084df3 )

  • Fix docs of ARIMAPlus.predict ( #512 ) ( 3b80f95 )

  • Include Index in table-of-contents ( #564 ) ( b188146 )

  • Mark Gemini model as Pre-GA ( #543 ) ( 769868b )

  • Migrate the overview page to Bigframes official landing page ( #536 ) ( a0fb8bb )

1.0.0 (2024-03-25)

⚠ BREAKING CHANGES

  • rename model parameter min_rel_progress to tol

  • early_stop setting no longer supported, always uses True

  • rename model parameter n_parallell_trees to n_estimators

  • rename class_weights to class_weight

  • rename learn_rate to learning_rate

  • PCA n_components supports float value and None , default to None

  • rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 )

Features

Bug Fixes

  • early_stop setting no longer supported, always uses True ( 65c6f47 )

  • Fix -1 offset lookups failing ( #463 ) ( 2dfb9c2 )

  • Plot.scatter c argument functionalities ( #494 ) ( d6ee994 )

  • Properly support format param for numerical input. ( #486 ) ( ae20c35 )

  • Renable to_csv and to_json related tests ( #468 ) ( 2b9a01d )

  • Sampling plot cannot preserve ordering if index is not ordered ( #475 ) ( a5345fe )

  • Use actual BigQuery types rather than ibis types in to_pandas ( #500 ) ( 82b4f91 )

Dependencies

Documentation

  • Add code samples for metrics.{accuracy_score, confusion_matrix} ( #478 ) ( 3e3329a )

  • Add code samples for metrics.{recall_score, precision_score, f11_score} ( #502 ) ( 370fe90 )

  • Improve API documentation ( #489 ) ( 751266e )

  • Update bigquery connection documentation ( #499 ) ( 4bfe094 )

  • Update LLM + K-means notebook to handle partial failures ( #496 ) ( 97afad9 )

0.26.0 (2024-03-20)

⚠ BREAKING CHANGES

  • exclude remote models for .register() ( #465 )

Features

  • (Series|DataFrame).plot ( #438 ) ( 1c3e668 )

  • read_gbq_table supports LIKE as a operator in filters ( #454 ) ( d2d425a )

  • Add DataFrame.pipe() method ( #421 ) ( 95f5a6e )

  • Set force=True by default in DataFrame.peek() ( #469 ) ( 4e8e97d )

  • Support datetime related casting in (Series|DataFrame|Index).astype ( #442 ) ( fde339b )

  • Support Series.dt.strftime ( #453 ) ( 8f6e955 )

Bug Fixes

  • Any() on empty set now correctly returns False ( #471 ) ( f55680c )

  • Df.drop_na preserves columns dtype ( #457 ) ( 3bab1a9 )

  • Disable to_json and to_csv related tests ( #462 ) ( 874026d )

  • Exclude remote models for .register() ( #465 ) ( 73fe0f8 )

  • Fix broken link in covid notebook ( #450 ) ( adadb06 )

  • Fix broken multiindex loc cases ( #467 ) ( b519197 )

  • Fix grouping series on multiple other series ( #455 ) ( 3971bd2 )

  • Groupby aggregates no longer check if grouping keys are numeric ( #472 ) ( 4fbf938 )

  • Raise ValueError when read_pandas() receives a bigframes DataFrame ( #447 ) ( b28f9fd )

  • Series.(to_csv|to_json) leverages bq export ( #452 ) ( 718a00c )

  • Warn when read_gbq / read_gbq_table uses the snapshot time cache ( #441 ) ( e16a8c0 )

Documentation

  • Add code samples for ml.metrics.r2_score ( #459 ) ( 85fefa2 )

  • Add the docs for loc and iloc indexers ( #446 ) ( 14ab8d8 )

  • Add the pages for at and iat indexers ( #456 ) ( 340f0b5 )

  • Add version information to bug template ( #437 ) ( 91bd39e )

  • Indicate that project and location are optional in example notebooks ( #451 ) ( 1df0140 )

0.25.0 (2024-03-14)

Features

  • (Series|DataFrame).plot.(line|area|scatter) ( #431 ) ( 0772510 )

  • Support CMEK for remote_function cloud functions ( #430 ) ( 2fd69f4 )

0.24.0 (2024-03-12)

⚠ BREAKING CHANGES

  • read_parquet uses a “pandas” engine to parse files by default. Use engine="bigquery" for the previous behavior

Features

  • (Series|Dataframe).plot.hist() ( #420 ) ( 4aadff4 )

  • Add detect_anomalies to ml ARIMAPlus and KMeans models ( #426 ) ( 6df28ed )

  • Add engine parameter to read_parquet ( #413 ) ( 31325a1 )

  • Add ml PCA.detect_anomalies method ( #422 ) ( 8d82945 )

  • Support BYOSA in remote_function ( #407 ) ( d92ced2 )

  • Support CMEK for BQ tables ( #403 ) ( 9a678e3 )

Bug Fixes

  • Move third_party.bigframes_vendored to bigframes_vendored ( #424 ) ( 763edeb )

  • Only do row identity based joins when joining by index ( #356 ) ( 76b252f )

  • Read_pandas inline respects location ( #412 ) ( ae0e3ea )

Documentation

  • Add predict sample to samples/snippets/bqml_getting_started_test.py ( #388 ) ( 6a3b0cc )

  • Document minimum IAM requirement ( #416 ) ( 36173b0 )

  • Fix the note rendering for DataFrames methods: nlargest, nsmallest ( #417 ) ( 38bd2ba )

0.23.0 (2024-03-05)

Features

  • Add ml.metrics.pairwise.euclidean_distance ( #397 ) ( 1726588 )

  • Add TextEmbedding model version support ( #394 ) ( e0f1ab0 )

Bug Fixes

  • Code exception in remote_function now prevents retry and surfaces in the client ( #387 ) ( dd3643d )

  • Docs link for metrics.pairwise ( #400 ) ( a60aba7 )

Dependencies

  • Update ibis to version 8.0.0 and refactor remote_function to use ibis UDF method ( #277 ) ( 350499b )

Documentation

  • Update README to point to new summary pages ( #402 ) ( bfe2b23 )

0.22.0 (2024-02-27)

⚠ BREAKING CHANGES

  • rename cosine_similarity to paired_cosine_distances ( #393 )

  • move model optional args to kwargs ( #381 )

Features

  • Add DataFrames.corr() method ( #379 ) ( 67fd434 )

  • Add ml.metrics.pairwise.manhattan_distance ( #392 ) ( 9d31865 )

  • Enable regional endpoints for me-central2 ( #386 ) ( 469674d )

Bug Fixes

  • Avoid ibis warning for “database” table() method argument ( #390 ) ( a0490a4 )

  • Correct the numeric literal dtype ( #365 ) ( 93b02cd )

  • Rename cosine_similarity to paired_cosine_distances ( #393 ) ( 81ece46 )

Performance Improvements

  • Inline read_pandas for small data ( #383 ) ( 59b446b )

Dependencies

  • Add minimum version constraint for sqlglot to 19.9.0 ( #389 ) ( 8b62d77 )

Documentation

  • Add a code sample for creating a kmeans model ( #267 ) ( 4291d65 )

  • Fix bigframes.pandas.concat documentation ( #382 ) ( 234b61c )

Miscellaneous Chores

Code Refactoring

  • Move model optional args to kwargs ( #381 ) ( 4037992 )

0.21.0 (2024-02-13)

Features

  • Add Series.cov method ( #368 ) ( 443db22 )

  • Add ml.llm.GeminiTextGenerator model ( #370 ) ( de1e0a4 )

  • Add ml.metrics.pairwise.cosine_similarity function ( #374 ) ( 126f566 )

  • Add XGBoostModel ( #363 ) ( d5518b2 )

  • Limited support of lambdas in Series.apply ( #345 ) ( 208e081 )

  • Support bigframes.pandas.to_datetime for scalars, iterables and series. ( #372 ) ( ffb0d15 )

  • Support read_gbq wildcard table path ( #377 ) ( 90caf86 )

Bug Fixes

Documentation

  • Clarify ADC pre-auth in a non-interactive environment ( #348 ) ( 99a9e6e )

0.20.1 (2024-02-06)

Performance Improvements

  • Make repr cache the block where appropriate ( #350 ) ( 068879f )

Documentation

  • Add a sample to demonstrate the evaluation results ( #364 ) ( cff0919 )

  • Fix the DataFrame.apply code sample ( #366 ) ( 1866a26 )

0.20.0 (2024-01-30)

Features

  • Add DataFrame.peek() as an efficient alternative to head() results preview ( #318 ) ( 9c34d83 )

  • Add ARIMA_EVAULATE options in forecasting models ( #336 ) ( 73e997b )

  • Add Index constructor, repr, copy, get_level_values, to_series ( #334 ) ( e5d054e )

  • Improve error message for drive based BQ table reads ( #344 ) ( 0794788 )

  • Update cut to work without labels = False and show intervals as dict ( #335 ) ( 4ff53db )

Bug Fixes

  • Chance default connection name in getting_started.ipnyb ( #347 ) ( 677f014 )

  • Series iteration correctly returns values instead of index ( #339 ) ( 2c6af9b )

Documentation

  • Add code samples for Series.{between, cumprod} ( #353 ) ( 09a52fd )

0.19.2 (2024-01-22)

Bug Fixes

  • Read_gbq large response issue ( #332 ) ( b8178b9 )

  • Use object dtype for ARRAY columns in to_pandas() with pandas 1.x ( #329 ) ( 374ddb5 )

Documentation

  • Add DataFrame.applymap documentation ( #326 ) ( bd531a1 )

  • Add code samples for series methods ( #323 ) ( 32cc6fa )

  • Add remote model requirements ( #333 ) ( c91f70c )

0.19.1 (2024-01-17)

Bug Fixes

  • Handle multi-level columns for df aggregates properly ( #305 ) ( 5bb45ba )

  • Update max_output_token limitation. ( #308 ) ( 5cccd36 )

Documentation

  • Add code samples for Series.corr ( #316 ) ( 9150c16 )

0.19.0 (2024-01-09)

Features

  • Add ‘columns’ as an alias for ‘col_order’ ( #298 ) ( a01b271 )

  • Add Series dt.tz and dt.unit properties ( #303 ) ( 2e1a403 )

  • Add to_gbq() method for LLM models ( #299 ) ( dafbc1b )

  • Allow manually set clustering_columns in dataframe.to_gbq ( #302 ) ( 9c21323 )

  • Support assigning to columns like a property ( #304 ) ( f645c56 )

  • Support upcasting numeric columns in concat ( #294 ) ( e3a056a )

Bug Fixes

  • DF.drop tuple input as multi-index ( #301 ) ( 21391a9 )

  • Fix bug converting non-string labels to sql ids ( #296 ) ( a61c5fe )

Documentation

  • Add code samples for Series.ffill and DataFrame.ffill ( #307 ) ( 1c63b45 )

0.18.0 (2024-01-02)

Features

  • Add dataframe.to_html ( #259 ) ( 2cd6489 )

  • Add IntervalIndex support to bigframes.pandas.cut ( #254 ) ( 6c1969a )

  • Add replace method to DataFrame ( #261 ) ( 5092215 )

  • Specific pyarrow mappings for decimal, bytes types ( #283 ) ( a1c0631 )

Bug Fixes

  • Dataframes to_gbq now creates dataset if it doesn’t exist ( #222 ) ( bac62f7 )

  • Exclude pandas 2.2.0rc0 to unblock prerelease tests ( #292 ) ( ac1a745 )

  • Fix DataFrameGroupby.agg() issue with as_index=False ( #273 ) ( ab49350 )

  • Make Series.str.replace work for simple strings ( #285 ) ( ad67465 )

  • Update dataframe.to_gbq to dedup column names. ( #286 ) ( 746115d )

  • Use setuptools.find_namespace_packages ( #246 ) ( 9ec352a )

Dependencies

  • Migrate to ibis-framework &gt;= "7.1.0" ( #53 ) ( 9798a2b )

Documentation

  • Add code snippets for explore query result page ( #278 ) ( 7cbbb7d )

  • Code samples for astype common to DataFrame and Series ( #280 ) ( 95b673a )

  • Code samples for DataFrame.copy and Series.copy ( #290 ) ( 7cbc2b0 )

  • Code samples for drop and fillna ( #284 ) ( 9c5012e )

  • Code samples for isna , isnull , dropna , isin ( #289 ) ( ad51035 )

  • Code samples for rename , size ( #293 ) ( eb69f60 )

  • Code samples for reset_index and sort_values ( #282 ) ( acc0eb7 )

  • Code samples for sample , get , Series.round ( #295 ) ( c2b1892 )

  • Code samples for Series.{add, replace, unique, T, transpose} ( #287 ) ( 0e1bbfc )

  • Code samples for Series.{map, to_list, count} ( #290 ) ( 7cbc2b0 )

  • Code samples for Series.{name, std, agg} ( #293 ) ( eb69f60 )

  • Code samples for Series.groupby and Series.{sum,mean,min,max} ( #280 ) ( 95b673a )

  • Code samples for DataFrame set_index , items ( #295 ) ( c2b1892 )

  • Fix the rendering for get_dummies ( #291 ) ( 252f3a2 )

0.17.0 (2023-12-14)

Features

  • Add filters argument to read_gbq for enhanced data querying ( #198 ) ( 034f71f )

  • Add module/class level api tracking ( #272 ) ( 4f3db3d )

  • Deprecate use_regional_endpoints ( #199 ) ( 319a1f2 )

Bug Fixes

  • Increase recursion limit, cache compilation tree hashes ( #184 ) ( b54791c )

  • Replaced raise NotImplementedError with return NotImplemented ( #258 ) ( a133822 )

Documentation

  • Add code samples for values and value_counts ( #249 ) ( f247d95 )

  • Add sample for getting started with BQML ( #141 ) ( fb14f54 )

0.16.0 (2023-12-12)

Features

  • Add ARIMAPlus.predict parameters ( #264 ) ( 99598c7 )

  • Add DataFrame from_dict and from_records methods ( #244 ) ( 8d81e24 )

  • Add DataFrame.select_dtypes method ( #242 ) ( 1737acc )

  • Add nunique method to Series/DataFrameGroupby ( #256 ) ( c8ec245 )

  • Support dataframe.loc with conditional columns selection ( #233 ) ( 3febea9 )

Bug Fixes

  • Enfore pandas version requirement <2.1.4 ( #265 ) ( 9dd63f6 )

  • Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests ( b02fc2c )

  • Fix value_counts column label for normalize=True ( #245 ) ( d3fa6f2 )

  • Migrate e2e tests to bigframes-load-testing project ( 8766ac6 )

  • Ml.sql logic ( #262 ) ( 68c6fdf )

  • Update the llm_kmeans notebook ( #247 ) ( 66d1839 )

Documentation

  • Add code samples for shape and head ( #257 ) ( 5bdcc65 )

  • Add example for dataframe.melt, dataframe.pivot, dataframe.stac… ( #252 ) ( 8c63697 )

  • Add example to dataframe.nlargest, dataframe.nsmallest, datafra… ( #234 ) ( e735412 )

  • Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod ( #243 ) ( 0523a31 )

  • Add examples for dataframe.nunique, dataframe.diff, dataframe.a… ( #251 ) ( 77074ec )

  • Correct the docs for option_context ( #263 ) ( d21c6dd )

  • Correct the params rendering for ml.remote and ml.ensemble modules ( #248 ) ( c2829e3 )

  • Fix return annotation in API docstrings ( #253 ) ( 89a1c67 )

0.15.0 (2023-11-29)

⚠ BREAKING CHANGES

  • model.predict returns all the columns ( #204 )

Features

  • Add info and memory_usage methods to dataframe ( #219 ) ( 9d6613d )

  • Add remote vertex model support ( #237 ) ( 0bfc4fb )

  • Add the recent api method for ML component ( #225 ) ( ed8876d )

  • Model.predict returns all the columns ( #204 ) ( 416171a )

  • Send warnings on LLM prediction partial failures ( #216 ) ( 81125f9 )

Bug Fixes

  • Add df snapshots lookup for read_gbq ( #229 ) ( d0d9b84 )

  • Avoid unnecessary row_number() on sort key for io ( #211 ) ( a18d40e )

  • Dedup special character ( #209 ) ( dd78acb )

  • Invalid JSON type of the notebook ( #215 ) ( a729831 )

  • Make to_pandas override enable_downsampling when sampling_method is manually set. ( #200 ) ( ae03756 )

  • Polish the llm+kmeans notebook ( #208 ) ( e8532b1 )

  • Update the llm+kmeans notebook with recent change ( #236 ) ( f8917ab )

  • Use anonymous dataset to create remote_function ( #205 ) ( 69b016e )

Documentation

  • Add code samples for index and column properties ( #212 ) ( c88d38e )

  • Add code samples for df reshaping, function, merge, and join methods ( #203 ) ( 010486c )

  • Add examples for dataframe.kurt, dataframe.std, dataframe.count ( #232 ) ( f9c6e72 )

  • Add examples for dataframe.mean, dataframe.median, dataframe.va… ( #228 ) ( edd0522 )

  • Add examples for dataframe.min, dataframe.max and dataframe.sum ( #227 ) ( 3a375e8 )

  • Code samples for Series.dot and DataFrame.dot ( #226 ) ( b62a07a )

  • Code samples for Series.where and Series.mask ( #217 ) ( 52dfad2 )

  • Code samples for dataframe.any, dataframe.all and dataframe.prod ( #223 ) ( d7957fa )

  • Make the code samples reflect default bq connection usage ( #206 ) ( 71844b0 )

Miscellaneous Chores

0.14.1 (2023-11-16)

Bug Fixes

  • Correctly handle null values when initializing fingerprint ordering ( #210 ) ( 8324f13 )

Documentation

  • Add an example notebook about line graphs ( #197 ) ( f957b27 )

0.14.0 (2023-11-14)

Features

  • Add ‘cross’ join support ( #176 ) ( 765446a )

  • Add ‘index’, ‘pad’, ‘nearest’ interpolate methods ( #162 ) ( 6a28403 )

  • Add series.sample (identical to existing dataframe.sample) ( #187 ) ( 37914a4 )

  • Add unordered sql compilation ( #156 ) ( 58f420c )

  • Log most recent API calls as recent-bigframes-api-xx labels on BigQuery jobs ( #145 ) ( 4ea33b7 )

  • Read_gbq creates order deterministically without table copy ( #191 ) ( 8ab81de )

  • Support date_series.astype("string[pyarrow]") to cast DATE to STRING ( #186 ) ( aee0e8e )

  • Support series.at[row_label] = scalar ( #173 ) ( 0c8bd33 )

  • Temporary resources no longer use BigQuery Sessions ( #194 ) ( 4a02cac )

Bug Fixes

  • All sort operation are now stable ( #195 ) ( 3a2761f )

  • Default to 7 days expiration for read_csv , read_json , read_parquet ( #193 ) ( 03606cd )

  • Deprecate the remote_service_type in llm model ( #180 ) ( a8a409a )

  • For reset_index on unnamed multiindex, always use level_[n] label ( #182 ) ( f95000d )

  • Match pandas behavior when assigning listlike to empty dfs ( #172 ) ( c1d1f42 )

  • Use anonymous dataset instead of session dataset for temp tables ( #181 ) ( 800d44e )

  • Use random table for read_pandas ( #192 ) ( 741c75e )

  • Use random table when loading data for read_csv , read_json , read_parquet ( #175 ) ( 9d2e6dc )

Documentation

  • Add code samples for read_gbq_function using community UDFs ( #188 ) ( 7506eab )

  • Add docstring code samples for Series.apply and DataFrame.map ( #185 ) ( c816d84 )

  • Add llm kmeans notebook as an included example ( #177 ) ( d49ae42 )

  • Use head() to get top n results, not to preview results ( #190 ) ( 87f84c9 )

0.13.0 (2023-11-07)

Features

  • to_gbq without a destination table writes to a temporary table ( #158 ) ( e1817c9 )

  • Add DataFrame.__iter__ , DataFrame.iterrows , DataFrame.itertuples , and DataFrame.keys methods ( #164 ) ( c065071 )

  • Add Series.__iter__ method ( #164 ) ( c065071 )

  • Add interpolate() to series and dataframe ( #157 ) ( b9cb55c )

  • Support 32k text-generation and multilingual embedding models ( #161 ) ( 5f0ea37 )

Bug Fixes

  • Update default temp table expiration to 7 days ( #174 ) ( 4ff26cd )

0.12.0 (2023-11-01)

Features

  • Add DataFrame.melt ( #113 ) ( 4e4409c )

  • Add DataFrame.to_pandas_batches() to download large DataFrame objects ( #136 ) ( 3afd4a3 )

  • Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs ( #133 ) ( 63c7919 )

  • Add pandas.qcut ( #104 ) ( 8e44518 )

  • Add pd.get_dummies ( #149 ) ( d8baad5 )

  • Add unstack to series, add level param ( #115 ) ( 5edcd19 )

  • Implement operator @ for DataFrame.dot ( #139 ) ( 79a638e )

  • Populate ibis version in user agent ( #140 ) ( c639a36 )

Bug Fixes

  • Don’t override the global logging config ( #138 ) ( 2ddbf74 )

  • Fix bug with column names under repeated column assignment ( #150 ) ( 29032d0 )

  • Resolve plotly rendering issue by using ipython html for job pro… ( #134 ) ( 39df43e )

  • Use indexee’s session for loc listlike cases ( #152 ) ( 27c5725 )

Documentation

  • Add artithmetic df sample code ( #153 ) ( ac44ccd )

  • Fix indentation on read_gbq_function code sample ( #163 ) ( 0801d96 )

  • Link to ML.EVALUATE BQML page for score() methods ( #137 ) ( 45c617f )

0.11.0 (2023-10-26)

Features

  • Add back reset_session as an alias for close_session ( #124 ) ( 694a85a )

  • Change query parameter to query_or_table in read_gbq ( #127 ) ( f9bb3c4 )

Bug Fixes

  • Expose bigframes.pandas.reset_session as a public API ( #128 ) ( b17e1f4 )

  • Use series’s own session in series.reindex listlike case ( #135 ) ( 95bff3f )

Documentation

  • Add runnable code samples for DataFrames I/O methods and property ( #129 ) ( 6fea8ef )

  • Add runnable code samples for reading methods ( #125 ) ( a669919 )

0.10.0 (2023-10-19)

Features

  • Implement DataFrame.dot for matrix multiplication ( #67 ) ( 29dd414 )

0.9.0 (2023-10-18)

⚠ BREAKING CHANGES

  • rename bigframes.pandas.reset_session to close_session ( #101 )

Features

  • Add bigframes.options.bigquery.application_name for partner attribution ( #117 ) ( 52d64ff )

  • Add AtIndexer getitems ( #107 ) ( 752b01f )

  • Rename bigframes.pandas.reset_session to close_session ( #101 ) ( 36693bf )

  • Send BigQuery cancel request when canceling bigframes process ( #103 ) ( e325fbb )

  • Support external packages in remote_function ( #98 ) ( ec10c4a )

  • Use ArrowDtype for STRUCT columns in to_pandas ( #85 ) ( 9238fad )

Bug Fixes

  • Support multiindex for three loc getitem overloads ( #113 ) ( 68e3cd3 )

Performance Improvements

  • If primary keys are defined, read_gbq avoids copying table data ( #112 ) ( e6c0cd1 )

Documentation

  • Add documentation for Series.struct.field and Series.struct.explode ( #114 ) ( a6dab9c )

  • Add open-source link in API doc ( #106 ) ( db51fe3 )

  • Update ML overview API doc ( #105 ) ( 1b3f3a5 )

0.8.0 (2023-10-12)

⚠ BREAKING CHANGES

  • The default behavior of to_parquet is changing from no compression to 'snappy' compression.

Features

  • Support compression in to_parquet ( a8c286f )

Bug Fixes

  • Create session dataset for remote functions only when needed ( #94 ) ( 1d385be )

0.7.0 (2023-10-11)

Features

  • Add aliases for several series properties ( #80 ) ( c0efec8 )

  • Add equals methods to series/dataframe ( #76 ) ( 636a209 )

  • Add iat and iloc accessing by tuples of integers ( #90 ) ( 228aeba )

  • Add level param to DataFrame.stack ( #88 ) ( 97b8bec )

  • Allow df.drop to take an index object ( #68 ) ( 740c451 )

  • Use default session connection ( #87 ) ( 4ae4ef9 )

Bug Fixes

  • Change the invalid url in docs ( #93 ) ( 969800d )

Documentation

  • Add more preprocessing models into the docs menu. ( #97 ) ( 1592315 )

0.6.0 (2023-10-04)

Features

  • Add df.unstack ( #63 ) ( 4a84714 )

  • Add idxmin, idxmax to series, dataframe ( #74 ) ( 781307e )

  • Add ml.preprocessing.KBinsDiscretizer ( #81 ) ( 24c6256 )

  • Add multi-column dataframe merge ( #73 ) ( c9fa85c )

  • Add update and align methods to dataframe ( #57 ) ( bf050cf )

  • Support STRUCT data type with Series.struct.field to extract child fields ( #71 ) ( 17afac9 )

Bug Fixes

  • Avoid 403 response too large to return error with read_gbq and large query results ( #77 ) ( 8f3b5b2 )

  • Change return type of Series.loc[scalar] ( #40 ) ( fff3d45 )

  • Fix df/series.iloc by list with multiindex ( #79 ) ( 971d091 )

0.5.0 (2023-09-28)

Features

  • Add DataFrame.kurtosis / DF.kurt method ( c1900c2 )

  • Add DataFrame.rolling and DataFrame.expanding methods ( c1900c2 )

  • Add items , apply methods to DataFrame . ( #43 ) ( 3adc1b3 )

  • Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )

  • Add index dtype , astype , drop , fillna , aggregate attributes. ( #38 ) ( 1a254a4 )

  • Add ml.preprocessing.LabelEncoder ( #50 ) ( 2510461 )

  • Add ml.preprocessing.MaxAbsScaler ( #56 ) ( 14b262b )

  • Add ml.preprocessing.MinMaxScaler ( #64 ) ( 392113b )

  • Add more index methods ( #54 ) ( a6e32aa )

  • Support calculate_p_values parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support class_weights="balanced" in LogisticRegression model ( c1900c2 )

  • Support df[column_name] = df_only_one_column ( c1900c2 )

  • Support early_stop parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support enable_global_explain parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support l2_reg parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support learn_rate_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support ls_init_learn_rate parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support max_iterations parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support min_rel_progress parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support optimize_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support casting string to integer or float ( #59 ) ( 3502f83 )

Bug Fixes

  • Fix header skipping logic in read_csv ( #49 ) ( d56258c )

  • Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )

  • LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )

  • Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )

Performance Improvements

  • Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )

  • Inline small Series and DataFrames in query text ( #45 ) ( 5e199ec )

  • Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )

  • Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )

Documentation

  • Link to Remote Functions code samples from README and API reference ( c1900c2 )

0.4.0 (2023-09-16)

Features

  • Add axis parameter to droplevel and reorder_levels ( 7c6b0dd )

  • Add bfill and ffill to DataFrame and Series ( 7c6b0dd )

  • Add DataFrame.combine and DataFrame.combine_first ( #27 ) ( 7c6b0dd )

  • Add DataFrame.nlargest , nsmallest ( 7c6b0dd )

  • Add DataFrame.pct_change and Series.pct_change ( 7c6b0dd )

  • Add DataFrame.skew and GroupBy.skew ( 7c6b0dd )

  • Add DataFrame.to_dict , to_excel , to_latex , to_records , to_string , to_markdown , to_pickle , to_orc ( 7c6b0dd )

  • Add diff method to DataFrame and GroupBy ( 7c6b0dd )

  • Add filter and reindex to Series and DataFrame ( 7c6b0dd )

  • Add reindex_like to DataFrame and Series ( 7c6b0dd )

  • Add swaplevel to DataFrame and Series ( 7c6b0dd )

  • Add partial support for Sereies.replace ( 7c6b0dd )

  • Support DataFrame.loc[bool_series, column] = scalar ( 7c6b0dd )

  • Support a persistent name in remote_function ( 7c6b0dd )

Bug Fixes

  • remote_function uses same credentials as other APIs ( 7c6b0dd )

  • Add type hints to models ( 7c6b0dd )

  • Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )

  • Remove transforms parameter in model.fit ( breaking change) ( 7c6b0dd )

  • Support column joins with “None indexer” ( 7c6b0dd )

  • Use for literals Int64Dtype in cut ( 7c6b0dd )

  • Use lowercase strings for parameter literals in bigframes.ml ( breaking change) ( 7c6b0dd )

Performance Improvements

  • bigframes-api label to I/O query jobs ( 7c6b0dd )

Documentation

  • Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )

  • Document region logic in README ( 7c6b0dd )

  • Fix OneHotEncoder sample ( 7c6b0dd )

0.3.2 (2023-09-06)

Bug Fixes

  • Make release.sh script for PyPI upload executable ( #20 ) ( 9951610 )

0.3.1 (2023-09-05)

Bug Fixes

  • release:Use correct directory name for release build config ( #17 ) ( 3dd25b3 )

0.3.0 (2023-09-02)

Features

  • Add bigframes.get_global_session() and bigframes.reset_session() aliases ( a32b747 )

  • Add bigframes.pandas.read_pickle function ( a32b747 )

  • Add components_ , explained_variance_ , and explained_variance_ratio_ properties to bigframes.ml.decomposition.PCA ( 89b9503 )

  • Add fit_transform to bigquery.ml transformers ( a32b747 )

  • Add Series.dropna and DataFrame.fillna ( 8fab755 )

  • Add Series.str methods isalpha , isdigit , isdecimal , isalnum , isspace , islower , isupper , zfill , center ( a32b747 )

  • Support bigframes.pandas.merge() ( 8fab755 )

  • Support DataFrame.isin with list and dict inputs ( 8fab755 )

  • Support DataFrame.pivot ( a32b747 )

  • Support DataFrame.stack ( 89b9503 )

  • Support DataFrame - DataFrame binary operations ( 8fab755 )

  • Support df[my_column] = [a python list] ( 89b9503 )

  • Support Index.is_monotonic ( 8fab755 )

  • Support np.arcsin , np.arccos , np.arctan , np.sinh , np.cosh , np.tanh , np.arcsinh , np.arccosh , np.arctanh , np.exp with Series argument ( 89b9503 )

  • Support np.sin , np.cos , np.tan , np.log , np.log10 , np.sqrt , np.abs with Series argument ( 89b9503 )

  • Support pow() and power operator in DataFrame and Series ( 8fab755 )

  • Support read_json with engine=bigquery for newline-delimited JSON files ( 89b9503 )

  • Support Series.corr ( 89b9503 )

  • Support Series.map ( 8fab755 )

  • Support for np.add , np.subtract , np.multiply , np.divide , np.power ( 8fab755 )

  • Support MultiIndex for DataFrame columns ( a32b747 )

  • Use pandas.Index for column labels ( a32b747 )

  • Use default session and connection in ml.llm and ml.imported ( 8fab755 )

Bug Fixes

  • Add error message to set_index ( a32b747 )

  • Align column names with pandas in DataFrame.agg results ( 89b9503 )

  • Allow (but still not recommended) ORDER BY in read_gbq input when an index_col is defined ( 89b9503 )

  • Check for IAM role on the BigQuery connection when initializing a remote_function ( 89b9503 )

  • Check that types are specified in read_gbq_function ( a32b747 )

  • Don’t use query cache for Session construction ( a32b747 )

  • Include survey link in abstract NotImplementedError exception messages ( 89b9503 )

  • Label temp table creation jobs with source=bigquery-dataframes-temp label ( 89b9503 )

  • Make X_train argument names consistent across methods ( 8fab755 )

  • Raise AttributeError for unimplemented pandas methods ( 89b9503 )

  • Raise exception for invalid function in read_gbq_function ( a32b747 )

  • Support spaces in column names in DataFrame initializater ( 89b9503 )

Performance Improvements

  • Add local cache for __repr_\*__ methods ( a32b747 )

  • Lazily instantiate client library objects ( 89b9503 )

  • Use row_number() filter for head / tail ( 8fab755 )

Documentation

  • Add ML section under Overview ( a32b747 )

  • Add release status to table of contents ( a32b747 )

  • Add samples and best practices to read_gbq docs ( a32b747 )

  • Correct the return types of Dataframe and Series ( a32b747 )

  • Create subfolders for notebooks ( a32b747 )

  • Fix link to GitHub ( 89b9503 )

  • Highlight bigframes is open-source ( a32b747 )

  • Sample ML Drug Name Generation notebook ( a32b747 )

  • Set options.bigquery.project in sample code ( 89b9503 )

  • Transform remote function user guide into sample code ( a32b747 )

  • Update remote function notebook with read_gbq_function usage ( 8fab755 )

0.2.0 (2023-08-17)

Features

  • Add KMeans.cluster_centers_.

  • Allow column labels to be any type handled by bq df, column labels can be integers now.

  • Add dataframegroupby.agg().

  • Add Series Property is_monotonic_increasing and is_monotonic_decreasing.

  • Add match, fullmatch, get, pad str methods.

  • Add series isin function.

Bug Fixes

  • Update ML package to use sessions for queries.

  • Optimize read_gbq with index_col set to cluster by index_col .

  • Raise ValueError if the location mismatched.

  • read_gbq no longer uses ‘time travel’ with query inputs.

Documentation

  • Add docstring to _uniform_sampling to avoid user using it.

0.1.1 (2023-08-14)

Documentation

  • Correct link to code repository in setup.py and use correct terminology for console.cloud.google.com links.

0.1.0 (2023-08-11)

Features

  • Add bigframes.pandas package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.

  • Add bigframes.ml package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .

0.0.0 (2023-02-22)

  • Empty package to reserve package name.
Design a Mobile Site
View Site in Mobile | Classic
Share by: