Changelog

PyPI History

1.18.0 (2024-09-18)

Features

Add “include” param to describe for string types ( #973 ) ( deac6d2 )
Add subset parameter to DataFrame.dropna to select which columns to consider ( #981 ) ( f7c03dc )

Bug Fixes

DataFrameGroupby.agg now works with unnamed tuples ( #985 ) ( 0f047b4 )
Fix a bug that raises exception when re-indexing columns with their original order ( #988 ) ( 596b03b )
Make the Series.apply outcome assign able to the original dataframe in partial ordering mode ( #874 ) ( c94ead9 )

Dependencies

Limit ibis-framework version to 9.2.0 ( #989 ) ( 06c1b33 )
Update to ibis-framework 9.x and newer sqlglot ( #827 ) ( 89ea44f )

1.17.0 (2024-09-11)

Features

Add __version__ alias to bigframes.pandas ( #967 ) ( 9ce10b4 )
Add Gemini 1.5 stable models support ( #945 ) ( c1cde19 )
Allow setting table labels in to_gbq ( #941 ) ( cccc6ca )
Define list accessor for bigframes Series ( #946 ) ( 8e8279d )
Enable read_csv() to process other files ( #940 ) ( 3b35860 )
Include the bigframes package version alongside the feedback link in error messages ( #936 ) ( 7b59b6d )

Bug Fixes

Astype Decimal to Int64 conversion. ( #957 ) ( 27764a6 )
Make read_gbq_function work for multi-param functions ( #947 ) ( c750be6 )
Support read_gbq_function for axis=1 application ( #950 ) ( 86e54b1 )

Documentation

Add docstring returns section to Options ( #937 ) ( a2640a2 )
Update title of pypi notebook example to reflect use of the PyPI public dataset ( #952 ) ( cd62e60 )

1.16.0 (2024-09-04)

Features

Add DataFrame.struct.explode to add struct subfields to a DataFrame ( #916 ) ( ad2f75e )
Implement bigframes.bigquery.json_extract_array ( #910 ) ( 575a29e )
Recover struct column from exploded Series ( #904 ) ( 7dd304c )

Bug Fixes

Fix issue with iterating on >10gb dataframes ( #949 ) ( 2b0f0fa )
Improve Series.replace for dict input ( #907 ) ( 4208044 )
NullIndex in ML model.predict error ( #917 ) ( 612271d )
Struct field non-nullable type issue. ( #914 ) ( 149d5ff )
Unordered mode errors in ml train_test_split ( #925 ) ( 85d7c21 )

Performance Improvements

Improve repr performance ( #918 ) ( 46f2dd7 )

Dependencies

Re-introduce support for numpy 1.24.x ( #931 ) ( 3d71913 )
Update minimum support to Pandas 1.5.3 and Pyarrow 10.0.1 ( #903 ) ( 7ed3962 )

Documentation

Add Claude3 ML and RemoteFunc notebooks ( #930 ) ( cfd16c1 )
Create sample notebook to manipulate struct and array data ( #883 ) ( 3031903 )
Update struct examples. ( #953 ) ( d632cd0 )
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook ( #890 ) ( d1883cc )

1.15.0 (2024-08-20)

Features

Add llm.TextEmbeddingGenerator to support new embedding models ( #905 ) ( 6bc6a41 )
Add ml.llm.Claude3TextGenerator model ( #901 ) ( 7050038 )

Documentation

Add columns for “requires ordering/index” to supported APIs summary ( #892 ) ( d2fc51a )
Remove duplicate description for kms_key_name ( #898 ) ( 1053d56 )
Update embedding model notebooks ( #906 ) ( d9b8ef5 )

1.14.0 (2024-08-14)

Features

Implement bigframes.bigquery.json_extract ( #868 ) ( 3dbf84b )
Implement Series.str.__getitem__ ( #897 ) ( e027b7e )

Bug Fixes

Fix caching from generating row numbers in partial ordering mode ( #872 ) ( 52b7786 )

Performance Improvements

Generate SQL with fewer CTEs ( #877 ) ( eb60804 )
Speed up compilation by reducing redundant type normalization ( #896 ) ( e0b11bc )

Documentation

Add streaming html docs ( #884 ) ( 171da6c )
Fix the DisplayOptions doc rendering ( #893 ) ( 3eb6a17 )
Update streaming notebook ( #887 ) ( 6e6f9df )

1.13.0 (2024-08-05)

Features

df.apply(axis=1) to support remote function with mutiple params ( #851 ) ( 2158818 )
Allow windowing in ‘partial’ ordering mode ( #861 ) ( ca26fe5 )
Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters ( #879 ) ( 8753bdd )

Bug Fixes

Fix issue with invalid sql generated by ml distance functions ( #865 ) ( 9959fc8 )

Documentation

Create sample notebook using ordering_mode="partial" ( #880 ) ( c415eb9 )
Update streaming notebook ( #875 ) ( e9b0557 )

1.12.0 (2024-07-31)

Features

Add bigframes-mode label to query jobs ( #832 ) ( c9eaff0 )
Add config option to set partial ordering mode ( #855 ) ( 823c0ce )
Add stratify param support to ml.model_selection.train_test_split method ( #815 ) ( 27f8631 )
Add streaming.StreamingDataFrame class ( #864 ) ( a7d7197 )
Allow DataFrame.join for self-join on Null index ( #860 ) ( e950533 )
Support remote function cleanup with session.close ( #818 ) ( ed06436 )
Support to_csv/parquet/json to local files/objects ( #858 ) ( d0ab9cc )

Bug Fixes

Fewer relation joins from df self-operations ( #823 ) ( 0d24f73 )
Fix ‘sql’ property for null index ( #844 ) ( 1b6a556 )
Fix unordered mode using ordered path to print frame ( #839 ) ( 93785cb )
Reduce redundant remote_function deployments ( #856 ) ( cbf2d42 )

Documentation

Add partner attribution steps to integrations sample notebook ( #835 ) ( d7b333f )
Make get_global_session / close_session / reset_session appears in the docs ( #847 ) ( 01d6bbb )

1.11.1 (2024-07-08)

Documentation

Remove session and connection in llm notebook ( #821 ) ( 74170da )
Remove the experimental flask icon from the public docs ( #820 ) ( 067ff17 )

1.11.0 (2024-07-01)

Features

Add .agg support for size ( #792 ) ( 87e6018 )
Add bigframes.bigquery.json_set ( #782 ) ( 1b613e0 )
Add bigframes.streaming.to_pubsub method to create continuous query that writes to Pub/Sub ( #801 ) ( b47f32d )
Add DataFrame.to_arrow to create Arrow Table from DataFrame ( #807 ) ( 1e3feda )
Add PolynomialFeatures support to to_gbq and pipelines ( #805 ) ( 57d98b9 )
Add Series.peek to preview data efficiently ( #727 ) ( 580e1b9 )
Expose gcf memory param in remote_function ( #803 ) ( 014765c )
More informative error when query plan too complex ( #811 ) ( 136dc24 )

Bug Fixes

Include internally required packages in remote_function hash ( #799 ) ( 4b8fc15 )

Documentation

Document dtype limitation on row processing remote_function ( #800 ) ( 487dff6 )

1.10.0 (2024-06-21)

Features

Add dataframe.insert ( #770 ) ( e8bab68 )
Add groupby head API ( #791 ) ( 44202bc )
Add ml.preprocessing.PolynomialFeatures class ( #793 ) ( b4fbb51 )
Bigframes.streaming module for continuous queries ( #703 ) ( 0433a1c )
Include index columns in DataFrame.sql if they are named ( #788 ) ( c8d16c0 )

Bug Fixes

Allow __repr__ to work with uninitialed DataFrame/Series/Index ( #778 ) ( e14c7a9 )
Df.loc with the 2nd input as bigframes boolean Series ( #789 ) ( a4ac82e )
Ensure numpy version matches in remote_function deployment ( #798 ) ( 324d93c )
Fix temp table creation retries by now throwing if table already exists. ( #787 ) ( 0e57d1f )
Self-join optimization doesn’t needlessly invalidate caching ( #797 ) ( 1b96b80 )

1.9.0 (2024-06-10)

Features

Allow functions returned from bpd.read_gbq_function to execute outside of apply ( #706 ) ( ad7d8ac )
Support bigquery.vector_search() ( #736 ) ( dad66fd )
Support score() in GeminiTextGenerator ( #740 ) ( b2c7d8b )
Support bytes type in remote_function ( #761 ) ( 4915424 )
Support fit() in GeminiTextGenerator ( #758 ) ( d751f5c )

Bug Fixes

ARIMAPlus loads auto_arima_min_order param ( #752 ) ( 39d7013 )
Improve to_pandas_batches for large results ( #746 ) ( 61f18cb )
Resolve issue with unset thread-local options ( #741 ) ( d93dbaf )

Documentation

Fix ML.EVALUATE spelling ( #749 ) ( 7899749 )
Remove LogisticRegression normal_equation strategy ( #753 ) ( ea5d367 )

1.8.0 (2024-05-31)

Features

merge only generates a default index if both inputs already have an index ( #733 ) ( 25d049c )
Add + , - as unary ops, ^ binary op ( #724 ) ( 968d825 )
Add GroupBy.size() to get number of rows in each group ( #479 ) ( 1fca588 )
Add DataFrame ~ operator ( #721 ) ( 354abc1 )
Add GeminiText 1.5 Preview models ( #737 ) ( 56cbd3b )
Add slot_millis and add stats to session object ( #725 ) ( 72e9583 )
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings ( #731 ) ( f12c906 )
Allow functions decorated with bpd.remote_function() to execute locally ( #704 ) ( d850da6 )
Ensure "bigframes-api" label is always set on jobs, even if the API is unknown ( #722 ) ( 1832778 )
Support ml.SimpleImputer in bigframes ( #708 ) ( 4c4415f )
Support type annotations to supply input and output types to bpd.remote_function() decorator ( #717 ) ( 4a12e3c )
Support type annotations with bpd.remote_function() and axis=1 (a preview feature) ( #730 ) ( e5a2992 )

Bug Fixes

Correct index labels in multiple aggregations for DataFrameGroupBy ( #723 ) ( 6a78c89 )
Fix Null index assign series to column ( #711 ) ( ffb4b57 )
Set bpd.remote_function() s input_types and output_types default to None to allow omitting them when type annotations are present ( #729 ) ( 0e25a3b )
Warn and disable time travel for linked datasets ( #712 ) ( 085fa9d )

Performance Improvements

Optimize dataframe-series alignment on axis=1 ( #732 ) ( 3d39221 )

Documentation

Add examples to DataFrameGroupBy and SeriesGroupBy ( #701 ) ( e7da0f0 )

1.7.0 (2024-05-20)

Features

read_gbq_query supports filters ( 9386373 )
read_gbq suggests a correct column name when one is not found ( 9386373 )
Add DefaultIndexKind.NULL to use as index_col in read_gbq\* , creating an indexless DataFrame/Series ( #662 ) ( 29e4886 )
Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) ( #663 ) ( 412f28b )
To_datetime supports utc=False for string inputs ( #579 ) ( adf9889 )

Bug Fixes

read_gbq_table respects primary keys even when filters are set ( #689 ) ( 9386373 )
Fix type error in test_cluster ( #698 ) ( 14d81c1 )
Improve escaping of literals and identifiers ( #682 ) ( da9b136 )
Properly identify non-unique index in tables without primary keys ( #699 ) ( 6e0f4d8 )
Remove a usage of the resource package when not available, such as on Windows ( #681 ) ( 96243f2 )
The imported samples error and use peek() ( #688 ) ( 1a0b744 )

Performance Improvements

Don’t run query immediately from read_gbq_table if filters is set ( 9386373 )
Use a LIMIT clause when max_results is set ( 9386373 )

Documentation

Add code snippets for imported onnx tutorials ( #684 ) ( cb36e46 )
Add code snippets for imported tensorflow model ( #679 ) ( b02c401 )
Use class_weight="balanced" in the logistic regression prediction tutorial ( #678 ) ( b951549 )

1.6.0 (2024-05-13)

Features

Add DataFrame.__delitem__ ( #673 ) ( 2218c21 )
Add Series.case_when() ( #673 ) ( 2218c21 )
Add strategy="quantile" in KBinsDiscretizer ( #654 ) ( c6c487f )
Add Series.combine ( #680 ) ( 2fd1b81 )
Series.str.split ( #675 ) ( 6eb19a7 )
Suggest correct options in bpd.options.bigquery.location ( #666 ) ( 57ccabc )
Support axis=1 in df.apply for scalar outputs ( #629 ) ( f6bdc4a )
Support gcf vpc connector in remote_function ( #677 ) ( 9ca92d0 )
Warn with a more specific DefaultLocationWarning category when no location can be detected ( #648 ) ( e084e54 )

Bug Fixes

Include index_col when selecting columns and filters in read_gbq_table ( #648 ) ( e084e54 )

Dependencies

Add jellyfish as a dependency for spelling correction ( 57ccabc )

Documentation

Add code snippets for llm text generatiion ( #669 ) ( 93416ed )
Add logistic regression samples ( #673 ) ( 2218c21 )
Address lint errors in code samples ( #665 ) ( 4fc8964 )
Document inlining of small data in read_\* APIs ( #670 ) ( 306953a )

1.5.0 (2024-05-07)

Features

bigframes.options and bigframes.option_context now uses thread-local variables to prevent context managers in separate threads from affecting each other ( #652 ) ( 651fd7d )
Add ARIMAPlus.coef_ property exposing ML.ARIMA_COEFFICIENTS functionality ( #585 ) ( 81d1262 )
Add a unique session_id to Session and allow cleaning up sessions ( #553 ) ( c8d4e23 )
Add the bigframes.bigquery sub-package with a bigframes.bigquery.array_length function ( #630 ) ( 9963f85 )
Always do a query dry run when option.repr_mode == "deferred" ( #652 ) ( 651fd7d )
Custom query labels for compute options ( #638 ) ( f561799 )
Warn with DefaultIndexWarning from read_gbq on clustered/partitioned tables with no index_col or filters set ( #631 , #658 ) ( 2715d2b , 73064dd )
Support index_col=False in read_csv and engine="bigquery" ( 73064dd )
Support gcf max instance count in remote_function ( #657 ) ( 36578ab )

Bug Fixes

Don’t raise UnknownLocationWarning for US or EU multi-regions ( #653 ) ( 8e4616b )
Fix bug with na in the column labels in stack ( #659 ) ( 4a34293 )
Use explicit session in PaLM2TextGenerator ( #651 ) ( e4f13c3 )

Documentation

Add python code sample for multiple forecasting time series ( #531 ) ( 16866d2 )
Fix the Palm2TextGenerator output token size ( #649 ) ( c67e501 )

1.4.0 (2024-04-29)

Features

Add .cache() method to persist intermediate dataframe ( #626 ) ( a5c94ec )
Add transpose support for small homogeneously typed DataFrames. ( #621 ) ( 054075d )
Allow single input type in remote_function ( #641 ) ( 3aa643f )
Expose gcf max timeout in remote_function ( #639 ) ( dfeaad0 )
Series binary ops compatible with more types ( #618 ) ( 518d315 )
Support the score method for PaLM2TextGenerator ( #634 ) ( 3ffc1d2 )

Bug Fixes

Allow to_pandas to download more than 10GB ( #637 ) ( ce56495 )
Extend row hash to 128 bits to guarantee unique row id ( #632 ) ( 9005c6e )
Llm fine tuning tests ( #627 ) ( 4724a1a )
Llm palm score tests ( #643 ) ( cf4ec3a )

Performance Improvements

Automatically condense internal expression representation ( #516 ) ( 03c1b0d )
Cache transpose to allow performant retranspose ( #635 ) ( 44b738d )

Documentation

Add supported pandas apis on the main page ( #628 ) ( 8d2a51c )
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial ( #623 ) ( 2b84c4f )
Address more technical writers’ feedback ( #640 ) ( 1e7793c )

1.3.0 (2024-04-22)

Features

Add Series.struct.dtypes property ( #599 ) ( d924ec2 )
Add fine tuning fit() for Palm2TextGenerator ( #616 ) ( 9c106bd )
Add quantile statistic ( #613 ) ( bc82804 )
Expose max_batching_rows in remote_function ( #622 ) ( 240a1ac )
Support primary key(s) in read_gbq by using as the index_col by default ( #625 ) ( 75bb240 )
Warn if location is set to unknown location ( #609 ) ( 3706b4f )

Bug Fixes

Address technical writers fb ( #611 ) ( 9f8f181 )
Infer narrowest numeric type when combining numeric columns ( #602 ) ( 8f9ece6 )
Use exact median implementation by default ( #619 ) ( 9d205ae )

Documentation

Fix rendering of examples for multiple apis ( #620 ) ( 9665e39 )
Set index_cols in read_gbq as a best practice ( #624 ) ( 70015b7 )

1.2.0 (2024-04-15)

Features

Add hasnans, combine_first, update to Series ( #600 ) ( 86e0f38 )
Add MultiIndex subclass. ( #596 ) ( 5d0f149 )
Add pivot_table for DataFrame. ( #473 ) ( 5f1d670 )
Add Series.autocorr ( #605 ) ( 4ec8034 )
Support list of numerics in pandas.cut ( #580 ) ( 290f95d )

Bug Fixes

Address more technical writers feedback ( #581 ) ( 4b08d92 )
Error for object dtype on read_pandas ( #570 ) ( 8702dcf )
Inverting int now does bitwise inversion rather than sign flip ( #574 ) ( 5f1db8b )
Loc setitem dtype issue. ( #603 ) ( b94bae9 )
Toc menu missing plotting name ( #591 ) ( eed12c1 )

Documentation

(Series|Dataframe).dtypes ( #598 ) ( edef48f )
Add code samples for str accessor methdos ( #594 ) ( a557ea2 )
Add docs for DataFrame and Series dunder methods ( #562 ) ( 8fc26c4 )
Add examples for at/iat ( #582 ) ( 3be4a2e )

1.1.0 (2024-04-04)

Features

(Series|DataFrame).explode ( #556 ) ( 9e32f57 )
Add DataFrame.eval and DataFrame.query ( #361 ) ( 5e28ebd )
Add ColumnTransformer save/load ( #541 ) ( 9d8cf67 )
Add ml.metrics.mean_squared_error ( #559 ) ( 853c25e )
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops ( #505 ) ( e8e66cf )
Add transformers save/load ( #552 ) ( d805241 )
Allow DataFrame binary ops to align on either axis and with loc… ( #544 ) ( 6d8f3af )
Expose DataFrame.bqclient to assist in integrations ( #519 ) ( 0be8911 )
Read_pandas accepts pandas Series and Index objects ( #573 ) ( f8821fe )
Support ML.GENERATE_EMBEDDING in PaLM2TextEmbeddingGenerator ( #539 ) ( 1156c1e )
Support max_columns in repr and make repr more efficient ( #515 ) ( 54e49cf )

Bug Fixes

Assign NaN scalar to column error. ( #513 ) ( 0a4153c )
Don’t download 100gb onto local python machine in load test ( #537 ) ( 082c58b )
Exclude list-like s parameter in plot.scatter ( #568 ) ( 1caac27 )
Fix case where df.peek would fail to execute even with force=True ( #511 ) ( 8eca99a )
Fix error in Series.drop(0) ( #575 ) ( 75dd786 )
Include all names in MultiIndex repr ( #564 ) ( b188146 )
Plot.scatter s parameter cannot accept float-like column ( #563 ) ( 8d39187 )
Product operation produces float result for all input types ( #501 ) ( 6873b30 )
Reloaded transformer .transform error ( #569 ) ( 39fe474 )
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible ( #561 ) ( 4995c00 )
Respect hard stack size limit and swallow limit change exception. ( #558 ) ( 4833908 )
Restore string to date/time type coercion ( #565 ) ( 4ae0262 )
Sync the notebook with embedding changes ( #550 ) ( 347f2dd )
Use bytes limit on frame inlining rather than element count ( #576 ) ( 659a161 )

Performance Improvements

Add multi-query execution capability for complex dataframes ( #427 ) ( d2d7e33 )

Dependencies

Include pyarrow as a dependency ( #529 ) ( 9b1525a )

Documentation

bigframes.options.bigquery.project and location are optional in some circumstances ( #548 ) ( 90bcec5 )
Add “Supported pandas APIs” reference to the documentation ( #542 ) ( 74c3915 )
Add General Availability banner to README ( #507 ) ( 262ff59 )
Add opeartions in API docs ( #557 ) ( ea95761 )
Add progress_bar code sample ( #508 ) ( 92a1af3 )
Add the code samples for metrics{auc, roc_auc_score, roc_curve} ( #520 ) ( 5f37b09 )
Address more comments from technical writers to meet legal purposes ( #571 ) ( 9084df3 )
Fix docs of ARIMAPlus.predict ( #512 ) ( 3b80f95 )
Include Index in table-of-contents ( #564 ) ( b188146 )
Mark Gemini model as Pre-GA ( #543 ) ( 769868b )
Migrate the overview page to Bigframes official landing page ( #536 ) ( a0fb8bb )

1.0.0 (2024-03-25)

⚠ BREAKING CHANGES

rename model parameter min_rel_progress to tol
early_stop setting no longer supported, always uses True
rename model parameter n_parallell_trees to n_estimators
rename class_weights to class_weight
rename learn_rate to learning_rate
PCA n_components supports float value and None , default to None
rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 )

Features

Add configuration option to read_gbq ( #401 ) ( 85cede2 )
Add ml ARIMAPlus model params ( #488 ) ( 352cb85 )
Add ml KMeans model params ( #477 ) ( 23a8d9a )
Add ml LogisticRegression model params ( #481 ) ( f959b65 )
Add ml PCA model params ( #474 ) ( fb5d83b )
Add params for LinearRegression model ( #464 ) ( 21b2188 )
Add support for Python 3.12 ( #231 ) ( df2976f )
Allow assigning directly to Series.name property ( #495 ) ( ad0e99e )
Ensure Series.str.len() can get length of array columns ( #497 ) ( 10c0446 )
Option to use bq connection without check ( #460 ) ( 0b3f8e5 )
PCA n_components supports float value and None , default to None ( 65c6f47 )
Rename class_weights to class_weight ( 65c6f47 )
Rename learn_rate to learning_rate ( 65c6f47 )
Rename model parameter min_rel_progress to tol ( 65c6f47 )
Rename model parameter n_parallell_trees to n_estimators ( 65c6f47 )
Rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 ) ( 65c6f47 )
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 ( #504 ) ( fbada4a )
Support dataframe.cov ( #498 ) ( c4beafd )
Support Series.dt.floor ( #493 ) ( 2dd01c2 )
Support Series.dt.normalize ( #483 ) ( 0bf1e91 )
Update plot sample to 1000 rows ( #458 ) ( 60d4a7b )

Bug Fixes

early_stop setting no longer supported, always uses True ( 65c6f47 )
Fix -1 offset lookups failing ( #463 ) ( 2dfb9c2 )
Plot.scatter c argument functionalities ( #494 ) ( d6ee994 )
Properly support format param for numerical input. ( #486 ) ( ae20c35 )
Renable to_csv and to_json related tests ( #468 ) ( 2b9a01d )
Sampling plot cannot preserve ordering if index is not ordered ( #475 ) ( a5345fe )
Use actual BigQuery types rather than ibis types in to_pandas ( #500 ) ( 82b4f91 )

Dependencies

Support pandas 2.2 ( #492 ) ( e2cf50e )

Documentation

Add code samples for metrics.{accuracy_score, confusion_matrix} ( #478 ) ( 3e3329a )
Add code samples for metrics.{recall_score, precision_score, f11_score} ( #502 ) ( 370fe90 )
Improve API documentation ( #489 ) ( 751266e )
Update bigquery connection documentation ( #499 ) ( 4bfe094 )
Update LLM + K-means notebook to handle partial failures ( #496 ) ( 97afad9 )

0.26.0 (2024-03-20)

⚠ BREAKING CHANGES

exclude remote models for .register() ( #465 )

Features

(Series|DataFrame).plot ( #438 ) ( 1c3e668 )
read_gbq_table supports LIKE as a operator in filters ( #454 ) ( d2d425a )
Add DataFrame.pipe() method ( #421 ) ( 95f5a6e )
Set force=True by default in DataFrame.peek() ( #469 ) ( 4e8e97d )
Support datetime related casting in (Series|DataFrame|Index).astype ( #442 ) ( fde339b )
Support Series.dt.strftime ( #453 ) ( 8f6e955 )

Bug Fixes

Any() on empty set now correctly returns False ( #471 ) ( f55680c )
Df.drop_na preserves columns dtype ( #457 ) ( 3bab1a9 )
Disable to_json and to_csv related tests ( #462 ) ( 874026d )
Exclude remote models for .register() ( #465 ) ( 73fe0f8 )
Fix broken link in covid notebook ( #450 ) ( adadb06 )
Fix broken multiindex loc cases ( #467 ) ( b519197 )
Fix grouping series on multiple other series ( #455 ) ( 3971bd2 )
Groupby aggregates no longer check if grouping keys are numeric ( #472 ) ( 4fbf938 )
Raise ValueError when read_pandas() receives a bigframes DataFrame ( #447 ) ( b28f9fd )
Series.(to_csv|to_json) leverages bq export ( #452 ) ( 718a00c )
Warn when read_gbq / read_gbq_table uses the snapshot time cache ( #441 ) ( e16a8c0 )

Documentation

Add code samples for ml.metrics.r2_score ( #459 ) ( 85fefa2 )
Add the docs for loc and iloc indexers ( #446 ) ( 14ab8d8 )
Add the pages for at and iat indexers ( #456 ) ( 340f0b5 )
Add version information to bug template ( #437 ) ( 91bd39e )
Indicate that project and location are optional in example notebooks ( #451 ) ( 1df0140 )

0.25.0 (2024-03-14)

Features

(Series|DataFrame).plot.(line|area|scatter) ( #431 ) ( 0772510 )
Support CMEK for remote_function cloud functions ( #430 ) ( 2fd69f4 )

0.24.0 (2024-03-12)

⚠ BREAKING CHANGES

read_parquet uses a “pandas” engine to parse files by default. Use engine="bigquery" for the previous behavior

Features

(Series|Dataframe).plot.hist() ( #420 ) ( 4aadff4 )
Add detect_anomalies to ml ARIMAPlus and KMeans models ( #426 ) ( 6df28ed )
Add engine parameter to read_parquet ( #413 ) ( 31325a1 )
Add ml PCA.detect_anomalies method ( #422 ) ( 8d82945 )
Support BYOSA in remote_function ( #407 ) ( d92ced2 )
Support CMEK for BQ tables ( #403 ) ( 9a678e3 )

Bug Fixes

Move third_party.bigframes_vendored to bigframes_vendored ( #424 ) ( 763edeb )
Only do row identity based joins when joining by index ( #356 ) ( 76b252f )
Read_pandas inline respects location ( #412 ) ( ae0e3ea )

Documentation

Add predict sample to samples/snippets/bqml_getting_started_test.py ( #388 ) ( 6a3b0cc )
Document minimum IAM requirement ( #416 ) ( 36173b0 )
Fix the note rendering for DataFrames methods: nlargest, nsmallest ( #417 ) ( 38bd2ba )

0.23.0 (2024-03-05)

Features

Add ml.metrics.pairwise.euclidean_distance ( #397 ) ( 1726588 )
Add TextEmbedding model version support ( #394 ) ( e0f1ab0 )

Bug Fixes

Code exception in remote_function now prevents retry and surfaces in the client ( #387 ) ( dd3643d )
Docs link for metrics.pairwise ( #400 ) ( a60aba7 )

Dependencies

Update ibis to version 8.0.0 and refactor remote_function to use ibis UDF method ( #277 ) ( 350499b )

Documentation

Update README to point to new summary pages ( #402 ) ( bfe2b23 )

0.22.0 (2024-02-27)

⚠ BREAKING CHANGES

rename cosine_similarity to paired_cosine_distances ( #393 )
move model optional args to kwargs ( #381 )

Features

Add DataFrames.corr() method ( #379 ) ( 67fd434 )
Add ml.metrics.pairwise.manhattan_distance ( #392 ) ( 9d31865 )
Enable regional endpoints for me-central2 ( #386 ) ( 469674d )

Bug Fixes

Avoid ibis warning for “database” table() method argument ( #390 ) ( a0490a4 )
Correct the numeric literal dtype ( #365 ) ( 93b02cd )
Rename cosine_similarity to paired_cosine_distances ( #393 ) ( 81ece46 )

Performance Improvements

Inline read_pandas for small data ( #383 ) ( 59b446b )

Dependencies

Add minimum version constraint for sqlglot to 19.9.0 ( #389 ) ( 8b62d77 )

Documentation

Add a code sample for creating a kmeans model ( #267 ) ( 4291d65 )
Fix bigframes.pandas.concat documentation ( #382 ) ( 234b61c )

Miscellaneous Chores

Release 0.22.0 ( #396 ) ( 8f73d9e )

Code Refactoring

Move model optional args to kwargs ( #381 ) ( 4037992 )

0.21.0 (2024-02-13)

Features

Add Series.cov method ( #368 ) ( 443db22 )
Add ml.llm.GeminiTextGenerator model ( #370 ) ( de1e0a4 )
Add ml.metrics.pairwise.cosine_similarity function ( #374 ) ( 126f566 )
Add XGBoostModel ( #363 ) ( d5518b2 )
Limited support of lambdas in Series.apply ( #345 ) ( 208e081 )
Support bigframes.pandas.to_datetime for scalars, iterables and series. ( #372 ) ( ffb0d15 )
Support read_gbq wildcard table path ( #377 ) ( 90caf86 )

Bug Fixes

Error message fix. ( #375 ) ( 930cf6b )

Documentation

Clarify ADC pre-auth in a non-interactive environment ( #348 ) ( 99a9e6e )

0.20.1 (2024-02-06)

Performance Improvements

Make repr cache the block where appropriate ( #350 ) ( 068879f )

Documentation

Add a sample to demonstrate the evaluation results ( #364 ) ( cff0919 )
Fix the DataFrame.apply code sample ( #366 ) ( 1866a26 )

0.20.0 (2024-01-30)

Features

Add DataFrame.peek() as an efficient alternative to head() results preview ( #318 ) ( 9c34d83 )
Add ARIMA_EVAULATE options in forecasting models ( #336 ) ( 73e997b )
Add Index constructor, repr, copy, get_level_values, to_series ( #334 ) ( e5d054e )
Improve error message for drive based BQ table reads ( #344 ) ( 0794788 )
Update cut to work without labels = False and show intervals as dict ( #335 ) ( 4ff53db )

Bug Fixes

Chance default connection name in getting_started.ipnyb ( #347 ) ( 677f014 )
Series iteration correctly returns values instead of index ( #339 ) ( 2c6af9b )

Documentation

Add code samples for Series.{between, cumprod} ( #353 ) ( 09a52fd )

0.19.2 (2024-01-22)

Bug Fixes

Read_gbq large response issue ( #332 ) ( b8178b9 )
Use object dtype for ARRAY columns in to_pandas() with pandas 1.x ( #329 ) ( 374ddb5 )

Documentation

Add DataFrame.applymap documentation ( #326 ) ( bd531a1 )
Add code samples for series methods ( #323 ) ( 32cc6fa )
Add remote model requirements ( #333 ) ( c91f70c )

0.19.1 (2024-01-17)

Bug Fixes

Handle multi-level columns for df aggregates properly ( #305 ) ( 5bb45ba )
Update max_output_token limitation. ( #308 ) ( 5cccd36 )

Documentation

Add code samples for Series.corr ( #316 ) ( 9150c16 )

0.19.0 (2024-01-09)

Features

Add ‘columns’ as an alias for ‘col_order’ ( #298 ) ( a01b271 )
Add Series dt.tz and dt.unit properties ( #303 ) ( 2e1a403 )
Add to_gbq() method for LLM models ( #299 ) ( dafbc1b )
Allow manually set clustering_columns in dataframe.to_gbq ( #302 ) ( 9c21323 )
Support assigning to columns like a property ( #304 ) ( f645c56 )
Support upcasting numeric columns in concat ( #294 ) ( e3a056a )

Bug Fixes

DF.drop tuple input as multi-index ( #301 ) ( 21391a9 )
Fix bug converting non-string labels to sql ids ( #296 ) ( a61c5fe )

Documentation

Add code samples for Series.ffill and DataFrame.ffill ( #307 ) ( 1c63b45 )

0.18.0 (2024-01-02)

Features

Add dataframe.to_html ( #259 ) ( 2cd6489 )
Add IntervalIndex support to bigframes.pandas.cut ( #254 ) ( 6c1969a )
Add replace method to DataFrame ( #261 ) ( 5092215 )
Specific pyarrow mappings for decimal, bytes types ( #283 ) ( a1c0631 )

Bug Fixes

Dataframes to_gbq now creates dataset if it doesn’t exist ( #222 ) ( bac62f7 )
Exclude pandas 2.2.0rc0 to unblock prerelease tests ( #292 ) ( ac1a745 )
Fix DataFrameGroupby.agg() issue with as_index=False ( #273 ) ( ab49350 )
Make Series.str.replace work for simple strings ( #285 ) ( ad67465 )
Update dataframe.to_gbq to dedup column names. ( #286 ) ( 746115d )
Use setuptools.find_namespace_packages ( #246 ) ( 9ec352a )

Dependencies

Migrate to ibis-framework >= "7.1.0" ( #53 ) ( 9798a2b )

Documentation

Add code snippets for explore query result page ( #278 ) ( 7cbbb7d )
Code samples for astype common to DataFrame and Series ( #280 ) ( 95b673a )
Code samples for DataFrame.copy and Series.copy ( #290 ) ( 7cbc2b0 )
Code samples for drop and fillna ( #284 ) ( 9c5012e )
Code samples for isna , isnull , dropna , isin ( #289 ) ( ad51035 )
Code samples for rename , size ( #293 ) ( eb69f60 )
Code samples for reset_index and sort_values ( #282 ) ( acc0eb7 )
Code samples for sample , get , Series.round ( #295 ) ( c2b1892 )
Code samples for Series.{add, replace, unique, T, transpose} ( #287 ) ( 0e1bbfc )
Code samples for Series.{map, to_list, count} ( #290 ) ( 7cbc2b0 )
Code samples for Series.{name, std, agg} ( #293 ) ( eb69f60 )
Code samples for Series.groupby and Series.{sum,mean,min,max} ( #280 ) ( 95b673a )
Code samples for DataFrame set_index , items ( #295 ) ( c2b1892 )
Fix the rendering for get_dummies ( #291 ) ( 252f3a2 )

0.17.0 (2023-12-14)

Features

Add filters argument to read_gbq for enhanced data querying ( #198 ) ( 034f71f )
Add module/class level api tracking ( #272 ) ( 4f3db3d )
Deprecate use_regional_endpoints ( #199 ) ( 319a1f2 )

Bug Fixes

Increase recursion limit, cache compilation tree hashes ( #184 ) ( b54791c )
Replaced raise NotImplementedError with return NotImplemented ( #258 ) ( a133822 )

Documentation

Add code samples for values and value_counts ( #249 ) ( f247d95 )
Add sample for getting started with BQML ( #141 ) ( fb14f54 )

0.16.0 (2023-12-12)

Features

Add ARIMAPlus.predict parameters ( #264 ) ( 99598c7 )
Add DataFrame from_dict and from_records methods ( #244 ) ( 8d81e24 )
Add DataFrame.select_dtypes method ( #242 ) ( 1737acc )
Add nunique method to Series/DataFrameGroupby ( #256 ) ( c8ec245 )
Support dataframe.loc with conditional columns selection ( #233 ) ( 3febea9 )

Bug Fixes

Enfore pandas version requirement <2.1.4 ( #265 ) ( 9dd63f6 )
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests ( b02fc2c )
Fix value_counts column label for normalize=True ( #245 ) ( d3fa6f2 )
Migrate e2e tests to bigframes-load-testing project ( 8766ac6 )
Ml.sql logic ( #262 ) ( 68c6fdf )
Update the llm_kmeans notebook ( #247 ) ( 66d1839 )

Documentation

Add code samples for shape and head ( #257 ) ( 5bdcc65 )
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… ( #252 ) ( 8c63697 )
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… ( #234 ) ( e735412 )
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod ( #243 ) ( 0523a31 )
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… ( #251 ) ( 77074ec )
Correct the docs for option_context ( #263 ) ( d21c6dd )
Correct the params rendering for ml.remote and ml.ensemble modules ( #248 ) ( c2829e3 )
Fix return annotation in API docstrings ( #253 ) ( 89a1c67 )

0.15.0 (2023-11-29)

⚠ BREAKING CHANGES

model.predict returns all the columns ( #204 )

Features

Add info and memory_usage methods to dataframe ( #219 ) ( 9d6613d )
Add remote vertex model support ( #237 ) ( 0bfc4fb )
Add the recent api method for ML component ( #225 ) ( ed8876d )
Model.predict returns all the columns ( #204 ) ( 416171a )
Send warnings on LLM prediction partial failures ( #216 ) ( 81125f9 )

Bug Fixes

Add df snapshots lookup for read_gbq ( #229 ) ( d0d9b84 )
Avoid unnecessary row_number() on sort key for io ( #211 ) ( a18d40e )
Dedup special character ( #209 ) ( dd78acb )
Invalid JSON type of the notebook ( #215 ) ( a729831 )
Make to_pandas override enable_downsampling when sampling_method is manually set. ( #200 ) ( ae03756 )
Polish the llm+kmeans notebook ( #208 ) ( e8532b1 )
Update the llm+kmeans notebook with recent change ( #236 ) ( f8917ab )
Use anonymous dataset to create remote_function ( #205 ) ( 69b016e )

Documentation

Add code samples for index and column properties ( #212 ) ( c88d38e )
Add code samples for df reshaping, function, merge, and join methods ( #203 ) ( 010486c )
Add examples for dataframe.kurt, dataframe.std, dataframe.count ( #232 ) ( f9c6e72 )
Add examples for dataframe.mean, dataframe.median, dataframe.va… ( #228 ) ( edd0522 )
Add examples for dataframe.min, dataframe.max and dataframe.sum ( #227 ) ( 3a375e8 )
Code samples for Series.dot and DataFrame.dot ( #226 ) ( b62a07a )
Code samples for Series.where and Series.mask ( #217 ) ( 52dfad2 )
Code samples for dataframe.any, dataframe.all and dataframe.prod ( #223 ) ( d7957fa )
Make the code samples reflect default bq connection usage ( #206 ) ( 71844b0 )

Miscellaneous Chores

Release 0.15.0 ( #241 ) ( 6c899be )

0.14.1 (2023-11-16)

Bug Fixes

Correctly handle null values when initializing fingerprint ordering ( #210 ) ( 8324f13 )

Documentation

Add an example notebook about line graphs ( #197 ) ( f957b27 )

0.14.0 (2023-11-14)

Features

Add ‘cross’ join support ( #176 ) ( 765446a )
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods ( #162 ) ( 6a28403 )
Add series.sample (identical to existing dataframe.sample) ( #187 ) ( 37914a4 )
Add unordered sql compilation ( #156 ) ( 58f420c )
Log most recent API calls as recent-bigframes-api-xx labels on BigQuery jobs ( #145 ) ( 4ea33b7 )
Read_gbq creates order deterministically without table copy ( #191 ) ( 8ab81de )
Support date_series.astype("string[pyarrow]") to cast DATE to STRING ( #186 ) ( aee0e8e )
Support series.at[row_label] = scalar ( #173 ) ( 0c8bd33 )
Temporary resources no longer use BigQuery Sessions ( #194 ) ( 4a02cac )

Bug Fixes

All sort operation are now stable ( #195 ) ( 3a2761f )
Default to 7 days expiration for read_csv , read_json , read_parquet ( #193 ) ( 03606cd )
Deprecate the remote_service_type in llm model ( #180 ) ( a8a409a )
For reset_index on unnamed multiindex, always use level_[n] label ( #182 ) ( f95000d )
Match pandas behavior when assigning listlike to empty dfs ( #172 ) ( c1d1f42 )
Use anonymous dataset instead of session dataset for temp tables ( #181 ) ( 800d44e )
Use random table for read_pandas ( #192 ) ( 741c75e )
Use random table when loading data for read_csv , read_json , read_parquet ( #175 ) ( 9d2e6dc )

Documentation

Add code samples for read_gbq_function using community UDFs ( #188 ) ( 7506eab )
Add docstring code samples for Series.apply and DataFrame.map ( #185 ) ( c816d84 )
Add llm kmeans notebook as an included example ( #177 ) ( d49ae42 )
Use head() to get top n results, not to preview results ( #190 ) ( 87f84c9 )

0.13.0 (2023-11-07)

Features

to_gbq without a destination table writes to a temporary table ( #158 ) ( e1817c9 )
Add DataFrame.__iter__ , DataFrame.iterrows , DataFrame.itertuples , and DataFrame.keys methods ( #164 ) ( c065071 )
Add Series.__iter__ method ( #164 ) ( c065071 )
Add interpolate() to series and dataframe ( #157 ) ( b9cb55c )
Support 32k text-generation and multilingual embedding models ( #161 ) ( 5f0ea37 )

Bug Fixes

Update default temp table expiration to 7 days ( #174 ) ( 4ff26cd )

0.12.0 (2023-11-01)

Features

Add DataFrame.melt ( #113 ) ( 4e4409c )
Add DataFrame.to_pandas_batches() to download large DataFrame objects ( #136 ) ( 3afd4a3 )
Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs ( #133 ) ( 63c7919 )
Add pandas.qcut ( #104 ) ( 8e44518 )
Add pd.get_dummies ( #149 ) ( d8baad5 )
Add unstack to series, add level param ( #115 ) ( 5edcd19 )
Implement operator @ for DataFrame.dot ( #139 ) ( 79a638e )
Populate ibis version in user agent ( #140 ) ( c639a36 )

Bug Fixes

Don’t override the global logging config ( #138 ) ( 2ddbf74 )
Fix bug with column names under repeated column assignment ( #150 ) ( 29032d0 )
Resolve plotly rendering issue by using ipython html for job pro… ( #134 ) ( 39df43e )
Use indexee’s session for loc listlike cases ( #152 ) ( 27c5725 )

Documentation

Add artithmetic df sample code ( #153 ) ( ac44ccd )
Fix indentation on read_gbq_function code sample ( #163 ) ( 0801d96 )
Link to ML.EVALUATE BQML page for score() methods ( #137 ) ( 45c617f )

0.11.0 (2023-10-26)

Features

Add back reset_session as an alias for close_session ( #124 ) ( 694a85a )
Change query parameter to query_or_table in read_gbq ( #127 ) ( f9bb3c4 )

Bug Fixes

Expose bigframes.pandas.reset_session as a public API ( #128 ) ( b17e1f4 )
Use series’s own session in series.reindex listlike case ( #135 ) ( 95bff3f )

Documentation

Add runnable code samples for DataFrames I/O methods and property ( #129 ) ( 6fea8ef )
Add runnable code samples for reading methods ( #125 ) ( a669919 )

0.10.0 (2023-10-19)

Features

Implement DataFrame.dot for matrix multiplication ( #67 ) ( 29dd414 )

0.9.0 (2023-10-18)

⚠ BREAKING CHANGES

rename bigframes.pandas.reset_session to close_session ( #101 )

Features

Add bigframes.options.bigquery.application_name for partner attribution ( #117 ) ( 52d64ff )
Add AtIndexer getitems ( #107 ) ( 752b01f )
Rename bigframes.pandas.reset_session to close_session ( #101 ) ( 36693bf )
Send BigQuery cancel request when canceling bigframes process ( #103 ) ( e325fbb )
Support external packages in remote_function ( #98 ) ( ec10c4a )
Use ArrowDtype for STRUCT columns in to_pandas ( #85 ) ( 9238fad )

Bug Fixes

Support multiindex for three loc getitem overloads ( #113 ) ( 68e3cd3 )

Performance Improvements

If primary keys are defined, read_gbq avoids copying table data ( #112 ) ( e6c0cd1 )

Documentation

Add documentation for Series.struct.field and Series.struct.explode ( #114 ) ( a6dab9c )
Add open-source link in API doc ( #106 ) ( db51fe3 )
Update ML overview API doc ( #105 ) ( 1b3f3a5 )

0.8.0 (2023-10-12)

⚠ BREAKING CHANGES

The default behavior of to_parquet is changing from no compression to 'snappy' compression.

Features

Support compression in to_parquet ( a8c286f )

Bug Fixes

Create session dataset for remote functions only when needed ( #94 ) ( 1d385be )

0.7.0 (2023-10-11)

Features

Add aliases for several series properties ( #80 ) ( c0efec8 )
Add equals methods to series/dataframe ( #76 ) ( 636a209 )
Add iat and iloc accessing by tuples of integers ( #90 ) ( 228aeba )
Add level param to DataFrame.stack ( #88 ) ( 97b8bec )
Allow df.drop to take an index object ( #68 ) ( 740c451 )
Use default session connection ( #87 ) ( 4ae4ef9 )

Bug Fixes

Change the invalid url in docs ( #93 ) ( 969800d )

Documentation

Add more preprocessing models into the docs menu. ( #97 ) ( 1592315 )

0.6.0 (2023-10-04)

Features

Add df.unstack ( #63 ) ( 4a84714 )
Add idxmin, idxmax to series, dataframe ( #74 ) ( 781307e )
Add ml.preprocessing.KBinsDiscretizer ( #81 ) ( 24c6256 )
Add multi-column dataframe merge ( #73 ) ( c9fa85c )
Add update and align methods to dataframe ( #57 ) ( bf050cf )
Support STRUCT data type with Series.struct.field to extract child fields ( #71 ) ( 17afac9 )

Bug Fixes

Avoid 403 response too large to return error with read_gbq and large query results ( #77 ) ( 8f3b5b2 )
Change return type of Series.loc[scalar] ( #40 ) ( fff3d45 )
Fix df/series.iloc by list with multiindex ( #79 ) ( 971d091 )

0.5.0 (2023-09-28)

Features

Add DataFrame.kurtosis / DF.kurt method ( c1900c2 )
Add DataFrame.rolling and DataFrame.expanding methods ( c1900c2 )
Add items , apply methods to DataFrame . ( #43 ) ( 3adc1b3 )
Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )
Add index dtype , astype , drop , fillna , aggregate attributes. ( #38 ) ( 1a254a4 )
Add ml.preprocessing.LabelEncoder ( #50 ) ( 2510461 )
Add ml.preprocessing.MaxAbsScaler ( #56 ) ( 14b262b )
Add ml.preprocessing.MinMaxScaler ( #64 ) ( 392113b )
Add more index methods ( #54 ) ( a6e32aa )
Support calculate_p_values parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support class_weights="balanced" in LogisticRegression model ( c1900c2 )
Support df[column_name] = df_only_one_column ( c1900c2 )
Support early_stop parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support enable_global_explain parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support l2_reg parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support learn_rate_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support ls_init_learn_rate parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support max_iterations parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support min_rel_progress parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support optimize_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )
Support casting string to integer or float ( #59 ) ( 3502f83 )

Bug Fixes

Fix header skipping logic in read_csv ( #49 ) ( d56258c )
Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )
LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )
Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )

Performance Improvements

Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )
Inline small Series and DataFrames in query text ( #45 ) ( 5e199ec )
Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )
Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )

Documentation

Link to Remote Functions code samples from README and API reference ( c1900c2 )

0.4.0 (2023-09-16)

Features

Add axis parameter to droplevel and reorder_levels ( 7c6b0dd )
Add bfill and ffill to DataFrame and Series ( 7c6b0dd )
Add DataFrame.combine and DataFrame.combine_first ( #27 ) ( 7c6b0dd )
Add DataFrame.nlargest , nsmallest ( 7c6b0dd )
Add DataFrame.pct_change and Series.pct_change ( 7c6b0dd )
Add DataFrame.skew and GroupBy.skew ( 7c6b0dd )
Add DataFrame.to_dict , to_excel , to_latex , to_records , to_string , to_markdown , to_pickle , to_orc ( 7c6b0dd )
Add diff method to DataFrame and GroupBy ( 7c6b0dd )
Add filter and reindex to Series and DataFrame ( 7c6b0dd )
Add reindex_like to DataFrame and Series ( 7c6b0dd )
Add swaplevel to DataFrame and Series ( 7c6b0dd )
Add partial support for Sereies.replace ( 7c6b0dd )
Support DataFrame.loc[bool_series, column] = scalar ( 7c6b0dd )
Support a persistent name in remote_function ( 7c6b0dd )

Bug Fixes

remote_function uses same credentials as other APIs ( 7c6b0dd )
Add type hints to models ( 7c6b0dd )
Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )
Remove transforms parameter in model.fit ( breaking change) ( 7c6b0dd )
Support column joins with “None indexer” ( 7c6b0dd )
Use for literals Int64Dtype in cut ( 7c6b0dd )
Use lowercase strings for parameter literals in bigframes.ml ( breaking change) ( 7c6b0dd )

Performance Improvements

bigframes-api label to I/O query jobs ( 7c6b0dd )

Documentation

Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )
Document region logic in README ( 7c6b0dd )
Fix OneHotEncoder sample ( 7c6b0dd )

0.3.2 (2023-09-06)

Bug Fixes

Make release.sh script for PyPI upload executable ( #20 ) ( 9951610 )

0.3.1 (2023-09-05)

Bug Fixes

release:Use correct directory name for release build config ( #17 ) ( 3dd25b3 )

0.3.0 (2023-09-02)

Features

Add bigframes.get_global_session() and bigframes.reset_session() aliases ( a32b747 )
Add bigframes.pandas.read_pickle function ( a32b747 )
Add components_ , explained_variance_ , and explained_variance_ratio_ properties to bigframes.ml.decomposition.PCA ( 89b9503 )
Add fit_transform to bigquery.ml transformers ( a32b747 )
Add Series.dropna and DataFrame.fillna ( 8fab755 )
Add Series.str methods isalpha , isdigit , isdecimal , isalnum , isspace , islower , isupper , zfill , center ( a32b747 )
Support bigframes.pandas.merge() ( 8fab755 )
Support DataFrame.isin with list and dict inputs ( 8fab755 )
Support DataFrame.pivot ( a32b747 )
Support DataFrame.stack ( 89b9503 )
Support DataFrame - DataFrame binary operations ( 8fab755 )
Support df[my_column] = [a python list] ( 89b9503 )
Support Index.is_monotonic ( 8fab755 )
Support np.arcsin , np.arccos , np.arctan , np.sinh , np.cosh , np.tanh , np.arcsinh , np.arccosh , np.arctanh , np.exp with Series argument ( 89b9503 )
Support np.sin , np.cos , np.tan , np.log , np.log10 , np.sqrt , np.abs with Series argument ( 89b9503 )
Support pow() and power operator in DataFrame and Series ( 8fab755 )
Support read_json with engine=bigquery for newline-delimited JSON files ( 89b9503 )
Support Series.corr ( 89b9503 )
Support Series.map ( 8fab755 )
Support for np.add , np.subtract , np.multiply , np.divide , np.power ( 8fab755 )
Support MultiIndex for DataFrame columns ( a32b747 )
Use pandas.Index for column labels ( a32b747 )
Use default session and connection in ml.llm and ml.imported ( 8fab755 )

Bug Fixes

Add error message to set_index ( a32b747 )
Align column names with pandas in DataFrame.agg results ( 89b9503 )
Allow (but still not recommended) ORDER BY in read_gbq input when an index_col is defined ( 89b9503 )
Check for IAM role on the BigQuery connection when initializing a remote_function ( 89b9503 )
Check that types are specified in read_gbq_function ( a32b747 )
Don’t use query cache for Session construction ( a32b747 )
Include survey link in abstract NotImplementedError exception messages ( 89b9503 )
Label temp table creation jobs with source=bigquery-dataframes-temp label ( 89b9503 )
Make X_train argument names consistent across methods ( 8fab755 )
Raise AttributeError for unimplemented pandas methods ( 89b9503 )
Raise exception for invalid function in read_gbq_function ( a32b747 )
Support spaces in column names in DataFrame initializater ( 89b9503 )

Performance Improvements

Add local cache for __repr_\*__ methods ( a32b747 )
Lazily instantiate client library objects ( 89b9503 )
Use row_number() filter for head / tail ( 8fab755 )

Documentation

Add ML section under Overview ( a32b747 )
Add release status to table of contents ( a32b747 )
Add samples and best practices to read_gbq docs ( a32b747 )
Correct the return types of Dataframe and Series ( a32b747 )
Create subfolders for notebooks ( a32b747 )
Fix link to GitHub ( 89b9503 )
Highlight bigframes is open-source ( a32b747 )
Sample ML Drug Name Generation notebook ( a32b747 )
Set options.bigquery.project in sample code ( 89b9503 )
Transform remote function user guide into sample code ( a32b747 )
Update remote function notebook with read_gbq_function usage ( 8fab755 )

0.2.0 (2023-08-17)

Features

Add KMeans.cluster_centers_.
Allow column labels to be any type handled by bq df, column labels can be integers now.
Add dataframegroupby.agg().
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
Add match, fullmatch, get, pad str methods.
Add series isin function.

Bug Fixes

Update ML package to use sessions for queries.
Optimize read_gbq with index_col set to cluster by index_col .
Raise ValueError if the location mismatched.
read_gbq no longer uses ‘time travel’ with query inputs.

Documentation

Add docstring to _uniform_sampling to avoid user using it.

0.1.1 (2023-08-14)

Documentation

Correct link to code repository in setup.py and use correct terminology for console.cloud.google.com links.

0.1.0 (2023-08-11)

Features

Add bigframes.pandas package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.
Add bigframes.ml package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .

0.0.0 (2023-02-22)

Empty package to reserve package name.