- 2.17.0 (latest)
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 2.0.0-dev0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Changelog
1.18.0 (2024-09-18)
Features
-
Add “include” param to describe for string types ( #973 ) ( deac6d2 )
-
Add
subset
parameter toDataFrame.dropna
to select which columns to consider ( #981 ) ( f7c03dc )
Bug Fixes
-
DataFrameGroupby.agg now works with unnamed tuples ( #985 ) ( 0f047b4 )
-
Fix a bug that raises exception when re-indexing columns with their original order ( #988 ) ( 596b03b )
-
Make the
Series.apply
outcomeassign
able to the original dataframe in partial ordering mode ( #874 ) ( c94ead9 )
Dependencies
1.17.0 (2024-09-11)
Features
-
Add
__version__
alias to bigframes.pandas ( #967 ) ( 9ce10b4 ) -
Define list accessor for bigframes Series ( #946 ) ( 8e8279d )
-
Enable read_csv() to process other files ( #940 ) ( 3b35860 )
-
Include the bigframes package version alongside the feedback link in error messages ( #936 ) ( 7b59b6d )
Bug Fixes
-
Make
read_gbq_function
work for multi-param functions ( #947 ) ( c750be6 ) -
Support
read_gbq_function
for axis=1 application ( #950 ) ( 86e54b1 )
Documentation
-
Add docstring returns section to Options ( #937 ) ( a2640a2 )
-
Update title of pypi notebook example to reflect use of the PyPI public dataset ( #952 ) ( cd62e60 )
1.16.0 (2024-09-04)
Features
-
Add
DataFrame.struct.explode
to add struct subfields to a DataFrame ( #916 ) ( ad2f75e ) -
Implement
bigframes.bigquery.json_extract_array
( #910 ) ( 575a29e ) -
Recover struct column from exploded Series ( #904 ) ( 7dd304c )
Bug Fixes
-
Fix issue with iterating on >10gb dataframes ( #949 ) ( 2b0f0fa )
-
Unordered mode errors in ml train_test_split ( #925 ) ( 85d7c21 )
Performance Improvements
Dependencies
Documentation
-
Add Claude3 ML and RemoteFunc notebooks ( #930 ) ( cfd16c1 )
-
Create sample notebook to manipulate struct and array data ( #883 ) ( 3031903 )
-
Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook ( #890 ) ( d1883cc )
1.15.0 (2024-08-20)
Features
Documentation
-
Add columns for “requires ordering/index” to supported APIs summary ( #892 ) ( d2fc51a )
-
Remove duplicate description for
kms_key_name
( #898 ) ( 1053d56 )
1.14.0 (2024-08-14)
Features
Bug Fixes
Performance Improvements
Documentation
1.13.0 (2024-08-05)
Features
-
df.apply(axis=1)
to support remote function with mutiple params ( #851 ) ( 2158818 ) -
Allow windowing in ‘partial’ ordering mode ( #861 ) ( ca26fe5 )
-
Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters ( #879 ) ( 8753bdd )
Bug Fixes
Documentation
1.12.0 (2024-07-31)
Features
-
Add config option to set partial ordering mode ( #855 ) ( 823c0ce )
-
Add stratify param support to ml.model_selection.train_test_split method ( #815 ) ( 27f8631 )
-
Allow DataFrame.join for self-join on Null index ( #860 ) ( e950533 )
-
Support remote function cleanup with
session.close
( #818 ) ( ed06436 ) -
Support to_csv/parquet/json to local files/objects ( #858 ) ( d0ab9cc )
Bug Fixes
-
Fewer relation joins from df self-operations ( #823 ) ( 0d24f73 )
-
Fix unordered mode using ordered path to print frame ( #839 ) ( 93785cb )
-
Reduce redundant
remote_function
deployments ( #856 ) ( cbf2d42 )
Documentation
-
Add partner attribution steps to integrations sample notebook ( #835 ) ( d7b333f )
-
Make
get_global_session
/close_session
/reset_session
appears in the docs ( #847 ) ( 01d6bbb )
1.11.1 (2024-07-08)
Documentation
-
Remove session and connection in llm notebook ( #821 ) ( 74170da )
-
Remove the experimental flask icon from the public docs ( #820 ) ( 067ff17 )
1.11.0 (2024-07-01)
Features
-
Add
bigframes.streaming.to_pubsub
method to create continuous query that writes to Pub/Sub ( #801 ) ( b47f32d ) -
Add
DataFrame.to_arrow
to create Arrow Table from DataFrame ( #807 ) ( 1e3feda ) -
Add
PolynomialFeatures
support toto_gbq
and pipelines ( #805 ) ( 57d98b9 ) -
Add Series.peek to preview data efficiently ( #727 ) ( 580e1b9 )
-
Expose gcf memory param in
remote_function
( #803 ) ( 014765c ) -
More informative error when query plan too complex ( #811 ) ( 136dc24 )
Bug Fixes
Documentation
1.10.0 (2024-06-21)
Features
-
Add ml.preprocessing.PolynomialFeatures class ( #793 ) ( b4fbb51 )
-
Bigframes.streaming module for continuous queries ( #703 ) ( 0433a1c )
-
Include index columns in DataFrame.sql if they are named ( #788 ) ( c8d16c0 )
Bug Fixes
-
Allow
__repr__
to work with uninitialed DataFrame/Series/Index ( #778 ) ( e14c7a9 ) -
Df.loc with the 2nd input as bigframes boolean Series ( #789 ) ( a4ac82e )
-
Ensure numpy version matches in
remote_function
deployment ( #798 ) ( 324d93c ) -
Fix temp table creation retries by now throwing if table already exists. ( #787 ) ( 0e57d1f )
-
Self-join optimization doesn’t needlessly invalidate caching ( #797 ) ( 1b96b80 )
1.9.0 (2024-06-10)
Features
-
Allow functions returned from
bpd.read_gbq_function
to execute outside ofapply
( #706 ) ( ad7d8ac )
Bug Fixes
-
ARIMAPlus loads auto_arima_min_order param ( #752 ) ( 39d7013 )
-
Improve to_pandas_batches for large results ( #746 ) ( 61f18cb )
-
Resolve issue with unset thread-local options ( #741 ) ( d93dbaf )
Documentation
1.8.0 (2024-05-31)
Features
-
merge
only generates a default index if both inputs already have an index ( #733 ) ( 25d049c ) -
Add
GroupBy.size()
to get number of rows in each group ( #479 ) ( 1fca588 ) -
Add slot_millis and add stats to session object ( #725 ) ( 72e9583 )
-
Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings ( #731 ) ( f12c906 )
-
Allow functions decorated with
bpd.remote_function()
to execute locally ( #704 ) ( d850da6 ) -
Ensure
"bigframes-api"
label is always set on jobs, even if the API is unknown ( #722 ) ( 1832778 ) -
Support type annotations to supply input and output types to
bpd.remote_function()
decorator ( #717 ) ( 4a12e3c ) -
Support type annotations with
bpd.remote_function()
andaxis=1
(a preview feature) ( #730 ) ( e5a2992 )
Bug Fixes
-
Correct index labels in multiple aggregations for DataFrameGroupBy ( #723 ) ( 6a78c89 )
-
Set
bpd.remote_function()
sinput_types
andoutput_types
default toNone
to allow omitting them when type annotations are present ( #729 ) ( 0e25a3b ) -
Warn and disable time travel for linked datasets ( #712 ) ( 085fa9d )
Performance Improvements
Documentation
1.7.0 (2024-05-20)
Features
-
read_gbq_query
supportsfilters
( 9386373 ) -
read_gbq
suggests a correct column name when one is not found ( 9386373 ) -
Add
DefaultIndexKind.NULL
to use asindex_col
inread_gbq\*
, creating an indexless DataFrame/Series ( #662 ) ( 29e4886 ) -
Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) ( #663 ) ( 412f28b )
-
To_datetime supports utc=False for string inputs ( #579 ) ( adf9889 )
Bug Fixes
-
read_gbq_table
respects primary keys even whenfilters
are set ( #689 ) ( 9386373 ) -
Improve escaping of literals and identifiers ( #682 ) ( da9b136 )
-
Properly identify non-unique index in tables without primary keys ( #699 ) ( 6e0f4d8 )
-
Remove a usage of the
resource
package when not available, such as on Windows ( #681 ) ( 96243f2 ) -
The imported samples error and use peek() ( #688 ) ( 1a0b744 )
Performance Improvements
-
Don’t run query immediately from
read_gbq_table
iffilters
is set ( 9386373 ) -
Use a
LIMIT
clause whenmax_results
is set ( 9386373 )
Documentation
-
Add code snippets for imported onnx tutorials ( #684 ) ( cb36e46 )
-
Add code snippets for imported tensorflow model ( #679 ) ( b02c401 )
-
Use
class_weight="balanced"
in the logistic regression prediction tutorial ( #678 ) ( b951549 )
1.6.0 (2024-05-13)
Features
-
Add
strategy="quantile"
in KBinsDiscretizer ( #654 ) ( c6c487f ) -
Suggest correct options in bpd.options.bigquery.location ( #666 ) ( 57ccabc )
-
Support
axis=1
indf.apply
for scalar outputs ( #629 ) ( f6bdc4a ) -
Support gcf vpc connector in
remote_function
( #677 ) ( 9ca92d0 ) -
Warn with a more specific
DefaultLocationWarning
category when no location can be detected ( #648 ) ( e084e54 )
Bug Fixes
Dependencies
- Add jellyfish as a dependency for spelling correction ( 57ccabc )
Documentation
-
Add code snippets for llm text generatiion ( #669 ) ( 93416ed )
-
Document inlining of small data in
read_\*
APIs ( #670 ) ( 306953a )
1.5.0 (2024-05-07)
Features
-
bigframes.options
andbigframes.option_context
now uses thread-local variables to prevent context managers in separate threads from affecting each other ( #652 ) ( 651fd7d ) -
Add
ARIMAPlus.coef_
property exposingML.ARIMA_COEFFICIENTS
functionality ( #585 ) ( 81d1262 ) -
Add a unique session_id to Session and allow cleaning up sessions ( #553 ) ( c8d4e23 )
-
Add the
bigframes.bigquery
sub-package with abigframes.bigquery.array_length
function ( #630 ) ( 9963f85 ) -
Always do a query dry run when
option.repr_mode == "deferred"
( #652 ) ( 651fd7d ) -
Custom query labels for compute options ( #638 ) ( f561799 )
-
Warn with
DefaultIndexWarning
fromread_gbq
on clustered/partitioned tables with noindex_col
orfilters
set ( #631 , #658 ) ( 2715d2b , 73064dd ) -
Support
index_col=False
inread_csv
andengine="bigquery"
( 73064dd ) -
Support gcf max instance count in
remote_function
( #657 ) ( 36578ab )
Bug Fixes
-
Don’t raise UnknownLocationWarning for US or EU multi-regions ( #653 ) ( 8e4616b )
-
Fix bug with na in the column labels in stack ( #659 ) ( 4a34293 )
-
Use explicit session in
PaLM2TextGenerator
( #651 ) ( e4f13c3 )
Documentation
-
Add python code sample for multiple forecasting time series ( #531 ) ( 16866d2 )
-
Fix the Palm2TextGenerator output token size ( #649 ) ( c67e501 )
1.4.0 (2024-04-29)
Features
-
Add .cache() method to persist intermediate dataframe ( #626 ) ( a5c94ec )
-
Add transpose support for small homogeneously typed DataFrames. ( #621 ) ( 054075d )
-
Allow single input type in
remote_function
( #641 ) ( 3aa643f ) -
Expose gcf max timeout in
remote_function
( #639 ) ( dfeaad0 ) -
Series binary ops compatible with more types ( #618 ) ( 518d315 )
-
Support the
score
method forPaLM2TextGenerator
( #634 ) ( 3ffc1d2 )
Bug Fixes
-
Allow to_pandas to download more than 10GB ( #637 ) ( ce56495 )
-
Extend row hash to 128 bits to guarantee unique row id ( #632 ) ( 9005c6e )
Performance Improvements
-
Automatically condense internal expression representation ( #516 ) ( 03c1b0d )
-
Cache transpose to allow performant retranspose ( #635 ) ( 44b738d )
Documentation
-
Add supported pandas apis on the main page ( #628 ) ( 8d2a51c )
-
Add the first sample for the Single time-series forecasting from Google Analytics data tutorial ( #623 ) ( 2b84c4f )
-
Address more technical writers’ feedback ( #640 ) ( 1e7793c )
1.3.0 (2024-04-22)
Features
-
Add fine tuning
fit()
for Palm2TextGenerator ( #616 ) ( 9c106bd ) -
Expose
max_batching_rows
inremote_function
( #622 ) ( 240a1ac ) -
Support primary key(s) in
read_gbq
by using as theindex_col
by default ( #625 ) ( 75bb240 ) -
Warn if location is set to unknown location ( #609 ) ( 3706b4f )
Bug Fixes
-
Infer narrowest numeric type when combining numeric columns ( #602 ) ( 8f9ece6 )
-
Use exact median implementation by default ( #619 ) ( 9d205ae )
Documentation
-
Fix rendering of examples for multiple apis ( #620 ) ( 9665e39 )
-
Set
index_cols
inread_gbq
as a best practice ( #624 ) ( 70015b7 )
1.2.0 (2024-04-15)
Features
Bug Fixes
-
Address more technical writers feedback ( #581 ) ( 4b08d92 )
-
Inverting int now does bitwise inversion rather than sign flip ( #574 ) ( 5f1db8b )
Documentation
-
Add code samples for
str
accessor methdos ( #594 ) ( a557ea2 ) -
Add docs for
DataFrame
andSeries
dunder methods ( #562 ) ( 8fc26c4 )
1.1.0 (2024-04-04)
Features
-
Add support for numpy expm1, log1p, floor, ceil, arctan2 ops ( #505 ) ( e8e66cf )
-
Allow DataFrame binary ops to align on either axis and with loc… ( #544 ) ( 6d8f3af )
-
Expose
DataFrame.bqclient
to assist in integrations ( #519 ) ( 0be8911 ) -
Read_pandas accepts pandas Series and Index objects ( #573 ) ( f8821fe )
-
Support
ML.GENERATE_EMBEDDING
inPaLM2TextEmbeddingGenerator
( #539 ) ( 1156c1e ) -
Support max_columns in repr and make repr more efficient ( #515 ) ( 54e49cf )
Bug Fixes
-
Don’t download 100gb onto local python machine in load test ( #537 ) ( 082c58b )
-
Exclude list-like s parameter in plot.scatter ( #568 ) ( 1caac27 )
-
Fix case where df.peek would fail to execute even with force=True ( #511 ) ( 8eca99a )
-
Plot.scatter s parameter cannot accept float-like column ( #563 ) ( 8d39187 )
-
Product operation produces float result for all input types ( #501 ) ( 6873b30 )
-
Rename PaLM2TextEmbeddingGenerator.predict output columns to be backward compatible ( #561 ) ( 4995c00 )
-
Respect hard stack size limit and swallow limit change exception. ( #558 ) ( 4833908 )
-
Restore string to date/time type coercion ( #565 ) ( 4ae0262 )
-
Sync the notebook with embedding changes ( #550 ) ( 347f2dd )
-
Use bytes limit on frame inlining rather than element count ( #576 ) ( 659a161 )
Performance Improvements
Dependencies
Documentation
-
bigframes.options.bigquery.project
andlocation
are optional in some circumstances ( #548 ) ( 90bcec5 ) -
Add “Supported pandas APIs” reference to the documentation ( #542 ) ( 74c3915 )
-
Add General Availability banner to README ( #507 ) ( 262ff59 )
-
Add the code samples for metrics{auc, roc_auc_score, roc_curve} ( #520 ) ( 5f37b09 )
-
Address more comments from technical writers to meet legal purposes ( #571 ) ( 9084df3 )
-
Migrate the overview page to Bigframes official landing page ( #536 ) ( a0fb8bb )
1.0.0 (2024-03-25)
⚠ BREAKING CHANGES
-
rename model parameter
min_rel_progress
totol
-
early_stop
setting no longer supported, always usesTrue
-
rename model parameter
n_parallell_trees
ton_estimators
-
rename
class_weights
toclass_weight
-
rename
learn_rate
tolearning_rate
-
PCA
n_components
supports float value andNone
, default toNone
-
rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 )
Features
-
Allow assigning directly to Series.name property ( #495 ) ( ad0e99e )
-
Ensure
Series.str.len()
can get length of array columns ( #497 ) ( 10c0446 ) -
Option to use bq connection without check ( #460 ) ( 0b3f8e5 )
-
PCA
n_components
supports float value andNone
, default toNone
( 65c6f47 ) -
Rename
class_weights
toclass_weight
( 65c6f47 ) -
Rename
learn_rate
tolearning_rate
( 65c6f47 ) -
Rename model parameter
min_rel_progress
totol
( 65c6f47 ) -
Rename model parameter
n_parallell_trees
ton_estimators
( 65c6f47 ) -
Rename various ml model parameters for consistency with sklearn ( https://github.com/googleapis/python-bigquery-dataframes/pull/491 ) ( 65c6f47 )
-
Support BQ regional endpoints for europe-west9, europe-west3, us-east4, and us-west1 ( #504 ) ( fbada4a )
Bug Fixes
-
early_stop
setting no longer supported, always usesTrue
( 65c6f47 ) -
Plot.scatter
c
argument functionalities ( #494 ) ( d6ee994 ) -
Properly support format param for numerical input. ( #486 ) ( ae20c35 )
-
Renable to_csv and to_json related tests ( #468 ) ( 2b9a01d )
-
Sampling plot cannot preserve ordering if index is not ordered ( #475 ) ( a5345fe )
-
Use actual BigQuery types rather than ibis types in to_pandas ( #500 ) ( 82b4f91 )
Dependencies
Documentation
-
Add code samples for metrics.{accuracy_score, confusion_matrix} ( #478 ) ( 3e3329a )
-
Add code samples for metrics.{recall_score, precision_score, f11_score} ( #502 ) ( 370fe90 )
-
Update bigquery connection documentation ( #499 ) ( 4bfe094 )
-
Update LLM + K-means notebook to handle partial failures ( #496 ) ( 97afad9 )
0.26.0 (2024-03-20)
⚠ BREAKING CHANGES
- exclude remote models for .register() ( #465 )
Features
-
read_gbq_table
supportsLIKE
as a operator infilters
( #454 ) ( d2d425a ) -
Set
force=True
by default inDataFrame.peek()
( #469 ) ( 4e8e97d ) -
Support datetime related casting in (Series|DataFrame|Index).astype ( #442 ) ( fde339b )
Bug Fixes
-
Any() on empty set now correctly returns False ( #471 ) ( f55680c )
-
Disable to_json and to_csv related tests ( #462 ) ( 874026d )
-
Fix grouping series on multiple other series ( #455 ) ( 3971bd2 )
-
Groupby aggregates no longer check if grouping keys are numeric ( #472 ) ( 4fbf938 )
-
Raise
ValueError
whenread_pandas()
receives a bigframesDataFrame
( #447 ) ( b28f9fd ) -
Series.(to_csv|to_json) leverages bq export ( #452 ) ( 718a00c )
-
Warn when
read_gbq
/read_gbq_table
uses the snapshot time cache ( #441 ) ( e16a8c0 )
Documentation
-
Add code samples for
ml.metrics.r2_score
( #459 ) ( 85fefa2 ) -
Add version information to bug template ( #437 ) ( 91bd39e )
-
Indicate that project and location are optional in example notebooks ( #451 ) ( 1df0140 )
0.25.0 (2024-03-14)
Features
-
(Series|DataFrame).plot.(line|area|scatter) ( #431 ) ( 0772510 )
-
Support CMEK for
remote_function
cloud functions ( #430 ) ( 2fd69f4 )
0.24.0 (2024-03-12)
⚠ BREAKING CHANGES
-
read_parquet
uses a “pandas” engine to parse files by default. Useengine="bigquery"
for the previous behavior
Features
Bug Fixes
-
Move
third_party.bigframes_vendored
tobigframes_vendored
( #424 ) ( 763edeb ) -
Only do row identity based joins when joining by index ( #356 ) ( 76b252f )
Documentation
-
Add predict sample to samples/snippets/bqml_getting_started_test.py ( #388 ) ( 6a3b0cc )
-
Fix the note rendering for DataFrames methods: nlargest, nsmallest ( #417 ) ( 38bd2ba )
0.23.0 (2024-03-05)
Features
-
Add ml.metrics.pairwise.euclidean_distance ( #397 ) ( 1726588 )
-
Add TextEmbedding model version support ( #394 ) ( e0f1ab0 )
Bug Fixes
-
Code exception in
remote_function
now prevents retry and surfaces in the client ( #387 ) ( dd3643d )
Dependencies
- Update ibis to version 8.0.0 and refactor
remote_function
to use ibis UDF method ( #277 ) ( 350499b )
Documentation
0.22.0 (2024-02-27)
⚠ BREAKING CHANGES
-
rename cosine_similarity to paired_cosine_distances ( #393 )
-
move model optional args to kwargs ( #381 )
Features
-
Add ml.metrics.pairwise.manhattan_distance ( #392 ) ( 9d31865 )
-
Enable regional endpoints for me-central2 ( #386 ) ( 469674d )
Bug Fixes
-
Avoid ibis warning for “database” table() method argument ( #390 ) ( a0490a4 )
-
Rename cosine_similarity to paired_cosine_distances ( #393 ) ( 81ece46 )
Performance Improvements
Dependencies
Documentation
-
Add a code sample for creating a kmeans model ( #267 ) ( 4291d65 )
-
Fix
bigframes.pandas.concat
documentation ( #382 ) ( 234b61c )
Miscellaneous Chores
Code Refactoring
0.21.0 (2024-02-13)
Features
-
Add ml.metrics.pairwise.cosine_similarity function ( #374 ) ( 126f566 )
-
Limited support of lambdas in
Series.apply
( #345 ) ( 208e081 ) -
Support bigframes.pandas.to_datetime for scalars, iterables and series. ( #372 ) ( ffb0d15 )
Bug Fixes
Documentation
0.20.1 (2024-02-06)
Performance Improvements
Documentation
0.20.0 (2024-01-30)
Features
-
Add
DataFrame.peek()
as an efficient alternative tohead()
results preview ( #318 ) ( 9c34d83 ) -
Add ARIMA_EVAULATE options in forecasting models ( #336 ) ( 73e997b )
-
Add Index constructor, repr, copy, get_level_values, to_series ( #334 ) ( e5d054e )
-
Improve error message for drive based BQ table reads ( #344 ) ( 0794788 )
-
Update cut to work without labels = False and show intervals as dict ( #335 ) ( 4ff53db )
Bug Fixes
-
Chance default connection name in getting_started.ipnyb ( #347 ) ( 677f014 )
-
Series iteration correctly returns values instead of index ( #339 ) ( 2c6af9b )
Documentation
0.19.2 (2024-01-22)
Bug Fixes
Documentation
0.19.1 (2024-01-17)
Bug Fixes
Documentation
0.19.0 (2024-01-09)
Features
-
Add ‘columns’ as an alias for ‘col_order’ ( #298 ) ( a01b271 )
-
Add Series dt.tz and dt.unit properties ( #303 ) ( 2e1a403 )
-
Allow manually set clustering_columns in dataframe.to_gbq ( #302 ) ( 9c21323 )
-
Support assigning to columns like a property ( #304 ) ( f645c56 )
-
Support upcasting numeric columns in concat ( #294 ) ( e3a056a )
Bug Fixes
Documentation
0.18.0 (2024-01-02)
Features
-
Add IntervalIndex support to bigframes.pandas.cut ( #254 ) ( 6c1969a )
-
Specific pyarrow mappings for decimal, bytes types ( #283 ) ( a1c0631 )
Bug Fixes
-
Dataframes to_gbq now creates dataset if it doesn’t exist ( #222 ) ( bac62f7 )
-
Exclude pandas 2.2.0rc0 to unblock prerelease tests ( #292 ) ( ac1a745 )
-
Fix DataFrameGroupby.agg() issue with as_index=False ( #273 ) ( ab49350 )
-
Make
Series.str.replace
work for simple strings ( #285 ) ( ad67465 ) -
Update dataframe.to_gbq to dedup column names. ( #286 ) ( 746115d )
Dependencies
Documentation
-
Add code snippets for explore query result page ( #278 ) ( 7cbbb7d )
-
Code samples for
astype
common to DataFrame and Series ( #280 ) ( 95b673a ) -
Code samples for
DataFrame.copy
andSeries.copy
( #290 ) ( 7cbc2b0 ) -
Code samples for
isna
,isnull
,dropna
,isin
( #289 ) ( ad51035 ) -
Code samples for
reset_index
andsort_values
( #282 ) ( acc0eb7 ) -
Code samples for
sample
,get
,Series.round
( #295 ) ( c2b1892 ) -
Code samples for
Series.{add, replace, unique, T, transpose}
( #287 ) ( 0e1bbfc ) -
Code samples for
Series.{map, to_list, count}
( #290 ) ( 7cbc2b0 ) -
Code samples for
Series.{name, std, agg}
( #293 ) ( eb69f60 ) -
Code samples for
Series.groupby
andSeries.{sum,mean,min,max}
( #280 ) ( 95b673a ) -
Code samples for DataFrame
set_index
,items
( #295 ) ( c2b1892 )
0.17.0 (2023-12-14)
Features
Bug Fixes
-
Increase recursion limit, cache compilation tree hashes ( #184 ) ( b54791c )
-
Replaced raise
NotImplementedError
with returnNotImplemented
( #258 ) ( a133822 )
Documentation
-
Add code samples for
values
andvalue_counts
( #249 ) ( f247d95 ) -
Add sample for getting started with BQML ( #141 ) ( fb14f54 )
0.16.0 (2023-12-12)
Features
-
Add DataFrame from_dict and from_records methods ( #244 ) ( 8d81e24 )
-
Add nunique method to Series/DataFrameGroupby ( #256 ) ( c8ec245 )
-
Support dataframe.loc with conditional columns selection ( #233 ) ( 3febea9 )
Bug Fixes
-
Enfore pandas version requirement <2.1.4 ( #265 ) ( 9dd63f6 )
-
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests ( b02fc2c )
-
Fix value_counts column label for normalize=True ( #245 ) ( d3fa6f2 )
-
Migrate e2e tests to bigframes-load-testing project ( 8766ac6 )
Documentation
-
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… ( #252 ) ( 8c63697 )
-
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… ( #234 ) ( e735412 )
-
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod ( #243 ) ( 0523a31 )
-
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… ( #251 ) ( 77074ec )
-
Correct the params rendering for
ml.remote
andml.ensemble
modules ( #248 ) ( c2829e3 ) -
Fix return annotation in API docstrings ( #253 ) ( 89a1c67 )
0.15.0 (2023-11-29)
⚠ BREAKING CHANGES
- model.predict returns all the columns ( #204 )
Features
-
Add info and memory_usage methods to dataframe ( #219 ) ( 9d6613d )
-
Add the recent api method for ML component ( #225 ) ( ed8876d )
-
Send warnings on LLM prediction partial failures ( #216 ) ( 81125f9 )
Bug Fixes
-
Avoid unnecessary row_number() on sort key for io ( #211 ) ( a18d40e )
-
Make to_pandas override enable_downsampling when sampling_method is manually set. ( #200 ) ( ae03756 )
-
Update the llm+kmeans notebook with recent change ( #236 ) ( f8917ab )
-
Use anonymous dataset to create
remote_function
( #205 ) ( 69b016e )
Documentation
-
Add code samples for
index
andcolumn
properties ( #212 ) ( c88d38e ) -
Add code samples for df reshaping, function, merge, and join methods ( #203 ) ( 010486c )
-
Add examples for dataframe.kurt, dataframe.std, dataframe.count ( #232 ) ( f9c6e72 )
-
Add examples for dataframe.mean, dataframe.median, dataframe.va… ( #228 ) ( edd0522 )
-
Add examples for dataframe.min, dataframe.max and dataframe.sum ( #227 ) ( 3a375e8 )
-
Code samples for
Series.dot
andDataFrame.dot
( #226 ) ( b62a07a ) -
Code samples for
Series.where
andSeries.mask
( #217 ) ( 52dfad2 ) -
Code samples for dataframe.any, dataframe.all and dataframe.prod ( #223 ) ( d7957fa )
-
Make the code samples reflect default bq connection usage ( #206 ) ( 71844b0 )
Miscellaneous Chores
0.14.1 (2023-11-16)
Bug Fixes
Documentation
0.14.0 (2023-11-14)
Features
-
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods ( #162 ) ( 6a28403 )
-
Add series.sample (identical to existing dataframe.sample) ( #187 ) ( 37914a4 )
-
Log most recent API calls as
recent-bigframes-api-xx
labels on BigQuery jobs ( #145 ) ( 4ea33b7 ) -
Read_gbq creates order deterministically without table copy ( #191 ) ( 8ab81de )
-
Support
date_series.astype("string[pyarrow]")
to cast DATE to STRING ( #186 ) ( aee0e8e ) -
Temporary resources no longer use BigQuery Sessions ( #194 ) ( 4a02cac )
Bug Fixes
-
Default to 7 days expiration for
read_csv
,read_json
,read_parquet
( #193 ) ( 03606cd ) -
Deprecate the
remote_service_type
in llm model ( #180 ) ( a8a409a ) -
For reset_index on unnamed multiindex, always use level_[n] label ( #182 ) ( f95000d )
-
Match pandas behavior when assigning listlike to empty dfs ( #172 ) ( c1d1f42 )
-
Use anonymous dataset instead of session dataset for temp tables ( #181 ) ( 800d44e )
-
Use random table when loading data for
read_csv
,read_json
,read_parquet
( #175 ) ( 9d2e6dc )
Documentation
-
Add code samples for
read_gbq_function
using community UDFs ( #188 ) ( 7506eab ) -
Add docstring code samples for
Series.apply
andDataFrame.map
( #185 ) ( c816d84 ) -
Add llm kmeans notebook as an included example ( #177 ) ( d49ae42 )
-
Use
head()
to get topn
results, not to preview results ( #190 ) ( 87f84c9 )
0.13.0 (2023-11-07)
Features
-
to_gbq
without a destination table writes to a temporary table ( #158 ) ( e1817c9 ) -
Add
DataFrame.__iter__
,DataFrame.iterrows
,DataFrame.itertuples
, andDataFrame.keys
methods ( #164 ) ( c065071 ) -
Add interpolate() to series and dataframe ( #157 ) ( b9cb55c )
-
Support 32k text-generation and multilingual embedding models ( #161 ) ( 5f0ea37 )
Bug Fixes
0.12.0 (2023-11-01)
Features
-
Add
DataFrame.to_pandas_batches()
to download largeDataFrame
objects ( #136 ) ( 3afd4a3 ) -
Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs ( #133 ) ( 63c7919 )
Bug Fixes
-
Don’t override the global logging config ( #138 ) ( 2ddbf74 )
-
Fix bug with column names under repeated column assignment ( #150 ) ( 29032d0 )
-
Resolve plotly rendering issue by using ipython html for job pro… ( #134 ) ( 39df43e )
-
Use indexee’s session for loc listlike cases ( #152 ) ( 27c5725 )
Documentation
-
Fix indentation on
read_gbq_function
code sample ( #163 ) ( 0801d96 ) -
Link to ML.EVALUATE BQML page for score() methods ( #137 ) ( 45c617f )
0.11.0 (2023-10-26)
Features
-
Add back
reset_session
as an alias forclose_session
( #124 ) ( 694a85a ) -
Change
query
parameter toquery_or_table
inread_gbq
( #127 ) ( f9bb3c4 )
Bug Fixes
-
Expose
bigframes.pandas.reset_session
as a public API ( #128 ) ( b17e1f4 ) -
Use series’s own session in series.reindex listlike case ( #135 ) ( 95bff3f )
Documentation
-
Add runnable code samples for DataFrames I/O methods and property ( #129 ) ( 6fea8ef )
-
Add runnable code samples for reading methods ( #125 ) ( a669919 )
0.10.0 (2023-10-19)
Features
0.9.0 (2023-10-18)
⚠ BREAKING CHANGES
- rename
bigframes.pandas.reset_session
toclose_session
( #101 )
Features
-
Add
bigframes.options.bigquery.application_name
for partner attribution ( #117 ) ( 52d64ff ) -
Rename
bigframes.pandas.reset_session
toclose_session
( #101 ) ( 36693bf ) -
Send BigQuery cancel request when canceling bigframes process ( #103 ) ( e325fbb )
-
Support external packages in
remote_function
( #98 ) ( ec10c4a ) -
Use ArrowDtype for STRUCT columns in
to_pandas
( #85 ) ( 9238fad )
Bug Fixes
Performance Improvements
Documentation
0.8.0 (2023-10-12)
⚠ BREAKING CHANGES
- The default behavior of
to_parquet
is changing from no compression to'snappy'
compression.
Features
- Support compression in
to_parquet
( a8c286f )
Bug Fixes
0.7.0 (2023-10-11)
Features
-
Add aliases for several series properties ( #80 ) ( c0efec8 )
-
Add iat and iloc accessing by tuples of integers ( #90 ) ( 228aeba )
Bug Fixes
Documentation
0.6.0 (2023-10-04)
Features
-
Add update and align methods to dataframe ( #57 ) ( bf050cf )
-
Support STRUCT data type with
Series.struct.field
to extract child fields ( #71 ) ( 17afac9 )
Bug Fixes
-
Avoid
403 response too large to return
error withread_gbq
and large query results ( #77 ) ( 8f3b5b2 ) -
Change return type of
Series.loc[scalar]
( #40 ) ( fff3d45 ) -
Fix df/series.iloc by list with multiindex ( #79 ) ( 971d091 )
0.5.0 (2023-09-28)
Features
-
Add
DataFrame.kurtosis
/DF.kurt
method ( c1900c2 ) -
Add
DataFrame.rolling
andDataFrame.expanding
methods ( c1900c2 ) -
Add
items
,apply
methods toDataFrame
. ( #43 ) ( 3adc1b3 ) -
Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )
-
Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. ( #38 ) ( 1a254a4 ) -
Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
class_weights="balanced"
inLogisticRegression
model ( c1900c2 ) -
Support
df[column_name] = df_only_one_column
( c1900c2 ) -
Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support casting string to integer or float ( #59 ) ( 3502f83 )
Bug Fixes
-
Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )
-
LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )
-
Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )
Performance Improvements
-
Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )
-
Inline small
Series
andDataFrames
in query text ( #45 ) ( 5e199ec ) -
Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )
-
Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )
Documentation
- Link to Remote Functions code samples from README and API reference ( c1900c2 )
0.4.0 (2023-09-16)
Features
-
Add
axis
parameter todroplevel
andreorder_levels
( 7c6b0dd ) -
Add
bfill
andffill
toDataFrame
andSeries
( 7c6b0dd ) -
Add
DataFrame.combine
andDataFrame.combine_first
( #27 ) ( 7c6b0dd ) -
Add
DataFrame.nlargest
,nsmallest
( 7c6b0dd ) -
Add
DataFrame.pct_change
andSeries.pct_change
( 7c6b0dd ) -
Add
DataFrame.skew
andGroupBy.skew
( 7c6b0dd ) -
Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
( 7c6b0dd ) -
Add
diff
method toDataFrame
andGroupBy
( 7c6b0dd ) -
Add
filter
andreindex
toSeries
andDataFrame
( 7c6b0dd ) -
Add
reindex_like
toDataFrame
andSeries
( 7c6b0dd ) -
Add
swaplevel
toDataFrame
andSeries
( 7c6b0dd ) -
Add partial support for
Sereies.replace
( 7c6b0dd ) -
Support
DataFrame.loc[bool_series, column] = scalar
( 7c6b0dd ) -
Support a persistent
name
inremote_function
( 7c6b0dd )
Bug Fixes
-
remote_function
uses same credentials as other APIs ( 7c6b0dd ) -
Add type hints to models ( 7c6b0dd )
-
Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )
-
Remove
transforms
parameter inmodel.fit
( breaking change) ( 7c6b0dd ) -
Support column joins with “None indexer” ( 7c6b0dd )
-
Use for literals
Int64Dtype
incut
( 7c6b0dd ) -
Use lowercase strings for parameter literals in
bigframes.ml
( breaking change) ( 7c6b0dd )
Performance Improvements
-
bigframes-api
label to I/O query jobs ( 7c6b0dd )
Documentation
-
Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )
-
Document region logic in README ( 7c6b0dd )
-
Fix OneHotEncoder sample ( 7c6b0dd )
0.3.2 (2023-09-06)
Bug Fixes
0.3.1 (2023-09-05)
Bug Fixes
0.3.0 (2023-09-02)
Features
-
Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases ( a32b747 ) -
Add
bigframes.pandas.read_pickle
function ( a32b747 ) -
Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
( 89b9503 ) -
Add
fit_transform
tobigquery.ml
transformers ( a32b747 ) -
Add
Series.dropna
andDataFrame.fillna
( 8fab755 ) -
Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
( a32b747 ) -
Support
bigframes.pandas.merge()
( 8fab755 ) -
Support
DataFrame.isin
with list and dict inputs ( 8fab755 ) -
Support
DataFrame.pivot
( a32b747 ) -
Support
DataFrame.stack
( 89b9503 ) -
Support
DataFrame
-DataFrame
binary operations ( 8fab755 ) -
Support
df[my_column] = [a python list]
( 89b9503 ) -
Support
Index.is_monotonic
( 8fab755 ) -
Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument ( 89b9503 ) -
Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument ( 89b9503 ) -
Support
pow()
and power operator inDataFrame
andSeries
( 8fab755 ) -
Support
read_json
withengine=bigquery
for newline-delimited JSON files ( 89b9503 ) -
Support
Series.corr
( 89b9503 ) -
Support
Series.map
( 8fab755 ) -
Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
( 8fab755 ) -
Support MultiIndex for DataFrame columns ( a32b747 )
-
Use
pandas.Index
for column labels ( a32b747 ) -
Use default session and connection in
ml.llm
andml.imported
( 8fab755 )
Bug Fixes
-
Add error message to
set_index
( a32b747 ) -
Align column names with pandas in
DataFrame.agg
results ( 89b9503 ) -
Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined ( 89b9503 ) -
Check for IAM role on the BigQuery connection when initializing a
remote_function
( 89b9503 ) -
Check that types are specified in
read_gbq_function
( a32b747 ) -
Don’t use query cache for Session construction ( a32b747 )
-
Include survey link in abstract
NotImplementedError
exception messages ( 89b9503 ) -
Label temp table creation jobs with
source=bigquery-dataframes-temp
label ( 89b9503 ) -
Make
X_train
argument names consistent across methods ( 8fab755 ) -
Raise AttributeError for unimplemented pandas methods ( 89b9503 )
-
Raise exception for invalid function in
read_gbq_function
( a32b747 ) -
Support spaces in column names in
DataFrame
initializater ( 89b9503 )
Performance Improvements
-
Add local cache for
__repr_\*__
methods ( a32b747 ) -
Lazily instantiate client library objects ( 89b9503 )
-
Use
row_number()
filter forhead
/tail
( 8fab755 )
Documentation
-
Add ML section under Overview ( a32b747 )
-
Add release status to table of contents ( a32b747 )
-
Add samples and best practices to
read_gbq
docs ( a32b747 ) -
Correct the return types of Dataframe and Series ( a32b747 )
-
Create subfolders for notebooks ( a32b747 )
-
Fix link to GitHub ( 89b9503 )
-
Highlight bigframes is open-source ( a32b747 )
-
Sample ML Drug Name Generation notebook ( a32b747 )
-
Set
options.bigquery.project
in sample code ( 89b9503 ) -
Transform remote function user guide into sample code ( a32b747 )
-
Update remote function notebook with read_gbq_function usage ( 8fab755 )
0.2.0 (2023-08-17)
Features
-
Add KMeans.cluster_centers_.
-
Allow column labels to be any type handled by bq df, column labels can be integers now.
-
Add dataframegroupby.agg().
-
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
-
Add match, fullmatch, get, pad str methods.
-
Add series isin function.
Bug Fixes
-
Update ML package to use sessions for queries.
-
Optimize
read_gbq
withindex_col
set to cluster byindex_col
. -
Raise ValueError if the location mismatched.
-
read_gbq
no longer uses ‘time travel’ with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
0.1.1 (2023-08-14)
Documentation
- Correct link to code repository in
setup.py
and use correct terminology forconsole.cloud.google.com
links.
0.1.0 (2023-08-11)
Features
-
Add
bigframes.pandas
package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more. -
Add
bigframes.ml
package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .
0.0.0 (2023-02-22)
- Empty package to reserve package name.