- 2.17.0 (latest)
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 2.0.0-dev0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Changelog
0.16.0 (2023-12-12)
Features
-
Add DataFrame from_dict and from_records methods ( #244 ) ( 8d81e24 )
-
Add nunique method to Series/DataFrameGroupby ( #256 ) ( c8ec245 )
-
Support dataframe.loc with conditional columns selection ( #233 ) ( 3febea9 )
Bug Fixes
-
Enfore pandas version requirement <2.1.4 ( #265 ) ( 9dd63f6 )
-
Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests ( b02fc2c )
-
Fix value_counts column label for normalize=True ( #245 ) ( d3fa6f2 )
-
Migrate e2e tests to bigframes-load-testing project ( 8766ac6 )
Documentation
-
Add example for dataframe.melt, dataframe.pivot, dataframe.stac… ( #252 ) ( 8c63697 )
-
Add example to dataframe.nlargest, dataframe.nsmallest, datafra… ( #234 ) ( e735412 )
-
Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod ( #243 ) ( 0523a31 )
-
Add examples for dataframe.nunique, dataframe.diff, dataframe.a… ( #251 ) ( 77074ec )
-
Correct the params rendering for
ml.remote
andml.ensemble
modules ( #248 ) ( c2829e3 ) -
Fix return annotation in API docstrings ( #253 ) ( 89a1c67 )
0.15.0 (2023-11-29)
⚠ BREAKING CHANGES
- model.predict returns all the columns ( #204 )
Features
-
Add info and memory_usage methods to dataframe ( #219 ) ( 9d6613d )
-
Add the recent api method for ML component ( #225 ) ( ed8876d )
-
Send warnings on LLM prediction partial failures ( #216 ) ( 81125f9 )
Bug Fixes
-
Avoid unnecessary row_number() on sort key for io ( #211 ) ( a18d40e )
-
Make to_pandas override enable_downsampling when sampling_method is manually set. ( #200 ) ( ae03756 )
-
Update the llm+kmeans notebook with recent change ( #236 ) ( f8917ab )
-
Use anonymous dataset to create
remote_function
( #205 ) ( 69b016e )
Documentation
-
Add code samples for
index
andcolumn
properties ( #212 ) ( c88d38e ) -
Add code samples for df reshaping, function, merge, and join methods ( #203 ) ( 010486c )
-
Add examples for dataframe.kurt, dataframe.std, dataframe.count ( #232 ) ( f9c6e72 )
-
Add examples for dataframe.mean, dataframe.median, dataframe.va… ( #228 ) ( edd0522 )
-
Add examples for dataframe.min, dataframe.max and dataframe.sum ( #227 ) ( 3a375e8 )
-
Code samples for
Series.dot
andDataFrame.dot
( #226 ) ( b62a07a ) -
Code samples for
Series.where
andSeries.mask
( #217 ) ( 52dfad2 ) -
Code samples for dataframe.any, dataframe.all and dataframe.prod ( #223 ) ( d7957fa )
-
Make the code samples reflect default bq connection usage ( #206 ) ( 71844b0 )
Miscellaneous Chores
0.14.1 (2023-11-16)
Bug Fixes
Documentation
0.14.0 (2023-11-14)
Features
-
Add ‘index’, ‘pad’, ‘nearest’ interpolate methods ( #162 ) ( 6a28403 )
-
Add series.sample (identical to existing dataframe.sample) ( #187 ) ( 37914a4 )
-
Log most recent API calls as
recent-bigframes-api-xx
labels on BigQuery jobs ( #145 ) ( 4ea33b7 ) -
Read_gbq creates order deterministically without table copy ( #191 ) ( 8ab81de )
-
Support
date_series.astype("string[pyarrow]")
to cast DATE to STRING ( #186 ) ( aee0e8e ) -
Temporary resources no longer use BigQuery Sessions ( #194 ) ( 4a02cac )
Bug Fixes
-
Default to 7 days expiration for
read_csv
,read_json
,read_parquet
( #193 ) ( 03606cd ) -
Deprecate the
remote_service_type
in llm model ( #180 ) ( a8a409a ) -
For reset_index on unnamed multiindex, always use level_[n] label ( #182 ) ( f95000d )
-
Match pandas behavior when assigning listlike to empty dfs ( #172 ) ( c1d1f42 )
-
Use anonymous dataset instead of session dataset for temp tables ( #181 ) ( 800d44e )
-
Use random table when loading data for
read_csv
,read_json
,read_parquet
( #175 ) ( 9d2e6dc )
Documentation
-
Add code samples for
read_gbq_function
using community UDFs ( #188 ) ( 7506eab ) -
Add docstring code samples for
Series.apply
andDataFrame.map
( #185 ) ( c816d84 ) -
Add llm kmeans notebook as an included example ( #177 ) ( d49ae42 )
-
Use
head()
to get topn
results, not to preview results ( #190 ) ( 87f84c9 )
0.13.0 (2023-11-07)
Features
-
to_gbq
without a destination table writes to a temporary table ( #158 ) ( e1817c9 ) -
Add
DataFrame.__iter__
,DataFrame.iterrows
,DataFrame.itertuples
, andDataFrame.keys
methods ( #164 ) ( c065071 ) -
Add interpolate() to series and dataframe ( #157 ) ( b9cb55c )
-
Support 32k text-generation and multilingual embedding models ( #161 ) ( 5f0ea37 )
Bug Fixes
0.12.0 (2023-11-01)
Features
-
Add
DataFrame.to_pandas_batches()
to download largeDataFrame
objects ( #136 ) ( 3afd4a3 ) -
Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs ( #133 ) ( 63c7919 )
Bug Fixes
-
Don’t override the global logging config ( #138 ) ( 2ddbf74 )
-
Fix bug with column names under repeated column assignment ( #150 ) ( 29032d0 )
-
Resolve plotly rendering issue by using ipython html for job pro… ( #134 ) ( 39df43e )
-
Use indexee’s session for loc listlike cases ( #152 ) ( 27c5725 )
Documentation
-
Fix indentation on
read_gbq_function
code sample ( #163 ) ( 0801d96 ) -
Link to ML.EVALUATE BQML page for score() methods ( #137 ) ( 45c617f )
0.11.0 (2023-10-26)
Features
-
Add back
reset_session
as an alias forclose_session
( #124 ) ( 694a85a ) -
Change
query
parameter toquery_or_table
inread_gbq
( #127 ) ( f9bb3c4 )
Bug Fixes
-
Expose
bigframes.pandas.reset_session
as a public API ( #128 ) ( b17e1f4 ) -
Use series’s own session in series.reindex listlike case ( #135 ) ( 95bff3f )
Documentation
-
Add runnable code samples for DataFrames I/O methods and property ( #129 ) ( 6fea8ef )
-
Add runnable code samples for reading methods ( #125 ) ( a669919 )
0.10.0 (2023-10-19)
Features
0.9.0 (2023-10-18)
⚠ BREAKING CHANGES
- rename
bigframes.pandas.reset_session
toclose_session
( #101 )
Features
-
Add
bigframes.options.bigquery.application_name
for partner attribution ( #117 ) ( 52d64ff ) -
Rename
bigframes.pandas.reset_session
toclose_session
( #101 ) ( 36693bf ) -
Send BigQuery cancel request when canceling bigframes process ( #103 ) ( e325fbb )
-
Support external packages in
remote_function
( #98 ) ( ec10c4a ) -
Use ArrowDtype for STRUCT columns in
to_pandas
( #85 ) ( 9238fad )
Bug Fixes
Performance Improvements
Documentation
0.8.0 (2023-10-12)
⚠ BREAKING CHANGES
- The default behavior of
to_parquet
is changing from no compression to'snappy'
compression.
Features
- Support compression in
to_parquet
( a8c286f )
Bug Fixes
0.7.0 (2023-10-11)
Features
-
Add aliases for several series properties ( #80 ) ( c0efec8 )
-
Add iat and iloc accessing by tuples of integers ( #90 ) ( 228aeba )
Bug Fixes
Documentation
0.6.0 (2023-10-04)
Features
-
Add update and align methods to dataframe ( #57 ) ( bf050cf )
-
Support STRUCT data type with
Series.struct.field
to extract child fields ( #71 ) ( 17afac9 )
Bug Fixes
-
Avoid
403 response too large to return
error withread_gbq
and large query results ( #77 ) ( 8f3b5b2 ) -
Change return type of
Series.loc[scalar]
( #40 ) ( fff3d45 ) -
Fix df/series.iloc by list with multiindex ( #79 ) ( 971d091 )
0.5.0 (2023-09-28)
Features
-
Add
DataFrame.kurtosis
/DF.kurt
method ( c1900c2 ) -
Add
DataFrame.rolling
andDataFrame.expanding
methods ( c1900c2 ) -
Add
items
,apply
methods toDataFrame
. ( #43 ) ( 3adc1b3 ) -
Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )
-
Add index
dtype
,astype
,drop
,fillna
, aggregate attributes. ( #38 ) ( 1a254a4 ) -
Support
calculate_p_values
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
class_weights="balanced"
inLogisticRegression
model ( c1900c2 ) -
Support
df[column_name] = df_only_one_column
( c1900c2 ) -
Support
early_stop
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
enable_global_explain
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
l2_reg
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
learn_rate_strategy
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
ls_init_learn_rate
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
max_iterations
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
min_rel_progress
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support
optimize_strategy
parameter inbigframes.ml.linear_model.LinearRegression
( c1900c2 ) -
Support casting string to integer or float ( #59 ) ( 3502f83 )
Bug Fixes
-
Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )
-
LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )
-
Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )
Performance Improvements
-
Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )
-
Inline small
Series
andDataFrames
in query text ( #45 ) ( 5e199ec ) -
Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )
-
Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )
Documentation
- Link to Remote Functions code samples from README and API reference ( c1900c2 )
0.4.0 (2023-09-16)
Features
-
Add
axis
parameter todroplevel
andreorder_levels
( 7c6b0dd ) -
Add
bfill
andffill
toDataFrame
andSeries
( 7c6b0dd ) -
Add
DataFrame.combine
andDataFrame.combine_first
( #27 ) ( 7c6b0dd ) -
Add
DataFrame.nlargest
,nsmallest
( 7c6b0dd ) -
Add
DataFrame.pct_change
andSeries.pct_change
( 7c6b0dd ) -
Add
DataFrame.skew
andGroupBy.skew
( 7c6b0dd ) -
Add
DataFrame.to_dict
,to_excel
,to_latex
,to_records
,to_string
,to_markdown
,to_pickle
,to_orc
( 7c6b0dd ) -
Add
diff
method toDataFrame
andGroupBy
( 7c6b0dd ) -
Add
filter
andreindex
toSeries
andDataFrame
( 7c6b0dd ) -
Add
reindex_like
toDataFrame
andSeries
( 7c6b0dd ) -
Add
swaplevel
toDataFrame
andSeries
( 7c6b0dd ) -
Add partial support for
Sereies.replace
( 7c6b0dd ) -
Support
DataFrame.loc[bool_series, column] = scalar
( 7c6b0dd ) -
Support a persistent
name
inremote_function
( 7c6b0dd )
Bug Fixes
-
remote_function
uses same credentials as other APIs ( 7c6b0dd ) -
Add type hints to models ( 7c6b0dd )
-
Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )
-
Remove
transforms
parameter inmodel.fit
( breaking change) ( 7c6b0dd ) -
Support column joins with “None indexer” ( 7c6b0dd )
-
Use for literals
Int64Dtype
incut
( 7c6b0dd ) -
Use lowercase strings for parameter literals in
bigframes.ml
( breaking change) ( 7c6b0dd )
Performance Improvements
-
bigframes-api
label to I/O query jobs ( 7c6b0dd )
Documentation
-
Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )
-
Document region logic in README ( 7c6b0dd )
-
Fix OneHotEncoder sample ( 7c6b0dd )
0.3.2 (2023-09-06)
Bug Fixes
0.3.1 (2023-09-05)
Bug Fixes
0.3.0 (2023-09-02)
Features
-
Add
bigframes.get_global_session()
andbigframes.reset_session()
aliases ( a32b747 ) -
Add
bigframes.pandas.read_pickle
function ( a32b747 ) -
Add
components_
,explained_variance_
, andexplained_variance_ratio_
properties tobigframes.ml.decomposition.PCA
( 89b9503 ) -
Add
fit_transform
tobigquery.ml
transformers ( a32b747 ) -
Add
Series.dropna
andDataFrame.fillna
( 8fab755 ) -
Add
Series.str
methodsisalpha
,isdigit
,isdecimal
,isalnum
,isspace
,islower
,isupper
,zfill
,center
( a32b747 ) -
Support
bigframes.pandas.merge()
( 8fab755 ) -
Support
DataFrame.isin
with list and dict inputs ( 8fab755 ) -
Support
DataFrame.pivot
( a32b747 ) -
Support
DataFrame.stack
( 89b9503 ) -
Support
DataFrame
-DataFrame
binary operations ( 8fab755 ) -
Support
df[my_column] = [a python list]
( 89b9503 ) -
Support
Index.is_monotonic
( 8fab755 ) -
Support
np.arcsin
,np.arccos
,np.arctan
,np.sinh
,np.cosh
,np.tanh
,np.arcsinh
,np.arccosh
,np.arctanh
,np.exp
with Series argument ( 89b9503 ) -
Support
np.sin
,np.cos
,np.tan
,np.log
,np.log10
,np.sqrt
,np.abs
with Series argument ( 89b9503 ) -
Support
pow()
and power operator inDataFrame
andSeries
( 8fab755 ) -
Support
read_json
withengine=bigquery
for newline-delimited JSON files ( 89b9503 ) -
Support
Series.corr
( 89b9503 ) -
Support
Series.map
( 8fab755 ) -
Support for
np.add
,np.subtract
,np.multiply
,np.divide
,np.power
( 8fab755 ) -
Support MultiIndex for DataFrame columns ( a32b747 )
-
Use
pandas.Index
for column labels ( a32b747 ) -
Use default session and connection in
ml.llm
andml.imported
( 8fab755 )
Bug Fixes
-
Add error message to
set_index
( a32b747 ) -
Align column names with pandas in
DataFrame.agg
results ( 89b9503 ) -
Allow (but still not recommended)
ORDER BY
inread_gbq
input when anindex_col
is defined ( 89b9503 ) -
Check for IAM role on the BigQuery connection when initializing a
remote_function
( 89b9503 ) -
Check that types are specified in
read_gbq_function
( a32b747 ) -
Don’t use query cache for Session construction ( a32b747 )
-
Include survey link in abstract
NotImplementedError
exception messages ( 89b9503 ) -
Label temp table creation jobs with
source=bigquery-dataframes-temp
label ( 89b9503 ) -
Make
X_train
argument names consistent across methods ( 8fab755 ) -
Raise AttributeError for unimplemented pandas methods ( 89b9503 )
-
Raise exception for invalid function in
read_gbq_function
( a32b747 ) -
Support spaces in column names in
DataFrame
initializater ( 89b9503 )
Performance Improvements
-
Add local cache for
__repr_\*__
methods ( a32b747 ) -
Lazily instantiate client library objects ( 89b9503 )
-
Use
row_number()
filter forhead
/tail
( 8fab755 )
Documentation
-
Add ML section under Overview ( a32b747 )
-
Add release status to table of contents ( a32b747 )
-
Add samples and best practices to
read_gbq
docs ( a32b747 ) -
Correct the return types of Dataframe and Series ( a32b747 )
-
Create subfolders for notebooks ( a32b747 )
-
Fix link to GitHub ( 89b9503 )
-
Highlight bigframes is open-source ( a32b747 )
-
Sample ML Drug Name Generation notebook ( a32b747 )
-
Set
options.bigquery.project
in sample code ( 89b9503 ) -
Transform remote function user guide into sample code ( a32b747 )
-
Update remote function notebook with read_gbq_function usage ( 8fab755 )
0.2.0 (2023-08-17)
Features
-
Add KMeans.cluster_centers_.
-
Allow column labels to be any type handled by bq df, column labels can be integers now.
-
Add dataframegroupby.agg().
-
Add Series Property is_monotonic_increasing and is_monotonic_decreasing.
-
Add match, fullmatch, get, pad str methods.
-
Add series isin function.
Bug Fixes
-
Update ML package to use sessions for queries.
-
Optimize
read_gbq
withindex_col
set to cluster byindex_col
. -
Raise ValueError if the location mismatched.
-
read_gbq
no longer uses ‘time travel’ with query inputs.
Documentation
- Add docstring to _uniform_sampling to avoid user using it.
0.1.1 (2023-08-14)
Documentation
- Correct link to code repository in
setup.py
and use correct terminology forconsole.cloud.google.com
links.
0.1.0 (2023-08-11)
Features
-
Add
bigframes.pandas
package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more. -
Add
bigframes.ml
package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .
0.0.0 (2023-02-22)
- Empty package to reserve package name.