Changelog

PyPI History

0.22.0 (2024-02-27)

⚠ BREAKING CHANGES

  • rename cosine_similarity to paired_cosine_distances ( #393 )

  • move model optional args to kwargs ( #381 )

Features

  • Add DataFrames.corr() method ( #379 ) ( 67fd434 )

  • Add ml.metrics.pairwise.manhattan_distance ( #392 ) ( 9d31865 )

  • Enable regional endpoints for me-central2 ( #386 ) ( 469674d )

Bug Fixes

  • Avoid ibis warning for “database” table() method argument ( #390 ) ( a0490a4 )

  • Correct the numeric literal dtype ( #365 ) ( 93b02cd )

  • Rename cosine_similarity to paired_cosine_distances ( #393 ) ( 81ece46 )

Performance Improvements

  • Inline read_pandas for small data ( #383 ) ( 59b446b )

Dependencies

  • Add minimum version constraint for sqlglot to 19.9.0 ( #389 ) ( 8b62d77 )

Documentation

  • Add a code sample for creating a kmeans model ( #267 ) ( 4291d65 )

  • Fix bigframes.pandas.concat documentation ( #382 ) ( 234b61c )

Miscellaneous Chores

Code Refactoring

  • Move model optional args to kwargs ( #381 ) ( 4037992 )

0.21.0 (2024-02-13)

Features

  • Add Series.cov method ( #368 ) ( 443db22 )

  • Add ml.llm.GeminiTextGenerator model ( #370 ) ( de1e0a4 )

  • Add ml.metrics.pairwise.cosine_similarity function ( #374 ) ( 126f566 )

  • Add XGBoostModel ( #363 ) ( d5518b2 )

  • Limited support of lambdas in Series.apply ( #345 ) ( 208e081 )

  • Support bigframes.pandas.to_datetime for scalars, iterables and series. ( #372 ) ( ffb0d15 )

  • Support read_gbq wildcard table path ( #377 ) ( 90caf86 )

Bug Fixes

Documentation

  • Clarify ADC pre-auth in a non-interactive environment ( #348 ) ( 99a9e6e )

0.20.1 (2024-02-06)

Performance Improvements

  • Make repr cache the block where appropriate ( #350 ) ( 068879f )

Documentation

  • Add a sample to demonstrate the evaluation results ( #364 ) ( cff0919 )

  • Fix the DataFrame.apply code sample ( #366 ) ( 1866a26 )

0.20.0 (2024-01-30)

Features

  • Add DataFrame.peek() as an efficient alternative to head() results preview ( #318 ) ( 9c34d83 )

  • Add ARIMA_EVAULATE options in forecasting models ( #336 ) ( 73e997b )

  • Add Index constructor, repr, copy, get_level_values, to_series ( #334 ) ( e5d054e )

  • Improve error message for drive based BQ table reads ( #344 ) ( 0794788 )

  • Update cut to work without labels = False and show intervals as dict ( #335 ) ( 4ff53db )

Bug Fixes

  • Chance default connection name in getting_started.ipnyb ( #347 ) ( 677f014 )

  • Series iteration correctly returns values instead of index ( #339 ) ( 2c6af9b )

Documentation

  • Add code samples for Series.{between, cumprod} ( #353 ) ( 09a52fd )

0.19.2 (2024-01-22)

Bug Fixes

  • Read_gbq large response issue ( #332 ) ( b8178b9 )

  • Use object dtype for ARRAY columns in to_pandas() with pandas 1.x ( #329 ) ( 374ddb5 )

Documentation

  • Add DataFrame.applymap documentation ( #326 ) ( bd531a1 )

  • Add code samples for series methods ( #323 ) ( 32cc6fa )

  • Add remote model requirements ( #333 ) ( c91f70c )

0.19.1 (2024-01-17)

Bug Fixes

  • Handle multi-level columns for df aggregates properly ( #305 ) ( 5bb45ba )

  • Update max_output_token limitation. ( #308 ) ( 5cccd36 )

Documentation

  • Add code samples for Series.corr ( #316 ) ( 9150c16 )

0.19.0 (2024-01-09)

Features

  • Add ‘columns’ as an alias for ‘col_order’ ( #298 ) ( a01b271 )

  • Add Series dt.tz and dt.unit properties ( #303 ) ( 2e1a403 )

  • Add to_gbq() method for LLM models ( #299 ) ( dafbc1b )

  • Allow manually set clustering_columns in dataframe.to_gbq ( #302 ) ( 9c21323 )

  • Support assigning to columns like a property ( #304 ) ( f645c56 )

  • Support upcasting numeric columns in concat ( #294 ) ( e3a056a )

Bug Fixes

  • DF.drop tuple input as multi-index ( #301 ) ( 21391a9 )

  • Fix bug converting non-string labels to sql ids ( #296 ) ( a61c5fe )

Documentation

  • Add code samples for Series.ffill and DataFrame.ffill ( #307 ) ( 1c63b45 )

0.18.0 (2024-01-02)

Features

  • Add dataframe.to_html ( #259 ) ( 2cd6489 )

  • Add IntervalIndex support to bigframes.pandas.cut ( #254 ) ( 6c1969a )

  • Add replace method to DataFrame ( #261 ) ( 5092215 )

  • Specific pyarrow mappings for decimal, bytes types ( #283 ) ( a1c0631 )

Bug Fixes

  • Dataframes to_gbq now creates dataset if it doesn’t exist ( #222 ) ( bac62f7 )

  • Exclude pandas 2.2.0rc0 to unblock prerelease tests ( #292 ) ( ac1a745 )

  • Fix DataFrameGroupby.agg() issue with as_index=False ( #273 ) ( ab49350 )

  • Make Series.str.replace work for simple strings ( #285 ) ( ad67465 )

  • Update dataframe.to_gbq to dedup column names. ( #286 ) ( 746115d )

  • Use setuptools.find_namespace_packages ( #246 ) ( 9ec352a )

Dependencies

  • Migrate to ibis-framework >= "7.1.0" ( #53 ) ( 9798a2b )

Documentation

  • Add code snippets for explore query result page ( #278 ) ( 7cbbb7d )

  • Code samples for astype common to DataFrame and Series ( #280 ) ( 95b673a )

  • Code samples for DataFrame.copy and Series.copy ( #290 ) ( 7cbc2b0 )

  • Code samples for drop and fillna ( #284 ) ( 9c5012e )

  • Code samples for isna , isnull , dropna , isin ( #289 ) ( ad51035 )

  • Code samples for rename , size ( #293 ) ( eb69f60 )

  • Code samples for reset_index and sort_values ( #282 ) ( acc0eb7 )

  • Code samples for sample , get , Series.round ( #295 ) ( c2b1892 )

  • Code samples for Series.{add, replace, unique, T, transpose} ( #287 ) ( 0e1bbfc )

  • Code samples for Series.{map, to_list, count} ( #290 ) ( 7cbc2b0 )

  • Code samples for Series.{name, std, agg} ( #293 ) ( eb69f60 )

  • Code samples for Series.groupby and Series.{sum,mean,min,max} ( #280 ) ( 95b673a )

  • Code samples for DataFrame set_index , items ( #295 ) ( c2b1892 )

  • Fix the rendering for get_dummies ( #291 ) ( 252f3a2 )

0.17.0 (2023-12-14)

Features

  • Add filters argument to read_gbq for enhanced data querying ( #198 ) ( 034f71f )

  • Add module/class level api tracking ( #272 ) ( 4f3db3d )

  • Deprecate use_regional_endpoints ( #199 ) ( 319a1f2 )

Bug Fixes

  • Increase recursion limit, cache compilation tree hashes ( #184 ) ( b54791c )

  • Replaced raise NotImplementedError with return NotImplemented ( #258 ) ( a133822 )

Documentation

  • Add code samples for values and value_counts ( #249 ) ( f247d95 )

  • Add sample for getting started with BQML ( #141 ) ( fb14f54 )

0.16.0 (2023-12-12)

Features

  • Add ARIMAPlus.predict parameters ( #264 ) ( 99598c7 )

  • Add DataFrame from_dict and from_records methods ( #244 ) ( 8d81e24 )

  • Add DataFrame.select_dtypes method ( #242 ) ( 1737acc )

  • Add nunique method to Series/DataFrameGroupby ( #256 ) ( c8ec245 )

  • Support dataframe.loc with conditional columns selection ( #233 ) ( 3febea9 )

Bug Fixes

  • Enfore pandas version requirement <2.1.4 ( #265 ) ( 9dd63f6 )

  • Exclude pandas 2.1.4 from prerelease tests to unblock e2e tests ( b02fc2c )

  • Fix value_counts column label for normalize=True ( #245 ) ( d3fa6f2 )

  • Migrate e2e tests to bigframes-load-testing project ( 8766ac6 )

  • Ml.sql logic ( #262 ) ( 68c6fdf )

  • Update the llm_kmeans notebook ( #247 ) ( 66d1839 )

Documentation

  • Add code samples for shape and head ( #257 ) ( 5bdcc65 )

  • Add example for dataframe.melt, dataframe.pivot, dataframe.stac… ( #252 ) ( 8c63697 )

  • Add example to dataframe.nlargest, dataframe.nsmallest, datafra… ( #234 ) ( e735412 )

  • Add examples for dataframe.cummin, dataframe.cummax, dataframe.cumsum, dataframe.cumprod ( #243 ) ( 0523a31 )

  • Add examples for dataframe.nunique, dataframe.diff, dataframe.a… ( #251 ) ( 77074ec )

  • Correct the docs for option_context ( #263 ) ( d21c6dd )

  • Correct the params rendering for ml.remote and ml.ensemble modules ( #248 ) ( c2829e3 )

  • Fix return annotation in API docstrings ( #253 ) ( 89a1c67 )

0.15.0 (2023-11-29)

⚠ BREAKING CHANGES

  • model.predict returns all the columns ( #204 )

Features

  • Add info and memory_usage methods to dataframe ( #219 ) ( 9d6613d )

  • Add remote vertex model support ( #237 ) ( 0bfc4fb )

  • Add the recent api method for ML component ( #225 ) ( ed8876d )

  • Model.predict returns all the columns ( #204 ) ( 416171a )

  • Send warnings on LLM prediction partial failures ( #216 ) ( 81125f9 )

Bug Fixes

  • Add df snapshots lookup for read_gbq ( #229 ) ( d0d9b84 )

  • Avoid unnecessary row_number() on sort key for io ( #211 ) ( a18d40e )

  • Dedup special character ( #209 ) ( dd78acb )

  • Invalid JSON type of the notebook ( #215 ) ( a729831 )

  • Make to_pandas override enable_downsampling when sampling_method is manually set. ( #200 ) ( ae03756 )

  • Polish the llm+kmeans notebook ( #208 ) ( e8532b1 )

  • Update the llm+kmeans notebook with recent change ( #236 ) ( f8917ab )

  • Use anonymous dataset to create remote_function ( #205 ) ( 69b016e )

Documentation

  • Add code samples for index and column properties ( #212 ) ( c88d38e )

  • Add code samples for df reshaping, function, merge, and join methods ( #203 ) ( 010486c )

  • Add examples for dataframe.kurt, dataframe.std, dataframe.count ( #232 ) ( f9c6e72 )

  • Add examples for dataframe.mean, dataframe.median, dataframe.va… ( #228 ) ( edd0522 )

  • Add examples for dataframe.min, dataframe.max and dataframe.sum ( #227 ) ( 3a375e8 )

  • Code samples for Series.dot and DataFrame.dot ( #226 ) ( b62a07a )

  • Code samples for Series.where and Series.mask ( #217 ) ( 52dfad2 )

  • Code samples for dataframe.any, dataframe.all and dataframe.prod ( #223 ) ( d7957fa )

  • Make the code samples reflect default bq connection usage ( #206 ) ( 71844b0 )

Miscellaneous Chores

0.14.1 (2023-11-16)

Bug Fixes

  • Correctly handle null values when initializing fingerprint ordering ( #210 ) ( 8324f13 )

Documentation

  • Add an example notebook about line graphs ( #197 ) ( f957b27 )

0.14.0 (2023-11-14)

Features

  • Add ‘cross’ join support ( #176 ) ( 765446a )

  • Add ‘index’, ‘pad’, ‘nearest’ interpolate methods ( #162 ) ( 6a28403 )

  • Add series.sample (identical to existing dataframe.sample) ( #187 ) ( 37914a4 )

  • Add unordered sql compilation ( #156 ) ( 58f420c )

  • Log most recent API calls as recent-bigframes-api-xx labels on BigQuery jobs ( #145 ) ( 4ea33b7 )

  • Read_gbq creates order deterministically without table copy ( #191 ) ( 8ab81de )

  • Support date_series.astype("string[pyarrow]") to cast DATE to STRING ( #186 ) ( aee0e8e )

  • Support series.at[row_label] = scalar ( #173 ) ( 0c8bd33 )

  • Temporary resources no longer use BigQuery Sessions ( #194 ) ( 4a02cac )

Bug Fixes

  • All sort operation are now stable ( #195 ) ( 3a2761f )

  • Default to 7 days expiration for read_csv , read_json , read_parquet ( #193 ) ( 03606cd )

  • Deprecate the remote_service_type in llm model ( #180 ) ( a8a409a )

  • For reset_index on unnamed multiindex, always use level_[n] label ( #182 ) ( f95000d )

  • Match pandas behavior when assigning listlike to empty dfs ( #172 ) ( c1d1f42 )

  • Use anonymous dataset instead of session dataset for temp tables ( #181 ) ( 800d44e )

  • Use random table for read_pandas ( #192 ) ( 741c75e )

  • Use random table when loading data for read_csv , read_json , read_parquet ( #175 ) ( 9d2e6dc )

Documentation

  • Add code samples for read_gbq_function using community UDFs ( #188 ) ( 7506eab )

  • Add docstring code samples for Series.apply and DataFrame.map ( #185 ) ( c816d84 )

  • Add llm kmeans notebook as an included example ( #177 ) ( d49ae42 )

  • Use head() to get top n results, not to preview results ( #190 ) ( 87f84c9 )

0.13.0 (2023-11-07)

Features

  • to_gbq without a destination table writes to a temporary table ( #158 ) ( e1817c9 )

  • Add DataFrame.__iter__ , DataFrame.iterrows , DataFrame.itertuples , and DataFrame.keys methods ( #164 ) ( c065071 )

  • Add Series.__iter__ method ( #164 ) ( c065071 )

  • Add interpolate() to series and dataframe ( #157 ) ( b9cb55c )

  • Support 32k text-generation and multilingual embedding models ( #161 ) ( 5f0ea37 )

Bug Fixes

  • Update default temp table expiration to 7 days ( #174 ) ( 4ff26cd )

0.12.0 (2023-11-01)

Features

  • Add DataFrame.melt ( #113 ) ( 4e4409c )

  • Add DataFrame.to_pandas_batches() to download large DataFrame objects ( #136 ) ( 3afd4a3 )

  • Add bigframes.options.compute.maximum_bytes_billed option that sets maximum bytes billed on query jobs ( #133 ) ( 63c7919 )

  • Add pandas.qcut ( #104 ) ( 8e44518 )

  • Add pd.get_dummies ( #149 ) ( d8baad5 )

  • Add unstack to series, add level param ( #115 ) ( 5edcd19 )

  • Implement operator @ for DataFrame.dot ( #139 ) ( 79a638e )

  • Populate ibis version in user agent ( #140 ) ( c639a36 )

Bug Fixes

  • Don’t override the global logging config ( #138 ) ( 2ddbf74 )

  • Fix bug with column names under repeated column assignment ( #150 ) ( 29032d0 )

  • Resolve plotly rendering issue by using ipython html for job pro… ( #134 ) ( 39df43e )

  • Use indexee’s session for loc listlike cases ( #152 ) ( 27c5725 )

Documentation

  • Add artithmetic df sample code ( #153 ) ( ac44ccd )

  • Fix indentation on read_gbq_function code sample ( #163 ) ( 0801d96 )

  • Link to ML.EVALUATE BQML page for score() methods ( #137 ) ( 45c617f )

0.11.0 (2023-10-26)

Features

  • Add back reset_session as an alias for close_session ( #124 ) ( 694a85a )

  • Change query parameter to query_or_table in read_gbq ( #127 ) ( f9bb3c4 )

Bug Fixes

  • Expose bigframes.pandas.reset_session as a public API ( #128 ) ( b17e1f4 )

  • Use series’s own session in series.reindex listlike case ( #135 ) ( 95bff3f )

Documentation

  • Add runnable code samples for DataFrames I/O methods and property ( #129 ) ( 6fea8ef )

  • Add runnable code samples for reading methods ( #125 ) ( a669919 )

0.10.0 (2023-10-19)

Features

  • Implement DataFrame.dot for matrix multiplication ( #67 ) ( 29dd414 )

0.9.0 (2023-10-18)

⚠ BREAKING CHANGES

  • rename bigframes.pandas.reset_session to close_session ( #101 )

Features

  • Add bigframes.options.bigquery.application_name for partner attribution ( #117 ) ( 52d64ff )

  • Add AtIndexer getitems ( #107 ) ( 752b01f )

  • Rename bigframes.pandas.reset_session to close_session ( #101 ) ( 36693bf )

  • Send BigQuery cancel request when canceling bigframes process ( #103 ) ( e325fbb )

  • Support external packages in remote_function ( #98 ) ( ec10c4a )

  • Use ArrowDtype for STRUCT columns in to_pandas ( #85 ) ( 9238fad )

Bug Fixes

  • Support multiindex for three loc getitem overloads ( #113 ) ( 68e3cd3 )

Performance Improvements

  • If primary keys are defined, read_gbq avoids copying table data ( #112 ) ( e6c0cd1 )

Documentation

  • Add documentation for Series.struct.field and Series.struct.explode ( #114 ) ( a6dab9c )

  • Add open-source link in API doc ( #106 ) ( db51fe3 )

  • Update ML overview API doc ( #105 ) ( 1b3f3a5 )

0.8.0 (2023-10-12)

⚠ BREAKING CHANGES

  • The default behavior of to_parquet is changing from no compression to 'snappy' compression.

Features

  • Support compression in to_parquet ( a8c286f )

Bug Fixes

  • Create session dataset for remote functions only when needed ( #94 ) ( 1d385be )

0.7.0 (2023-10-11)

Features

  • Add aliases for several series properties ( #80 ) ( c0efec8 )

  • Add equals methods to series/dataframe ( #76 ) ( 636a209 )

  • Add iat and iloc accessing by tuples of integers ( #90 ) ( 228aeba )

  • Add level param to DataFrame.stack ( #88 ) ( 97b8bec )

  • Allow df.drop to take an index object ( #68 ) ( 740c451 )

  • Use default session connection ( #87 ) ( 4ae4ef9 )

Bug Fixes

  • Change the invalid url in docs ( #93 ) ( 969800d )

Documentation

  • Add more preprocessing models into the docs menu. ( #97 ) ( 1592315 )

0.6.0 (2023-10-04)

Features

  • Add df.unstack ( #63 ) ( 4a84714 )

  • Add idxmin, idxmax to series, dataframe ( #74 ) ( 781307e )

  • Add ml.preprocessing.KBinsDiscretizer ( #81 ) ( 24c6256 )

  • Add multi-column dataframe merge ( #73 ) ( c9fa85c )

  • Add update and align methods to dataframe ( #57 ) ( bf050cf )

  • Support STRUCT data type with Series.struct.field to extract child fields ( #71 ) ( 17afac9 )

Bug Fixes

  • Avoid 403 response too large to return error with read_gbq and large query results ( #77 ) ( 8f3b5b2 )

  • Change return type of Series.loc[scalar] ( #40 ) ( fff3d45 )

  • Fix df/series.iloc by list with multiindex ( #79 ) ( 971d091 )

0.5.0 (2023-09-28)

Features

  • Add DataFrame.kurtosis / DF.kurt method ( c1900c2 )

  • Add DataFrame.rolling and DataFrame.expanding methods ( c1900c2 )

  • Add items , apply methods to DataFrame . ( #43 ) ( 3adc1b3 )

  • Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )

  • Add index dtype , astype , drop , fillna , aggregate attributes. ( #38 ) ( 1a254a4 )

  • Add ml.preprocessing.LabelEncoder ( #50 ) ( 2510461 )

  • Add ml.preprocessing.MaxAbsScaler ( #56 ) ( 14b262b )

  • Add ml.preprocessing.MinMaxScaler ( #64 ) ( 392113b )

  • Add more index methods ( #54 ) ( a6e32aa )

  • Support calculate_p_values parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support class_weights="balanced" in LogisticRegression model ( c1900c2 )

  • Support df[column_name] = df_only_one_column ( c1900c2 )

  • Support early_stop parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support enable_global_explain parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support l2_reg parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support learn_rate_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support ls_init_learn_rate parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support max_iterations parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support min_rel_progress parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support optimize_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support casting string to integer or float ( #59 ) ( 3502f83 )

Bug Fixes

  • Fix header skipping logic in read_csv ( #49 ) ( d56258c )

  • Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )

  • LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )

  • Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )

Performance Improvements

  • Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )

  • Inline small Series and DataFrames in query text ( #45 ) ( 5e199ec )

  • Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )

  • Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )

Documentation

  • Link to Remote Functions code samples from README and API reference ( c1900c2 )

0.4.0 (2023-09-16)

Features

  • Add axis parameter to droplevel and reorder_levels ( 7c6b0dd )

  • Add bfill and ffill to DataFrame and Series ( 7c6b0dd )

  • Add DataFrame.combine and DataFrame.combine_first ( #27 ) ( 7c6b0dd )

  • Add DataFrame.nlargest , nsmallest ( 7c6b0dd )

  • Add DataFrame.pct_change and Series.pct_change ( 7c6b0dd )

  • Add DataFrame.skew and GroupBy.skew ( 7c6b0dd )

  • Add DataFrame.to_dict , to_excel , to_latex , to_records , to_string , to_markdown , to_pickle , to_orc ( 7c6b0dd )

  • Add diff method to DataFrame and GroupBy ( 7c6b0dd )

  • Add filter and reindex to Series and DataFrame ( 7c6b0dd )

  • Add reindex_like to DataFrame and Series ( 7c6b0dd )

  • Add swaplevel to DataFrame and Series ( 7c6b0dd )

  • Add partial support for Sereies.replace ( 7c6b0dd )

  • Support DataFrame.loc[bool_series, column] = scalar ( 7c6b0dd )

  • Support a persistent name in remote_function ( 7c6b0dd )

Bug Fixes

  • remote_function uses same credentials as other APIs ( 7c6b0dd )

  • Add type hints to models ( 7c6b0dd )

  • Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )

  • Remove transforms parameter in model.fit ( breaking change) ( 7c6b0dd )

  • Support column joins with “None indexer” ( 7c6b0dd )

  • Use for literals Int64Dtype in cut ( 7c6b0dd )

  • Use lowercase strings for parameter literals in bigframes.ml ( breaking change) ( 7c6b0dd )

Performance Improvements

  • bigframes-api label to I/O query jobs ( 7c6b0dd )

Documentation

  • Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )

  • Document region logic in README ( 7c6b0dd )

  • Fix OneHotEncoder sample ( 7c6b0dd )

0.3.2 (2023-09-06)

Bug Fixes

  • Make release.sh script for PyPI upload executable ( #20 ) ( 9951610 )

0.3.1 (2023-09-05)

Bug Fixes

  • release:Use correct directory name for release build config ( #17 ) ( 3dd25b3 )

0.3.0 (2023-09-02)

Features

  • Add bigframes.get_global_session() and bigframes.reset_session() aliases ( a32b747 )

  • Add bigframes.pandas.read_pickle function ( a32b747 )

  • Add components_ , explained_variance_ , and explained_variance_ratio_ properties to bigframes.ml.decomposition.PCA ( 89b9503 )

  • Add fit_transform to bigquery.ml transformers ( a32b747 )

  • Add Series.dropna and DataFrame.fillna ( 8fab755 )

  • Add Series.str methods isalpha , isdigit , isdecimal , isalnum , isspace , islower , isupper , zfill , center ( a32b747 )

  • Support bigframes.pandas.merge() ( 8fab755 )

  • Support DataFrame.isin with list and dict inputs ( 8fab755 )

  • Support DataFrame.pivot ( a32b747 )

  • Support DataFrame.stack ( 89b9503 )

  • Support DataFrame - DataFrame binary operations ( 8fab755 )

  • Support df[my_column] = [a python list] ( 89b9503 )

  • Support Index.is_monotonic ( 8fab755 )

  • Support np.arcsin , np.arccos , np.arctan , np.sinh , np.cosh , np.tanh , np.arcsinh , np.arccosh , np.arctanh , np.exp with Series argument ( 89b9503 )

  • Support np.sin , np.cos , np.tan , np.log , np.log10 , np.sqrt , np.abs with Series argument ( 89b9503 )

  • Support pow() and power operator in DataFrame and Series ( 8fab755 )

  • Support read_json with engine=bigquery for newline-delimited JSON files ( 89b9503 )

  • Support Series.corr ( 89b9503 )

  • Support Series.map ( 8fab755 )

  • Support for np.add , np.subtract , np.multiply , np.divide , np.power ( 8fab755 )

  • Support MultiIndex for DataFrame columns ( a32b747 )

  • Use pandas.Index for column labels ( a32b747 )

  • Use default session and connection in ml.llm and ml.imported ( 8fab755 )

Bug Fixes

  • Add error message to set_index ( a32b747 )

  • Align column names with pandas in DataFrame.agg results ( 89b9503 )

  • Allow (but still not recommended) ORDER BY in read_gbq input when an index_col is defined ( 89b9503 )

  • Check for IAM role on the BigQuery connection when initializing a remote_function ( 89b9503 )

  • Check that types are specified in read_gbq_function ( a32b747 )

  • Don’t use query cache for Session construction ( a32b747 )

  • Include survey link in abstract NotImplementedError exception messages ( 89b9503 )

  • Label temp table creation jobs with source=bigquery-dataframes-temp label ( 89b9503 )

  • Make X_train argument names consistent across methods ( 8fab755 )

  • Raise AttributeError for unimplemented pandas methods ( 89b9503 )

  • Raise exception for invalid function in read_gbq_function ( a32b747 )

  • Support spaces in column names in DataFrame initializater ( 89b9503 )

Performance Improvements

  • Add local cache for __repr_\*__ methods ( a32b747 )

  • Lazily instantiate client library objects ( 89b9503 )

  • Use row_number() filter for head / tail ( 8fab755 )

Documentation

  • Add ML section under Overview ( a32b747 )

  • Add release status to table of contents ( a32b747 )

  • Add samples and best practices to read_gbq docs ( a32b747 )

  • Correct the return types of Dataframe and Series ( a32b747 )

  • Create subfolders for notebooks ( a32b747 )

  • Fix link to GitHub ( 89b9503 )

  • Highlight bigframes is open-source ( a32b747 )

  • Sample ML Drug Name Generation notebook ( a32b747 )

  • Set options.bigquery.project in sample code ( 89b9503 )

  • Transform remote function user guide into sample code ( a32b747 )

  • Update remote function notebook with read_gbq_function usage ( 8fab755 )

0.2.0 (2023-08-17)

Features

  • Add KMeans.cluster_centers_.

  • Allow column labels to be any type handled by bq df, column labels can be integers now.

  • Add dataframegroupby.agg().

  • Add Series Property is_monotonic_increasing and is_monotonic_decreasing.

  • Add match, fullmatch, get, pad str methods.

  • Add series isin function.

Bug Fixes

  • Update ML package to use sessions for queries.

  • Optimize read_gbq with index_col set to cluster by index_col .

  • Raise ValueError if the location mismatched.

  • read_gbq no longer uses ‘time travel’ with query inputs.

Documentation

  • Add docstring to _uniform_sampling to avoid user using it.

0.1.1 (2023-08-14)

Documentation

  • Correct link to code repository in setup.py and use correct terminology for console.cloud.google.com links.

0.1.0 (2023-08-11)

Features

  • Add bigframes.pandas package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.

  • Add bigframes.ml package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .

0.0.0 (2023-02-22)

  • Empty package to reserve package name.
Create a Mobile Website
View Site in Mobile | Classic
Share by: