Changelog

PyPI History

0.5.0 (2023-09-28)

Features

  • Add DataFrame.kurtosis / DF.kurt method ( c1900c2 )

  • Add DataFrame.rolling and DataFrame.expanding methods ( c1900c2 )

  • Add items , apply methods to DataFrame . ( #43 ) ( 3adc1b3 )

  • Add axis param to simple df aggregations ( #52 ) ( 9cf9972 )

  • Add index dtype , astype , drop , fillna , aggregate attributes. ( #38 ) ( 1a254a4 )

  • Add ml.preprocessing.LabelEncoder ( #50 ) ( 2510461 )

  • Add ml.preprocessing.MaxAbsScaler ( #56 ) ( 14b262b )

  • Add ml.preprocessing.MinMaxScaler ( #64 ) ( 392113b )

  • Add more index methods ( #54 ) ( a6e32aa )

  • Support calculate_p_values parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support class_weights="balanced" in LogisticRegression model ( c1900c2 )

  • Support df[column_name] = df_only_one_column ( c1900c2 )

  • Support early_stop parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support enable_global_explain parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support l2_reg parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support learn_rate_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support ls_init_learn_rate parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support max_iterations parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support min_rel_progress parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support optimize_strategy parameter in bigframes.ml.linear_model.LinearRegression ( c1900c2 )

  • Support casting string to integer or float ( #59 ) ( 3502f83 )

Bug Fixes

  • Fix header skipping logic in read_csv ( #49 ) ( d56258c )

  • Generate unique ids on join to avoid id collisions ( #65 ) ( 7ab65e8 )

  • LabelEncoder params consistent with Sklearn ( #60 ) ( 632caec )

  • Loosen filter items tests to accomodate shifting pandas impl ( #41 ) ( edabdbb )

Performance Improvements

  • Add ability to cache dataframe and series to session table ( #51 ) ( 416d7cb )

  • Inline small Series and DataFrames in query text ( #45 ) ( 5e199ec )

  • Reimplement unpivot to use cross join rather than union ( #47 ) ( f9a93ce )

  • Simplify join order to use multiple order keys instead of string. ( #36 ) ( 5056da6 )

Documentation

  • Link to Remote Functions code samples from README and API reference ( c1900c2 )

0.4.0 (2023-09-16)

Features

  • Add axis parameter to droplevel and reorder_levels ( 7c6b0dd )

  • Add bfill and ffill to DataFrame and Series ( 7c6b0dd )

  • Add DataFrame.combine and DataFrame.combine_first ( #27 ) ( 7c6b0dd )

  • Add DataFrame.nlargest , nsmallest ( 7c6b0dd )

  • Add DataFrame.pct_change and Series.pct_change ( 7c6b0dd )

  • Add DataFrame.skew and GroupBy.skew ( 7c6b0dd )

  • Add DataFrame.to_dict , to_excel , to_latex , to_records , to_string , to_markdown , to_pickle , to_orc ( 7c6b0dd )

  • Add diff method to DataFrame and GroupBy ( 7c6b0dd )

  • Add filter and reindex to Series and DataFrame ( 7c6b0dd )

  • Add reindex_like to DataFrame and Series ( 7c6b0dd )

  • Add swaplevel to DataFrame and Series ( 7c6b0dd )

  • Add partial support for Sereies.replace ( 7c6b0dd )

  • Support DataFrame.loc[bool_series, column] = scalar ( 7c6b0dd )

  • Support a persistent name in remote_function ( 7c6b0dd )

Bug Fixes

  • remote_function uses same credentials as other APIs ( 7c6b0dd )

  • Add type hints to models ( 7c6b0dd )

  • Raise error when ARIMAPlus is used with Pipeline ( 7c6b0dd )

  • Remove transforms parameter in model.fit ( breaking change) ( 7c6b0dd )

  • Support column joins with “None indexer” ( 7c6b0dd )

  • Use for literals Int64Dtype in cut ( 7c6b0dd )

  • Use lowercase strings for parameter literals in bigframes.ml ( breaking change) ( 7c6b0dd )

Performance Improvements

  • bigframes-api label to I/O query jobs ( 7c6b0dd )

Documentation

  • Document possible parameter values for PaLM2TextGenerator ( 7c6b0dd )

  • Document region logic in README ( 7c6b0dd )

  • Fix OneHotEncoder sample ( 7c6b0dd )

0.3.2 (2023-09-06)

Bug Fixes

  • Make release.sh script for PyPI upload executable ( #20 ) ( 9951610 )

0.3.1 (2023-09-05)

Bug Fixes

  • release:Use correct directory name for release build config ( #17 ) ( 3dd25b3 )

0.3.0 (2023-09-02)

Features

  • Add bigframes.get_global_session() and bigframes.reset_session() aliases ( a32b747 )

  • Add bigframes.pandas.read_pickle function ( a32b747 )

  • Add components_ , explained_variance_ , and explained_variance_ratio_ properties to bigframes.ml.decomposition.PCA ( 89b9503 )

  • Add fit_transform to bigquery.ml transformers ( a32b747 )

  • Add Series.dropna and DataFrame.fillna ( 8fab755 )

  • Add Series.str methods isalpha , isdigit , isdecimal , isalnum , isspace , islower , isupper , zfill , center ( a32b747 )

  • Support bigframes.pandas.merge() ( 8fab755 )

  • Support DataFrame.isin with list and dict inputs ( 8fab755 )

  • Support DataFrame.pivot ( a32b747 )

  • Support DataFrame.stack ( 89b9503 )

  • Support DataFrame - DataFrame binary operations ( 8fab755 )

  • Support df[my_column] = [a python list] ( 89b9503 )

  • Support Index.is_monotonic ( 8fab755 )

  • Support np.arcsin , np.arccos , np.arctan , np.sinh , np.cosh , np.tanh , np.arcsinh , np.arccosh , np.arctanh , np.exp with Series argument ( 89b9503 )

  • Support np.sin , np.cos , np.tan , np.log , np.log10 , np.sqrt , np.abs with Series argument ( 89b9503 )

  • Support pow() and power operator in DataFrame and Series ( 8fab755 )

  • Support read_json with engine=bigquery for newline-delimited JSON files ( 89b9503 )

  • Support Series.corr ( 89b9503 )

  • Support Series.map ( 8fab755 )

  • Support for np.add , np.subtract , np.multiply , np.divide , np.power ( 8fab755 )

  • Support MultiIndex for DataFrame columns ( a32b747 )

  • Use pandas.Index for column labels ( a32b747 )

  • Use default session and connection in ml.llm and ml.imported ( 8fab755 )

Bug Fixes

  • Add error message to set_index ( a32b747 )

  • Align column names with pandas in DataFrame.agg results ( 89b9503 )

  • Allow (but still not recommended) ORDER BY in read_gbq input when an index_col is defined ( 89b9503 )

  • Check for IAM role on the BigQuery connection when initializing a remote_function ( 89b9503 )

  • Check that types are specified in read_gbq_function ( a32b747 )

  • Don’t use query cache for Session construction ( a32b747 )

  • Include survey link in abstract NotImplementedError exception messages ( 89b9503 )

  • Label temp table creation jobs with source=bigquery-dataframes-temp label ( 89b9503 )

  • Make X_train argument names consistent across methods ( 8fab755 )

  • Raise AttributeError for unimplemented pandas methods ( 89b9503 )

  • Raise exception for invalid function in read_gbq_function ( a32b747 )

  • Support spaces in column names in DataFrame initializater ( 89b9503 )

Performance Improvements

  • Add local cache for __repr_\*__ methods ( a32b747 )

  • Lazily instantiate client library objects ( 89b9503 )

  • Use row_number() filter for head / tail ( 8fab755 )

Documentation

  • Add ML section under Overview ( a32b747 )

  • Add release status to table of contents ( a32b747 )

  • Add samples and best practices to read_gbq docs ( a32b747 )

  • Correct the return types of Dataframe and Series ( a32b747 )

  • Create subfolders for notebooks ( a32b747 )

  • Fix link to GitHub ( 89b9503 )

  • Highlight bigframes is open-source ( a32b747 )

  • Sample ML Drug Name Generation notebook ( a32b747 )

  • Set options.bigquery.project in sample code ( 89b9503 )

  • Transform remote function user guide into sample code ( a32b747 )

  • Update remote function notebook with read_gbq_function usage ( 8fab755 )

0.2.0 (2023-08-17)

Features

  • Add KMeans.cluster_centers_.

  • Allow column labels to be any type handled by bq df, column labels can be integers now.

  • Add dataframegroupby.agg().

  • Add Series Property is_monotonic_increasing and is_monotonic_decreasing.

  • Add match, fullmatch, get, pad str methods.

  • Add series isin function.

Bug Fixes

  • Update ML package to use sessions for queries.

  • Optimize read_gbq with index_col set to cluster by index_col .

  • Raise ValueError if the location mismatched.

  • read_gbq no longer uses ‘time travel’ with query inputs.

Documentation

  • Add docstring to _uniform_sampling to avoid user using it.

0.1.1 (2023-08-14)

Documentation

  • Correct link to code repository in setup.py and use correct terminology for console.cloud.google.com links.

0.1.0 (2023-08-11)

Features

  • Add bigframes.pandas package with an API compatible with pandas . Supported data sources include: BigQuery SQL queries, BigQuery tables, CSV (local and GCS), Parquet (local and Cloud Storage), and more.

  • Add bigframes.ml package with an API inspired by scikit-learn . Train machine learning models and run batch predicition, powered by BigQuery ML .

0.0.0 (2023-02-22)

  • Empty package to reserve package name.
Create a Mobile Website
View Site in Mobile | Classic
Share by: