Class DataFrame (0.2.0)

  DataFrame 
 ( 
 data 
 = 
 None 
 , 
 index 
 : 
 vendored_pandas_typing 
 . 
 Axes 
 | 
 None 
 = 
 None 
 , 
 columns 
 : 
 vendored_pandas_typing 
 . 
 Axes 
 | 
 None 
 = 
 None 
 , 
 dtype 
 : 
 typing 
 . 
 Optional 
 [ 
 bigframes 
 . 
 dtypes 
 . 
 DtypeString 
 | 
 bigframes 
 . 
 dtypes 
 . 
 Dtype 
 ] 
 = 
 None 
 , 
 copy 
 : 
 typing 
 . 
 Optional 
 [ 
 bool 
 ] 
 = 
 None 
 , 
 * 
 , 
 session 
 : 
 typing 
 . 
 Optional 
 [ 
 bigframes 
 . 
 session 
 . 
 Session 
 ] 
 = 
 None 
 ) 
 

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Properties

axes

Return a list representing the axes of the DataFrame.

It has the row axis labels and column axis labels as the only members. They are returned in that order.

Examples

 df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
df.axes
[RangeIndex(start=0, stop=2, step=1), Index(['col1', 'col2'],
dtype='object')] 

columns

The column labels of the DataFrame.

dtypes

Return the dtypes in the DataFrame.

This returns a Series with the data type of each column. The result's index is the original DataFrame's columns. Columns with mixed types aren't supported yet in BigQuery DataFrames.

empty

Indicates whether Series/DataFrame is empty.

True if Series/DataFrame is entirely empty (no items), meaning any of the axes are of length 0.

Returns
Type
Description
bool
If Series/DataFrame is empty, return True, if not return False.

iloc

Purely integer-location based indexing for selection by position.

.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.

Allowed inputs are:

  • Not supported yetAn integer, e.g. 5 .
  • Not supported yetA list or array of integers, e.g. [4, 3, 0] .
  • A slice object with ints, e.g. 1:7 .
  • Not supported yetA boolean array.
  • Not supported yetA callable function with one argument (the calling Series or DataFrame) that returns valid output for indexing (one of the above). This is useful in method chains, when you don't have a reference to the calling object, but would like to base your selection on some value.
  • Not supported yetA tuple of row and column indexes. The tuple elements consist of one of the above inputs, e.g. (0, 1) .

.iloc will raise IndexError if a requested indexer is out-of-bounds, except slice indexers which allow out-of-bounds indexing (this conforms with python/numpy slice semantics).

index

The index (row labels) of the DataFrame.

The index of a DataFrame is a series of labels that identify each row. The labels can be integers, strings, or any other hashable type. The index is used for label-based access and alignment, and can be accessed or modified using this attribute.

loc

Access a group of rows and columns by label(s) or a boolean array.

.loc[] is primarily label based, but may also be used with a boolean array.

Allowed inputs are:

  • A single label, e.g. 5 or 'a' , (note that 5 is interpreted as a label of the index, and neveras an integer position along the index).
  • A list of labels, e.g. ['a', 'b', 'c'] .
  • A boolean series of the same length as the axis being sliced, e.g. [True, False, True] .
  • An alignable Index. The index of the returned selection will be the input.
  • Not supported yetAn alignable boolean Series. The index of the key will be aligned before masking.
  • Not supported yetA slice object with labels, e.g. 'a':'f' . Note: contrary to usual python slices, boththe start and the stop are included.
  • Not supported yetA callable function with one argument (the calling Series or DataFrame) that returns valid output for indexing (one of the above).
Exceptions
Type
Description
NotImplementError
if the inputs are not supported.

ndim

Return an int representing the number of axes / array dimensions.

Returns
Type
Description
int
Return 1 if Series. Otherwise return 2 if DataFrame.

query_job

BigQuery job metadata for the most recent query.

shape

Return a tuple representing the dimensionality of the DataFrame.

size

Return an int representing the number of elements in this object.

Returns
Type
Description
int
Return the number of rows if Series. Otherwise return the number of rows times number of columns if DataFrame.

sql

Compiles this DataFrame's expression tree to SQL.

values

Return the values of DataFrame in the form of a NumPy array.

Methods

__array_ufunc__

  __array_ufunc__ 
 ( 
 ufunc 
 : 
 numpy 
 . 
 ufunc 
 , 
 method 
 : 
 str 
 , 
 * 
 inputs 
 , 
 ** 
 kwargs 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Used to support numpy ufuncs. See: https://numpy.org/doc/stable/reference/ufuncs.html

__getitem__

  __getitem__ 
 ( 
 key 
 : 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ], 
 pandas 
 . 
 core 
 . 
 indexes 
 . 
 base 
 . 
 Index 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 ] 
 ) 
 

Gets the specified column(s) from the DataFrame.

__repr__

  __repr__ 
 () 
 - 
> str 
 

Converts a DataFrame to a string. Calls compute.

Only represents the first <xref uid="bigframes.options">bigframes.options</xref>.display.max_rows .

__setitem__

  __setitem__ 
 ( 
 key 
 : 
 str 
 , 
 value 
 : 
 typing 
 . 
 Union 
 [ 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 int 
 , 
 float 
 , 
 typing 
 . 
 Callable 
 ] 
 ) 
 

Modify or insert a column into the DataFrame.

Note: This does notmodify the original table the DataFrame was derived from.

abs

  abs 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return a Series/DataFrame with absolute numeric value of each element.

This function only applies to elements that are all numeric.

add

  add 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get addition of DataFrame and other, element-wise (binary operator + ).

Equivalent to dataframe + other . With reverse version, radd .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

add_prefix

  add_prefix 
 ( 
 prefix 
 : 
 str 
 , 
 axis 
 : 
 int 
 | 
 str 
 | 
 None 
 = 
 None 
 ) 
 - 
> DataFrame 
 

Prefix labels with string prefix .

For Series, the row labels are prefixed. For DataFrame, the column labels are prefixed.

Parameters
Name
Description
prefix
str

The string to add before each label.

axis
int or str or None, default None

{{0 or 'index', 1 or 'columns', None}} , default None. Axis to add prefix on.

add_suffix

  add_suffix 
 ( 
 suffix 
 : 
 str 
 , 
 axis 
 : 
 int 
 | 
 str 
 | 
 None 
 = 
 None 
 ) 
 - 
> DataFrame 
 

Suffix labels with string suffix .

For Series, the row labels are suffixed. For DataFrame, the column labels are suffixed.

agg

  agg 
 ( 
 func 
 : 
 str 
 | 
 typing 
 . 
 Sequence 
 [ 
 str 
 ]) 
 - 
> DataFrame 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 

Aggregate using one or more operations over the specified axis.

Parameter
Name
Description
func
function

Function to use for aggregating the data. Accepted combinations are: string function name, list of function names, e.g. ['sum', 'mean'] .

Returns
Type
Description
Aggregated results.

aggregate

  aggregate 
 ( 
 func 
 : 
 str 
 | 
 typing 
 . 
 Sequence 
 [ 
 str 
 ]) 
 - 
> DataFrame 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 

Aggregate using one or more operations over the specified axis.

Parameter
Name
Description
func
function

Function to use for aggregating the data. Accepted combinations are: string function name, list of function names, e.g. ['sum', 'mean'] .

Returns
Type
Description
Aggregated results.

all

  all 
 ( 
 * 
 , 
 bool_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return whether all elements are True, potentially over an axis.

Returns True unless there at least one element within a Series or along a DataFrame axis that is False or equivalent (e.g. zero or empty).

Parameter
Name
Description
bool_only
bool. default False

Include only boolean columns.

Returns
Type
Description
Series if all elements are True.

any

  any 
 ( 
 * 
 , 
 bool_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return whether any element is True, potentially over an axis.

Returns False unless there is at least one element within a series or along a Dataframe axis that is True or equivalent (e.g. non-zero or non-empty).

Parameter
Name
Description
bool_only
bool. default False

Include only boolean columns.

applymap

  applymap 
 ( 
 func 
 , 
 na_action 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Apply a function to a Dataframe elementwise.

This method applies a function that accepts and returns a scalar to every element of a DataFrame.

Parameter
Name
Description
na_action
Optional[str], default None

{None, 'ignore'} , default None. If ‘ignore’, propagate NaN values, without passing them to func.

Returns
Type
Description
bigframes.dataframe.DataFrame
Transformed DataFrame.

assign

  assign 
 ( 
 ** 
 kwargs 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Assign new columns to a DataFrame.

Returns a new object with all original columns in addition to new ones. Existing columns that are re-assigned will be overwritten.

Returns
Type
Description
bigframes.dataframe.DataFrame
A new DataFrame with the new columns in addition to all the existing columns.

astype

  astype 
 ( 
 dtype 
 : 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Literal 
 [ 
 "boolean" 
 , 
 "Float64" 
 , 
 "Int64" 
 , 
 "string" 
 , 
 "string[pyarrow]" 
 , 
 "timestamp[us, tz=UTC][pyarrow]" 
 , 
 "timestamp[us][pyarrow]" 
 , 
 "date32[day][pyarrow]" 
 , 
 "time64[us][pyarrow]" 
 , 
 ], 
 pandas 
 . 
 core 
 . 
 arrays 
 . 
 boolean 
 . 
 BooleanDtype 
 , 
 pandas 
 . 
 core 
 . 
 arrays 
 . 
 floating 
 . 
 Float64Dtype 
 , 
 pandas 
 . 
 core 
 . 
 arrays 
 . 
 integer 
 . 
 Int64Dtype 
 , 
 pandas 
 . 
 core 
 . 
 arrays 
 . 
 string_ 
 . 
 StringDtype 
 , 
 pandas 
 . 
 core 
 . 
 arrays 
 . 
 arrow 
 . 
 dtype 
 . 
 ArrowDtype 
 , 
 ] 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Cast a pandas object to a specified dtype dtype .

Parameter
Name
Description
dtype
str or pandas.ExtensionDtype

A dtype supported by BigQuery DataFrame include 'boolean','Float64','Int64', 'string', 'tring[pyarrow]','timestamp[us, tz=UTC][pyarrow]', 'timestamp us][pyarrow] ','date32 day][pyarrow] ','time64 us][pyarrow] ' A pandas.ExtensionDtype include pandas.BooleanDtype(), pandas.Float64Dtype(), pandas.Int64Dtype(), pandas.StringDtype(storage="pyarrow"), pd.ArrowDtype(pa.date32()), pd.ArrowDtype(pa.time64("us")), pd.ArrowDtype(pa.timestamp("us")), pd.ArrowDtype(pa.timestamp("us", tz="UTC")).

copy

  copy 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Make a copy of this object's indices and data.

A new object will be created with a copy of the calling object's data and indices. Modifications to the data or indices of the copy will not be reflected in the original object.

count

  count 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Count non-NA cells for each column or row.

The values None , NaN , NaT , and optionally numpy.inf (depending on pandas.options.mode.use_inf_as_na ) are considered NA.

Parameter
Name
Description
numeric_only
bool, default False

Include only float , int or boolean data.

Returns
Type
Description
For each column/row the number of non-NA/null entries. If level is specified returns a DataFrame .

cummax

  cummax 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return cumulative maximum over a DataFrame axis.

Returns a DataFrame of the same size containing the cumulative maximum.

Returns
Type
Description
bigframes.dataframe.DataFrame
Return cumulative maximum of DataFrame.

cummin

  cummin 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return cumulative minimum over a DataFrame axis.

Returns a DataFrame of the same size containing the cumulative minimum.

Returns
Type
Description
bigframes.dataframe.DataFrame
Return cumulative minimum of DataFrame.

cumprod

  cumprod 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return cumulative product over a DataFrame axis.

Returns a DataFrame of the same size containing the cumulative product.

Returns
Type
Description
bigframes.dataframe.DataFrame
Return cumulative product of DataFrame.

cumsum

  cumsum 
 () 
 

Return cumulative sum over a DataFrame axis.

Returns a DataFrame of the same size containing the cumulative sum.

Returns
Type
Description
bigframes.dataframe.DataFrame
Return cumulative sum of DataFrame.

describe

  describe 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Generate descriptive statistics.

Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset's distribution, excluding NaN values.

Only supports numeric columns.

Returns
Type
Description
bigframes.dataframe.DataFrame
Summary statistics of the Series or Dataframe provided.

div

  div 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get floating division of DataFrame and other, element-wise (binary operator / ).

Equivalent to dataframe / other . With reverse version, rtruediv .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

divide

  divide 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get floating division of DataFrame and other, element-wise (binary operator / ).

Equivalent to dataframe / other . With reverse version, rtruediv .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

drop

  drop 
 ( 
 labels 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Any 
 ] 
 = 
 None 
 , 
 * 
 , 
 axis 
 : 
 typing 
 . 
 Union 
 [ 
 int 
 , 
 str 
 ] 
 = 
 0 
 , 
 index 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Any 
 ] 
 = 
 None 
 , 
 columns 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]] 
 ] 
 = 
 None 
 , 
 level 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 int 
 ]] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Drop specified labels from columns.

Remove columns by directly specifying column names.

Exceptions
Type
Description
KeyError
If any of the labels is not found in the selected axis.
Returns
Type
Description
bigframes.dataframe.DataFrame
DataFrame without the removed column labels.

drop_duplicates

  drop_duplicates 
 ( 
 subset 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]] 
 ] 
 = 
 None 
 , 
 * 
 , 
 keep 
 : 
 str 
 = 
 "first" 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return DataFrame with duplicate rows removed.

Considering certain columns is optional. Indexes, including time indexes are ignored.

Parameters
Name
Description
subset
column label or sequence of labels, optional

Only consider certain columns for identifying duplicates, by default use all of the columns.

keep
{'first', 'last', False }, default 'first'

Determines which duplicates (if any) to keep. - 'first' : Drop duplicates except for the first occurrence. - 'last' : Drop duplicates except for the last occurrence. - False : Drop all duplicates.

Returns
Type
Description
bigframes.dataframe.DataFrame
DataFrame with duplicates removed

droplevel

  droplevel 
 ( 
 level 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 int 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 int 
 ]]]) 
 

Return DataFrame with requested index / column level(s) removed.

Parameter
Name
Description
level
int, str, or list-like

If a string is given, must be the name of a level If list-like, elements must be names or positional indexes of levels.

Returns
Type
Description
DataFrame
DataFrame with requested index / column level(s) removed.

dropna

  dropna 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Remove missing values.

Returns
Type
Description
bigframes.dataframe.DataFrame
DataFrame with NA entries dropped from it.

duplicated

  duplicated 
 ( 
 subset 
 = 
 None 
 , 
 keep 
 : 
 str 
 = 
 "first" 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return boolean Series denoting duplicate rows.

Considering certain columns is optional.

Parameters
Name
Description
subset
column label or sequence of labels, optional

Only consider certain columns for identifying duplicates, by default use all of the columns.

keep
{'first', 'last', False}, default 'first'

Determines which duplicates (if any) to mark. - first : Mark duplicates as True except for the first occurrence. - last : Mark duplicates as True except for the last occurrence. - False : Mark all duplicates as True .

Returns
Type
Description
Boolean series for each duplicated rows.

eq

  eq 
 ( 
 other 
 : 
 typing 
 . 
 Any 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get equal to of DataFrame and other, element-wise (binary operator eq ).

Among flexible wrappers ( eq , ne , le , lt , ge , gt ) to comparison operators.

Equivalent to == , != , <= , < , >= , > with support to choose axis (rows or columns) and level for comparison.

Parameters
Name
Description
other
scalar, sequence, Series, or DataFrame

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}, default 'columns'

Whether to compare by the index (0 or 'index') or columns (1 or 'columns').

floordiv

  floordiv 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get integer division of DataFrame and other, element-wise (binary operator // ).

Equivalent to dataframe // other . With reverse version, rfloordiv .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

ge

  ge 
 ( 
 other 
 : 
 typing 
 . 
 Any 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get 'greater than or equal to' of DataFrame and other, element-wise (binary operator >= ).

Among flexible wrappers ( eq , ne , le , lt , ge , gt ) to comparison operators.

Equivalent to == , != , <= , < , >= , > with support to choose axis (rows or columns) and level for comparison.

Parameters
Name
Description
other
scalar, sequence, Series, or DataFrame

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}, default 'columns'

Whether to compare by the index (0 or 'index') or columns (1 or 'columns').

Returns
Type
Description
DataFrame
DataFrame of bool. The result of the comparison.

get

  get 
 ( 
 key 
 , 
 default 
 = 
 None 
 ) 
 

Get item from object for given key (ex: DataFrame column).

Returns default value if not found.

groupby

  groupby 
 ( 
 by 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 bigframes 
 . 
 series 
 . 
 Series 
 ]], 
 ] 
 ] 
 = 
 None 
 , 
 * 
 , 
 level 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 int 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 int 
 ]]] 
 ] 
 = 
 None 
 , 
 as_index 
 : 
 bool 
 = 
 True 
 , 
 dropna 
 : 
 bool 
 = 
 True 
 ) 
 - 
> bigframes 
 . 
 core 
 . 
 groupby 
 . 
 DataFrameGroupBy 
 

Group DataFrame by columns.

A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.

Parameters
Name
Description
by
str, Sequence[str]

A label or list of labels may be passed to group by the columns in self . Notice that a tuple is interpreted as a (single) key.

level
int, level name, or sequence of such, default None

If the axis is a MultiIndex (hierarchical), group by a particular level or levels. Do not specify both by and level .

as_index
bool, default True

Default True. Return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively "SQL-style" grouped output. This argument has no effect on filtrations such as head() , tail() , nth() and in transformations.

dropna
bool, default True

Default True. If True, and if group keys contain NA values, NA values together with row/column will be dropped. If False, NA values will also be treated as the key in groups.

Returns
Type
Description
A groupby object that contains information about the groups.

gt

  gt 
 ( 
 other 
 : 
 typing 
 . 
 Any 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get 'greater than' of DataFrame and other, element-wise (binary operator > ).

Among flexible wrappers ( eq , ne , le , lt , ge , gt ) to comparison operators.

Equivalent to == , != , <= , < , >= , > with support to choose axis (rows or columns) and level for comparison.

Parameters
Name
Description
other
scalar, sequence, Series, or DataFrame

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}, default 'columns'

Whether to compare by the index (0 or 'index') or columns (1 or 'columns').

Returns
Type
Description
DataFrame
DataFrame of bool: The result of the comparison.

head

  head 
 ( 
 n 
 : 
 int 
 = 
 5 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return the first n rows.

This function returns the first n rows for the object based on position. It is useful for quickly testing if your object has the right type of data in it.

Not yet supportedFor negative values of n , this function returns all rows except the last |n| rows, equivalent to df[:n] .

If n is larger than the number of rows, this function returns all rows.

Parameter
Name
Description
n
int, default 5

Default 5. Number of rows to select.

isna

  isna 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Detect missing values.

Return a boolean same-sized object indicating if the values are NA. NA values get mapped to True values. Everything else gets mapped to False values. Characters such as empty strings '' or numpy.inf are not considered NA values.

isnull

  isnull 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Detect missing values.

Return a boolean same-sized object indicating if the values are NA. NA values get mapped to True values. Everything else gets mapped to False values. Characters such as empty strings '' or numpy.inf are not considered NA values.

join

  join 
 ( 
 other 
 : 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 , 
 * 
 , 
 on 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 , 
 how 
 : 
 str 
 = 
 "left" 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Join columns of another DataFrame.

Join columns with other DataFrame on index

Parameter
Name
Description
how
{'left', 'right', 'outer', 'inner'}, default 'left'`

How to handle the operation of the two objects. left : use calling frame's index (or column if on is specified) right : use other 's index. outer : form union of calling frame's index (or column if on is specified) with other 's index, and sort it lexicographically. inner : form intersection of calling frame's index (or column if on is specified) with other 's index, preserving the order of the calling's one.

Returns
Type
Description
bigframes.dataframe.DataFrame
A dataframe containing columns from both the caller and other .

le

  le 
 ( 
 other 
 : 
 typing 
 . 
 Any 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get 'less than or equal to' of dataframe and other, element-wise (binary operator <= ).

Among flexible wrappers ( eq , ne , le , lt , ge , gt ) to comparison operators.

Equivalent to == , != , <= , < , >= , > with support to choose axis (rows or columns) and level for comparison.

Parameters
Name
Description
other
scalar, sequence, Series, or DataFrame

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}, default 'columns'

Whether to compare by the index (0 or 'index') or columns (1 or 'columns').

Returns
Type
Description
DataFrame
DataFrame of bool. The result of the comparison.

lt

  lt 
 ( 
 other 
 : 
 typing 
 . 
 Any 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get 'less than' of DataFrame and other, element-wise (binary operator < ).

Among flexible wrappers ( eq , ne , le , lt , ge , gt ) to comparison operators.

Equivalent to == , != , <= , < , >= , > with support to choose axis (rows or columns) and level for comparison.

Parameters
Name
Description
other
scalar, sequence, Series, or DataFrame

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}, default 'columns'

Whether to compare by the index (0 or 'index') or columns (1 or 'columns').

Returns
Type
Description
DataFrame
DataFrame of bool. The result of the comparison.

map

  map 
 ( 
 func 
 , 
 na_action 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Apply a function to a Dataframe elementwise.

This method applies a function that accepts and returns a scalar to every element of a DataFrame.

Parameter
Name
Description
na_action
Optional[str], default None

{None, 'ignore'} , default None. If ‘ignore’, propagate NaN values, without passing them to func.

Returns
Type
Description
bigframes.dataframe.DataFrame
Transformed DataFrame.

max

  max 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return the maximum of the values over the requested axis.

If you want the index of the maximum, use idxmax . This is the equivalent of the numpy.ndarray method argmax .

Parameter
Name
Description
numeric_only
bool. default False

Default False. Include only float, int, boolean columns.

Returns
Type
Description
Series after the maximum of values.

mean

  mean 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return the mean of the values over the requested axis.

Parameter
Name
Description
numeric_only
bool. default False

Default False. Include only float, int, boolean columns.

Returns
Type
Description
Series with the mean of values.

median

  median 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 , 
 exact 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return the median of the values over the requested axis.

Parameters
Name
Description
numeric_only
bool. default False

Default False. Include only float, int, boolean columns.

exact
bool. default False

Default False. Get the exact median instead of an approximate one. Note: exact=True not yet supported.

Returns
Type
Description
Series with the median of values.

merge

  merge 
 ( 
 right 
 : 
 bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 , 
 how 
 : 
 typing 
 . 
 Literal 
 [ 
 "inner" 
 , 
 "left" 
 , 
 "outer" 
 , 
 "right" 
 ] 
 = 
 "inner" 
 , 
 on 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 , 
 * 
 , 
 left_on 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 , 
 right_on 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 , 
 sort 
 : 
 bool 
 = 
 False 
 , 
 suffixes 
 : 
 tuple 
 [ 
 str 
 , 
 str 
 ] 
 = 
 ( 
 "_x" 
 , 
 "_y" 
 ) 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Merge DataFrame objects with a database-style join.

The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes will be ignored . Otherwise if joining indexes on indexes or indexes on a column or columns, the index will be passed on. When performing a cross merge, no column specifications to merge on are allowed.

Returns
Type
Description
bigframes.dataframe.DataFrame
A DataFrame of the two merged objects.

min

  min 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return the minimum of the values over the requested axis.

If you want the index of the minimum, use idxmin . This is the equivalent of the numpy.ndarray method argmin .

Parameter
Name
Description
numeric_only
bool, default False

Default False. Include only float, int, boolean columns.

Returns
Type
Description
Series with the minimum of the values.

mod

  mod 
 ( 
 other 
 : 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get modulo of DataFrame and other, element-wise (binary operator % ).

Equivalent to dataframe % other . With reverse version, rmod .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameter
Name
Description
axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

mul

  mul 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get multiplication of DataFrame and other, element-wise (binary operator * ).

Equivalent to dataframe * other . With reverse version, rmul .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

multiply

  multiply 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get multiplication of DataFrame and other, element-wise (binary operator * ).

Equivalent to dataframe * other . With reverse version, rmul .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

ne

  ne 
 ( 
 other 
 : 
 typing 
 . 
 Any 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get not equal to of DataFrame and other, element-wise (binary operator ne ).

Among flexible wrappers ( eq , ne , le , lt , ge , gt ) to comparison operators.

Equivalent to == , != , <= , < , >= , > with support to choose axis (rows or columns) and level for comparison.

Parameters
Name
Description
other
scalar, sequence, Series, or DataFrame

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}, default 'columns'

Whether to compare by the index (0 or 'index') or columns (1 or 'columns').

Returns
Type
Description
DataFrame
Result of the comparison.

notna

  notna 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings '' or numpy.inf are not considered NA values. NA values get mapped to False values.

Returns
Type
Description
NDFrame
Mask of bool values for each element that indicates whether an element is not an NA value.

notnull

  notnull 
 () 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings '' or numpy.inf are not considered NA values. NA values get mapped to False values.

Returns
Type
Description
NDFrame
Mask of bool values for each element that indicates whether an element is not an NA value.

nunique

  nunique 
 () 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Count number of distinct elements in specified axis.

Returns
Type
Description
Series with number of distinct elements.

pivot

  pivot 
 ( 
 * 
 , 
 columns 
 : 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]], 
 index 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]] 
 ] 
 = 
 None 
 , 
 values 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]] 
 ] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return reshaped DataFrame organized by given index / column values.

Reshape data (produce a "pivot" table) based on column values. Uses unique values from specified index / columns to form axes of the resulting DataFrame. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns.

Parameters
Name
Description
columns
str or object or a list of str

Column to use to make new frame's columns.

index
str or object or a list of str, optional

Column to use to make new frame's index. If not given, uses existing index.

values
str, object or a list of the previous, optional

Column(s) to use for populating new frame's values. If not specified, all remaining columns will be used and the result will have hierarchically indexed columns.

prod

  prod 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return the product of the values over the requested axis.

Parameter
Name
Description
numeric_only
bool. default False

Include only float, int, boolean columns.

Returns
Type
Description
Series with the product of the values.

product

  product 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return the product of the values over the requested axis.

Parameter
Name
Description
numeric_only
bool. default False

Include only float, int, boolean columns.

Returns
Type
Description
Series with the product of the values.

radd

  radd 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get addition of DataFrame and other, element-wise (binary operator + ).

Equivalent to dataframe + other . With reverse version, radd .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

rank

  rank 
 ( 
 axis 
 = 
 0 
 , 
 method 
 : 
 str 
 = 
 "average" 
 , 
 numeric_only 
 = 
 False 
 , 
 na_option 
 : 
 str 
 = 
 "keep" 
 , 
 ascending 
 = 
 True 
 , 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Compute numerical data ranks (1 through n) along axis.

By default, equal values are assigned a rank that is the average of the ranks of those values.

Parameters
Name
Description
method
{'average', 'min', 'max', 'first', 'dense'}, default 'average'

How to rank the group of records that have the same value (i.e. ties): average : average rank of the group, min : lowest rank in the group max : highest rank in the group, first : ranks assigned in order they appear in the array, dense`: like 'min', but rank always increases by 1 between groups.

numeric_only
bool, default False

For DataFrame objects, rank only numeric columns if set to True.

na_option
{'keep', 'top', 'bottom'}, default 'keep'

How to rank NaN values: keep : assign NaN rank to NaN values, , top : assign lowest rank to NaN values, bottom : assign highest rank to NaN values.

ascending
bool, default True

Whether or not the elements should be ranked in ascending order.

Returns
Type
Description
same type as caller
Return a Series or DataFrame with data ranks as values.

rdiv

  rdiv 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get floating division of DataFrame and other, element-wise (binary operator / ).

Equivalent to other / dataframe . With reverse version, truediv .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

rename

  rename 
 ( 
 * 
 , 
 columns 
 : 
 typing 
 . 
 Mapping 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Hashable 
 ] 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Rename columns.

Dict values must be unique (1-to-1). Labels not contained in a dict will be left as-is. Extra labels listed don't throw an error.

Parameter
Name
Description
columns
Mapping

Dict-like from old column labels to new column labels.

Exceptions
Type
Description
KeyError
If any of the labels is not found.
Returns
Type
Description
bigframes.dataframe.DataFrame
DataFrame with the renamed axis labels.

rename_axis

  rename_axis 
 ( 
 mapper 
 : 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]], 
 ** 
 kwargs 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Set the name of the axis for the index.

Returns
Type
Description
bigframes.dataframe.DataFrame
DataFrame with the new index name

reorder_levels

  reorder_levels 
 ( 
 order 
 : 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 int 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Union 
 [ 
 str 
 , 
 int 
 ]]] 
 ) 
 

Rearrange index levels using input order. May not drop or duplicate levels.

Parameter
Name
Description
order
list of int or list of str

List representing new level order. Reference level by number (position) or by key (label).

Returns
Type
Description
DataFrame
DataFrame of rearranged index.

reset_index

  reset_index 
 ( 
 * 
 , 
 drop 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Reset the index.

Reset the index of the DataFrame, and use the default one instead.

Parameter
Name
Description
drop
bool, default False

Do not try to insert index into dataframe columns. This resets the index to the default integer index.

Returns
Type
Description
bigframes.dataframe.DataFrame
DataFrame with the new index.

rfloordiv

  rfloordiv 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get integer division of DataFrame and other, element-wise (binary operator // ).

Equivalent to other // dataframe . With reverse version, rfloordiv .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

rmod

  rmod 
 ( 
 other 
 : 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get modulo of DataFrame and other, element-wise (binary operator % ).

Equivalent to other % dataframe . With reverse version, mod .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

rmul

  rmul 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get multiplication of DataFrame and other, element-wise (binary operator * ).

Equivalent to dataframe * other . With reverse version, rmul .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

rsub

  rsub 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get subtraction of DataFrame and other, element-wise (binary operator - ).

Equivalent to other - dataframe . With reverse version, sub .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

rtruediv

  rtruediv 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get floating division of DataFrame and other, element-wise (binary operator / ).

Equivalent to other / dataframe . With reverse version, truediv .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

sample

  sample 
 ( 
 n 
 : 
 typing 
 . 
 Optional 
 [ 
 int 
 ] 
 = 
 None 
 , 
 frac 
 : 
 typing 
 . 
 Optional 
 [ 
 float 
 ] 
 = 
 None 
 , 
 * 
 , 
 random_state 
 : 
 typing 
 . 
 Optional 
 [ 
 int 
 ] 
 = 
 None 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return a random sample of items from an axis of object.

You can use random_state for reproducibility.

Parameters
Name
Description
n
Optional[int], default None

Number of items from axis to return. Cannot be used with frac . Default = 1 if frac = None.

frac
Optional[float], default None

Fraction of axis items to return. Cannot be used with n .

random_state
Optional[int], default None

Seed for random number generator.

set_index

  set_index 
 ( 
 keys 
 : 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]], 
 append 
 : 
 bool 
 = 
 False 
 , 
 drop 
 : 
 bool 
 = 
 True 
 , 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Set the DataFrame index using existing columns.

Set the DataFrame index (row labels) using one existing column. The index can replace the existing index.

Returns
Type
Description
DataFrame
Changed row labels.

shift

  shift 
 ( 
 periods 
 : 
 int 
 = 
 1 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Shift index by desired number of periods.

Shifts the index without realigning the data.

Returns
Type
Description
NDFrame
Copy of input object, shifted.

sort_index

  sort_index 
 ( 
 ascending 
 : 
 bool 
 = 
 True 
 , 
 na_position 
 : 
 typing 
 . 
 Literal 
 [ 
 "first" 
 , 
 "last" 
 ] 
 = 
 "last" 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Sort object by labels (along an axis).

sort_values

  sort_values 
 ( 
 by 
 : 
 str 
 | 
 typing 
 . 
 Sequence 
 [ 
 str 
 ], 
 * 
 , 
 ascending 
 : 
 bool 
 | 
 typing 
 . 
 Sequence 
 [ 
 bool 
 ] 
 = 
 True 
 , 
 kind 
 : 
 str 
 = 
 "quicksort" 
 , 
 na_position 
 : 
 typing 
 . 
 Literal 
 [ 
 "first" 
 , 
 "last" 
 ] 
 = 
 "last" 
 ) 
 - 
> DataFrame 
 

Sort by the values along row axis.

Parameters
Name
Description
by
str or Sequence[str]

Name or list of names to sort by.

ascending
bool or Sequence[bool], default True

Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by.

kind
str, default quicksort

Choice of sorting algorithm. Accepts 'quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’. Ignored except when determining whether to sort stably. 'mergesort' or 'stable' will result in stable reorder.

na_position
{'first', 'last'}, default last

{'first', 'last'} , default 'last' Puts NaNs at the beginning if first ; last puts NaNs at the end.

std

  std 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return sample standard deviation over requested axis.

Normalized by N-1 by default.

Parameter
Name
Description
numeric_only
bool. default False

Default False. Include only float, int, boolean columns.

Returns
Type
Description
Series with sample standard deviation.

sub

  sub 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get subtraction of DataFrame and other, element-wise (binary operator - ).

Equivalent to dataframe - other . With reverse version, rsub .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

subtract

  subtract 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get subtraction of DataFrame and other, element-wise (binary operator - ).

Equivalent to dataframe - other . With reverse version, rsub .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

sum

  sum 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return the sum of the values over the requested axis.

This is equivalent to the method numpy.sum .

Parameter
Name
Description
numeric_only
bool. default False

Default False. Include only float, int, boolean columns.

Returns
Type
Description
Series with the sum of values.

tail

  tail 
 ( 
 n 
 : 
 int 
 = 
 5 
 ) 
 - 
> bigframes 
 . 
 dataframe 
 . 
 DataFrame 
 

Return the last n rows.

This function returns last n rows from the object based on position. It is useful for quickly verifying data, for example, after sorting or appending rows.

For negative values of n , this function returns all rows except the first |n| rows, equivalent to df[|n|:] .

If n is larger than the number of rows, this function returns all rows.

Parameter
Name
Description
n
int, default 5

Number of rows to select.

to_csv

  to_csv 
 ( 
 path_or_buf 
 : 
 str 
 , 
 sep 
 = 
 "," 
 , 
 * 
 , 
 header 
 : 
 bool 
 = 
 True 
 , 
 index 
 : 
 bool 
 = 
 True 
 ) 
 - 
> None 
 

Write object to a comma-separated values (csv) file on Cloud Storage.

Parameters
Name
Description
path_or_buf
str

A destination URI of Cloud Storage files(s) to store the extracted dataframe in format of gs://<bucket_name>/<object_name_or_glob> . If the data size is more than 1GB, you must use a wildcard to export the data into multiple files and the size of the files varies. None, file-like objects or local file paths not yet supported.

index
bool, default True

If True, write row names (index).

Returns
Type
Description
None
String output not yet supported.

to_gbq

  to_gbq 
 ( 
 destination_table 
 : 
 str 
 , 
 * 
 , 
 if_exists 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Literal 
 [ 
 "fail" 
 , 
 "replace" 
 , 
 "append" 
 ]] 
 = 
 "fail" 
 , 
 index 
 : 
 bool 
 = 
 True 
 , 
 ordering_id 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 ) 
 - 
> None 
 

Write a DataFrame to a BigQuery table.

Parameters
Name
Description
destination_table
str

Name of table to be written, in the form dataset.tablename or project.dataset.tablename .

if_exists
str, default 'fail'

Behavior when the destination table exists. Value can be one of: 'fail' If table exists raise pandas_gbq.gbq.TableCreationError. 'replace' If table exists, drop it, recreate it, and insert data. 'append' If table exists, insert data. Create if does not exist.

index
bool. default True

whether write row names (index) or not.

ordering_id
Optional[str], default None

If set, write the ordering of the DataFrame as a column in the result table with this name.

to_json

  to_json 
 ( 
 path_or_buf 
 : 
 str 
 , 
 orient 
 : 
 typing 
 . 
 Literal 
 [ 
 "split" 
 , 
 "records" 
 , 
 "index" 
 , 
 "columns" 
 , 
 "values" 
 , 
 "table" 
 ] 
 = 
 "columns" 
 , 
 * 
 , 
 lines 
 : 
 bool 
 = 
 False 
 , 
 index 
 : 
 bool 
 = 
 True 
 ) 
 - 
> None 
 

Convert the object to a JSON string, written to Cloud Storage.

Note NaN's and None will be converted to null and datetime objects will be converted to UNIX timestamps.

Parameters
Name
Description
path_or_buf
str

A destination URI of Cloud Storage files(s) to store the extracted dataframe in format of gs://<bucket_name>/<object_name_or_glob> . Must contain a wildcard * character. If the data size is more than 1GB, you must use a wildcard to export the data into multiple files and the size of the files varies. None, file-like objects or local file paths not yet supported.

orient
{ split , records , index , columns , values , table }, default 'columns

Indication of expected JSON string format. * Series: - default is 'index' - allowed values are: {{'split', 'records', 'index', 'table'}}. * DataFrame: - default is 'columns' - allowed values are: {{'split', 'records', 'index', 'columns', 'values', 'table'}}. * The format of the JSON string: - 'split' : dict like {{'index' -> [index], 'columns' -> [columns], 'data' -> [values]}} - 'records' : list like [{{column -> value}}, ... , {{column -> value}}] - 'index' : dict like {{index -> {{column -> value}}}} - 'columns' : dict like {{column -> {{index -> value}}}} - 'values' : just the values array - 'table' : dict like {{'schema': {{schema}}, 'data': {{data}}}} Describing the data, where data component is like orient='records' .

index
bool, default True

If True, write row names (index).

lines
bool, default False

If 'orient' is 'records' write out line-delimited json format. Will throw ValueError if incorrect 'orient' since others are not list-like.

Returns
Type
Description
None
String output not yet supported.

to_numpy

  to_numpy 
 ( 
 dtype 
 = 
 None 
 , 
 copy 
 = 
 False 
 , 
 na_value 
 = 
 None 
 , 
 ** 
 kwargs 
 ) 
 - 
> numpy 
 . 
 ndarray 
 

Convert the DataFrame to a NumPy array.

Parameters
Name
Description
dtype
None

The dtype to pass to numpy.asarray() .

copy
bool, default None

Whether to ensure that the returned value is not a view on another array.

na_value
Any, default None

The value to use for missing values. The default value depends on dtype and the dtypes of the DataFrame columns.

Returns
Type
Description
numpy.ndarray
The converted NumPy array.

to_pandas

  to_pandas 
 ( 
 max_download_size 
 : 
 typing 
 . 
 Optional 
 [ 
 int 
 ] 
 = 
 None 
 , 
 sampling_method 
 : 
 typing 
 . 
 Optional 
 [ 
 str 
 ] 
 = 
 None 
 , 
 random_state 
 : 
 typing 
 . 
 Optional 
 [ 
 int 
 ] 
 = 
 None 
 , 
 ) 
 - 
> pandas 
 . 
 core 
 . 
 frame 
 . 
 DataFrame 
 

Write DataFrame to pandas DataFrame.

Parameters
Name
Description
max_download_size
int, default None

Download size threshold in MB. If max_download_size is exceeded when downloading data (e.g., to_pandas()), the data will be downsampled if bigframes.options .sampling.enable_downsampling is True, otherwise, an error will be raised. If set to a value other than None, this will supersede the global config.

sampling_method
str, default None

Downsampling algorithms to be chosen from, the choices are: "head": This algorithm returns a portion of the data from the beginning. It is fast and requires minimal computations to perform the downsampling; "uniform": This algorithm returns uniform random samples of the data. If set to a value other than None, this will supersede the global config.

random_state
int, default None

The seed for the uniform downsampling algorithm. If provided, the uniform method may take longer to execute and require more computation. If set to a value other than None, this will supersede the global config.

Returns
Type
Description
pandas.DataFrame
A pandas DataFrame with all rows and columns of this DataFrame if the data_sampling_threshold_mb is not exceeded; otherwise, a pandas DataFrame with downsampled rows and all columns of this DataFrame.

to_parquet

  to_parquet 
 ( 
 path 
 : 
 str 
 , 
 * 
 , 
 index 
 : 
 bool 
 = 
 True 
 ) 
 - 
> None 
 

Write a DataFrame to the binary Parquet format.

This function writes the dataframe as a parquet file <https://parquet.apache.org/> _ to Cloud Storage.

Parameters
Name
Description
path
str

Destination URI(s) of Cloud Storage files(s) to store the extracted dataframe in format of gs://<bucket_name>/<object_name_or_glob> . If the data size is more than 1GB, you must use a wildcard to export the data into multiple files and the size of the files varies.

index
bool, default True

If True , include the dataframe's index(es) in the file output. If False , they will not be written to the file.

truediv

  truediv 
 ( 
 other 
 : 
 float 
 | 
 int 
 | 
 bigframes 
 . 
 series 
 . 
 Series 
 , 
 axis 
 : 
 str 
 | 
 int 
 = 
 "columns" 
 ) 
 - 
> DataFrame 
 

Get floating division of DataFrame and other, element-wise (binary operator / ).

Equivalent to dataframe / other . With reverse version, rtruediv .

Among flexible wrappers ( add , sub , mul , div , mod , pow ) to arithmetic operators: + , - , * , / , // , % , ** .

Parameters
Name
Description
other
float, int, or Series

Any single or multiple element data structure, or list-like object.

axis
{0 or 'index', 1 or 'columns'}

Whether to compare by the index (0 or 'index') or columns. (1 or 'columns'). For Series input, axis to match Series index on.

Returns
Type
Description
DataFrame
DataFrame result of the arithmetic operation.

value_counts

  value_counts 
 ( 
 subset 
 : 
 typing 
 . 
 Optional 
 [ 
 typing 
 . 
 Union 
 [ 
 typing 
 . 
 Hashable 
 , 
 typing 
 . 
 Sequence 
 [ 
 typing 
 . 
 Hashable 
 ]] 
 ] 
 = 
 None 
 , 
 normalize 
 : 
 bool 
 = 
 False 
 , 
 sort 
 : 
 bool 
 = 
 True 
 , 
 ascending 
 : 
 bool 
 = 
 False 
 , 
 dropna 
 : 
 bool 
 = 
 True 
 , 
 ) 
 

Return a Series containing counts of unique rows in the DataFrame.

Parameters
Name
Description
subset
label or list of labels, optional

Columns to use when counting unique combinations.

normalize
bool, default False

Return proportions rather than frequencies.

sort
bool, default True

Sort by frequencies.

ascending
bool, default False

Sort in ascending order.

dropna
bool, default True

Don’t include counts of rows that contain NA values.

Returns
Type
Description
Series
Series containing counts of unique rows in the DataFrame

var

  var 
 ( 
 * 
 , 
 numeric_only 
 : 
 bool 
 = 
 False 
 ) 
 - 
> bigframes 
 . 
 series 
 . 
 Series 
 

Return unbiased variance over requested axis.

Normalized by N-1 by default.

Parameter
Name
Description
numeric_only
bool. default False

Default False. Include only float, int, boolean columns.

Returns
Type
Description
Series with unbiased variance over requested axis.
Design a Mobile Site
View Site in Mobile | Classic
Share by: