fsds_100719.ft package

A collection of submodules by online-ds-ft-100719. Maintained by James Irving (GitHub: jirvingphd) james.irving@flatironschool.com

Submodules

fsds_100719.ft.hakkeray module

My Template Module Name: Ru Keïn Email: rukeine@gmail.com GitHub Profile: https://github.com/hakkeray

fsds_100719.ft.hakkeray.hot_stats(data, column, verbose=False, t=None)[source]

Scans the values of a column within a dataframe and displays its datatype, nulls (incl. pct of total), unique values, non-null value counts, and statistical info (if the datatype is numeric).

Parameters:

**args:

data: accepts dataframe

column: accepts name of column within dataframe (should be inside quotes ‘’)

**kwargs:

verbose: (optional) accepts a boolean (default=False); verbose=True will display all unique values found.

t: (optional) accepts column name as target to calculate correlation coefficient against using pandas data.corr() function.

Examples:

hot_stats(df, ‘str_column’) –> where df = data, ‘string_column’ = column you want to scan

hot_stats(df, ‘numeric_column’, t=’target’) –> where ‘target’ = column to check correlation value

Developer notes: additional features to add in the future: -get mode(s) -functionality for string objects -pass multiple columns at once and display all —————– SAMPLE OUTPUT: ************************************

——–> HOT!STATS <——– CONDITION Data Type: int64 count 21597.000000 mean 3.409825 std 0.650546 min 1.000000 25% 3.000000 50% 3.000000 75% 4.000000 max 5.000000 Name: condition, dtype: float64 à-la-Mode: 0 3 dtype: int64 No Nulls Found! Non-Null Value Counts: 3 14020 4 5677 5 1701 2 170 1 29 Name: condition, dtype: int64 # Unique Values: 5

fsds_100719.ft.jirving module

My Template Module Name: James M. Irving Email: james.irving.phd@gmail.com GitHub Profile: https://github.com/jirvingphd

fsds_100719.ft.jirving.check_column(df, col_name, n_unique=10)[source]

Displays info on null values, datatype, unqiue values and displays .describe()

Args:
df (df): contains the columns col_name (str): name of the df column to show n_unique (int): Number of unique values top show.
Return:
fig, ax (Matplotlib Figure and Axes)
fsds_100719.ft.jirving.testing()[source]