battetl package

Subpackages

Submodules

battetl.BattETL module

class battetl.BattETL.BattETL(config_path: str, user_transform_test_data: Callable[[DataFrame], DataFrame] | None = None, user_transform_cycle_stats: Callable[[DataFrame], DataFrame] | None = None, env_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/battetl/checkouts/latest/docs/.env')

Bases: object

extract()

Extracts the data in the target directory specified in the config.

Returns:

self – Returns a reference to the instance object

Return type:

BattETL

load()

Loads the test data, cycle stats, and schedule file to the target databases(s) specified in the config.

Returns:

num_rows_inserted – The number of rows inserted into the target_table.

Return type:

int

transform()

Transforms the test data from the target directory.

Returns:

self – Returns a reference to the instance object

Return type:

BattETL

battetl.battetl_quick module

battetl.battetl_quick.battetl_quick(file_path: str, file_meta=None) DataFrame

The BattETL Quick Mode. Extracts test data or cycle stats from a file, transforms it, and loads it to the database.

Parameters:
  • file_path (str) – Name of the file to be processed.

  • file_meta (dict) – A dictionary used to decode unstructured data.

Returns:

The test data or cycle stats that was loaded to the database.

Return type:

pd.DataFrame

battetl.battetl_quick.clone_battdb(battdb_folder: str = 'battdb')

Clones the BattDB repo to the specified folder.

Parameters:

battdb_folder (str, optional) – Name of the folder to clone battdb to. The default is ‘battdb’.

battetl.battetl_quick.convert_time_to_seconds(df: DataFrame, column_name: str)

Converts a column of time values in the format 0:00:00.000 to seconds.

Parameters:
  • df (pd.DataFrame) – The DataFrame to be processed.

  • column_name (str) – The name of the column to be converted.

battetl.battetl_quick.is_docker_compose_installed() bool

Checks if Docker Compose is installed.

Return type:

bool

battetl.battetl_quick.is_docker_installed() bool

Checks if Docker is installed.

Return type:

bool

battetl.battetl_quick.run_docker_compose(battdb_folder: str = 'battdb') CompletedProcess

Runs docker-compose up -d with the docker-compose.yml file in the BattDB repo.

Parameters:

battdb_folder (str, optional) – Name of the folder to clone BattDB to. The default is ‘battdb’.

Return type:

subprocess.CompletedProcess

battetl.constants module

class battetl.constants.Constants

Bases: object

ARBIN_SCHEDULE_FILE_ENCODING = 'latin-1'
BATTDB_QUICK_SCHEMA_VERSION = 1.1
BATTDB_SCHEMA_VERSION = 11.2
BATTVIZ_CYCLE_STATS_PATH = 'd/cycling-results/cycling-results'
BATTVIZ_TEST_DATA_PATH = 'd/Test-Data/test-data'
COLUMNS_ARBIN_CYCLE_STATS_ONLY = {'Charge Time (s)', 'Coulombic Efficiency (%)', 'Date_Time', 'Discharge Time (s)', 'V_Max_On_Cycle (V)', 'mAh/g'}
COLUMNS_ARBIN_TEST_DATA_ONLY = {'ACR (Ohm)', 'Data Point', 'Date Time', 'Internal Resistance (Ohm)', 'dQ/dV (Ah/V)', 'dV/dQ (V/Ah)', 'dV/dt (V/s)'}
COLUMNS_CYCLE_STATS = {'calculated_cc_capacity_mah', 'calculated_cc_charge_time_s', 'calculated_charge_capacity_mah', 'calculated_charge_energy_mwh', 'calculated_coulombic_efficiency', 'calculated_cv_capacity_mah', 'calculated_cv_charge_time_s', 'calculated_discharge_capacity_mah', 'calculated_discharge_energy_mwh', 'calculated_eighty_percent_charge_time_s', 'calculated_fifty_percent_charge_time_s', 'calculated_max_charge_temp_c', 'calculated_max_discharge_temp_c', 'cycle', 'cycle_stats_id', 'other_details', 'reported_charge_capacity_mah', 'reported_charge_energy_mwh', 'reported_charge_time_s', 'reported_coulombic_efficiency', 'reported_discharge_capacity_mah', 'reported_discharge_energy_mwh', 'reported_discharge_time_s', 'test_id', 'test_time_s'}
COLUMNS_MACCOR_CYCLE_STATS_CUSTOMER1 = {'AH-IN', 'AH-OUT', 'Cycle', 'Date', 'T1_End', 'T1_Max', 'T1_Min', 'T1_Start'}
COLUMNS_MACCOR_CYCLE_STATS_ONLY = {'AH-IN', 'AH-OUT', 'Cycle', 'Date', 'T1_End', 'T1_Max', 'T1_Min', 'T1_Start'}
COLUMNS_MACCOR_TEST_DATA_CUSTOMER1 = {'AtRate (0x02)', 'AtRateTimeToEmpty (0x04)', 'AverageCurrent (0x14)', 'AverageTimeToEmpty (0x16)', 'AverageTimeToFull (0x18)', 'BatteryStatus (0x0A)', 'Capacity(Ah)', 'ChargingCurrent (0x32)', 'ChargingVoltage (0x30)', 'Current (0x0C)', 'Current(A)Voltage(V)', 'Cyc#', 'DPt Time', 'DesignCapacity (0x3C)', 'ES', 'FullChargeCapacity (0x12)', 'ManufacturerAccess (0x00)', 'RelativeStateOfCharge (0x2C)', 'RemainingCapacity (0x10)', 'Step', 'StepTime(s)', 'Temperature (0x06)', 'TestTime(s)', 'Volt 1', 'Voltage (0x08)', 'Watt-hr'}
COLUMNS_MACCOR_TEST_DATA_ONLY = {'Capacity(Ah)', 'Current(A)', 'Cyc#', 'DPt Time', 'EV Temp', 'Step', 'StepTime(s)', 'Temp 1', 'TestTime(s)', 'Voltage(V)'}
COLUMNS_MACCOR_TEST_DATA_TYPE2_ONLY = {'Capacity', 'Cycle C', 'Cycle P', 'DPT Time', 'ES', 'Energy', 'MD', 'Rec'}
COLUMNS_MAPPING_ARBIN_CYCLE_STATS = {'Charge Capacity (Ah)': 'reported_charge_capacity_ah', 'Charge Capacity(Ah)': 'reported_charge_capacity_ah', 'Charge Energy (Wh)': 'reported_charge_energy_wh', 'Charge Time (s)': 'reported_charge_time_s', 'Charge Time(s)': 'reported_charge_time_s', 'Charge_Energy(Wh)': 'reported_charge_energy_wh', 'Coulombic Efficiency (%)': 'reported_coulombic_efficiency', 'Current (A)': 'current_a', 'Current(A)': 'current_a', 'Cycle Index': 'cycle', 'Date_Time': 'recorded_datetime', 'Discharge Capacity (Ah)': 'reported_discharge_capacity_ah', 'Discharge Capacity(Ah)': 'reported_discharge_capacity_ah', 'Discharge Energy (Wh)': 'reported_discharge_energy_wh', 'Discharge Energy(Wh)': 'reported_discharge_energy_wh', 'Discharge Time (s)': 'reported_discharge_time_s', 'Step Index': 'step', 'Step Time (s)': 'step_time_s', 'Step Time(s)': 'step_time_s', 'TC_Counter1': 'tc_counter1', 'Test Time (s)': 'test_time_s', 'Test Time(s)': 'test_time_s', 'Voltage (V)': 'voltage_v', 'Voltage(V)': 'voltage_v'}
COLUMNS_MAPPING_ARBIN_TEST_DATA = {'Charge Capacity (Ah)': 'arbin_charge_capacity_ah', 'Charge Capacity(Ah)': 'arbin_charge_capacity_ah', 'Charge Energy (Wh)': 'arbin_charge_energy_wh', 'Charge_Capacity(Ah)': 'arbin_charge_capacity_ah', 'Charge_Energy (Wh)': 'arbin_charge_energy_wh', 'Charge_Energy(Wh)': 'arbin_charge_energy_wh', 'Current (A)': 'current_a', 'Current(A)': 'current_a', 'Cycle Index': 'cycle', 'Cycle_Index': 'cycle', 'Data Point': 'data_point', 'Date Time': 'recorded_datetime', 'Date_Time': 'recorded_datetime', 'Discharge Capacity (Ah)': 'arbin_discharge_capacity_ah', 'Discharge Capacity(Ah)': 'arbin_discharge_capacity_ah', 'Discharge Energy (Wh)': 'arbin_discharge_energy_wh', 'Discharge Energy(Wh)': 'arbin_discharge_energy_wh', 'Discharge_Capacity(Ah)': 'arbin_discharge_capacity_ah', 'Discharge_Energy(Wh)': 'arbin_discharge_energy_wh', 'Internal Resistance (Ohm)': 'impedance_ohm', 'Power (W)': 'power_w', 'Step Index': 'step', 'Step Time (s)': 'step_time_s', 'Step Time(s)': 'step_time_s', 'Step_Index': 'step', 'Step_Time(s)': 'step_time_s', 'TC_Counter1': 'tc_counter1', 'Test Time (s)': 'test_time_s', 'Test Time(s)': 'test_time_s', 'Test_Time(s)': 'test_time_s', 'Voltage (V)': 'voltage_v', 'Voltage(V)': 'voltage_v'}
COLUMNS_MAPPING_MACCOR_CYCLE_STATS = {'ACR': 'acr_ohm', 'AH-IN': 'reported_charge_capacity_ah', 'AH-OUT': 'reported_discharge_capacity_ah', 'Current': 'maccor_min_current_ma', 'Cycle': 'cycle', 'T1_End': 'maccor_charge_thermocouple_end_c', 'T1_End.1': 'maccor_discharge_thermocouple_end_c', 'T1_Max': 'maccor_charge_thermocouple_max_c', 'T1_Max.1': 'maccor_discharge_thermocouple_max_c', 'T1_Min': 'maccor_charge_thermocouple_min_c', 'T1_Min.1': 'maccor_discharge_thermocouple_min_c', 'T1_Start': 'maccor_charge_thermocouple_start_c', 'T1_Start.1': 'maccor_discharge_thermocouple_start_c', 'Test Time': 'test_time_s', 'Voltage': 'maccor_min_voltage_mv', 'WH-IN': 'reported_charge_energy_wh', 'WH-OUT': 'reported_discharge_energy_wh'}
COLUMNS_MAPPING_MACCOR_TEST_DATA = {'Capacity(Ah)': 'maccor_capacity_ah', 'Current(A)': 'current_a', 'Cyc#': 'cycle', 'Cycle P': 'cycle', 'DPt Time': 'recorded_datetime', 'EV Temp': 'ev_temp_c', 'Step': 'step', 'Step Time': 'step_time_s', 'StepTime(s)': 'step_time_s', 'Test Time': 'test_time_s', 'TestTime(s)': 'test_time_s', 'Voltage(V)': 'voltage_v', 'Watt-hr': 'maccor_energy_wh'}
COLUMNS_TEST_DATA = {'current_ma', 'cycle', 'other_details', 'recorded_datetime', 'step', 'step_time_s', 'test_data_id', 'test_id', 'test_time_s', 'thermocouple_temps_c', 'unixtime_s', 'voltage_mv'}
COLUMNS_TO_MILLI = {'arbin_charge_capacity_ah': 'arbin_charge_capacity_mah', 'arbin_charge_energy_wh': 'arbin_charge_energy_mwh', 'arbin_discharge_capacity_ah': 'arbin_discharge_capacity_mah', 'arbin_discharge_energy_wh': 'arbin_discharge_energy_mwh', 'capacity': 'maccor_capacity_mah', 'capacity_ah': 'capacity_mah', 'charge_capacity_ah': 'charge_capacity_mah', 'charge_energy_wh': 'charge_energy_mwh', 'current': 'current_ma', 'current_a': 'current_ma', 'discharge_capacity_ah': 'discharge_capacity_mah', 'discharge_energy_wh': 'discharge_energy_mwh', 'energy': 'maccor_energy_mwh', 'impedance_ohm': 'impedance_mohm', 'maccor_capacity_ah': 'maccor_capacity_mah', 'maccor_energy_wh': 'maccor_energy_mwh', 'power_w': 'power_mw', 'reported_charge_capacity_ah': 'reported_charge_capacity_mah', 'reported_charge_energy_wh': 'reported_charge_energy_mwh', 'reported_discharge_capacity_ah': 'reported_discharge_capacity_mah', 'reported_discharge_energy_wh': 'reported_discharge_energy_mwh', 'voltage': 'voltage_mv', 'voltage_v': 'voltage_mv'}
COLUMNS_UNSTRUCTURED_TEST_DATA = {'current_ma', 'other_details', 'time_s', 'voltage_mv'}
DATABASE_MAX_RETRIES = 10
DATABASE_MAX_RETRY_DELAY = 60
DATABASE_RETRY_DELAY = 10
DATA_TYPE_CYCLE_STATS = 'cycle_stats'
DATA_TYPE_TEST_DATA = 'test_data'
DEFAULT_TIME_ZONE = 'America/Los_Angeles'
MACCOR_CHARGE_STEP_NAMES = ['Charge', 'Chg Func', 'FastWave']
MACCOR_DISCHARGE_STEP_NAMES = ['Dischrge', 'Dis Func']
MACCOR_PROCEDURE_FILE_ENCODING = 'UTF-8'
MAKE_ARBIN = 'arbin'
MAKE_MACCOR = 'maccor'
PREFIX_ARBIN_THERMOCOUPLE = 'aux_temperature_'
PREFIX_MACCOR_THERMOCOUPLE = 'temp '
TEMPLATE_RENAMED_THERMOCOUPLE = 'thermocouple_X_c'
UNSTRUCTURED_DATA_REQUIRED_KEYS_CSV = {'current_ma', 'pandas_read_csv_args', 'voltage_mv'}
UNSTRUCTURED_DATA_REQUIRED_KEYS_XLSX = {'current_ma', 'pandas_read_excel_args', 'voltage_mv'}

battetl.logger module

BattETL logger module

battetl.utils module

class battetl.utils.DashOrderedDict

Bases: OrderedDict

Pulled from BEEP: https://github.com/TRI-AMDD/beep/blob/master/beep/utils/__init__.py

Nested data structure with pydash enabled getters and setters. Nested values can be set using dot notation, e. g.

>>> dod = DashOrderedDict()
>>> dod.set('key1.key2', 5)
>>> print(dod['key1']['key2'])
>>> 5
get_path(string, default=None)
merge(obj)
set(string, value)
unset(string)
class battetl.utils.Utils

Bases: object

convert_datetime(column: str, timezone: str) DataFrame

Convert datetime to UTC format with time zone

Parameters:
  • df (pandas.DataFrame) – The input DataFrame

  • column (str) – column to convert

  • timezone (str) – Time zone strings in the IANA Time Zone Database

Returns:

df – Converted data

Return type:

pandas.DataFrame

convert_timedelta_to_seconds(column: str) DataFrame

Convert time delta to seconds

Parameters:
  • df (pandas.DataFrame) – Original data

  • column (str) – Column to convert

Returns:

df – Converted data

Return type:

pandas.DataFrame

convert_to_float()

Converts value to float if it is a string.

Parameters:

value (str or float) – The value to convert.

Return type:

float

convert_to_milli() DataFrame

Convert columns to milli- and rename column name

Parameters:

df (pandas.DataFrame) – Original data

Returns:

df – Converted data

Return type:

pandas.DataFrame

drop_columns(columns_to_drop: list[str]) DataFrame

This function drops unnamed columns from the passed DataFrame.

Parameters:
  • df (pandas.DataFrame) – DataFrame to drop unnamed columns from.

  • columns_to_drop (list) – List of strings that give column names to drop

Returns:

df – Pandas DataFrame with the dropped columns.

Return type:

pandas.DataFrame

drop_empty_rows() DataFrame

The function drops empty rows from DataFrame

Parameters:

df (pandas.DataFrame) – DataFrame to drop empty rows from.

Returns:

df – Pandas DataFrame with the empty rows dropped.

Return type:

pandas.DataFrame

drop_unnamed_columns() DataFrame

The function drops unnamed columns from DataFrame

Parameters:

df (pandas.DataFrame) – DataFrame to drop unnamed columns from.

Returns:

df – Pandas DataFrame with the unnamed columns dropped.

Return type:

pandas.DataFrame

get_cycle_make() tuple[str, str]

Determine the make and type of cycler. Currently supported: - Arbin Test Data - Arbin Cycle Stats - Maccor Test Data - Maccor Cycle Stats

Parameters:

data (list[str]) – Original data

Returns:

  • (str) – Cycle make

  • (str) – Data type

get_lower_strip_set() set

Transform string list to lower case set

Parameters:

data (list[str]) – Original data

Returns:

Lower case set data

Return type:

(set)

load_config() dict

Load config file from path

Parameters:

config_path (str) – Path to config file

Returns:

config – Config file as dictionary

Return type:

dict

load_env() None

Set environment variables from .env file

Parameters:

env_path (str) – Path to .env file

rename_df_columns(columnsMapping: dict) DataFrame

Rename column names to BattETL format

Parameters:

df (pandas.DataFrame) – Original data

Returns:

df – Renamed data

Return type:

pandas.DataFrame

sort_dataframe(columns: list[str]) DataFrame

Sort pandas.DataFrame with input columns

Parameters:
  • df (pandas.DataFrame) – Original data

  • columns (list[str]) – Sort by columns

Returns:

df – Sorted data

Return type:

pandas.DataFrame

validate_file_meta(file_type: str) bool

Validate file_meta

Parameters:
  • file_meta (dict) –

    Dictionary containing the meta data for the file. For example:

    {
        "voltage_mv" :
        {
            "column_name":"volt",
            "scaling_factor":1,
        },
        "current_ma" :
        {
            "column_name":"curr",
            "scaling_factor":1,
        },
    }
    

  • file_type (str) – Type of file. Valid values are ‘csv’ and ‘xlsx’.

Returns:

True if the file_meta is valid.

Return type:

bool

Module contents

battetl.create_config(data_folder_path)

Create a configuration file based on the contents of the specified data folder.

This function scans the provided data folder for relevant files and generates a configuration file (in JSON format) with metadata about the files and the test setup.

Arguments: data_folder_path (str): The path to the data folder containing the test files.

Returns: None

Example Usage: create_config(‘/path/to/data/folder’)

File Naming Conventions: - For Maccor data files: Files ending with a number followed by ‘.txt’. - For Maccor stats files: Files ending with ‘[STATS].txt’. - For Maccor schedule files: Files ending with ‘.000’. - For Arbin data files: Files containing ‘Wb’ in the name and ending with ‘.CSV’. - For Arbin stats files: Files ending with ‘StatisticByCycle.CSV’. - For Maccor schedule files: Files ending with ‘.sdx’.

Generated Configuration Structure: The configuration file includes metadata about the test setup, including timezone, file paths, and various parameters related to the test, cell, schedule, cycler, customers, and projects.

The configuration structure is as follows: {

“timezone”: “America/Los_Angeles”, “data_file_path”: data_files, “stats_file_path”: stats_files, “schedule_file_path”: schedule_files, “meta_data”: {

“test_meta”: {

“cell_id”: None, “schedule_id”: None, “test_name”: os.path.basename(data_files[0]).split(’ ‘, 1)[0], “start_date”: ‘2020-10-06’, “end_date”: ‘2020-10-11’, “channel”: int(os.path.basename(data_files[0]).split(‘.’, 1)[0].split(’ ‘)[-1]), “ev_chamber”: 12, “ev_chamber_slot”: None, “thermocouples”: None, “thermocouple_channels”: None, “comments”: “Ran at 45 degrees C”, “project_id”: None, “test_capacity_mah”: 2650, “potentiostat_id”: None, “cycler_id”: None,

}, “cell”: {

“cell_type_id”: None, “batch_number”: “BATCH_NUMBER”, “label”: “24”, “date_received”: “2020-09-01”, “comments”: None, “date_manufactured”: None, “manufacturer_sn”: “BattGenie_SN”, “dims”: None, “weight_g”: None, “first_received_at_voltage_mv”: None,

}, “cell_meta”: {

“manufacturer”: “BattGenie”, “manufacturer_pn”: “BattGenie_PN”, “form_factor”: “pouch”, “capacity_mah”: 2720, “chemistry”: None, “dimensions”: ‘{“x_mm”:”54.25”, “y_mm”:106.96, “z_mm”:3.19}’, “datasheet”: None,

}, “schedule_meta”: {

“schedule_name”: “BG_Characterization_v1”, “test_type”: “Characterization”, “cycler_make”: “Maccor”, “date_created”: “2020-10-06”, “created_by”: “BattGenie”, “comments”: None, “cv_voltage_threshold_mv”: None, “details”: None,

}, “cycler”: {

“sn”: “SN”, “calibration_date”: None, “calibration_due_date”: None, “location”: “BattGenie”, “timezone_based”: None,

}, “cycler_meta”: {

“manufacturer”: “Maccor”, “model”: “SERIES 4000M”, “datasheet”: None, “num_channels”: None, “lower_current_limit_a”: None, “upper_current_limit_a”: None, “lower_voltage_limit_v”: None, “upper_voltage_limit_v”: None,

}, “customers”: {

“customer_name”: “FakeCustomer”

}, “projects”: {

“project_name”: “FakeProject”

}

}

}

battetl.run_battetl()

Run the BattETL application with command-line interface.

This function parses command-line arguments and executes appropriate actions based on the provided commands.

Command-Line Arguments: -c, –config: Configuration command. If specified, it creates a new configuration file. -e, –extract: Extract command. If specified, it triggers the data extraction process. -t, –transform: Transform command. If specified, it triggers the data transformation process. -l, –load: Load command. If specified, it triggers the data loading process. -etl, –etl: ETL command. If specified, it triggers the full ETL (Extract, Transform, Load) process.

Optional Argument: config_file_path: Path to the configuration file. If provided, it overrides the default configuration file path.

If not provided, a default configuration file named ‘demo_config.json’ is used.

Returns: None

Raises: None

Example Usage: To create a new configuration file:

python script.py –config /path/to/config.json

To run a specific step of the ETL process:

python script.py –extract python script.py –transform python script.py –load

To run the full ETL process:

python script.py –etl

To run the BattETL in quick mode with default configuration:

python script.py