battetl package¶
Subpackages¶
- battetl.extract package
- battetl.load package
- Submodules
- battetl.load.Loader module
- battetl.load.batt_db_test_helper module
BattDbTestHelperBattDbTestHelper.cell_idBattDbTestHelper.cell_type_idBattDbTestHelper.create_test_db_entries()BattDbTestHelper.customer_idBattDbTestHelper.cycle_idBattDbTestHelper.cycler_type_idBattDbTestHelper.delete_entry()BattDbTestHelper.delete_test_data()BattDbTestHelper.delete_test_db_entries()BattDbTestHelper.generate_random_string()BattDbTestHelper.load_df_to_db()BattDbTestHelper.load_sil_data()BattDbTestHelper.load_sim_data()BattDbTestHelper.profile_idBattDbTestHelper.read_first_row()BattDbTestHelper.read_last_row()BattDbTestHelper.schedule_idBattDbTestHelper.test_id
- battetl.load.quick_loader module
- Module contents
- battetl.transform package
Submodules¶
battetl.BattETL module¶
- class battetl.BattETL.BattETL(config_path: str, user_transform_test_data: Callable[[DataFrame], DataFrame] | None = None, user_transform_cycle_stats: Callable[[DataFrame], DataFrame] | None = None, env_path: str = '/home/docs/checkouts/readthedocs.org/user_builds/battetl/checkouts/latest/docs/.env')¶
Bases:
object- extract()¶
Extracts the data in the target directory specified in the config.
- Returns:
self – Returns a reference to the instance object
- Return type:
- load()¶
Loads the test data, cycle stats, and schedule file to the target databases(s) specified in the config.
- Returns:
num_rows_inserted – The number of rows inserted into the target_table.
- Return type:
int
battetl.battetl_quick module¶
- battetl.battetl_quick.battetl_quick(file_path: str, file_meta=None) DataFrame¶
The BattETL Quick Mode. Extracts test data or cycle stats from a file, transforms it, and loads it to the database.
- Parameters:
file_path (str) – Name of the file to be processed.
file_meta (dict) – A dictionary used to decode unstructured data.
- Returns:
The test data or cycle stats that was loaded to the database.
- Return type:
pd.DataFrame
- battetl.battetl_quick.clone_battdb(battdb_folder: str = 'battdb')¶
Clones the BattDB repo to the specified folder.
- Parameters:
battdb_folder (str, optional) – Name of the folder to clone battdb to. The default is ‘battdb’.
- battetl.battetl_quick.convert_time_to_seconds(df: DataFrame, column_name: str)¶
Converts a column of time values in the format 0:00:00.000 to seconds.
- Parameters:
df (pd.DataFrame) – The DataFrame to be processed.
column_name (str) – The name of the column to be converted.
- battetl.battetl_quick.is_docker_compose_installed() bool¶
Checks if Docker Compose is installed.
- Return type:
bool
- battetl.battetl_quick.is_docker_installed() bool¶
Checks if Docker is installed.
- Return type:
bool
- battetl.battetl_quick.run_docker_compose(battdb_folder: str = 'battdb') CompletedProcess¶
Runs docker-compose up -d with the docker-compose.yml file in the BattDB repo.
- Parameters:
battdb_folder (str, optional) – Name of the folder to clone BattDB to. The default is ‘battdb’.
- Return type:
subprocess.CompletedProcess
battetl.constants module¶
- class battetl.constants.Constants¶
Bases:
object- ARBIN_SCHEDULE_FILE_ENCODING = 'latin-1'¶
- BATTDB_QUICK_SCHEMA_VERSION = 1.1¶
- BATTDB_SCHEMA_VERSION = 11.2¶
- BATTVIZ_CYCLE_STATS_PATH = 'd/cycling-results/cycling-results'¶
- BATTVIZ_TEST_DATA_PATH = 'd/Test-Data/test-data'¶
- COLUMNS_ARBIN_CYCLE_STATS_ONLY = {'Charge Time (s)', 'Coulombic Efficiency (%)', 'Date_Time', 'Discharge Time (s)', 'V_Max_On_Cycle (V)', 'mAh/g'}¶
- COLUMNS_ARBIN_TEST_DATA_ONLY = {'ACR (Ohm)', 'Data Point', 'Date Time', 'Internal Resistance (Ohm)', 'dQ/dV (Ah/V)', 'dV/dQ (V/Ah)', 'dV/dt (V/s)'}¶
- COLUMNS_CYCLE_STATS = {'calculated_cc_capacity_mah', 'calculated_cc_charge_time_s', 'calculated_charge_capacity_mah', 'calculated_charge_energy_mwh', 'calculated_coulombic_efficiency', 'calculated_cv_capacity_mah', 'calculated_cv_charge_time_s', 'calculated_discharge_capacity_mah', 'calculated_discharge_energy_mwh', 'calculated_eighty_percent_charge_time_s', 'calculated_fifty_percent_charge_time_s', 'calculated_max_charge_temp_c', 'calculated_max_discharge_temp_c', 'cycle', 'cycle_stats_id', 'other_details', 'reported_charge_capacity_mah', 'reported_charge_energy_mwh', 'reported_charge_time_s', 'reported_coulombic_efficiency', 'reported_discharge_capacity_mah', 'reported_discharge_energy_mwh', 'reported_discharge_time_s', 'test_id', 'test_time_s'}¶
- COLUMNS_MACCOR_CYCLE_STATS_CUSTOMER1 = {'AH-IN', 'AH-OUT', 'Cycle', 'Date', 'T1_End', 'T1_Max', 'T1_Min', 'T1_Start'}¶
- COLUMNS_MACCOR_CYCLE_STATS_ONLY = {'AH-IN', 'AH-OUT', 'Cycle', 'Date', 'T1_End', 'T1_Max', 'T1_Min', 'T1_Start'}¶
- COLUMNS_MACCOR_TEST_DATA_CUSTOMER1 = {'AtRate (0x02)', 'AtRateTimeToEmpty (0x04)', 'AverageCurrent (0x14)', 'AverageTimeToEmpty (0x16)', 'AverageTimeToFull (0x18)', 'BatteryStatus (0x0A)', 'Capacity(Ah)', 'ChargingCurrent (0x32)', 'ChargingVoltage (0x30)', 'Current (0x0C)', 'Current(A)Voltage(V)', 'Cyc#', 'DPt Time', 'DesignCapacity (0x3C)', 'ES', 'FullChargeCapacity (0x12)', 'ManufacturerAccess (0x00)', 'RelativeStateOfCharge (0x2C)', 'RemainingCapacity (0x10)', 'Step', 'StepTime(s)', 'Temperature (0x06)', 'TestTime(s)', 'Volt 1', 'Voltage (0x08)', 'Watt-hr'}¶
- COLUMNS_MACCOR_TEST_DATA_ONLY = {'Capacity(Ah)', 'Current(A)', 'Cyc#', 'DPt Time', 'EV Temp', 'Step', 'StepTime(s)', 'Temp 1', 'TestTime(s)', 'Voltage(V)'}¶
- COLUMNS_MACCOR_TEST_DATA_TYPE2_ONLY = {'Capacity', 'Cycle C', 'Cycle P', 'DPT Time', 'ES', 'Energy', 'MD', 'Rec'}¶
- COLUMNS_MAPPING_ARBIN_CYCLE_STATS = {'Charge Capacity (Ah)': 'reported_charge_capacity_ah', 'Charge Capacity(Ah)': 'reported_charge_capacity_ah', 'Charge Energy (Wh)': 'reported_charge_energy_wh', 'Charge Time (s)': 'reported_charge_time_s', 'Charge Time(s)': 'reported_charge_time_s', 'Charge_Energy(Wh)': 'reported_charge_energy_wh', 'Coulombic Efficiency (%)': 'reported_coulombic_efficiency', 'Current (A)': 'current_a', 'Current(A)': 'current_a', 'Cycle Index': 'cycle', 'Date_Time': 'recorded_datetime', 'Discharge Capacity (Ah)': 'reported_discharge_capacity_ah', 'Discharge Capacity(Ah)': 'reported_discharge_capacity_ah', 'Discharge Energy (Wh)': 'reported_discharge_energy_wh', 'Discharge Energy(Wh)': 'reported_discharge_energy_wh', 'Discharge Time (s)': 'reported_discharge_time_s', 'Step Index': 'step', 'Step Time (s)': 'step_time_s', 'Step Time(s)': 'step_time_s', 'TC_Counter1': 'tc_counter1', 'Test Time (s)': 'test_time_s', 'Test Time(s)': 'test_time_s', 'Voltage (V)': 'voltage_v', 'Voltage(V)': 'voltage_v'}¶
- COLUMNS_MAPPING_ARBIN_TEST_DATA = {'Charge Capacity (Ah)': 'arbin_charge_capacity_ah', 'Charge Capacity(Ah)': 'arbin_charge_capacity_ah', 'Charge Energy (Wh)': 'arbin_charge_energy_wh', 'Charge_Capacity(Ah)': 'arbin_charge_capacity_ah', 'Charge_Energy (Wh)': 'arbin_charge_energy_wh', 'Charge_Energy(Wh)': 'arbin_charge_energy_wh', 'Current (A)': 'current_a', 'Current(A)': 'current_a', 'Cycle Index': 'cycle', 'Cycle_Index': 'cycle', 'Data Point': 'data_point', 'Date Time': 'recorded_datetime', 'Date_Time': 'recorded_datetime', 'Discharge Capacity (Ah)': 'arbin_discharge_capacity_ah', 'Discharge Capacity(Ah)': 'arbin_discharge_capacity_ah', 'Discharge Energy (Wh)': 'arbin_discharge_energy_wh', 'Discharge Energy(Wh)': 'arbin_discharge_energy_wh', 'Discharge_Capacity(Ah)': 'arbin_discharge_capacity_ah', 'Discharge_Energy(Wh)': 'arbin_discharge_energy_wh', 'Internal Resistance (Ohm)': 'impedance_ohm', 'Power (W)': 'power_w', 'Step Index': 'step', 'Step Time (s)': 'step_time_s', 'Step Time(s)': 'step_time_s', 'Step_Index': 'step', 'Step_Time(s)': 'step_time_s', 'TC_Counter1': 'tc_counter1', 'Test Time (s)': 'test_time_s', 'Test Time(s)': 'test_time_s', 'Test_Time(s)': 'test_time_s', 'Voltage (V)': 'voltage_v', 'Voltage(V)': 'voltage_v'}¶
- COLUMNS_MAPPING_MACCOR_CYCLE_STATS = {'ACR': 'acr_ohm', 'AH-IN': 'reported_charge_capacity_ah', 'AH-OUT': 'reported_discharge_capacity_ah', 'Current': 'maccor_min_current_ma', 'Cycle': 'cycle', 'T1_End': 'maccor_charge_thermocouple_end_c', 'T1_End.1': 'maccor_discharge_thermocouple_end_c', 'T1_Max': 'maccor_charge_thermocouple_max_c', 'T1_Max.1': 'maccor_discharge_thermocouple_max_c', 'T1_Min': 'maccor_charge_thermocouple_min_c', 'T1_Min.1': 'maccor_discharge_thermocouple_min_c', 'T1_Start': 'maccor_charge_thermocouple_start_c', 'T1_Start.1': 'maccor_discharge_thermocouple_start_c', 'Test Time': 'test_time_s', 'Voltage': 'maccor_min_voltage_mv', 'WH-IN': 'reported_charge_energy_wh', 'WH-OUT': 'reported_discharge_energy_wh'}¶
- COLUMNS_MAPPING_MACCOR_TEST_DATA = {'Capacity(Ah)': 'maccor_capacity_ah', 'Current(A)': 'current_a', 'Cyc#': 'cycle', 'Cycle P': 'cycle', 'DPt Time': 'recorded_datetime', 'EV Temp': 'ev_temp_c', 'Step': 'step', 'Step Time': 'step_time_s', 'StepTime(s)': 'step_time_s', 'Test Time': 'test_time_s', 'TestTime(s)': 'test_time_s', 'Voltage(V)': 'voltage_v', 'Watt-hr': 'maccor_energy_wh'}¶
- COLUMNS_TEST_DATA = {'current_ma', 'cycle', 'other_details', 'recorded_datetime', 'step', 'step_time_s', 'test_data_id', 'test_id', 'test_time_s', 'thermocouple_temps_c', 'unixtime_s', 'voltage_mv'}¶
- COLUMNS_TO_MILLI = {'arbin_charge_capacity_ah': 'arbin_charge_capacity_mah', 'arbin_charge_energy_wh': 'arbin_charge_energy_mwh', 'arbin_discharge_capacity_ah': 'arbin_discharge_capacity_mah', 'arbin_discharge_energy_wh': 'arbin_discharge_energy_mwh', 'capacity': 'maccor_capacity_mah', 'capacity_ah': 'capacity_mah', 'charge_capacity_ah': 'charge_capacity_mah', 'charge_energy_wh': 'charge_energy_mwh', 'current': 'current_ma', 'current_a': 'current_ma', 'discharge_capacity_ah': 'discharge_capacity_mah', 'discharge_energy_wh': 'discharge_energy_mwh', 'energy': 'maccor_energy_mwh', 'impedance_ohm': 'impedance_mohm', 'maccor_capacity_ah': 'maccor_capacity_mah', 'maccor_energy_wh': 'maccor_energy_mwh', 'power_w': 'power_mw', 'reported_charge_capacity_ah': 'reported_charge_capacity_mah', 'reported_charge_energy_wh': 'reported_charge_energy_mwh', 'reported_discharge_capacity_ah': 'reported_discharge_capacity_mah', 'reported_discharge_energy_wh': 'reported_discharge_energy_mwh', 'voltage': 'voltage_mv', 'voltage_v': 'voltage_mv'}¶
- COLUMNS_UNSTRUCTURED_TEST_DATA = {'current_ma', 'other_details', 'time_s', 'voltage_mv'}¶
- DATABASE_MAX_RETRIES = 10¶
- DATABASE_MAX_RETRY_DELAY = 60¶
- DATABASE_RETRY_DELAY = 10¶
- DATA_TYPE_CYCLE_STATS = 'cycle_stats'¶
- DATA_TYPE_TEST_DATA = 'test_data'¶
- DEFAULT_TIME_ZONE = 'America/Los_Angeles'¶
- MACCOR_CHARGE_STEP_NAMES = ['Charge', 'Chg Func', 'FastWave']¶
- MACCOR_DISCHARGE_STEP_NAMES = ['Dischrge', 'Dis Func']¶
- MACCOR_PROCEDURE_FILE_ENCODING = 'UTF-8'¶
- MAKE_ARBIN = 'arbin'¶
- MAKE_MACCOR = 'maccor'¶
- PREFIX_ARBIN_THERMOCOUPLE = 'aux_temperature_'¶
- PREFIX_MACCOR_THERMOCOUPLE = 'temp '¶
- TEMPLATE_RENAMED_THERMOCOUPLE = 'thermocouple_X_c'¶
- UNSTRUCTURED_DATA_REQUIRED_KEYS_CSV = {'current_ma', 'pandas_read_csv_args', 'voltage_mv'}¶
- UNSTRUCTURED_DATA_REQUIRED_KEYS_XLSX = {'current_ma', 'pandas_read_excel_args', 'voltage_mv'}¶
battetl.logger module¶
BattETL logger module
battetl.utils module¶
- class battetl.utils.DashOrderedDict¶
Bases:
OrderedDictPulled from BEEP: https://github.com/TRI-AMDD/beep/blob/master/beep/utils/__init__.py
Nested data structure with pydash enabled getters and setters. Nested values can be set using dot notation, e. g.
>>> dod = DashOrderedDict() >>> dod.set('key1.key2', 5) >>> print(dod['key1']['key2']) >>> 5
- get_path(string, default=None)¶
- merge(obj)¶
- set(string, value)¶
- unset(string)¶
- class battetl.utils.Utils¶
Bases:
object- convert_datetime(column: str, timezone: str) DataFrame¶
Convert datetime to UTC format with time zone
- Parameters:
df (pandas.DataFrame) – The input DataFrame
column (str) – column to convert
timezone (str) – Time zone strings in the IANA Time Zone Database
- Returns:
df – Converted data
- Return type:
pandas.DataFrame
- convert_timedelta_to_seconds(column: str) DataFrame¶
Convert time delta to seconds
- Parameters:
df (pandas.DataFrame) – Original data
column (str) – Column to convert
- Returns:
df – Converted data
- Return type:
pandas.DataFrame
- convert_to_float()¶
Converts value to float if it is a string.
- Parameters:
value (str or float) – The value to convert.
- Return type:
float
- convert_to_milli() DataFrame¶
Convert columns to milli- and rename column name
- Parameters:
df (pandas.DataFrame) – Original data
- Returns:
df – Converted data
- Return type:
pandas.DataFrame
- drop_columns(columns_to_drop: list[str]) DataFrame¶
This function drops unnamed columns from the passed DataFrame.
- Parameters:
df (pandas.DataFrame) – DataFrame to drop unnamed columns from.
columns_to_drop (list) – List of strings that give column names to drop
- Returns:
df – Pandas DataFrame with the dropped columns.
- Return type:
pandas.DataFrame
- drop_empty_rows() DataFrame¶
The function drops empty rows from DataFrame
- Parameters:
df (pandas.DataFrame) – DataFrame to drop empty rows from.
- Returns:
df – Pandas DataFrame with the empty rows dropped.
- Return type:
pandas.DataFrame
- drop_unnamed_columns() DataFrame¶
The function drops unnamed columns from DataFrame
- Parameters:
df (pandas.DataFrame) – DataFrame to drop unnamed columns from.
- Returns:
df – Pandas DataFrame with the unnamed columns dropped.
- Return type:
pandas.DataFrame
- get_cycle_make() tuple[str, str]¶
Determine the make and type of cycler. Currently supported: - Arbin Test Data - Arbin Cycle Stats - Maccor Test Data - Maccor Cycle Stats
- Parameters:
data (list[str]) – Original data
- Returns:
(str) – Cycle make
(str) – Data type
- get_lower_strip_set() set¶
Transform string list to lower case set
- Parameters:
data (list[str]) – Original data
- Returns:
Lower case set data
- Return type:
(set)
- load_config() dict¶
Load config file from path
- Parameters:
config_path (str) – Path to config file
- Returns:
config – Config file as dictionary
- Return type:
dict
- load_env() None¶
Set environment variables from .env file
- Parameters:
env_path (str) – Path to .env file
- rename_df_columns(columnsMapping: dict) DataFrame¶
Rename column names to BattETL format
- Parameters:
df (pandas.DataFrame) – Original data
- Returns:
df – Renamed data
- Return type:
pandas.DataFrame
- sort_dataframe(columns: list[str]) DataFrame¶
Sort pandas.DataFrame with input columns
- Parameters:
df (pandas.DataFrame) – Original data
columns (list[str]) – Sort by columns
- Returns:
df – Sorted data
- Return type:
pandas.DataFrame
- validate_file_meta(file_type: str) bool¶
Validate file_meta
- Parameters:
file_meta (dict) –
Dictionary containing the meta data for the file. For example:
{ "voltage_mv" : { "column_name":"volt", "scaling_factor":1, }, "current_ma" : { "column_name":"curr", "scaling_factor":1, }, }
file_type (str) – Type of file. Valid values are ‘csv’ and ‘xlsx’.
- Returns:
True if the file_meta is valid.
- Return type:
bool
Module contents¶
- battetl.create_config(data_folder_path)¶
Create a configuration file based on the contents of the specified data folder.
This function scans the provided data folder for relevant files and generates a configuration file (in JSON format) with metadata about the files and the test setup.
Arguments: data_folder_path (str): The path to the data folder containing the test files.
Returns: None
Example Usage: create_config(‘/path/to/data/folder’)
File Naming Conventions: - For Maccor data files: Files ending with a number followed by ‘.txt’. - For Maccor stats files: Files ending with ‘[STATS].txt’. - For Maccor schedule files: Files ending with ‘.000’. - For Arbin data files: Files containing ‘Wb’ in the name and ending with ‘.CSV’. - For Arbin stats files: Files ending with ‘StatisticByCycle.CSV’. - For Maccor schedule files: Files ending with ‘.sdx’.
Generated Configuration Structure: The configuration file includes metadata about the test setup, including timezone, file paths, and various parameters related to the test, cell, schedule, cycler, customers, and projects.
The configuration structure is as follows: {
“timezone”: “America/Los_Angeles”, “data_file_path”: data_files, “stats_file_path”: stats_files, “schedule_file_path”: schedule_files, “meta_data”: {
- “test_meta”: {
“cell_id”: None, “schedule_id”: None, “test_name”: os.path.basename(data_files[0]).split(’ ‘, 1)[0], “start_date”: ‘2020-10-06’, “end_date”: ‘2020-10-11’, “channel”: int(os.path.basename(data_files[0]).split(‘.’, 1)[0].split(’ ‘)[-1]), “ev_chamber”: 12, “ev_chamber_slot”: None, “thermocouples”: None, “thermocouple_channels”: None, “comments”: “Ran at 45 degrees C”, “project_id”: None, “test_capacity_mah”: 2650, “potentiostat_id”: None, “cycler_id”: None,
}, “cell”: {
“cell_type_id”: None, “batch_number”: “BATCH_NUMBER”, “label”: “24”, “date_received”: “2020-09-01”, “comments”: None, “date_manufactured”: None, “manufacturer_sn”: “BattGenie_SN”, “dims”: None, “weight_g”: None, “first_received_at_voltage_mv”: None,
}, “cell_meta”: {
“manufacturer”: “BattGenie”, “manufacturer_pn”: “BattGenie_PN”, “form_factor”: “pouch”, “capacity_mah”: 2720, “chemistry”: None, “dimensions”: ‘{“x_mm”:”54.25”, “y_mm”:106.96, “z_mm”:3.19}’, “datasheet”: None,
}, “schedule_meta”: {
“schedule_name”: “BG_Characterization_v1”, “test_type”: “Characterization”, “cycler_make”: “Maccor”, “date_created”: “2020-10-06”, “created_by”: “BattGenie”, “comments”: None, “cv_voltage_threshold_mv”: None, “details”: None,
}, “cycler”: {
“sn”: “SN”, “calibration_date”: None, “calibration_due_date”: None, “location”: “BattGenie”, “timezone_based”: None,
}, “cycler_meta”: {
“manufacturer”: “Maccor”, “model”: “SERIES 4000M”, “datasheet”: None, “num_channels”: None, “lower_current_limit_a”: None, “upper_current_limit_a”: None, “lower_voltage_limit_v”: None, “upper_voltage_limit_v”: None,
}, “customers”: {
“customer_name”: “FakeCustomer”
}, “projects”: {
“project_name”: “FakeProject”
}
}
}
- battetl.run_battetl()¶
Run the BattETL application with command-line interface.
This function parses command-line arguments and executes appropriate actions based on the provided commands.
Command-Line Arguments: -c, –config: Configuration command. If specified, it creates a new configuration file. -e, –extract: Extract command. If specified, it triggers the data extraction process. -t, –transform: Transform command. If specified, it triggers the data transformation process. -l, –load: Load command. If specified, it triggers the data loading process. -etl, –etl: ETL command. If specified, it triggers the full ETL (Extract, Transform, Load) process.
Optional Argument: config_file_path: Path to the configuration file. If provided, it overrides the default configuration file path.
If not provided, a default configuration file named ‘demo_config.json’ is used.
Returns: None
Raises: None
Example Usage: To create a new configuration file:
python script.py –config /path/to/config.json
- To run a specific step of the ETL process:
python script.py –extract python script.py –transform python script.py –load
- To run the full ETL process:
python script.py –etl
- To run the BattETL in quick mode with default configuration:
python script.py
BattETL