Python Code to Analyze Runtime

L001 · January 16, 2024, 7:35pm

Dear Community

Good Afternoon!
I have been working on some code to parse and analyze TRC files from Hamilton by methods. I would like to share what i did so far and what is my main goal.

Code

import glob
import re
import os.path
import pandas as pd

class HAMILTONLogParser:

def get_data(path: str):
    
    time = "[0-9][0-9]:[0-9][0-9]"
    date = "\d{4}-\d{2}-\d{2}"
    
    runtime = []
    rundate = []
    output = {}
    
    results = [file for file in glob.glob(path) if "Error" not in open(file).read() 
                    and "Abort" not in open(file).read()]
            
    for file in results:
       with open(file, "r") as f:
         runtime.append(re.findall(time, f.read()))
         f.close()

    for run in results:
       with open(run, "r") as f2:
         rundate.append(re.findall(date, f2.read()))
         f2.close()

    output["times"] = runtime 
    output["dates"] = list(zip(*rundate))[0]
    output["filename"] = results
    
    return output

def hami_dataframe(data, output: str, raw=True, verbose=False):
    
    table = pd.DataFrame(data["times"], index=data["dates"])
    
    table.insert(0, column="filename", value =  data["filename"])
    table.insert(1, column="records", value = table.count(axis=1))

    table.sort_index(ascending=False, inplace=True) 

    if raw == False:
      for col in table[0:len(table)]:
        try:
            table[col] = pd.to_timedelta(table[col]+':00')
        except (TypeError, ValueError):
            print(f"({col}")
           
      for i in table["records"] - 1:
         try: 
           table["duration"] = table[i] - table[0]
         except:
           print(i)
     
      table = table[['filename', 'records', 'duration']]
      print(table)
      table.to_excel(output)
    
    table.to_excel(output)
    
    if verbose == True:
         print("NUMBER OF FILES:  ", len(data["filename"]))
         print("NUMBER OF RECORDS IN TIME:  ", len(data["times"]))
         print("NUMBER OF RECORDS IN DATE:  ", len(data["dates"]))
         print(table.info())
         
    if os.path.exists(output):
        print("File Written")
    else: 
        print("FILE NOT WRITTEN")
    
    return table

run as:

f1 = HAMILTONLogParser.get_data(path=“path/to/protocol_*”)
f2 = HAMILTONLogParser.hami_dataframe(f1,output=“path/outputname/file.xlsx”, verbose=True, raw=True)

My code is working but is far from the ideal. The “raw” option returns the whole table with multiples hour:minutes, allowing you to calculate manually (latest row with a value - the first row with a value) using tools such as excel. However, despite raw=True calculate automatically, some rows are not being calculated.

I would like to hear if we have a better and mature code and if such concept is viable in yours daily routine. Where i work we are improving our methods continuously and so far we did not have a way to compare multiples runs before/after some changes in the code (Venus 4).

I tried to stay close as possible to PEP and KISS, it’s completely possible that i commited some sins and i open for suggestions/fix if you believe this tool could be useful! If we take it as something relevant, i will create a repository on github.

All the best!

BirdBare · January 17, 2024, 4:21am

I like the idea. You may want to communicate with Tecan. They have a dashboard like software that similarly analyzes trace files. I cannot remember what is it called off the top of my head (command center?), so maybe someone can chime in. They did seem interested in integrating alternative systems.

luisvillaautomata · January 17, 2024, 3:00pm

Here’s my two cents on this: run time is fine for a dataset generated from the start because it can be predictive but it’s not great. One should also capture how long certain steps took or how long from the start of the method it took to get to that point. This is the level of granularity team need to make troubleshooting or analysis a breeze.

You may imagine feeding a SQL or NoSQL database with real time time stamps to help with this.

By the way, the Tecan service is called Introspect. They have a couple of dashboard options. Check them out at SLAS for more.

Starrif2263 · January 19, 2024, 9:16pm

Hello community,

Last time I check Introspec only work for Tecan. If things ha e change that would great. Their dasboard is pretty good.

BirdBare · January 19, 2024, 9:44pm

Contact Marco Licheri in Tecan. He was the one who suggested using the Tecan dashboard software (still not sure if it is Introspect) on Hamilton systems. He did confirm it had already been tested.