最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Counting occurrences of event B that occur between occurrences of event A in Python - Stack Overflow

programmeradmin3浏览0评论

I am attempting to summarize data in an event log. I have 2 event I want to track. Even A and Event B. I would like to count how many times event B occurs in between occurrences of event A. for example:

      Date       Time      Event
0   2025-02-01  03:51:40     A
1   2025-02-01  05:53:31     B
2   2025-02-01  07:55:05     B
3   2025-02-01  10:14:52     B
4   2025-02-01  12:17:01     A
5   2025-02-01  14:20:15     B
6   2025-02-01  20:26:04     A
7   2025-02-01  22:31:27     A
8   2025-02-02  03:50:48     B
9   2025-02-02  05:52:28     B
10  2025-02-02  14:00:45     A

I would like to return,

     Date       Time      Event  B Count
0  2025-02-01  03:51:40     A        0
1  2025-02-01  12:17:01     A        3
2  2025-02-01  20:26:04     A        1
3  2025-02-01  22:31:27     A        0
4  2025-02-02  14:00:45     A        2

I have no idea how to accomplish this. Any help is appreciated. Also this is my first stack overflow question, so I apologize if I have done/formatted anything wrong.

I am attempting to summarize data in an event log. I have 2 event I want to track. Even A and Event B. I would like to count how many times event B occurs in between occurrences of event A. for example:

      Date       Time      Event
0   2025-02-01  03:51:40     A
1   2025-02-01  05:53:31     B
2   2025-02-01  07:55:05     B
3   2025-02-01  10:14:52     B
4   2025-02-01  12:17:01     A
5   2025-02-01  14:20:15     B
6   2025-02-01  20:26:04     A
7   2025-02-01  22:31:27     A
8   2025-02-02  03:50:48     B
9   2025-02-02  05:52:28     B
10  2025-02-02  14:00:45     A

I would like to return,

     Date       Time      Event  B Count
0  2025-02-01  03:51:40     A        0
1  2025-02-01  12:17:01     A        3
2  2025-02-01  20:26:04     A        1
3  2025-02-01  22:31:27     A        0
4  2025-02-02  14:00:45     A        2

I have no idea how to accomplish this. Any help is appreciated. Also this is my first stack overflow question, so I apologize if I have done/formatted anything wrong.

Share Improve this question asked Mar 26 at 19:07 RMAC52RMAC52 1 4
  • 1 What you are looking for is a simple python based state machine – JonSG Commented Mar 26 at 19:24
  • Is the 1st file on going and you need the 2nd file to be automatically generated on each change? Is this supposed to be run manually? Is the 1st file a CSV? – Uberhumus Commented Mar 26 at 19:58
  • If you show us where you are code wise, we can offer suggestions. – JonSG Commented Mar 26 at 20:28
  • Is the data a Pandas dataframe or .csv file or what? – user19077881 Commented Mar 26 at 23:36
Add a comment  | 

2 Answers 2

Reset to default 0

It looks as though you're working with a pandas DataFrame. If that's the case then let's assume that the origin of the data is a CSV file that looks like:

Date,Time,Event
2025-02-01,03:51:40,A
2025-02-01,05:53:31,B
2025-02-01,07:55:05,B
2025-02-01,10:14:52,B
2025-02-01,12:17:01,A
2025-02-01,14:20:15,B
2025-02-01,20:26:04,A
2025-02-01,22:31:27,A
2025-02-02,03:50:48,B
2025-02-02,05:52:28,B
2025-02-02,14:00:45,A

Construct a DataFrame based on the CSV file contents. Iterate over the DataFrame rows and build a dictionary taking into account the number of B events counted before any A event. Create a new DataFrame from the dictionary.

import pandas as pd
from collections import defaultdict

FILENAME = "foo.csv"
b_count = 0
d = defaultdict(list)

for _, (_date, _time, _event) in pd.read_csv(FILENAME).iterrows():
    if _event == "A":
        d["Date"].append(_date)
        d["Time"].append(_time)
        d["Event"].append(_event)
        d["B Count"].append(b_count)
        b_count = 0
    else:
        b_count += 1

print(pd.DataFrame.from_dict(d))

Output:

         Date      Time Event  B Count
0  2025-02-01  03:51:40     A        0
1  2025-02-01  12:17:01     A        3
2  2025-02-01  20:26:04     A        1
3  2025-02-01  22:31:27     A        0
4  2025-02-02  14:00:45     A        2

The user has mentioned 'event.log' and displayed what seems to be a SPACE+ separated file - not a 'traditional' csv. Here's a trivial example to load and process as per requirements.


cat bilvit.py 
import sys
import re

if len( sys.argv ) != 2:
    print(f"usage:{sys.argv[0]} FILENAME")
    sys.exit(1)

bCount = 0
i=0

with open(sys.argv[1],'r') as file:
    records  = file.readlines()

for record in records:
    record = record.strip()
    if not record: continue # skip empty lines
    
    _junk, _date, _time, _event = re.split(r'\s+', record )
    if _event == "A":
        if ( i == 0 ): print("      Date         Time       Event   B Count")  # header
        
        print(f"{i:-3d}   {_date}   {_time}   {_event}       {bCount}")
        bCount = 0
        i += 1
    else:
        bCount += 1

#
# run it
python bilvit.py event.log 
    Date         Time       Event   B Count
0   2025-02-01   03:51:40   A       0
1   2025-02-01   12:17:01   A       3
2   2025-02-01   20:26:04   A       1
3   2025-02-01   22:31:27   A       0
4   2025-02-02   14:00:45   A       2
发布评论

评论列表(0)

  1. 暂无评论