CAT — Detailed Data Processing

is614
4 min readNov 7, 2021

--

Supplementary Considerations for Data Preparation and Processing

This article serves to provide details for our main article — Computer Aided Tablevision

1.Extracting QR codes from Raw String in Splunk

The data stream is piped into Splunk for parsing and information processing.

A typical string of raw text from each capture is shown below.

Raw string from Splunk

It comprises of “code”, which contains all the QR codes that were sensed by the Raspberry Pi in one capture, and “time”, which is the timestamp of the capture. The code string for the seat starts with the centre ID, “C001”, followed by the table ID, “T002”, and ends with the seat ID for that table, “S1”. The code string for the cleaner also contains the centre ID but ends with the cleaner ID, “M001”.

2.Accuracy of each Seat Position

When testing various combinations of window periods and threshold values for test scenario 1: Seat position occupied to unoccupied and vice versa, we found that the best result was obtained from an eleven seconds window period which detects a cleaner at least eight times with 64% accuracy. We show the calculations below.

The figure below shows the distribution of detected QR code for seat position S1 over four hours based on test scenario 1. To be perfectly accurate, every column in the barchart should show the value of 1. We average the readings to obtain the final accuracy for each seat and take the combined average to find the overall accuracy of the system.

Average accuracy of S1 detection by hour
Average accuracy of detection of all 4 seat positions

Below is the Splunk code for the aggregation and barchart.

index= “main” source=”table001" | eval flag=”clean” | stats count(eval(searchmatch(“initialisation”))) AS init_statecount(eval(searchmatch(“C001T001S01”))) AS S1 count(eval(searchmatch(“C001T001S02”))) AS S2count(eval(searchmatch(“C001T001S03”))) AS S3 count(eval(searchmatch(“C001T001S04”))) AS S4count(eval(searchmatch(“C001M001”))) AS Cleaner_ID by _time, source |streamstats window=11 sum(S1) AS SumS1 sum(S2) AS SumS2 sum(S3) AS SumS3 sum(S4) AS SumS4 sum(Cleaner_ID) AS SumCleaner |streamstats window=10 max(SumCleaner) AS MaxCleaner |eval minS = min(SumS1,SumS2,SumS3,SumS4) | eval Occupancy=case(minS<=7 AND SumCleaner=0 AND init_state=0, “is_occupied”, 1=1, “not_occupied”) |eval flag=case(init_state=1, “clean”, MaxCleaner>7, “clean”,Occupancy=”is_occupied” AND MaxCleaner<=7 AND init_state=0, “req_clean”, 1=1, flag) | filldown | where Occupancy=”not_occupied” | timechart avg(S1)

Hence, for the QR codes for seat positions, our prototype setup can detect a QR code when it is present at least eight out of 11 times with ~94% accuracy.

3.Accuracy of detecting Cleaner

We next find the window and threshold value for detecting a cleaner QR code. The best result was obtained from an eleven seconds window period which detects a cleaner at least eight times with 64% accuracy.

Average accuracy of Cleaner_ID detection by hour
Average detection of Cleaner_ID

The main cause of the difference in accuracy between cleaner and seat positions is the relative shorter presence of the cleaner (test scenario 2) in each hour. We expect the cleaner to spend around 20 seconds to wipe down a table as there is no crockery and trays. Also, the movement of the cleaner makes capturing their QR code more difficult than that of the fixed seat positions. The average accuracy of this second order moving average to detect the cleaner QR code is an improved ~90%.

To improve the accuracy of detecting the cleaner, we perform another aggregation over a ten second rolling window period using only the maximum count obtained from the previous aggregation.

For example, in the first moving average period of eleven seconds, if the counts captured per second are {11, 11, 10, 10, 9 , 9 , 9, 9, 9, 9, 9, 8, 8, 8, 11,…} , the maximum counts will be {11, 11, 10, 10, 11, …} . We then apply a ten second window on the maximum counts to obtain the per second status of the cleaner. The average maximum counts from test scenario 2 is below.

Average maximum Cleaner_ID count

For the above barchart, all the individual columns should read eleven if the QR codes are captured perfectly. The Table 3.1.8 shows that percentage accuracy per hour. The average accuracy of this second order moving average to detect the cleaner QR code is an improved ~90%.

Percentage accuracy of Cleaner_ID detected per hour

--

--

is614
is614

No responses yet