https://leetcode.com/problems/average-time-of-process-per-machine/description/
Table: Activity
Column Name | Type |
---|---|
machine_id | int |
process_id | int |
activity_type | enum |
timestamp | float |
The table shows the user activities for a factory website.
(machine_id, process_id, activity_type) is the primary key (combination of columns with unique values) of this table.
machine_id is the ID of a machine.
process_id is the ID of a process running on the machine with ID machine_id.
activity_type is an ENUM (category) of type ('start', 'end').
timestamp is a float representing the current time in seconds.
'start' means the machine starts the process at the given timestamp and 'end' means the machine ends the process at the given timestamp.
The 'start' timestamp will always be before the 'end' timestamp for every (machine_id, process_id) pair.
It is guaranteed that each (machine_id, process_id) pair has a 'start' and 'end' timestamp.
There is a factory website that has several machines each running the same number of processes. Write a solution to find the average time each machine takes to complete a process.
The time to complete a process is the 'end' timestamp minus the 'start' timestamp. The average time is calculated by the total time to complete every process on the machine divided by the number of processes that were run.
The resulting table should have the machine_id along with the average time as processing_time, which should be rounded to 3 decimal places.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input:
Activity table:
machine_id | process_id | activity_type | timestamp |
---|---|---|---|
0 | 0 | start | 0.712 |
0 | 0 | end | 1.520 |
0 | 1 | start | 3.140 |
0 | 1 | end | 4.120 |
1 | 0 | start | 0.550 |
1 | 0 | end | 1.550 |
1 | 1 | start | 0.430 |
1 | 1 | end | 1.420 |
2 | 0 | start | 4.100 |
2 | 0 | end | 4.512 |
2 | 1 | start | 2.500 |
2 | 1 | end | 5.000 |
Output:
machine_id | processing_time |
---|---|
0 | 0.894 |
1 | 0.995 |
2 | 1.456 |
Explanation:
There are 3 machines running 2 processes each.
Machine 0's average time is ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894
Machine 1's average time is ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995
Machine 2's average time is ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456
내 문제 풀이 접근
WITH NUM AS
(SELECT machine_id, COUNT(machine_id)/2 machine_num
FROM Activity
GROUP BY machine_id)
SELECT machine_id, activity_type, SUM(timestamp) timestamp
FROM Activity
GROUP BY machine_id,activity_type
실행결과
machine_id | activity_type | timestamp |
---|---|---|
0 | start | 3.852000117301941 |
0 | end | 5.639999866485596 |
1 | start | 0.9800000190734863 |
1 | end | 2.96999990940094 |
2 | start | 6.599999904632568 |
2 | end | 9.51200008392334 |
이렇게 구해놓고 ID 별 Start-end 를 하려고 했는데 머리가 새하얘졌다.
다 구해놓고 못하는 느낌이었다.
저번에 튜터님이 SUM(), COUNT() 등 집계함수를 써버리면 코드를 더 만지기 힘들어지니까 상황을 잘 보고 사용하라는 말씀이 생각이 났고 어차피 테이블을 두번 참조해야 하니 접근 방법을 바꾸어 보았다.
정답 쿼리
SELECT s.machine_id, ROUND(AVG(e.timestamp- s.timestamp), 3) processing_time
FROM Activity s, Activity e
WHERE s.machine_id = e.machine_id and s.process_id = e.process_id
and s.activity_type = 'start'
and e.activity_type = 'end'
GROUP BY 1
요즘 SQL 문제를 풀면서 접근 방법이 적절하지 않다면 코드가 많이 복잡해지거나
연산해야 하는 경우가 많아져서 어디서부터 건드려야 할지에 대한 고민이 많고
머리가 복잡하다.