SQL_코드카타(2024.01.22)

김수경·2024년 1월 22일

86. Average Time of Process per Machine

테이블 : Activity

+----------------+---------+
| Column Name | Type |
+----------------+---------+
| machine_id | int |
| process_id | int |
| activity_type | enum |
| timestamp | float |
+----------------+---------+
표에는 공장 웹사이트에 대한 사용자 활동이 나와 있습니다.
(machine_id, process_id, Activity_type)은 이 테이블의 기본 키(고유한 값을 가진 열의 조합)입니다.
machine_id는 머신의 ID입니다.
process_id는 ID가 machine_id인 머신에서 실행 중인 프로세스의 ID입니다.
Activity_type은 ('start', 'end') 유형의 ENUM(범주)입니다.
타임스탬프는 현재 시간을 초 단위로 나타내는 부동소수점입니다.
'start'는 기계가 주어진 타임스탬프에서 프로세스를 시작한다는 것을 의미하고 'end'는 기계가 주어진 타임스탬프에서 프로세스를 종료한다는 것을 의미합니다.
'시작' 타임스탬프는 항상 모든 (machine_id, process_id) 쌍에 대한 '종료' 타임스탬프 이전입니다.

문제

각각 동일한 수의 프로세스를 실행하는 여러 컴퓨터가 있는 공장 웹사이트가 있습니다. 각 기계가 프로세스를 완료하는 데 걸리는 평균 시간을 구하는 솔루션을 작성하세요.
프로세스를 완료하는 데 걸리는 시간은'end' timestamp - 'start' timestamp.입니다.
평균 시간은 머신의 모든 프로세스를 완료하는 데 걸린 총 시간을 실행된 프로세스 수로 나누어 계산합니다.
결과 테이블에는 machine_id와 함께 평균 시간이 표시 되어야 하며 소수점 이하 3자리에서 반올림processing_time 되어야 합니다 .
어떤 순서로든 결과 테이블을 반환합니다 .

output

+------------+-----------------+
| machine_id | processing_time |
+------------+-----------------+
| 0 | 0.894 |
| 1 | 0.995 |
| 2 | 1.456 |
+------------+-----------------+
Explanation:
There are 3 machines running 2 processes each.
Machine 0's average time is ((1.520 - 0.712) + (4.120 - 3.140)) / 2 = 0.894
Machine 1's average time is ((1.550 - 0.550) + (1.420 - 0.430)) / 2 = 0.995
Machine 2's average time is ((4.512 - 4.100) + (5.000 - 2.500)) / 2 = 1.456

👩🏻‍💻 My Coding

초보수준의 간단한 구문으로 풀 수 있는데도 행과의 연산..어케해야할지 모르겠다. 한개의 테이블이지만 조인을 사용해 조건을 주면 될 것 같다.
두 테이블의 machine_id와 process_id는 같고 각 테이블에서 하나는 'start' 하나는 'end'의 값만 뽑아서 계산 (해설참고함😭)

select
    *
from Activity a1 join Activity a2
on a1.machine_id = a2.machine_id
and a1.process_id = a2.process_id
and a1.activity_type = 'start'
and a2.activity_type = 'end'

machine_id	process_id	activity_type	timestamp	machine_id	process_id	activity_type	timestamp
0	0	start	0.712	0	0	end	1.52
0	1	start	3.14	0	1	end	4.12
1	0	start	0.55	1	0	end	1.55
1	1	start	0.43	1	1	end	1.42
2	0	start	4.1	2	0	end	4.512
2	1	start	2.5	2	1	end	5

이러면 위의 표처럼 나온다.
machine_id, process_id로 잘 묶였고 시간의 평균들만 계산하고 group by로 묶어준다.

select
    a1.machine_id as machine_id,
    round((sum(a2.timestamp - a1.timestamp)/2),3) as processing_time
from Activity a1 join Activity a2
on a1.machine_id = a2.machine_id
and a1.process_id = a2.process_id
and a1.activity_type = 'start'
and a2.activity_type = 'end'
group by 1

오답이다.
sum으로 직접 나누는 것이 아닌 avg로 평균을 구하자.

select
    a1.machine_id as machine_id,
    round(avg(a2.timestamp - a1.timestamp),3) as processing_time
from Activity a1 join Activity a2
on a1.machine_id = a2.machine_id
and a1.process_id = a2.process_id
and a1.activity_type = 'start'
and a2.activity_type = 'end'
group by 1

88. Students and Examinations

테이블 1 : Students

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| student_id | int |
| student_name | varchar |
+---------------+---------+
student_id is the primary key (column with unique values) for this table.
Each row of this table contains the ID and the name of one student in the school.

테이블 2 : Subjects

+--------------+---------+
| Column Name | Type |
+--------------+---------+
| subject_name | varchar |
+--------------+---------+
subject_name is the primary key (column with unique values) for this table.
Each row of this table contains the name of one subject in the school.

테이블 3 : Examinations

+--------------+---------+
| Column Name | Type |
+--------------+---------+
| student_id | int |
| subject_name | varchar |
+--------------+---------+
There is no primary key (column with unique values) for this table. It may contain duplicates.
Each student from the Students table takes every course from the Subjects table.
Each row of this table indicates that a student with ID student_id attended the exam of subject_name.

문제

각 학생이 각 시험에 참석한 횟수를 구하는 풀이를 작성하세요.
student_id 및 subject_name에 의해 정렬된 결과 테이블을 반환합니다.

output

+------------+--------------+--------------+----------------+
| student_id | student_name | subject_name | attended_exams |
+------------+--------------+--------------+----------------+
| 1 | Alice | Math | 3 |
| 1 | Alice | Physics | 2 |
| 1 | Alice | Programming | 1 |
| 2 | Bob | Math | 1 |
| 2 | Bob | Physics | 0 |
| 2 | Bob | Programming | 1 |
| 6 | Alex | Math | 0 |
| 6 | Alex | Physics | 0 |
| 6 | Alex | Programming | 0 |
| 13 | John | Math | 1 |
| 13 | John | Physics | 1 |
| 13 | John | Programming | 1 |
+------------+--------------+--------------+----------------+

👩🏻‍💻 My Coding

select
    student_id,
    student_name,
    subject_name,
    attended_exams
from 
    (
    select e.student_id as student_id,
    st.student_name as student_name,
    sb.subject_name as subject_name,
    count(*) as attended_exams
    from examinations e join students st on e.student_id = st.student_id
                    join subjects sb on e.subject_name = sb.subject_name 
    group by 1, 2, 3  
    ) a
where attended_exams >= 1

서브쿼리를 사용해, attended_exams = 0이 아닌 것들만 뽑아보려고하는데 어떤 조건을 줘도 계속 0이 같이 나온다..
문제는 그게 아니라, examination 테이블엔 student_id=6이 없어서 뽑히지 않았던게 문제였다.
cross join이나 left join을 활용해보자. (문제 제대로 읽어야겠음, 정렬 기준도 그냥 지나쳤다)
0이랑은 상관 없는 문제이다.

select 
    st.student_id,
    st.student_name,
    su.subject_name,
    COALESCE(count(e.student_id), 0) as attended_exams
from students st  cross join subjects su
                 left join Examinations e on st.student_id = e.student_id and su.subject_name = e.subject_name  
group by 1, 3
order by 1, 3

핵심 포인트!

corss join은 on 절이 없다
join 절에서 조건은 and로 묶는다
coalesce는 일치하는 행이 없으면 0을 반환하고 일치하면 count를 반환한다.

김수경

잘 하고 있는겨?

이전 포스트

SQL_코드카타(2024.01.10)

다음 포스트

SQL_코드카타(2024.01.22)

코드카타

86. Average Time of Process per Machine

테이블 : Activity

문제

output

👩🏻‍💻 My Coding

87. Employee Bonus

테이블1 : Employee

테이블2 : Bonus

문제

output

👩🏻‍💻 My Coding

88. Students and Examinations

테이블 1 : Students

테이블 2 : Subjects

테이블 3 : Examinations

문제

output

👩🏻‍💻 My Coding

핵심 포인트!

SQL_코드카타(2024.01.10)

Python_코드카타(2024.01.22)

0개의 댓글