문제출처 : https://www.hackerrank.com/challenges/interviews/problem?isFullScreen=true
Samantha interviews many candidates from different colleges using coding challenges and contests. Write a query to print the contest_id, hacker_id, name, and the sums of total_submissions, total_accepted_submissions, total_views, and total_unique_views for each contest sorted by contest_id. Exclude the contest from the result if all four sums are 0.
Note: A specific contest can be used to screen candidates at more than one college, but each college only holds 1 screening contest.
The following tables hold interview data:
Contests: The contest_id is the id of the contest, hacker_id is the id of the hacker who created the contest, and name is the name of the hacker.
Colleges: The college_id is the id of the college, and contest_id is the id of the contest that Samantha used to screen the candidates.
Challenges: The challenge_id is the id of the challenge that belongs to one of the contests whose contest_id Samantha forgot, and college_id is the id of the college where the challenge was given to candidates.
View_Stats: The challenge_id is the id of the challenge, total_views is the number of times the challenge was viewed by candidates, and total_unique_views is the number of times the challenge was viewed by unique candidates.
Submission_Stats: The challenge_id is the id of the challenge, total_submissions is the number of submissions for the challenge, and total_accepted_submission is the number of submissions that achieved full scores.
테이블도 너무 많고.. hackerrank 문제의 경우 이해하는게 문제 해결의 대부분을 차지한다고 생각할만큼 내가 문제를 이해하는데 조금 시간이 오래 걸리는 것 같다...
우선 문제는 다음과 같은 요구사항을 가지고 있다.
Samantha interviews many candidates from different colleges using coding challenges and contests.
코딩챌린지,대회를 통해 많은 후보자들을 인터뷰한다.
Write a query to print the contest_id, hacker_id, name, and the sums of total_submissions
contest_id,hacker_id,name,submission의 합을 쿼리해라.
total_accepted_submissions, total_views, and total_unique_views for each contest sorted by contest_id. Exclude the contest from the result if all four sums are 0.
각 콘테스트의 total_accepted_submissions, total_views 및 total_unique_views를 contest_id 정렬해라. 그리고 네 가지 합계가 모두 0이면 결과에서 해당 콘테스트를 제외시켜라.
그럼 나는 위에서 제공한 여러 테이블을 가지고 total_accepted_submissions, total_views, and total_unique_views 을 먼저 구해야겠다고 생각하였다.
일단 college,contest,challenge 3개의 테이블을 조합하여 대학별 속한 챌린지,콘테스트 들을 알기 위해 조인을 진행하였다.
SELECT
CT.CONTEST_ID,CT.HACKER_ID,CT.NAME
FROM
CONTESTS CT
JOIN
COLLEGES CL ON CT.CONTEST_ID = CL.CONTEST_ID
JOIN
CHALLENGES CH ON CH.COLLEGE_ID = CL.COLLEGE_ID;
이렇게 구한 테이블에서 이제 문제에서 요구하는 총 점수들을 구해야 하기 떄문에 다음과 같이 진행하였다.
SELECT
CHALLENGE_ID,
SUM(TOTAL_VIEWS) TOTAL_VIEWS,
SUM(TOTAL_UNIQUE_VIEW) TOTAL_UNIQUE_VIEW
FROM
VIEW_STATS
GROUP BY
CHALLENGE_ID
SELECT
CHALLENGE_ID,
SUM(TOTAL_SUBMISSIONS) TOTAL_SUBMISSIONS,
SUM(TOTAL_ACCEPTED_SUBMISSIONS) TOTAL_ACCEPTED_SUBMISSIONS
FROM
SUBMISSION_STATS
GROUP BY
CHALLENGE_ID
SELECT
CT.CONTEST_ID,CT.HACKER_ID,CT.NAME
FROM
CONTESTS CT
JOIN
COLLEGES CL ON CT.CONTEST_ID = CL.CONTEST_ID
JOIN
CHALLENGES CH ON CH.COLLEGE_ID = CL.COLLEGE_ID
JOIN
(
SELECT
CHALLENGE_ID,
SUM(TOTAL_VIEWS) TOTAL_VIEWS,
SUM(TOTAL_UNIQUE_VIEW) TOTAL_UNIQUE_VIEW
FROM
VIEW_STATS
GROUP BY
CHALLENGE_ID
) V ON CH.CHALLENGE_ID = V.CHALLENGED_ID
JOIN
(
SELECT
CHALLENGE_ID,
SUM(TOTAL_SUBMISSIONS) TOTAL_SUBMISSIONS,
SUM(TOTAL_ACCEPTED_SUBMISSIONS) TOTAL_ACCEPTED_SUBMISSIONS
FROM
SUBMISSION_STATS
GROUP BY
CHALLENGE_ID
) SM ON CH.CHALLENGE_ID = SM.CHALLENGE_ID
마지막으로 문제에서 주어진 조건 CONTEST_ID,HACKER_ID,NAME 별 4개 항목의 총합이 0인 경우는 제외하는 옵션과 정렬 옵션 추가해주자.
/*
Enter your query here.
*/
SELECT
CT.CONTEST_ID,CT.HACKER_ID,CT.NAME,
SUM(TOTAL_VIEWS),
SUM(TOTAL_UNIQUE_VIEW),
SUM(TOTAL_SUBMISSIONS),
SUM(TOTAL_ACCEPTED_SUBMISSIONS)
FROM
CONTESTS CT
JOIN
COLLEGES CL ON CT.CONTEST_ID = CL.CONTEST_ID
JOIN
CHALLENGES CH ON CH.COLLEGE_ID = CL.COLLEGE_ID
JOIN
(
SELECT
CHALLENGE_ID,
SUM(TOTAL_VIEWS) TOTAL_VIEWS,
SUM(TOTAL_UNIQUE_VIEW) TOTAL_UNIQUE_VIEW
FROM
VIEW_STATS
GROUP BY
CHALLENGE_ID
) V ON CH.CHALLENGE_ID = V.CHALLENGED_ID
JOIN
(
SELECT
CHALLENGE_ID,
SUM(TOTAL_SUBMISSIONS) TOTAL_SUBMISSIONS,
SUM(TOTAL_ACCEPTED_SUBMISSIONS) TOTAL_ACCEPTED_SUBMISSIONS
FROM
SUBMISSION_STATS
GROUP BY
CHALLENGE_ID
) SM ON CH.CHALLENGE_ID = SM.CHALLENGE_ID
GROUP BY 1,2,3
HAVING HAVING SUM(TOTAL_SUBMISSIONS) != 0 OR
SUM(TOTAL_ACCEPTED_SUBMISSIONS) != 0 OR
SUM(TOTAL_VIEWS) != 0 OR
SUM(TOTAL_UNIQUE_VIEW) != 0
ORDER BY 1;
위처럼 작성했을 때 조인 시 각 테이블간의 관계를 고려하지 않고 조인하여 테스트 케이스에서 통과하지 못했다
따라서 테이블 간의 관계를 고려하여 다음과 같이 쿼리를 수정하였더니(INNER,LEFT) 테스트 케이스에 통과하였다.
/*
Enter your query here.
*/
SELECT
CT.CONTEST_ID,CT.HACKER_ID,CT.NAME,
SUM(TOTAL_SUBMISSIONS),
SUM(TOTAL_ACCEPTED_SUBMISSIONS),
SUM(TOTAL_VIEWS),
SUM(TOTAL_UNIQUE_VIEWS)
FROM
CONTESTS CT
INNER JOIN
COLLEGES CL ON CT.CONTEST_ID = CL.CONTEST_ID
LEFT JOIN
CHALLENGES CH ON CH.COLLEGE_ID = CL.COLLEGE_ID
LEFT JOIN
(
SELECT
CHALLENGE_ID,
SUM(TOTAL_VIEWS) TOTAL_VIEWS,
SUM(TOTAL_UNIQUE_VIEWS) TOTAL_UNIQUE_VIEWS
FROM
VIEW_STATS
GROUP BY
CHALLENGE_ID
) V ON CH.CHALLENGE_ID = V.CHALLENGE_ID
LEFT JOIN
(
SELECT
CHALLENGE_ID,
SUM(TOTAL_SUBMISSIONS) TOTAL_SUBMISSIONS,
SUM(TOTAL_ACCEPTED_SUBMISSIONS) TOTAL_ACCEPTED_SUBMISSIONS
FROM
SUBMISSION_STATS
GROUP BY
CHALLENGE_ID
) SM ON CH.CHALLENGE_ID = SM.CHALLENGE_ID
GROUP BY 1,2,3
HAVING
SUM(TOTAL_SUBMISSIONS) != 0 OR
SUM(TOTAL_ACCEPTED_SUBMISSIONS) != 0 OR
SUM(TOTAL_VIEWS) != 0 OR
SUM(TOTAL_UNIQUE_VIEWS)
ORDER BY 1;
JOIN 시 테이블간의 관계를 명확하게 판단하여 조인 방식을 지정해야겠다는 생각을 다시한번 할 수 있는 문제였다 문제는 크게 어렵지 않았으나 JOIN 방식을 달리하여 꽤나 고생했던 문제이다..