Biologists are studying basic patterns in DNA sequences. Write a solution to identify sample_id
with the following patterns:
3
consecutive G (like GGG or GGGG)Return the result table ordered by sample_id in ascending order.
The result format is in the following example.
CASE
문제REGEXP_LIKE(STR, regex)
: 문자열 STR
이 표현식 regex
와 일치하는지 검사, 일치하면 TRUE
, 아니면 FALSE
REGEXP_SUBSTR(STR, regex)
: 문자열STR
에서 정규표현식 regex
와 일치하는 Sub String을 추출, 없다면 NULL
DECODE(COL, VAL, A, B)
: COL
값이 VAL
과 같다면, A
아니면 B
SELECT
sample_id,
dna_sequence,
species,
CASE
WHEN dna_sequence LIKE 'ATG%' THEN 1
ELSE 0
END AS has_start,
CASE
WHEN REGEXP_LIKE(dna_sequence, '(TAA|TAG|TGA)$') THEN 1
ELSE 0
END AS has_stop,
CASE
WHEN dna_sequence LIKE '%ATAT%' THEN 1
ELSE 0
END AS has_atat,
DECODE(REGEXP_SUBSTR(dna_sequence, 'G{3,}'), NULL, 0, 1)
AS has_ggg
FROM Samples
ORDER BY sample_id;