CodeWars 코딩 문제 2021/02/02 - Most frequently used words in a text

이호현·2021년 2월 2일
0

Algorithm

목록 보기
75/138

[문제]

Write a function that, given a string of text (possibly with punctuation and line-breaks), returns an array of the top-3 most occurring words, in descending order of the number of occurrences.

Assumptions:

  • A word is a string of letters (A to Z) optionally containing one or more apostrophes (') in ASCII. (No need to handle fancy punctuation.)
  • Matches should be case-insensitive, and the words in the result should be lowercased.
  • Ties may be broken arbitrarily.
  • If a text contains fewer than three unique words, then either the top-2 or top-1 words should be returned, or an empty array if a text contains no words.

Examples:

top_3_words("In a village of La Mancha, the name of which I have no desire to call to mind, there lived not long since one of those gentlemen that keep a lance in the lance-rack, an old buckler, a lean hack, and a greyhound for coursing. An olla of rather more beef than mutton, a salad on most nights, scraps on Saturdays, lentils on Fridays, and a pigeon or so extra on Sundays, made away with three-quarters of his income.")
# => ["a", "of", "on"]

top_3_words("e e e e DDD ddd DdD: ddd ddd aa aA Aa, bb cc cC e e e")
# => ["e", "ddd", "aa"]

top_3_words(" //wont won't won't")
# => ["won't", "wont"]

Bonus points (not really, but just for fun):
1. Avoid creating an array whose memory footprint is roughly as big as the input text.
2. Avoid sorting the entire array of unique words.

(요약) 제일 많이 나온 단어 3개를 횟수로 정렬.

[풀이]

function topThreeWords(text) {
  const reg = /[^a-zA-Z']{1,}/;

  const mapText = new Map();

  text.toLowerCase().split(reg).filter(str => str.length).forEach(str => {
    mapText.get(str) ? mapText.set(str, mapText.get(str) + 1) : mapText.set(str, 1);
  });

  const textCountArr = [...mapText].sort((a, b) => {
    return b[1] - a[1];
  }).filter((arr, idx) => (idx <= 2 && /\w/.test(arr[0])));

  return textCountArr.map(arr => arr[0]);
}

알파벳과 '를 제외한 나머지를 기준으로 문자열을 split.
map객체를 이용해 문자열과 나온 횟수를 뽑아냄.
그리고 횟수로 정렬을 시키고, 횟수가 같으면 그냥 놔둠.
정렬된 것을 '만 있는건 빼고, 상위 3개 문자열을 return.

filter(str => str.length)
// 이 부분을 아래처럼 바꿔도 된다.
filter(str => str)

JS - 코드 간단하게 처리하기

profile
평생 개발자로 살고싶습니다

0개의 댓글