[캐글] Courses - Python(4)

HO94·2021년 6월 24일
0

캐글

목록 보기
6/17

2021.06.24

23일날 올린 글이 비공개가 되어있는데 전체공개로 수정을해도 비공개에서 바뀌질 않는다,,왜지,,?

6. Strings and Dictionaries (Exercise)

  1. 우편번호 판별하기
    There is a saying that "Data scientists spend 80% of their time cleaning data, and 20% of their time complaining about cleaning data." Let's see if you can write a function to help clean US zip code data. Given a string, it should return whether or not that string represents a valid zip code. For our purposes, a valid zip code is any string consisting of exactly 5 digits.
    HINT: str has a method that will be useful here. Use help(str) to review a list of string methods.
def is_valid_zip(zip_code):
    """Returns whether the input string is a valid (5 digit) zip code
    """
    if zip_code.isdigit() and len(zip_code) == 5:
        return True
    return False
# Check your answer
q1.check()

str.isdigit()을 처음 접했다.
문자열이 숫자로 이뤄져있는지 알려주는 함수라고 한다.
처음에는 단순히 return len(zip_code) == 5 이렇게 작성했더니

Expected return value of False given zip_code='1234x', but got True instead.

이런 오류가 나왔다.
이런걸 보면 문제 만든사람도 대단한 것 같다,,,어떻게 이렇게 풀 줄 알고 오류예시를 준비했을까,,


  1. keyword의 단어가 포함된 문서 인덱스 반환한기
    A researcher has gathered thousands of news articles. But she wants to focus her attention on articles including a specific word. Complete the function below to help her filter her list of articles.
    Your function should meet the following criteria:
  • Do not include documents where the keyword string shows up only as a part of a larger word. For example, if she were looking for the keyword “closed”, you would not include the string “enclosed.”
  • She does not want you to distinguish upper case from lower case letters. So the phrase “Closed the case.” would be included when the keyword is “closed”
  • Do not let periods or commas affect what is matched. “It is closed.” would be included when the keyword is “closed”. But you can assume there are no other types of punctuation.
def word_search(doc_list, keyword):
    """
    Takes a list of documents (each document is a string) and a keyword. 
    Returns list of the index values into the original list for all documents 
    containing the keyword.
    Example:
    doc_list = ["The Learn Python Challenge Casi.", "They bought a car", "Casiville"]
>> word_search(doc_list, 'casi')
>> [0]
    """
    # list to hold the indices of matching documents
    indices = [] 
    # Iterate through the indices (i) and elements (doc) of documents
    for i, doc in enumerate(doc_list):
        # Split the string doc into a list of words (according to whitespace)
        tokens = doc.split()
        # Make a transformed list where we 'normalize' each word to facilitate matching.
        # Periods and commas are removed from the end of each word, and it's set to all lowercase.
        normalized = [token.rstrip('.,').lower() for token in tokens]
        # Is there a match? If so, update the list of matching indices.
        if keyword.lower() in normalized:
            indices.append(i)
    return indices
# Check your answer
q2.check()

enumerate
인덱스를 어떻게 가져와야하지,,고민하다 결국 풀이를 봤는데
전에 본적이 있었던 enumerate 이렇게 활용해야하는줄 몰랐다,,
strip()
,나 .을 제거할 때 remove로만 해봤서 strip으로도 가능하다는 걸 알게 됐다.


  1. 2번과 동일하지만 여러 단어를 딕셔너리 구조로 반환하기
    Now the researcher wants to supply multiple keywords to search for. Complete the function below to help her.
    (You're encouraged to use the word_search function you just wrote when implementing this function. Reusing code in this way makes your programs more robust and readable - and it saves typing!)
def multi_word_search(doc_list, keywords):
    """
    Takes list of documents (each document is a string) and a list of keywords.  
    Returns a dictionary where each key is a keyword, and the value is a list of indices
    (from doc_list) of the documents containing that keyword
>> doc_list = ["The Learn Python Challenge Casi.", "They bought a car and a casi", "Casiville"]
>> keywords = ['casi', 'they']
>> multi_word_search(doc_list, keywords)
    {'casi': [0, 1], 'they': [1]}
    """
    answer = {}
    for keyword in keywords:
        answer[keyword] = word_search(doc_list, keyword)
    return answer
# Check your answer
q3.check()

0개의 댓글