Swift - 문자열과 문자

임성빈·2022년 3월 4일

Swift

목록 보기

3/26

문자열 리터럴

문자열은 큰 따온표(")로 묶어 표현한다.

let something = "Some string literal value"

여러줄 문자열 리터럴

여러줄의 문자열을 사용하고 싶은 경우 큰 따옴표 3개로 (""")로 묶어서 사용할 수 있다.

let nationalAnthem = """
동해물과 백두산이 마르고 닳도록 
하느님이 보우하사 우리나라 만세
무궁화 삼천리 화려 강산
대한 사람 대한으로 길이 보전하세
"""

여러줄 문자열을 사용할 때는 첫 시작의 """ 다음 줄부터 마지막 ''' 의 직전까지를 문자열로 본다.

그래서 아래 두 줄의 표현으로 이루어진 singleLineString과 multiLineString은 같은 값을 갖게 된다.

singleLineString = "They are the same."
multiLineString = """They are the same."""

여러줄 문자열을 사용하며 줄바꿈을 하고 싶으면 백슬래쉬()를 사용한다. 또한 문자열의 시작과 끝에 각각 빈줄을 넣고 싶다면 한 줄을 띄어서 문자열을 입력하면 된다.

let softWrappednationalAnthem = """

동해물과 백두산이 마르고 닳도록 \
하느님이 보우하사 우리나라 만세 \
무궁화 삼천리 화려 강산 \
대한 사람 대한으로 길이 보전하세

"""

끝나는 지점의 """ 의 위치를 기준으로 들여쓰기도 가능하다.

문자열 리터럴의 특수 문자

문자열 리터럴은 다음과 같은 특수 문자를 포함할 수 있다.

\0, \, \t, \n, \r, \", \'
\u{n}, n은 1-8자리 십진수 형태로 구성된 유니코드

let wiseWords = "\"Imagination is more important than knowledge\" - Einstein"
// "Imagination is more important than knowlege" - Einstein
let dollaSign = "\u{24}"            // $, 유니코트 U+0024
let blackHeart = "\u{2665}"         // ♥, 유니코드 U+2665
let sparklingHeart = "\u{1F496}" // 💖,유니코드 U+1F496

빈 문자열 초기화

아래 두 변수의 문자열 값은 같다.

var emtpryString = ""
var anotherEmptyString = String()

문자열 수정

var variableString = "Horse"
variableString += " and carriage"
// variableString = Horse and carriage

let constantString = "Highlander"
constantString += " and another Highlander"
// 문자열 상수(let)로 선언 되어서 에러 처리

값 타입 문자열

Swift의 String은 값 타입이다.
그래서 String이 다른 함수 혹은 메소드로부터 생성되면 String 값이 할당 될때, 이전 String의 레퍼런스를 할당하는 것이 아니라 값을 복사해서 생성한다.
반대로 말하면 다른 메소드에서 할당 받은 문자열은 그 문자열을 수정해도 원본 문자열이 변하지 않기 때문에 편하게 사용해도 된다.

문자

문자열의 개별 문자을 for-in loop를 사용해 접근할 수 있다.

for character in "Dog!🐶" {
	print(character)
}
// D
// o
// g
// !
// 🐶

다음과 같이 문자 상수를 선언할 수 있다.

let exclamationMark: Character = "!"

문자 배열을 이용해 문자열의 초기화 메소드에 인자로 넣어 문자열을 생성할 수 있다.

let catCharacters: [Character] = ["C", "a", "t", "!", "🐱"]
let catString = String(catCharacters)
print(catString)
// Prints "Cat!🐱"

문자열과 문자의 결합

let string1 = "hi"
let string2 = " there"
var welcome = string1 + string2
// welcome : "hi there"

var instrction = "look over"
instruction += string2
// instruction : "look over there"

let exclamationMark: Character = "!"
welcome.append(exclamationMark)
//welcome : "hi there!"

문자열 삽입

백슬래쉬 괄호를 이용해 상수, 변수, 리터럴 값을 문자열에 추가할 수 있다.

let multiplier = 3
let message = "\(multiplier) times 2 is \(Double(multiplier) * 2)"
// message : "3 times 2 is 6"

유니코드

유니코드는 전 세계의 모든 문자를 컴퓨터에서 일괄되게 표현하고 다룰 수 있도록 설계된 국제 표준이다. Swift의 문자열과 문자 타입은 유니코드에 순응한다.

유니코드 스칼라

Swift의 네이티브 문자열 타입은 유니코드 스칼라 값으로 만들어졌다.
하나의 유니코드는 21비트의 숫자로 구성되어 있다.

자모 그룹의 확장

유니코드를 결합해서 사용할 수 있다.

let eAcute: Character = "\u{E9}"					// é
let combinedEAcute: Character = "\u{65}\u{301}" 	// e +  ́

아래는 한글의 "한"자를 단독으로 사용했을 때와 ㅎ,ㅏ,ㄴ 의 자모를 따로 결합해서 사용한 것이다.

let precomposed: Character = "\u{D55C}"
let decomposed: Character = "\u{1112}\u{u1161}\u{11AB}"
// precomposed: 한, decomposed: ㅎ+ㅏ+ㄴ

é 와 원심볼을 결합한 형태

let enclosedEAcute: Character = "\u{E9}\u{20DD}"
// enclosedEAcute : é⃝

지역심볼문자인 U와 S를 결합한 형태

let regionalIndicatorForUS: Character = "\u{1F1FA}\u{1F1F8}"
// regionalIndicatorForUS : 🇺🇸

문자 세기

문자열의 문자의 숫자를 세기 위해서는 문자열의 count 프로퍼티를 이용한다.

let unusualMenagerie = "Koala 🐨, Snail 🐌, Penguin 🐧, Dromedary 🐪"
print("unusualMenagerie의 문자는 \(unusualMenagerie.count)개")
// Prints "unusualMenagerie의 문자는 40개"

문자열의 접근과 수정

문자열의 수정과 접근은 문자열 메소드 혹은 프로퍼티를 이용하서나 서브스크립트 문법을 이용해 할 수 있다.

문자열 인덱스

startIndex, endIndex, index(before:), index(after:), index(\_: offsetBy) 메소드 등을 이용해 문자열에서 특정 문자에 접근할 수 있다.

주의
위 메소드들은 Collection 프로토콜을 따르는 Array, Dictionary, Set 등에서도 동일하게 사용할 수 있다.

let greeting = "Hi there!"
greeting[greeting.startIndex]
// H
greeting[greeting.index(before: greeting.endIndex)]
// !
greeting[greeting.index(after: greeting.startIndex)]
// i
let index = greeting.index(greeting.startIndex, offsetBy: 7)
greeting[index]
// e

문자열의 인덱스를 벗어나는 문자를 가져오려고 하면 런타임 에러가 발생

greeting[greeting.endIndex]
// 에러!
greeting.index(after: greeting.endIndex)
// 에러!

문자열의 개별 문자를 접근하기 위해서는 indices 프로퍼티를 사용한다.

for index in greeting.indices {
	print("\(greeting[index]) ", terminator: "")
}
// H i  t h e r e !

문자의 삽입과 삭제

문자의 삽입과 삭제에는 insert(:at:), insert(contentOf:at:), remove(at:), removeSubrange(:) 메소드를 사용할 수 있다.

주의
위 메소드들은 RangeReplaceableCollection 프로토콜을 따르는 Array, Dictionary, Set 등에서도 동일하게 사용할 수 있다.

var welcome = "hello"
welcome.insert("!", at: welcome.endIndex)
// welcome : hello!

welcome.insert(contentOf: " there", at: welcome.index(before: welcome.endIndex))
// welcome : hello there!

welcome.remove(at: welcome.index(before: welcome.endIndex))
//  welcome : hello there

let range = welcome.index(welcome.endIndex, offsetBy: -6)..<welcome.endIndex
welcome.removeSubrange(range)
// welcome : hello

부분 문자열

문자열에서 부분문자를 얻기 위해 prefix(\_:) 와 같은 서브스크립트 메소드를 이용할 수 있는데, 그렇게 얻은 부분 문자열(String) 인스턴스가 아니라 부분문자열(SubString) 인스턴스이다.
만약 부분 문자열을 단기간에 사용하는게 아니라 오랜기간 사용한다면 문자열 인스턴스로 바꿔서 사용하는게 좋다.

let greeting = "Hello, World!"
let index = greeting.index(of: ",") ?? greeting.endIndex
let beginning = greeting[..<index]
// beginning : Hello

// SubString인 beginning을 String으로 변환
let newString = String(beginning)

위와 같이 사용하는 것이 좋은 이유는 메모리 관리 때문이다.
SubString은 해당 문자를 직접 갖고 있는 것이 아니라 원본 String의 메모리를 참조해 사용한다.

그래서 SubString을 계속 이용하는 이상은 원본 String이 계속 메모리에 남아 사용하지 않는 문자열까지도 남게 된다.
그렇기 때문에 SubString을 오래 사용한다면 String에서 인스턴스로 만들어 사용하고자 하는 문자만 메모리에 올려놓고 사용하는 것이 관리 효율면에서 좋다고 말할 수 있다.

주의
String과 SubString 모두 StringProtocol을 따른다.
즉, 문자 조작에 필요한 편리한 메소드들을 공통으로 사용할 수 있다.

문자열과 문자 비교

문자열과 문자 비교에는 == 혹은 != 연산자를 사용한다.

let quotation = "We're a lot alike, you and I."
let sameQuotation = "We're a lost alike, you and I."

if quotation == sameQuotation {
    print("These two strings are considered equal.")
} else {
	print("These two strings are not equivalent.")
}
// These two strings are considered equal. 출력

유니코드는 결합된 문자열을 갖고 비교하게 된다.

let eAcuteQuestion = "Voulez-vous un caf\u{E9}?"
// "Voulez-vous un café?"

let combinedEAcuteQuestion = "Voulez-vous un caf\u{65}\u{301}?"
// "Voulez-vous un café?"

if eAcuteQuestion == combinedEAcuteQuestion {
    print("These two strings are considered equal.")
} else {
	print("These two strings are not equivalent.")
}
// These two strings are considered equal. 출력

같은 유니코드 문자여도 유니코드가 다르면 다른 문자로 판별한다.

let latinCapitalLetterA: Character = "\u{41}"
// A(U+0041)

let cyrillicCapitalLetterA: Character = "\u{0410}"
// A(U+0410)

if latinCapitalLetterA == cyrillicCapitalLetterA {
	print("These two Characters are considered equal.")
} else {
    print("These two characters are not equivalent.")
}
// These two characters are not equivalent. 출력

접두사와 접미사 비교

접두사와 접미사의 비교를 위해 hasPrefix(:), hasSuffix(:) 메소드를 사용할 수 있다.

let romeoAndJuliet = [
    "Act 1 Scene 1: Verona, A public place",
    "Act 1 Scene 2: Capulet's mansion",
    "Act 1 Scene 3: A room in Capulet's mansion",
    "Act 1 Scene 4: A street outside Capulet's mansion",
    "Act 1 Scene 5: The Great Hall in Capulet's mansion",
    "Act 2 Scene 1: Outside Capulet's mansion",
    "Act 2 Scene 2: Capulet's orchard",
    "Act 2 Scene 3: Outside Friar Lawrence's cell",
    "Act 2 Scene 4: A street in Verona",
    "Act 2 Scene 5: Capulet's mansion",
    "Act 2 Scene 6: Friar Lawrence's cell"
]

다음 코드는 문자열 배열에서 접두어 Act 1이 몇개 들어있는지 확인하는 코드이다.

var act1SceneCount = 0
for scene in remeoAndJuliet {
    if scene.hasPrefix("Act 1 ") {
        act1SceneCount += 1
    }
}
print("There are \(act1SceneCount) scenes in Act 1")
// There are 5 scenes in Act 1

다음 코드는 문자열 배열에서 접미어 Capulet's mansion 과 Friar Lawrences' cell 이 각각 몇개 들어있는지 확인하는 코드이다.

var mansionCount = 0
var cellCount = 0
for scene in remeoAndJuliet {
    if scene.hasSuffix("Capulet's mansion") {
        mansionCount += 1
    } else if scene.hasSuffix("Friar Lawrence's cell") {
        cellCount += 1
    }
}
print("\(mansionCount) mansion scenes; \(cellCount) cell scenes")
// 6 mansion scenes; 2 cell scenes

문자열의 유니코드 표현

유니코드 문자가 텍스트 파일이나 다른 저장소에 쓰여질 때 유니코드 스칼라는 UFT-8, UFT-16, UFT-32 등 다양한 유니코드 인코딩 방식이 사용된다.

let dogString = "Dog!!🐶"

UTF-8 표현

for codeUnit in dogString.utf8 {
    print("\(codeUnit) ", terminator: "")
}
print("")
// 68 111 103 226 128 188 240 159 144 182

UTF-16 표현

for codeUnit in dogString.utf16 {
    print("\(codeUnit) ", terminator: "")
}
print("")
// 68 111 103 8252 55357 56374

유니코드 스칼라 표현

for scalar in dogString.unicodeScalars {
    print("\(scalar.value) ", terminator: "")
}
print("")
// 68 111 103 8252 128054

for scalar in dogString.unicodeScalars {
    print("\(scalar) ")
}
// D
// o
// g
// !!
// 🐶

임성빈

iOS 앱개발

이전 포스트

Swift - 기본 연산자

다음 포스트