Charset๊ณผ Collation

Blasterยท2021๋…„ 8์›” 14์ผ
0

MySQL ํŠœํ† ๋ฆฌ์–ผ

๋ชฉ๋ก ๋ณด๊ธฐ
2/3

๐Ÿ”  ๋ฌธ์ž ์ธ์ฝ”๋”ฉ์— ๋Œ€ํ•˜์—ฌ

์ปดํ“จํ„ฐ๋Š” 0๊ณผ 1๋กœ ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ด์ง„๋ฒ•์œผ๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ์‚ฌ์šฉ์ž๊ฐ€ ์ž…๋ ฅํ•œ ๋ฌธ์ž, ์ˆซ์ž, ๊ณต๋ฐฑ๊ณผ ๊ฐ™์€ ๋ชจ๋“  ๋ฌธ์ž๋Š” ์ด์ง„์ˆ˜๋กœ ์ €์žฅ์ด ๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ปดํ“จํ„ฐ ๋‚ด์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋“  ๋ฌธ์ž๋ฅผ ๋ฌธ์ž ์ง‘ํ•ฉ(Character Set)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ๋ฌธ์ž๋ฅผ ์ปดํ“จํ„ฐ๊ฐ€ ์ฝ์„ ์ˆ˜ ์žˆ๊ฒŒ ์ฝ”๋“œํ™”ํ•˜๋Š” ๊ณผ์ •์„ ๋ฌธ์ž ์ธ์ฝ”๋”ฉ(Character Encoding)์ด๋ผ๊ณ  ๋ถ€๋ฆ…๋‹ˆ๋‹ค. ๋Œ€ํ‘œ์ ์œผ๋กœ ASCII, UNICODE๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

ASCII

  • 1960๋…„๋Œ€์— ๋“ฑ์žฅํ•œ ์ตœ์ดˆ์˜ ๋ฌธ์ž ์ง‘ํ•ฉ
  • 7bit๋งŒ ์‚ฌ์šฉ
  • 128๊ฐœ ๋ฌธ์ž ํ‘œํ˜„
  • ์–ธ์–ด ์ค‘ ์˜์–ด๋งŒ ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅ

UNICODE

  • 1990๋…„๋Œ€์— ๋“ฑ์žฅํ•œ ๋ฌธ์ž ์ง‘ํ•ฉ
  • ๊ฑฐ์˜ ๋ชจ๋“  ์–ธ์–ด์˜ ๋ฌธ์ž๋ฅผ ํ‘œํ˜„
  • ์ด๋ชจ์ง€ ์‚ฌ์šฉ ๊ฐ€๋Šฅ๐Ÿ˜‰

EUC

  • ๋™์•„์‹œ์•„ ์–ธ์–ด์˜ ๋ฌธ์ž ์ง‘ํ•ฉ

๐Ÿ˜ณ UTF-8? EUC-KR?

์šฐ๋ฆฌ๋‚˜๋ผ์—์„œ ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ฌธ์ž ์ธ์ฝ”๋”ฉ ๋ฐฉ์‹์œผ๋กœ UTF-8๊ณผ EUC-KR์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด ๋‘˜ ์ค‘ ์ฐจ์ด์ ์ด ๋ฌด์—‡์ธ์ง€ ๊ทธ๋ฆฌ๊ณ  ์–ด๋–ค ๋ฐฉ์‹์„ ์‚ฌ์šฉํ• ์ง€ ๊ณ ๋ฏผํ•˜๊ฒŒ ๋  ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

UTF-8

  • UNICODE ์ธ์ฝ”๋”ฉ
  • 1~4btye ์‚ฌ์šฉ
  • ํ•œ๊ธ€ 3byte, ์ด๋ชจ์ง€ 4byte ์‚ฌ์šฉ
  • ๋ชจ๋“  ์–ธ์–ด์˜ ๋ฌธ์ž๋ฅผ ํ‘œํ˜„

EUC-KR

  • ํ•œ๊ธ€ 2byte ์‚ฌ์šฉ
  • ASCII ๋ฌธ์ž 1byte ์‚ฌ์šฉ
  • ํ•œ๊ธ€๊ณผ ์˜์–ด ํŽ˜์ด์ง€์— ์ ํ•ฉ

๐Ÿ“š MySQL ๋ฌธ์ž ์ง‘ํ•ฉ

MySQL์—์„œ Character Set์„ ์กฐํšŒํ•ด๋ณด๋ฉด utf8๊ณผ utf8mb4๋ฅผ ๋ฐœ๊ฒฌํ•˜๊ฒŒ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‘˜ ๋‹ค ๋˜‘๊ฐ™์€ UTF-8 UNICODE์ธ๋ฐ๋‹ค๊ฐ€ ๊ทธ๋ƒฅ utf8๋กœ ์‚ฌ์šฉํ•ด๋„ ์ƒ๊ด€์€ ์—†์ง€๋งŒ ๋ชจ๋ฐ”์ผ ์‚ฌ์šฉ์œผ๋กœ ์ด๋ชจ์ง€ ์‚ฌ์šฉ๋Ÿ‰๋„ ๋งŽ์ด ๋Š˜๋ฉด์„œ utf8mb4๋กœ ๋งŽ์ด ๊ถŒ์žฅํ•œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ์šฐ๋ฆฌ๊ฐ€ ์•Œ๊ณ  ์žˆ๋Š” UTF-8์€ 4byte๊นŒ์ง€ ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•˜๊ณ  ์ด๋ชจ์ง€๋„ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ๋ฐ ์™œ MySQL์—์„œ๋Š” utf8mb4๋กœ ์‚ฌ์šฉํ•ด์•ผํ•˜๋Š”์ง€ ๊ทธ ์ด์œ ๋Š” ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

utf8

  • ์ตœ๋Œ€ 3byte ์ง€์›

utf8mb4

  • MySQL 5.5.3๋ถ€ํ„ฐ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  • ์ตœ๋Œ€ 4byte ์ง€์›
  • ์ด๋ชจ์ง€ ์ €์žฅ ๊ฐ€๋Šฅ

๐Ÿคท Collation

์„ค์ •๋œ Charset์œผ๋กœ ์ €์žฅ๋œ ๋ฐ์ดํ„ฐ ๋ฌธ์ž์—ด์„
๋น„๊ตํ•˜๊ณ  ์ •๋ ฌ ๋ฐฉ๋ฒ•์„ ์ •์˜ํ•˜๋Š” ๊ทœ์น™ ์ง‘ํ•ฉ


โš™๏ธ MySQL Charset ์„ค์ •

MySQL 8.0์—์„œ๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ utf8mb4๋กœ ์„ค์ •์ด ๋˜์–ด์žˆ์Šต๋‹ˆ๋‹ค.

SHOW VARIABLES LIKE 'character_set%';

ํ•˜์ง€๋งŒ ์œ„์™€ ๊ฐ™์€ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ด๋ณด๋ฉด Client์™€ Connection์€ euckr๋กœ ์„ค์ •์ด ๋˜์–ด์žˆ์œผ๋‹ˆ ์ด ๋‘๊ฐœ๋ฅผ utf8mb4๋กœ ๋ฐ”๊ฟ”๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

์„ค์ •

my.ini ํŒŒ์ผ์ด ์กด์žฌํ•˜๋Š” ๊ฒฝ๋กœ๋ฅผ ๋ณต์‚ฌํ•ด์ค€ ํ›„ ํ•ด๋‹น ๊ฒฝ๋กœ๋กœ ๋“ค์–ด๊ฐ€์ค๋‹ˆ๋‹ค.

์œˆ๋„์šฐ ๊ฒ€์ƒ‰ ์ƒ์ž [์„œ๋น„์Šค] โ†’ MYSQL80 [์šฐํด๋ฆญ] โ†’ [์†์„ฑ]

my.ini ํŒŒ์ผ ์ €์žฅ์„ ์œ„ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์„ค์ •์„ ํ•ด์ค๋‹ˆ๋‹ค.

my.ini [์šฐํด๋ฆญ] โ†’ ์†์„ฑ โ†’ [๋ณด์•ˆ] โ†’ [ํŽธ์ง‘] โ†’ Users [์„ ํƒ] โ†’ ์‚ฌ์šฉ ๊ถŒํ•œ ํ—ˆ์šฉ

my.ini ํŒŒ์ผ์„ ์—ด์–ด์ค€ ํ›„, ๋งจ ์•„๋ž˜์— ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ถ”๊ฐ€๋ฅผ ํ•ด์ค๋‹ˆ๋‹ค.

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_0900_ai_ci

ํŽธ์ง‘ ํ›„, [์„œ๋น„์Šค]์—์„œ MySQL ์„œ๋ฒ„๋ฅผ ๋‹ค์‹œ ์‹œ์ž‘ํ•ด์ค€ ํ›„
MySQL์—์„œ Charset์ด utf8mb4๋กœ ๋ณ€๊ฒฝ๋œ ๊ฑธ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

[์„œ๋น„์Šค] โ†’ MYSQL80 [๋‹ค์‹œ ์‹œ์ž‘]


๐Ÿ“‘ ์ฐธ๊ณ ๋ฌธํ—Œ

Fundamentals of data representation
Beginners Guide to Data and Character Encoding
What is the difference between EUC and UTF (especially EUC-KR and UTF-8)?
What is UTF-8 Encoding? A Guide for Non-Programmers
EUC-KR vs UTF-8 ๋ฌธ์ž์…‹ ์„ ํƒ ๊ฐ€์ด๋“œ
MySQL utf8 vs utf8mb4 โ€“ Whatโ€™s the difference between utf8 and utf8mb4?
10.14 Adding a Collation to a Character Set
10.3.2 Server Character Set and Collation

0๊ฐœ์˜ ๋Œ“๊ธ€

๊ด€๋ จ ์ฑ„์šฉ ์ •๋ณด