Chapter 7. Basic Types

지환·2021년 11월 7일

K.N.K

Arithmtic types(C99)

Integer types
- char
- Signed integer types, both standard(signed char, short int, int, long int, long long int) and extended
- Unsigned integer types, both standard(unsigned char, unsigned short int, unsigned int, unsigned long int, unsigned long long int, _Bool) and extended
- Enumerated types
Floating types
- Real floating types(float, double, long double)
- Complex types(float _Complex, double _Complex, long double _Complex)

7.1 Integer Types

signed integer : 최상위 비트를 부호 표시용으로 사용한다. (0이면 0 이상, 1이면 음수)

signed short int
unsigned short int
signed int
unsigned int
signed long int
unsigned long int
signed long long int
unsigned long long int

signed는 생략할 수 있다.(unsigned라고 표기하지 않으면 기본적으로 signed)
int는 생략할 수 있다.
int 앞에 오는 specifier는 순서가 상관없다.

각 type의 크기는 machine에 따라 다르다. 일반적인 크기는 검색하면 그냥 나오는데, 꼭 그렇게 하도록 정해진건 아니다.
하지만,
(1)각 type이 특정 최소값 범위에는 들어야 하고(해당 범위는 23.2에서 확인할 수 있음),
(2)int의 범위가 short int 이상, long int의 범위가 int 이상이어야 한다.
고 C standard에서 정의하고 있다.

Integer Constants

Decimal
: must not begin with zero
Octal
: must begin with zero (098 같은 숫자는 warning message 발생 시킬 수도 있음)

Hexadecimal
: must begin with 0x(0X)

8, 16진법도 그냥 숫자를 표현하는 방식 중 하나일 뿐이다.
Octal과 Hexadecimal은 주로 low level programming할때 쓰임.(Chapter 20)

decimal constants는
int, long int, long long int 순서대로 가장 작게 표현할 수 있는 type으로 정해진다.(C99)

octal, hexadecimal constants는
int, unsigned int, long int, unsigend long int, long long int, unsigned long long int 순서대로 가장 작게 표현할 수 있는 type으로 정해진다.(C99)

여기서 suffix가 붙으면 위의 list가 바뀔 수 있다.
예를들어 deciaml constant에 l이 붙으면 int는 고려하지 않고, long int, long long int만 고려한다.

u(U), l(L), ul(UL), ll(LL), ... 등의 suffix를 붙여서 해당 integer constant의 type을 정해줄 수 있다.

Integer overflow

Operation의 결과가 해당 type이 저장할 수 있는 범위를 넘어가면 이를 overflow라고 한다.
signed integer의 경우 overflow가 undefined behavior 이지만, unsigned integer에서는 정의돼있다. 결과값은 2^n으로 나눈 나머지이다.(n is the number of bits used to store the result)

Reading and Writing Integers

conversion specification
(d는 오로지 signed int를 위해 작동. 나머지는??)
1. unsigned integer
: u(decimal notation), o(octal notation), x(hexadecimal notation)
-> 음수는 입력을 못받는건가? : Q&A보니 음수 아닌 signed는 가능하다고 돼있음. 음수는 이런식으로 입력 못받는 듯
2. short integer
: put h in front of d, u, o, x
3. long integer
: put l in front of d, u, o, x
4. long long integer
: put ll in front of d, u, o, x

7.2 Floating Types

float
double
long double

C standard에서는 어떤 식으로 floating type을 저장할지 명시하고 있지 않지만, 대부분 modern computer들은 IEEE Standard 754(IEC 60559)를 따른다.
(IEEE 754에 대한 자세한 내용은 다음 게시글에 올릴 예정.)
floating type의 특징은 <float.h> header file에 명시돼있다.

Floating Constants

Flaoting Constants MUST contain (1)a decimal point and/or (2)an exponent
By default, floating constants are stored as double-precision numbers.
-> 왜 double이 기본이지? : 과거부터 float은 두번째 선택지였다. K&R에서도 float을 사용하는 주된 이유는 double이 필요하지 않을때 공간과 시간을 절약하기 위해 사용한다고 했다.

f(F), l(L) 의 suffix를 붙여서 type을 정해줄 수 있다.(float variable에 float 상수 저장할때는 f suffix를 붙여주는게 좋음. 왜냐하면 float 상수는 기본적으로 double로 인식되기 때문)

floating constants도 hexadecimal로 쓰일 수 있다. 0x(0X)로 시작하고, e(E)대신 p(P)를 씀.(C99)

Reading and Writing Floating-Point Numbers

Conversion specification
1. f, e, g : reading/writing float, writing double
2. lf, le, lg : reading/(writing) double
3. Lf, Le, Lg : reading/writing long double

왜 f, e, g를 writing double하는데 그냥 쓸 수 있지?
: printf나 scanf는 variable-length argument list를 가진다.
variable-length argument list를 가진 함수가 호출되면, 컴파일러는 float argument를 double로 변환시킨다.
(짐작하자면, 제대로 된 full 함수 prototype을 보지못했기 때문에
default argument promotion으로인해 double로 바꾸지 않을까. p.194)
따라서 printf에서 float과 double을 구분하는 의미가 없어지기 때문에(사실상 모두 double로 판단)
float을 writing하나 double을 writing하나 f, e, g를 그냥 쓸 수 있다.
(double을 writing하는데 lf, le, lg가 허용된건 C99부터)
하지만 scanf에서는 pointer가 argument이기 때문에 type의 구분이 중요하다.
따라서 구분된 conversion specification을 사용한다.

7.3 Character types

computer마다 char의 값이 다를 수 있다. 왜냐하면 computer마다 다른 character set을 사용하기 때문이다.
하지만 대부분 ASCII character set을 따른다.(7-bit) (American Standard Code for Information Interchange)

char a = 'A'; 처럼 single quotes로 표현.
C는 char를 각 코드에 맞게 작은 정수형으로 취급한다.
따라서 다양한 연산이 가능하지만, portability를 위해 되도록 자제하는게 좋다
(다른 computer는 다른 character set을 사용할 수 있으니..)

Signed and Unsigned Characters

C standard는 char형이 기본적으로 signed인지 unsigned인지 정의하지 않음.
따라서 필요하다면, 정확히 명시해야한다.

Escape Sequences

character escape
numeric escape

character escape는 간단지만, 모든 nonprinting ASCII characters을 표현하지 못한다. 그래서 numeric escape로 남은 것들을 모두 표현한다.

numeric escapes : ASCII 코드에 숫자로 표현
1. octal escape sequence : \로 시작해서 총 3자리까지 쓸 수 있다.(\33 == \033)
2. hexadecimal escape sequence : \x로 시작해서 자리수 제한없이 쓸 수 있다.(x는 lower case여야됨)

escape squence에 \?는 왜 필요하지?
: 그냥 ?만 쓰면 trigraph로 판단할 수 있기 때문이다. 지금은 키보드에 있지만, 예전에는 키보드 없던 문자를 쓰기 위해 물음표 두개로 시작하는 trigraph를 사용했다.

Reading and Writing Characters

printf, scanf에서는 %c를 이용한다.
단, scanf에서 "%c"를 그냥 쓰면 앞에 white space가 skip되지 않으므로, skip을 원하면 " %c" 형식으로 한칸 띄워서 써주면 된다.
putchar(ch);
ch = getchar();

while (getchar() != '\n') ;
: skips rest of line
while ((ch = gethcar()) == ' ') ;
: skips blanks,
: blanks skip한 뒤에 ch에 저장된게 첫번째 nonblank character임.

pritnf("Enter an integer: ");
scnaf("%d", &i);
printf("Enter a command: ");
command = getchar();

여기서 command에는 new-line character가 들어갈 확률이 높다. ***주의.***

7.4 Type Conversion

Arithmentic operation을 할 때 각 operands는 (1)같은 size에 (2)같은 방식으로 저장돼있어야 한다. 즉, 같은 type이어야 한다. 따라서 둘의 type이 다르면, type conversion이 일어난다.

implicit conversion
explicit conversion : using cast operator

implicit conversion이 일어나는 경우는??
1. arithmetic 이나 logical expression에서 operands의 type이 다를 때
2. assignment expression에서 type이 다를 때
3. function call 에서 argument의 type이 해당 paramenter와 다를 때
4. return statement의 experssion의 type이 해당 함수의 return type과 다를 때
(3번, 4번은 Chapter 9에서 다룸)

1. 의 경우 The usual arithmentic conversion이 적용된다.

"integer conversion rank" (C99) (For simplicity, extended integer type과 enumerated type은 제외.)
1. long long int, unsigned long long int
2. long int, unsigned long int
3. int, unsigned int
4. short int, unsigned short int
5. char, signed char, unsigned char
6. _Bool

둘 중 하나가 floating type이면,
: float -> double -> long double 순으로 가며 표현할 수 있는 가장 작은 type으로 변환된다. (ex. a가 double, b가 int면, a+b;에서 b가 double로 변환.)
: complex type이 끼어있으면 27.3 참고
둘 다 floating type이 아니라면,
: 우선 both operands에 integer promotion(<- int나 unsigned int보다 rank가 낮으면 int나 unsigned int로 convert)
: integer promotion을 했는데도 둘의 type이 다르다면, 아래를 순서대로 진행

둘 다 signed이거나 둘 다 unsigned라면,
rank가 낮은 operand를 rank가 높은 operand로 convert
unsigned operand가 signed operand보다 랭크가 높거나 같다면,
signed operand를 unsigned operand의 type으로 convert
signed operand가 unsigned operand의 type의 모든 값을 표현할 수 있다면,
unsigned operand를 signed operand의 type으로 convert
그게 아니라면,
두개의 operands 모두를 signed operand에 대응하는 unsigned type으로 convert
(ex. signed operand가 long int 였다면, 둘 다 unsigned long int로 변경)

모든 arithmetic types은 _Bool로 변경될 수 있다. 0 이라면 0, 그게 아니라면 1로.

unsigned와 signed는 안 섞는게 좋음..
s>u 같은 계산을 할 때, unsigned로 변환되면,
s가 음수일 경우 잘못된 결과가 나올 수 있음.
unsigned <-> signed는 해석 방식만 달라지는 듯 하다.

2. 의 경우

간단하다. 그냥 오른쪽 피연산자가 왼쪽 피연산자의 type에 맞게 convert된다.

왼쪽 operand가 충분히 크다면 상관 없지만,
narrowing assignment는 문제가 된다.

floating-point number를 integer variable에 assign하면 factional part가 drop된다.
이 외의 경우에 narrowing assignment는 warning message를 발생시킬 수 있으므로 되도록 피해야 한다.(undefined behavior도 발생할 수 있음)
그래서 f = 3.14f; 이렇게 f suffix를 붙여주는게 좋다.
아니면 float에 double을 assign하는 꼴이다.

Casting(explicit conversion)

( type-name ) expression

주의할 것

long i;
int j = 100000;

i = j * j;

위 처럼 쓰면 overflow 발생.
왜냐면 100000*100000 (10,000,000,000)은 int 범위 초과.
i는 long 이지만, 왼쪽의 j*j만 놓고 보면 왼쪽은 int형이므로 i에 assign 되기전에 overflow가 발생한다.
따라서

long i;
int j = 100000;

i = (long)j * j;

으로 casting 해주는게 best.

여기서 또

i = (long)(j * j);

로 해버리면 overflow 발생.

7.5 Type Definitions

typedef int Bool;

Capitalizing the first letter of a type name is just a convention.

장점
1. array나 pointer type은 macro로 쓰일 수 없다.(pointer는 되긴하겠지만 2개이상 변수 선언시 문제가 된다.)
간단한 type 외에 좀만 더 복잡해지면 typedef가 macro보다 훨씬 낫다.(macro로 하면 애초에 동작안할수도있기도하고)
2. typedef는 변수와 같은 scope rule을 따른다.
3. 수정에 편리하다.
(그렇다고 typedef만 바꾼다고 다 해결되는건 아님,
type을 바꿔야될 때 typedef 내용만 바꾼다고 해도,
conversion specification 같은 건 그대로 남아있음.)

macro와 차이

7.6 The sizeof Operator

sizeof ( type-name )

operand가 type name일 경우 괄호가 필요하지만,
expression(ex. i)일 경우 괄호는 꼭 필요하지 않다.
하지만 그냥 하는게 좋음, sizeof i + j;는 precedence때문에 (sizeof i) + j; 로 해석될 수도 있다.
type은 size_t
byte 크기가 반환된다.

printf 사용할 때, 주로 "%zu" conversion specification을 사용한다.

Q&A

carriage-return과 line-feed ??
carriage-return은 커서를 해당 줄 제일 앞으로 이동.
line-feed는 커서를 다음 줄로 이동.

printf("test\r");
printf("1004");

실행결과 :
1004

printf("test\n");
printf("1004");

실행결과 :
test
1004

C에서는 \n은 그저 line-feed로만 인식한다. 엔터키를 입력해도, line-feed로만 변환하고, 반대로도 그렇다.
file에서 읽어오거나 쓸때도 마찬가지다. 이는 UNIX에서 계승됐다.
이게 혼란스러워 보일 수 있지만, operating system마다 다를 수 있는 세부사항으로부터 프로그램을 격리? 보호?해주는 역할을 한다.

_Bool은 왜 이름이 저러나, 그냥 bool이나 boolean은 안되나?
(chapter5 내용)
이미 이 전에 있던 프로그램들이 bool이나 boolean을 써서 또 그렇게 정의해버리면 기존 프로그램과 충돌한다.
하지만 underscore로 시작하고 대문자가 뒤에 따라오는 경우는 나중에 사용하기위해 예약해뒀기 때문에 충돌하지 않는다.

지환

이전 포스트

Chapter 6. Loops

다음 포스트