3주차 Unit 9.3 — DataInputStream / DataOutputStream

Psj·2026년 5월 20일

F-lab

목록 보기

114/239

Unit 9.3 — DataInputStream / DataOutputStream

F-LAB JAVA · 3주차 · Phase 9 · I/O 강화

📌 학습 목표

이 Unit을 끝내면 다음을 답할 수 있어야 한다.

DataInputStream / DataOutputStream 의 정의와 활용 시점은?
기본 타입 메서드 (readInt, readLong, readDouble, readUTF 등) 의 동작은?
바이너리 형식 vs 텍스트 형식 의 차이는?
엔디언 (Big-Endian) 과 자바의 선택은?
readUTF / writeUTF 의 특별한 형식 (Modified UTF-8) 은?
readFully 가 read 와 어떻게 다른가?
DataInput / DataOutput 인터페이스 의 역할은?
타입별 바이트 크기 (int 4, long 8, double 8 등) 는?
직렬화와의 관계 는? (다음 Unit 의 토대)

🎯 핵심 한 문장

DataInputStream 과 DataOutputStream 은 자바의 기본 타입 (int, long, float, double, boolean, char, String) 을 바이너리 형식 으로 읽고 쓰는 Decorator 스트림이다.
writeInt(42) 는 텍스트 "42" (2바이트) 가 아니라 바이너리 4바이트 (0x00 0x00 0x00 0x2A, Big-Endian) 로 저장 — 파싱 불필요, 빠름, 정확.
readUTF / writeUTF 는 Modified UTF-8 형식 (앞 2바이트 길이 + UTF-8 데이터) 으로 문자열 안전 저장.
readFully 는 정확히 요청한 바이트를 채울 때까지 읽으며, 부족하면 EOFException — 헤더 등 정확한 크기 처리에 필수.
다음 Unit 의 객체 직렬화 (ObjectInputStream) 가 내부적으로 DataInputStream 의 메커니즘 활용 — 직렬화의 토대.

비유 — 우편 보내기

텍스트 형식 (PrintStream):
  편지에 "42" (두 글자) 쓰기
  - 사람이 읽기 좋음
  - 파싱 (Integer.parseInt) 필요
  - 가변 크기 ("42" 2바이트, "12345" 5바이트)

바이너리 형식 (DataOutputStream):
  편지에 정수 42 자체 (4바이트) 쓰기
  - 기계가 빠르게 처리
  - 파싱 불필요
  - 고정 크기 (모든 int = 4바이트)

readUTF:
  "안녕" → [길이 (2바이트)][UTF-8 6바이트]
  - 길이 정보 포함
  - 안전한 String 저장

→ DataInput/Output = 자바 타입 ↔ 바이너리.

🧭 9개 섹션 로드맵

1. DataInputStream / DataOutputStream 의 정의
2. 기본 타입 메서드 종합
3. 바이너리 vs 텍스트 형식
4. 엔디언 (Big-Endian)
5. readUTF / writeUTF 의 Modified UTF-8
6. readFully — 정확한 읽기
7. DataInput / DataOutput 인터페이스
8. 실무 활용과 함정
9. 면접 + 자기 점검

1️⃣ DataInputStream / DataOutputStream 의 정의

1.1 DataInputStream 의 정의

package java.io;

public class DataInputStream extends FilterInputStream implements DataInput {
    
    public DataInputStream(InputStream in);
    
    // DataInput 인터페이스 구현
    public boolean readBoolean() throws IOException;
    public byte readByte() throws IOException;
    public char readChar() throws IOException;
    public short readShort() throws IOException;
    public int readInt() throws IOException;
    public long readLong() throws IOException;
    public float readFloat() throws IOException;
    public double readDouble() throws IOException;
    public String readUTF() throws IOException;
    
    public void readFully(byte[] b) throws IOException;
    public void readFully(byte[] b, int off, int len) throws IOException;
    public int skipBytes(int n) throws IOException;
    
    public String readLine() throws IOException;   // deprecated
    
    // 추가
    public int read(byte[] b, int off, int len);
    public int read();
}

핵심:

FilterInputStream 의 자식 (Decorator)
DataInput 인터페이스 구현
기본 타입 메서드 제공
Java 1.0 부터

1.2 DataOutputStream 의 정의

package java.io;

public class DataOutputStream extends FilterOutputStream implements DataOutput {
    
    protected int written;   // 쓴 바이트 수
    
    public DataOutputStream(OutputStream out);
    
    // DataOutput 인터페이스 구현
    public void writeBoolean(boolean v) throws IOException;
    public void writeByte(int v) throws IOException;
    public void writeChar(int v) throws IOException;
    public void writeShort(int v) throws IOException;
    public void writeInt(int v) throws IOException;
    public void writeLong(long v) throws IOException;
    public void writeFloat(float v) throws IOException;
    public void writeDouble(double v) throws IOException;
    public void writeUTF(String str) throws IOException;
    
    public void writeBytes(String s) throws IOException;
    public void writeChars(String s) throws IOException;
    
    public final int size() { return written; }
}

핵심:

FilterOutputStream 의 자식
DataOutput 인터페이스 구현
매 write 마다 written 증가
Java 1.0 부터

1.3 위치 (계층)

InputStream
  ├── FileInputStream
  ├── FilterInputStream
  │   ├── BufferedInputStream
  │   ├── DataInputStream         ← 여기
  │   └── PushbackInputStream
  └── ...

OutputStream
  ├── FileOutputStream
  ├── FilterOutputStream
  │   ├── BufferedOutputStream
  │   ├── DataOutputStream         ← 여기
  │   └── PrintStream
  └── ...

1.4 기본 사용

// 쓰기
try (DataOutputStream dos = new DataOutputStream(
        new FileOutputStream("data.bin"))) {
    
    dos.writeInt(42);
    dos.writeLong(1234567890L);
    dos.writeDouble(3.14);
    dos.writeBoolean(true);
    dos.writeUTF("Hello");
}

// 읽기 (같은 순서로!)
try (DataInputStream dis = new DataInputStream(
        new FileInputStream("data.bin"))) {
    
    int i = dis.readInt();          // 42
    long l = dis.readLong();         // 1234567890
    double d = dis.readDouble();     // 3.14
    boolean b = dis.readBoolean();   // true
    String s = dis.readUTF();        // "Hello"
}

1.5 Decorator 의 활용

// BufferedStream 과 결합 (효율)
try (DataOutputStream dos = new DataOutputStream(
        new BufferedOutputStream(
            new FileOutputStream("data.bin")))) {
    
    // BufferedOutputStream 으로 system call 절감
    // DataOutputStream 으로 타입 변환
    dos.writeInt(42);
}

// 읽기도 동일
try (DataInputStream dis = new DataInputStream(
        new BufferedInputStream(
            new FileInputStream("data.bin")))) {
    
    int i = dis.readInt();
}

1.6 ILIC 활용

public class ShipmentBinaryStorage {
    
    // 1. 바이너리 저장
    public void save(Path path, Shipment s) throws IOException {
        try (DataOutputStream dos = new DataOutputStream(
                new BufferedOutputStream(
                    new FileOutputStream(path.toFile())))) {
            
            dos.writeLong(s.getId());
            dos.writeUTF(s.getBlNo());
            dos.writeDouble(s.getWeight().doubleValue());
            dos.writeLong(s.getCreatedAt().toEpochMilli());
            dos.writeBoolean(s.isActive());
        }
    }
    
    // 2. 바이너리 로드
    public Shipment load(Path path) throws IOException {
        try (DataInputStream dis = new DataInputStream(
                new BufferedInputStream(
                    new FileInputStream(path.toFile())))) {
            
            return Shipment.builder()
                .id(dis.readLong())
                .blNo(dis.readUTF())
                .weight(BigDecimal.valueOf(dis.readDouble()))
                .createdAt(Instant.ofEpochMilli(dis.readLong()))
                .active(dis.readBoolean())
                .build();
        }
    }
}

1.7 자기 점검 답변

DataInputStream / DataOutputStream 의 정의는?

답:
1. 정의:

FilterInputStream/OutputStream 자식
DataInput/DataOutput 인터페이스 구현
Decorator

목적:
- 자바 기본 타입 ↔ 바이너리
- readInt, writeLong 등
- Modified UTF-8
활용:
- BufferedStream 과 결합 권장
- 같은 순서로 read/write
위치:
- java.io
- Java 1.0+

2️⃣ 기본 타입 메서드 종합

2.1 모든 타입의 크기

타입	크기 (바이트)	read 메서드	write 메서드
boolean	1	readBoolean	writeBoolean
byte	1	readByte	writeByte
char	2	readChar	writeChar
short	2	readShort	writeShort
int	4	readInt	writeInt
long	8	readLong	writeLong
float	4	readFloat	writeFloat
double	8	readDouble	writeDouble
String	가변	readUTF	writeUTF

2.2 boolean (1바이트)

// write
dos.writeBoolean(true);   // 0x01
dos.writeBoolean(false);  // 0x00

// read
boolean b = dis.readBoolean();
// 0x01 → true
// 0x00 → false
// 그 외 → 0이 아니면 true

2.3 정수 타입

// byte (1바이트, signed -128 ~ 127)
dos.writeByte(65);   // 0x41
byte b = dis.readByte();   // 65

// 또는 readByte 의 반환은 int (정밀)
// 실제로는 byte 처리

// short (2바이트, -32768 ~ 32767)
dos.writeShort(1000);   // 0x03 0xE8
short s = dis.readShort();   // 1000

// int (4바이트, -2,147,483,648 ~ 2,147,483,647)
dos.writeInt(42);   // 0x00 0x00 0x00 0x2A
int i = dis.readInt();   // 42

// long (8바이트, ±9 × 10^18)
dos.writeLong(1234567890L);
long l = dis.readLong();

2.4 부동 소수점

// float (4바이트, IEEE 754 single precision)
dos.writeFloat(3.14f);
float f = dis.readFloat();

// double (8바이트, IEEE 754 double precision)
dos.writeDouble(3.14);
double d = dis.readDouble();

// 정확도 차이:
// float: 약 7자리
// double: 약 15자리

2.5 문자

// char (2바이트, UTF-16)
dos.writeChar('A');     // 0x00 0x41
dos.writeChar('안');     // 0xC5 0x48
char c = dis.readChar();

2.6 String — 두 가지 방식

// 방법 1: writeUTF (권장)
dos.writeUTF("안녕");
// [길이 6바이트][UTF-8 6바이트] = 총 8바이트
String s = dis.readUTF();   // "안녕"

// 방법 2: writeChars (UTF-16, 비효율)
dos.writeChars("안녕");
// 각 char 2바이트 = 4바이트
// 단, 길이 정보 없음
// readChars 도 없음 → 사용 X

// 방법 3: writeBytes (Latin-1, 정보 손실)
dos.writeBytes("Hello");
// 각 char 의 하위 8비트만
// 한글은 손실
// 사용 X

2.7 크기 계산

// 정확한 바이트 수 예측
public class Shipment {
    long id;          // 8
    String blNo;      // 가변 (writeUTF: 2 + UTF-8 길이)
    double weight;    // 8
    boolean active;   // 1
}

// 저장 시 약:
// 8 + (2 + blNo의 UTF-8 길이) + 8 + 1 = 19 + blNo 길이

// 또는 size() 메서드로 확인
DataOutputStream dos = new DataOutputStream(...);
dos.writeLong(id);
dos.writeUTF(blNo);
int written = dos.size();   // 지금까지 쓴 바이트

2.8 ILIC 활용

public class ShipmentBinaryFormat {
    
    // 1. 모든 타입 활용
    public void save(Path path, Shipment s) throws IOException {
        try (DataOutputStream dos = new DataOutputStream(
                new BufferedOutputStream(
                    new FileOutputStream(path.toFile())))) {
            
            // 기본 정보
            dos.writeLong(s.getId());
            dos.writeUTF(s.getBlNo());
            dos.writeUTF(s.getConsignee());
            
            // 숫자
            dos.writeDouble(s.getWeight().doubleValue());
            dos.writeInt(s.getQuantity());
            
            // 시간
            dos.writeLong(s.getCreatedAt().toEpochMilli());
            
            // 상태
            dos.writeBoolean(s.isActive());
            dos.writeByte(s.getStatus().ordinal());
            
            // 위치
            dos.writeFloat(s.getLatitude());
            dos.writeFloat(s.getLongitude());
        }
    }
    
    // 2. 정확한 크기 계산
    public int estimateSize(Shipment s) throws IOException {
        try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
             DataOutputStream dos = new DataOutputStream(baos)) {
            
            save(dos, s);
            return dos.size();
        }
    }
}

2.9 자기 점검 답변

기본 타입의 크기와 메서드는?

답:
1. 고정 크기:

boolean, byte: 1
char, short: 2
int, float: 4
long, double: 8

메서드:
- readXxx / writeXxx
- String 은 readUTF / writeUTF
String 방식:
- writeUTF: 길이 + Modified UTF-8 (권장)
- writeChars: UTF-16 (길이 X, 비권장)
- writeBytes: Latin-1 (손실, 비권장)
size() 메서드:
- DataOutputStream 에서
- 쓴 바이트 수

3️⃣ 바이너리 vs 텍스트 형식

3.1 두 형식의 차이

텍스트 형식:
  정수 42 → "42" (2바이트 문자)
  - 사람이 읽기 OK
  - 파싱 필요 (Integer.parseInt)
  - 가변 크기
  - 큰 숫자 = 더 큼

바이너리 형식:
  정수 42 → 0x00 0x00 0x00 0x2A (4바이트)
  - 기계 처리 빠름
  - 파싱 X
  - 고정 크기
  - 모든 int = 4바이트

3.2 크기 비교

정수 42:
  텍스트: 2바이트 ("42")
  바이너리: 4바이트 (int)
  텍스트 승

정수 1234567890:
  텍스트: 10바이트 ("1234567890")
  바이너리: 4바이트
  바이너리 승

double 3.14159265:
  텍스트: 10바이트 ("3.14159265")
  바이너리: 8바이트
  바이너리 승

긴 텍스트 "Hello, World!":
  텍스트: 13바이트
  바이너리 (writeUTF): 15바이트 (길이 2 + 13)
  텍스트 약간 승

결론:
  - 작은 숫자: 텍스트 작음
  - 큰 숫자, double: 바이너리 작음
  - 일관성: 바이너리 권장

3.3 처리 속도 비교

// 텍스트 형식
try (BufferedWriter bw = Files.newBufferedWriter(path, UTF_8)) {
    for (int i = 0; i < 1_000_000; i++) {
        bw.write(String.valueOf(i));
        bw.newLine();
    }
}
// 매 i 마다:
// 1. Integer.toString (String 생성, GC)
// 2. write (인코딩 적용)
// 느림

// 바이너리 형식
try (DataOutputStream dos = new DataOutputStream(
        new BufferedOutputStream(
            new FileOutputStream(path.toFile())))) {
    for (int i = 0; i < 1_000_000; i++) {
        dos.writeInt(i);
    }
}
// 매 i 마다:
// 4바이트 직접 쓰기
// 매우 빠름

// 결과 (대략):
// 텍스트: 5초
// 바이너리: 0.5초
// 약 10배 빠름

3.4 가독성 차이

텍스트 파일 (열어볼 수 있음):
  42
  1234567890
  3.14
  안녕

바이너리 파일 (불가):
  hex dump:
  00 00 00 2A 00 00 00 00 49 96 02 D2 40 09 1E B8 51 EB 85 1F 00 06 EC 95 88 EB 85 95

  - 사람이 읽기 어려움
  - 전용 뷰어 필요
  - 디버깅 어려움

3.5 호환성

텍스트 형식:
  - 모든 도구 (vi, cat, less)
  - 어디서나
  - 인코딩 주의

바이너리 형식:
  - 전용 코드 필요
  - 형식 정확히 알아야
  - 끝없는 byte 의 의미 정의
  - 버전 관리 어려움

3.6 선택 가이드

텍스트 권장:
  ✓ 사람이 읽음
  ✓ 다양한 도구 사용
  ✓ 디버깅 우선
  ✓ 작은 데이터
  ✓ 로그, 설정

바이너리 권장:
  ✓ 성능 우선
  ✓ 큰 데이터
  ✓ 정확한 크기 (네트워크 프로토콜)
  ✓ 정밀한 부동소수점
  ✓ 인터널 (사람이 안 보는)

3.7 하이브리드 — JSON

JSON:
  - 텍스트 기반 (가독성)
  - 구조화 (객체, 배열)
  - 표준 (모든 언어)
  - 파싱 라이브러리 풍부

장점:
  - 텍스트 + 구조화
  - 가독성 + 처리 편의

단점:
  - 바이너리보다 큼
  - 파싱 비용

비교:
  텍스트: "42"
  JSON: {"value": 42}
  바이너리: 0x00000002A

  JSON 이 일반적 권장
  바이너리는 특수 경우만

3.8 자기 점검 답변

바이너리 vs 텍스트 형식의 차이는?

답:
1. 크기:

작은 숫자: 텍스트 작음
큰 숫자/double: 바이너리 작음

속도:
- 바이너리 약 10배 빠름
- 파싱 X
가독성:
- 텍스트: 사람 OK
- 바이너리: 도구 필요
선택:
- 사람이 읽음: 텍스트
- 성능 우선: 바이너리
- 일반적: JSON (하이브리드)

4️⃣ 엔디언 (Big-Endian)

4.1 엔디언이란

엔디언 (Endianness):

  다중 바이트 데이터의 저장 순서.

Big-Endian (BE):
  - 큰 자리부터 (사람 읽는 순서)
  - 0x12345678 → 12 34 56 78
  - 네트워크 표준 (network byte order)

Little-Endian (LE):
  - 작은 자리부터
  - 0x12345678 → 78 56 34 12
  - x86 CPU 의 표준

4.2 자바의 선택

자바의 DataOutputStream:
  → 항상 Big-Endian
  
이유:
  - 자바는 플랫폼 독립적
  - JVM 이 어디서든 같은 결과
  - 네트워크 표준 (BE) 와 일치

영향:
  - C/C++ 와 다를 수 있음 (x86 은 LE)
  - 자바 ↔ C 통신 시 변환 필요

4.3 시각화

// int 0x12345678 쓰기
dos.writeInt(0x12345678);

// 파일에 저장된 바이트:
// 12 34 56 78   (Big-Endian)

// hex dump:
$ xxd file.bin
00000000: 1234 5678   .4Vx

4.4 NIO 의 ByteOrder

// NIO 의 ByteBuffer 는 엔디언 변경 가능
ByteBuffer buf = ByteBuffer.allocate(4);
buf.order(ByteOrder.BIG_ENDIAN);     // 기본
buf.putInt(0x12345678);
// [0x12, 0x34, 0x56, 0x78]

buf.clear();
buf.order(ByteOrder.LITTLE_ENDIAN);
buf.putInt(0x12345678);
// [0x78, 0x56, 0x34, 0x12]

// 시스템 기본
ByteOrder nativeOrder = ByteOrder.nativeOrder();
// x86: LITTLE_ENDIAN
// PowerPC: BIG_ENDIAN

// DataOutputStream 은 항상 BE
// LE 가 필요하면 ByteBuffer 사용

4.5 C ↔ 자바 호환

// C 코드 (x86, Little-Endian):
// int value = 0x12345678;
// fwrite(&value, sizeof(int), 1, file);
// 파일: 78 56 34 12

// 자바 코드 (Big-Endian):
// dos.writeInt(0x12345678);
// 파일: 12 34 56 78

// 둘은 다른 형식!
// 호환하려면:
public int readLittleEndianInt(DataInputStream dis) throws IOException {
    int b1 = dis.read();
    int b2 = dis.read();
    int b3 = dis.read();
    int b4 = dis.read();
    return (b4 << 24) | (b3 << 16) | (b2 << 8) | b1;
}

// 또는 ByteBuffer
public int readLittleEndianInt(byte[] bytes) {
    return ByteBuffer.wrap(bytes)
        .order(ByteOrder.LITTLE_ENDIAN)
        .getInt();
}

4.6 네트워크 프로토콜

대부분 네트워크 프로토콜: Big-Endian
  - TCP/IP
  - HTTP
  - 대부분 표준

이유:
  - 1980 년대 표준화 (당시 BE 가 많음)
  - "network byte order" = BE

자바의 장점:
  - DataOutputStream 이 BE
  - 네트워크와 자연스럽게 호환
  - htonl, ntohl 같은 변환 불필요

4.7 ILIC 활용

public class NetworkProtocol {
    
    // 자바 ↔ 자바 (BE)
    public void sendMessage(Socket socket, int messageId, String content) 
            throws IOException {
        try (DataOutputStream dos = new DataOutputStream(socket.getOutputStream())) {
            dos.writeInt(messageId);     // BE
            dos.writeUTF(content);
        }
    }
    
    public Message receiveMessage(Socket socket) throws IOException {
        try (DataInputStream dis = new DataInputStream(socket.getInputStream())) {
            int messageId = dis.readInt();   // BE
            String content = dis.readUTF();
            return new Message(messageId, content);
        }
    }
    
    // 자바 ↔ C/C++ (LE)
    public int readCInt(InputStream is) throws IOException {
        byte[] buf = new byte[4];
        if (is.read(buf) != 4) throw new EOFException();
        
        // C 코드가 LE 면
        return ByteBuffer.wrap(buf)
            .order(ByteOrder.LITTLE_ENDIAN)
            .getInt();
    }
}

4.8 자기 점검 답변

엔디언과 자바의 선택은?

답:
1. 엔디언:

Big-Endian: 큰 자리부터
Little-Endian: 작은 자리부터

자바의 선택:
- DataOutputStream: 항상 BE
- 네트워크 표준
- 플랫폼 독립적
C/C++ 호환:
- x86 는 LE
- 자바와 다름
- ByteBuffer 로 변환
NIO ByteBuffer:
- order(ByteOrder) 변경 가능
- 양쪽 지원

5️⃣ readUTF / writeUTF 의 Modified UTF-8

5.1 Modified UTF-8 의 정의

Modified UTF-8:
  
  자바의 특별한 문자열 직렬화 형식.
  표준 UTF-8 과 살짝 다름.

형식:
  [길이 (2바이트, unsigned short)]
  [Modified UTF-8 인코딩 데이터]

특징:
  - 길이 정보 포함 (앞 2바이트)
  - 최대 65535 바이트
  - null 문자 (\0) 의 특별 처리
  - Supplementary 문자의 다른 처리

5.2 표준 UTF-8 과의 차이

차이 1: null 문자
  표준 UTF-8: 0x00 (1바이트)
  Modified UTF-8: 0xC0 0x80 (2바이트)
  
  이유: C 의 null-terminated 문자열과 충돌 회피
  
차이 2: Supplementary 문자 (BMP 외)
  표준 UTF-8: 4바이트
  Modified UTF-8: Surrogate Pair 의 각 char 를 3바이트로
  → 총 6바이트
  
  이모지 😀 (U+1F600):
    UTF-8: 0xF0 0x9F 0x98 0x80 (4바이트)
    Modified UTF-8: 0xED 0xA0 0xBD + 0xED 0xB8 0x80 (6바이트)

5.3 writeUTF 의 동작

public void writeUTF(String str) throws IOException;

// 동작:
// 1. str 의 Modified UTF-8 길이 계산
// 2. 길이가 65535 초과면 UTFDataFormatException
// 3. 2바이트 길이 쓰기 (Big-Endian)
// 4. Modified UTF-8 데이터 쓰기

// 예: writeUTF("안녕")
// "안" = 0xEC 0x95 0x88 (3바이트, Modified UTF-8)
// "녕" = 0xEB 0x85 0x95 (3바이트)
// 길이 = 6
// 결과: 0x00 0x06 0xEC 0x95 0x88 0xEB 0x85 0x95
//       [길이][데이터]
// 총 8바이트

5.4 readUTF 의 동작

public String readUTF() throws IOException;

// 동작:
// 1. 2바이트 길이 읽기 (BE)
// 2. 그만큼 데이터 읽기
// 3. Modified UTF-8 디코딩
// 4. String 반환

// 예: 앞 예의 바이트 읽기
// readUTF() → "안녕"

// 또는 정적 메서드
String s = DataInputStream.readUTF(dis);

5.5 길이 제한

// 65535 바이트 제한
String huge = "x".repeat(70000);

try {
    dos.writeUTF(huge);
} catch (UTFDataFormatException e) {
    // 길이 초과
}

// 한글 문자열의 경우:
// 한 글자 3바이트 (Modified UTF-8)
// 최대 약 21,845 한글 글자

// 큰 문자열은 다른 방법
// 1. 청크 분할
// 2. 길이를 int (4바이트) 로 쓰고 byte[] 쓰기
public void writeLongString(DataOutput dos, String s) throws IOException {
    byte[] bytes = s.getBytes(StandardCharsets.UTF_8);
    dos.writeInt(bytes.length);
    dos.write(bytes);
}

public String readLongString(DataInput dis) throws IOException {
    int len = dis.readInt();
    byte[] bytes = new byte[len];
    dis.readFully(bytes);
    return new String(bytes, StandardCharsets.UTF_8);
}

5.6 활용 시점

writeUTF / readUTF:
  - 짧은 문자열 (< 65535 바이트)
  - 자바 ↔ 자바 통신
  - 직렬화

표준 UTF-8 권장:
  - 긴 문자열
  - 다른 언어와 통신
  - 표준 호환

5.7 자바 ↔ 다른 언어

// 자바 (Modified UTF-8)
dos.writeUTF("Hello");
// 파일: 00 05 48 65 6C 6C 6F

// C 에서 읽기 (표준 UTF-8 이라고 가정)
// 만약 ASCII 만 이라면:
// 처음 2바이트 무시 (길이)
// 그 다음 5바이트 = "Hello"
// 운 좋게 호환

// 한글이라면:
// 자바: Modified UTF-8
// C: 표준 UTF-8
// 깨질 수 있음 (null 문자, Supplementary)

// 안전 호환:
public void writeStandardUtf8(DataOutput dos, String s) throws IOException {
    byte[] bytes = s.getBytes(StandardCharsets.UTF_8);
    dos.writeInt(bytes.length);
    dos.write(bytes);
}

5.8 ILIC 활용

public class ShipmentDataSerializer {
    
    // 일반 (자바 ↔ 자바)
    public void writeNormal(DataOutputStream dos, Shipment s) throws IOException {
        dos.writeLong(s.getId());
        dos.writeUTF(s.getBlNo());        // 짧음
        dos.writeUTF(s.getConsignee());   // 짧음
    }
    
    // 긴 텍스트 — 표준 UTF-8
    public void writeLongText(DataOutputStream dos, String text) throws IOException {
        byte[] bytes = text.getBytes(StandardCharsets.UTF_8);
        dos.writeInt(bytes.length);
        dos.write(bytes);
    }
    
    public String readLongText(DataInputStream dis) throws IOException {
        int len = dis.readInt();
        byte[] bytes = new byte[len];
        dis.readFully(bytes);
        return new String(bytes, StandardCharsets.UTF_8);
    }
    
    // 옵션: writeUTF 또는 long text
    public void writeText(DataOutputStream dos, String s) throws IOException {
        byte[] bytes = s.getBytes(StandardCharsets.UTF_8);
        if (bytes.length <= 65535) {
            dos.writeBoolean(false);   // short UTF
            dos.writeUTF(s);
        } else {
            dos.writeBoolean(true);    // long text
            writeLongText(dos, s);
        }
    }
    
    public String readText(DataInputStream dis) throws IOException {
        boolean longText = dis.readBoolean();
        return longText ? readLongText(dis) : dis.readUTF();
    }
}

5.9 자기 점검 답변

Modified UTF-8 와 표준 UTF-8 의 차이는?

답:
1. Modified UTF-8 형식:

[길이 2바이트][데이터]
최대 65535 바이트

차이:
- null 문자: 0xC0 0x80 (2바이트)
- Supplementary: Surrogate Pair 각 3바이트 (총 6)
이유:
- C 의 null-terminated 충돌 회피
- 자바 내부 처리
활용:
- 자바 ↔ 자바: writeUTF OK
- 다른 언어: 표준 UTF-8 권장
- 긴 텍스트: writeInt + byte[]
한계:
- 65535 바이트
- 한글 약 21,845 글자

6️⃣ readFully — 정확한 읽기

6.1 read 와 readFully 의 차이

// read(byte[]) — 부분 읽기 가능
public int read(byte[] b) throws IOException;
// 반환: 실제 읽은 수 (< b.length 가능)
// EOF: -1

// readFully — 정확히 채움
public void readFully(byte[] b) throws IOException;
public void readFully(byte[] b, int off, int len) throws IOException;
// 정확히 b.length (또는 len) 만큼
// 부족하면 EOFException

6.2 시나리오 비교

// 파일 크기 100바이트
byte[] buf = new byte[150];

// read 의 동작
int n = fis.read(buf);
// n = 100 (부분만)
// EOF 도달

// readFully 의 동작
try {
    dis.readFully(buf);
} catch (EOFException e) {
    // 100 < 150 이므로 예외
}

// readFully(buf, 0, 50) — 50 바이트만
dis.readFully(buf, 0, 50);
// buf[0~49] 채워짐
// 정상 (100 ≥ 50)

6.3 readFully 가 필요한 시점

필요한 경우:

1. 헤더 읽기
   - 정확한 크기 보장
   - 부족하면 형식 오류

2. 기본 타입 읽기
   - readInt 의 4바이트
   - 부족하면 데이터 손상

3. 프로토콜 파싱
   - 정확한 크기로 분할
   - 길이 + 페이로드

readInt 의 내부:
  public int readInt() throws IOException {
      int ch1 = in.read();
      int ch2 = in.read();
      int ch3 = in.read();
      int ch4 = in.read();
      if ((ch1 | ch2 | ch3 | ch4) < 0)
          throw new EOFException();
      return (ch1 << 24) + (ch2 << 16) + (ch3 << 8) + ch4;
  }
  
  // 부족하면 EOFException

6.4 readFully 의 구현

public void readFully(byte b[], int off, int len) throws IOException {
    if (len < 0) throw new IndexOutOfBoundsException();
    int n = 0;
    while (n < len) {
        int count = in.read(b, off + n, len - n);
        if (count < 0) throw new EOFException();
        n += count;
    }
}

// 동작:
// 1. n < len 동안 read 반복
// 2. read 가 -1 반환 → EOFException
// 3. 정확히 len 바이트 채움

6.5 활용 — 헤더 파싱

public class FileHeader {
    int magic;
    int version;
    long size;
    String name;
}

public FileHeader readHeader(DataInputStream dis) throws IOException {
    FileHeader h = new FileHeader();
    
    // 매직 넘버
    h.magic = dis.readInt();   // 정확히 4바이트
    if (h.magic != 0xDEADBEEF) {
        throw new IOException("Invalid magic");
    }
    
    // 버전
    h.version = dis.readInt();
    
    // 크기
    h.size = dis.readLong();
    
    // 이름
    h.name = dis.readUTF();
    
    return h;
}

// 헤더 모두 정확히 읽음
// 부족 시 EOFException

6.6 readNBytes (Java 9+) 와 비교

// readFully — 부족 시 예외
dis.readFully(buf);

// readNBytes — 부족 시 작은 배열 반환
byte[] data = is.readNBytes(100);
// data.length 가 100 또는 그 이하

// 차이:
// - readFully: 엄격, EOF 면 예외
// - readNBytes: 유연, EOF 까지 읽음

// 선택:
// - 정확한 크기 필요: readFully
// - 가능한 만큼: readNBytes

6.7 ILIC 활용

public class ShipmentRecordReader {
    
    // 고정 길이 레코드
    public List<ShipmentRecord> readFixedRecords(Path path) throws IOException {
        List<ShipmentRecord> records = new ArrayList<>();
        
        try (DataInputStream dis = new DataInputStream(
                new BufferedInputStream(
                    new FileInputStream(path.toFile())))) {
            
            byte[] buf = new byte[ShipmentRecord.SIZE];
            
            try {
                while (true) {
                    dis.readFully(buf);   // 정확히 SIZE
                    records.add(ShipmentRecord.parse(buf));
                }
            } catch (EOFException e) {
                // 정상 종료 (파일 끝)
            }
        }
        
        return records;
    }
    
    // 헤더 + 가변 본문
    public Message readMessage(DataInputStream dis) throws IOException {
        // 헤더 (16바이트)
        byte[] header = new byte[16];
        dis.readFully(header);
        
        int magic = ByteBuffer.wrap(header, 0, 4).getInt();
        int version = ByteBuffer.wrap(header, 4, 4).getInt();
        int bodyLen = ByteBuffer.wrap(header, 8, 4).getInt();
        int crc = ByteBuffer.wrap(header, 12, 4).getInt();
        
        // 본문
        byte[] body = new byte[bodyLen];
        dis.readFully(body);
        
        // CRC 검증
        if (computeCrc(body) != crc) {
            throw new IOException("CRC mismatch");
        }
        
        return new Message(magic, version, body);
    }
}

6.8 자기 점검 답변

readFully vs read 의 차이는?

답:
1. read(byte[]):

부분 읽기 가능
반환: 실제 수, -1 EOF

readFully:
- 정확히 채움
- 부족 시 EOFException
필요 시점:
- 헤더 (고정 크기)
- 기본 타입 (readInt 등)
- 프로토콜 파싱
내부:
- while 루프 + read
- -1 시 EOFException
readNBytes (Java 9+):
- 유연 (EOF 까지)
- readFully 와 다름

7️⃣ DataInput / DataOutput 인터페이스

7.1 DataInput 인터페이스

package java.io;

public interface DataInput {
    
    void readFully(byte b[]) throws IOException;
    void readFully(byte b[], int off, int len) throws IOException;
    int skipBytes(int n) throws IOException;
    
    boolean readBoolean() throws IOException;
    byte readByte() throws IOException;
    int readUnsignedByte() throws IOException;
    short readShort() throws IOException;
    int readUnsignedShort() throws IOException;
    char readChar() throws IOException;
    int readInt() throws IOException;
    long readLong() throws IOException;
    float readFloat() throws IOException;
    double readDouble() throws IOException;
    
    String readLine() throws IOException;
    String readUTF() throws IOException;
}

7.2 DataOutput 인터페이스

public interface DataOutput {
    
    void write(int b) throws IOException;
    void write(byte b[]) throws IOException;
    void write(byte b[], int off, int len) throws IOException;
    
    void writeBoolean(boolean v) throws IOException;
    void writeByte(int v) throws IOException;
    void writeShort(int v) throws IOException;
    void writeChar(int v) throws IOException;
    void writeInt(int v) throws IOException;
    void writeLong(long v) throws IOException;
    void writeFloat(float v) throws IOException;
    void writeDouble(double v) throws IOException;
    
    void writeBytes(String s) throws IOException;
    void writeChars(String s) throws IOException;
    void writeUTF(String s) throws IOException;
}

7.3 구현 클래스

DataInput 의 구현:
  - DataInputStream (가장 일반)
  - RandomAccessFile
  - ObjectInputStream

DataOutput 의 구현:
  - DataOutputStream
  - RandomAccessFile
  - ObjectOutputStream

핵심:
  - 인터페이스로 추상화
  - 다양한 구현
  - 다형성 활용

7.4 RandomAccessFile

// 양방향 + 임의 접근
RandomAccessFile raf = new RandomAccessFile("file.dat", "rw");

// DataInput 메서드
int i = raf.readInt();
String s = raf.readUTF();

// DataOutput 메서드
raf.writeInt(42);
raf.writeUTF("Hello");

// 위치 이동
raf.seek(100);
raf.getFilePointer();   // 현재 위치
raf.length();           // 파일 크기

// 사용 시점:
// - Random Access
// - 양방향
// - 단, NIO 의 FileChannel 이 더 권장

7.5 ObjectInputStream 의 활용

// ObjectInputStream 도 DataInput
ObjectInputStream ois = new ObjectInputStream(in);

// 기본 타입 메서드
int i = ois.readInt();
long l = ois.readLong();
String s = ois.readUTF();

// 객체 메서드
Object obj = ois.readObject();

// 다음 Unit 의 토대

7.6 다형성 활용

// 메서드가 인터페이스 받음
public void process(DataInput input) throws IOException {
    int id = input.readInt();
    String name = input.readUTF();
    // ...
}

// 다양한 호출
DataInputStream dis = ...;
process(dis);

RandomAccessFile raf = ...;
process(raf);

ObjectInputStream ois = ...;
process(ois);

// 같은 인터페이스, 다른 구현

7.7 ILIC 활용

public class ShipmentSerializer {
    
    // DataInput / DataOutput 으로 추상화
    public void writeShipment(DataOutput out, Shipment s) throws IOException {
        out.writeLong(s.getId());
        out.writeUTF(s.getBlNo());
        out.writeDouble(s.getWeight().doubleValue());
        out.writeBoolean(s.isActive());
    }
    
    public Shipment readShipment(DataInput in) throws IOException {
        return Shipment.builder()
            .id(in.readLong())
            .blNo(in.readUTF())
            .weight(BigDecimal.valueOf(in.readDouble()))
            .active(in.readBoolean())
            .build();
    }
    
    // 다양한 호출
    public void saveToFile(Path path, Shipment s) throws IOException {
        try (DataOutputStream dos = new DataOutputStream(
                new BufferedOutputStream(
                    new FileOutputStream(path.toFile())))) {
            writeShipment(dos, s);
        }
    }
    
    public void saveToRandom(Path path, Shipment s, long offset) throws IOException {
        try (RandomAccessFile raf = new RandomAccessFile(path.toFile(), "rw")) {
            raf.seek(offset);
            writeShipment(raf, s);
        }
    }
    
    public void serializeWithObject(OutputStream os, Shipment s) throws IOException {
        try (ObjectOutputStream oos = new ObjectOutputStream(os)) {
            writeShipment(oos, s);
            // 추가로 객체로도 저장 가능
            oos.writeObject(s);
        }
    }
}

7.8 자기 점검 답변

DataInput / DataOutput 인터페이스의 역할은?

답:
1. 정의:

기본 타입 read/write 의 추상화
인터페이스

구현 클래스:
- DataInputStream/OutputStream (가장 일반)
- RandomAccessFile (양방향 + Random)
- ObjectInputStream/OutputStream (직렬화)
활용:
- 다형성
- 메서드 인자로 인터페이스
- 다양한 구현 호환
RandomAccessFile:
- 양방향 + 임의 접근
- seek, getFilePointer

8️⃣ 실무 활용과 함정

8.1 권장 패턴

// 패턴 1: 기본 (Buffered + Data)
try (DataOutputStream dos = new DataOutputStream(
        new BufferedOutputStream(
            new FileOutputStream(path.toFile())))) {
    dos.writeInt(42);
    dos.writeUTF("Hello");
}

// 패턴 2: 압축
try (DataOutputStream dos = new DataOutputStream(
        new GZIPOutputStream(
            new BufferedOutputStream(
                new FileOutputStream(path.toFile()))))) {
    dos.writeInt(42);
}

// 패턴 3: 메모리
try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
     DataOutputStream dos = new DataOutputStream(baos)) {
    dos.writeInt(42);
    dos.writeUTF("Hello");
    byte[] result = baos.toByteArray();
}

// 패턴 4: 네트워크
try (DataOutputStream dos = new DataOutputStream(socket.getOutputStream())) {
    dos.writeInt(messageType);
    dos.writeUTF(payload);
}

8.2 흔한 함정 종합

함정 1: 순서 불일치
  // 쓰기
  dos.writeInt(id);
  dos.writeUTF(name);
  
  // 읽기 — 잘못된 순서
  String name = dis.readUTF();   // ❌ int 를 UTF 로
  
  → 데이터 손상

함정 2: 타입 불일치
  // 쓰기
  dos.writeLong(id);   // 8바이트
  
  // 읽기
  int id = dis.readInt();   // 4바이트만 ❌
  
  → 다음 데이터 깨짐

함정 3: read 와 readFully 혼동
  // 부족하면 무시 vs 예외
  // 기본 타입은 readFully 가 안전

함정 4: readUTF 의 65535 제한
  // 큰 텍스트 → UTFDataFormatException
  // 해결: int 길이 + byte[]

함정 5: 엔디언 (C 와 통신)
  // 자바: BE
  // C/x86: LE
  // ByteBuffer 변환

함정 6: char 의 2바이트
  // writeChar 는 2바이트 (UTF-16)
  // ASCII 만이면 writeByte 가 작음

함정 7: 형식 버전 관리
  // 형식 바뀌면 옛 데이터 못 읽음
  // 버전 필드 권장

8.3 버전 관리

// 파일 형식의 버전 관리
public class ShipmentFile {
    
    private static final int MAGIC = 0xDEADBEEF;
    private static final int VERSION_1 = 1;
    private static final int VERSION_2 = 2;
    private static final int CURRENT_VERSION = VERSION_2;
    
    public void write(DataOutputStream dos, Shipment s) throws IOException {
        // 헤더
        dos.writeInt(MAGIC);
        dos.writeInt(CURRENT_VERSION);
        
        // 데이터
        dos.writeLong(s.getId());
        dos.writeUTF(s.getBlNo());
        
        if (CURRENT_VERSION >= VERSION_2) {
            // 새 필드
            dos.writeBoolean(s.isUrgent());
        }
    }
    
    public Shipment read(DataInputStream dis) throws IOException {
        int magic = dis.readInt();
        if (magic != MAGIC) throw new IOException("Invalid format");
        
        int version = dis.readInt();
        
        Shipment.Builder builder = Shipment.builder()
            .id(dis.readLong())
            .blNo(dis.readUTF());
        
        if (version >= VERSION_2) {
            builder.urgent(dis.readBoolean());
        }
        
        return builder.build();
    }
}

8.4 ILIC 의 종합 활용

@Service
public class ShipmentBinaryService {
    
    private static final int MAGIC = 0xDEADBEEF;
    private static final int VERSION = 1;
    
    // 1. 효율적 저장
    public void saveCompact(Path path, List<Shipment> shipments) throws IOException {
        try (DataOutputStream dos = new DataOutputStream(
                new BufferedOutputStream(
                    new GZIPOutputStream(
                        new FileOutputStream(path.toFile()))))) {
            
            // 헤더
            dos.writeInt(MAGIC);
            dos.writeInt(VERSION);
            dos.writeInt(shipments.size());
            
            // 데이터
            for (Shipment s : shipments) {
                dos.writeLong(s.getId());
                dos.writeUTF(s.getBlNo());
                dos.writeUTF(s.getConsignee());
                dos.writeDouble(s.getWeight().doubleValue());
                dos.writeLong(s.getCreatedAt().toEpochMilli());
            }
        }
    }
    
    // 2. 로드 + 검증
    public List<Shipment> loadCompact(Path path) throws IOException {
        try (DataInputStream dis = new DataInputStream(
                new BufferedInputStream(
                    new GZIPInputStream(
                        new FileInputStream(path.toFile()))))) {
            
            // 헤더 검증
            int magic = dis.readInt();
            if (magic != MAGIC) throw new IOException("Invalid magic");
            
            int version = dis.readInt();
            if (version != VERSION) throw new IOException("Unsupported version");
            
            int count = dis.readInt();
            List<Shipment> shipments = new ArrayList<>(count);
            
            for (int i = 0; i < count; i++) {
                Shipment s = Shipment.builder()
                    .id(dis.readLong())
                    .blNo(dis.readUTF())
                    .consignee(dis.readUTF())
                    .weight(BigDecimal.valueOf(dis.readDouble()))
                    .createdAt(Instant.ofEpochMilli(dis.readLong()))
                    .build();
                shipments.add(s);
            }
            
            return shipments;
        }
    }
    
    // 3. 네트워크 프로토콜
    public void sendShipment(Socket socket, Shipment s) throws IOException {
        DataOutputStream dos = new DataOutputStream(socket.getOutputStream());
        
        // 메시지 타입 + 페이로드
        dos.writeInt(1);   // SHIPMENT_MESSAGE
        dos.writeLong(s.getId());
        dos.writeUTF(s.getBlNo());
        dos.flush();
    }
    
    public Shipment receiveShipment(Socket socket) throws IOException {
        DataInputStream dis = new DataInputStream(socket.getInputStream());
        
        int type = dis.readInt();
        if (type != 1) throw new IOException("Unexpected type: " + type);
        
        return Shipment.builder()
            .id(dis.readLong())
            .blNo(dis.readUTF())
            .build();
    }
    
    // 4. 크기 측정
    public int measureSize(Shipment s) throws IOException {
        try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
             DataOutputStream dos = new DataOutputStream(baos)) {
            
            dos.writeLong(s.getId());
            dos.writeUTF(s.getBlNo());
            
            return dos.size();
        }
    }
}

8.5 자기 점검 답변

실무 활용과 함정?

답:
1. 권장 패턴:

DataOutputStream + BufferedOutputStream
압축 결합
메모리에 직렬화

흔한 함정 7가지:
- 순서 불일치
- 타입 불일치
- read vs readFully
- readUTF 65535 제한
- 엔디언 (C 호환)
- char 의 2바이트
- 버전 관리
버전 관리:
- MAGIC + VERSION 헤더
- 새 필드 추가 시 호환
다음 Unit (직렬화) 의 토대

9️⃣ 면접 + 자기 점검

9.1 면접 단골 질문 매핑

Q	핵심 답변
DataInputStream 정의?	FilterInputStream, DataInput
기본 타입 메서드?	readInt, readLong, readDouble, readUTF
바이너리 vs 텍스트?	고정 크기, 빠름 vs 가독성
엔디언?	자바 Big-Endian (네트워크 표준)
readUTF 형식?	Modified UTF-8, 길이 2바이트 + 데이터
readUTF 한계?	65535 바이트
readFully vs read?	정확히 채움 vs 부분
DataInput/Output 인터페이스?	다형성, 다양한 구현
RandomAccessFile?	양방향 + Random Access
직렬화와 관계?	DataInput/Output 의 확장

9.2 자기 점검 체크리스트

정의

DataInputStream / DataOutputStream
FilterInputStream/OutputStream 자식
DataInput / DataOutput 인터페이스

메서드

형식

바이너리 vs 텍스트
엔디언 (BE)
Modified UTF-8
writeUTF 의 형식 ([길이][데이터])

실무

BufferedStream 결합
압축 결합
버전 관리
흔한 함정

9.3 추가 심화 질문

Q1: writeUTF 의 정확한 형식?

답:

[unsigned short 길이][Modified UTF-8 데이터]

- 길이: 2바이트, BE
- 데이터의 바이트 수 (글자 수 X)
- 최대 65535

예: "안녕"
  Modified UTF-8: 6바이트
  파일: 0x00 0x06 0xEC 0x95 0x88 0xEB 0x85 0x95
        [길이 6][데이터 6바이트]

Q2: DataInputStream 의 동시성?

답:

외부 동기화 권장
메서드 자체는 not synchronized
단, 감싼 stream 이 synchronized 면 안전 (BufferedInputStream 등)

Q3: readInt 가 부족하면?

답:

// 3바이트만 남았는데 readInt
try {
    int i = dis.readInt();
} catch (EOFException e) {
    // 4바이트 못 채움
}
// 안전: 항상 try-catch

Q4: writeFloat 의 정밀도?

답:

IEEE 754 single precision
32비트 (4바이트)
부호 1 + 지수 8 + 가수 23
약 7자리 정확
큰 수는 정밀도 손실

dos.writeFloat(1.123456789f);
float f = dis.readFloat();
// f ≈ 1.1234568 (정밀도 손실)

Q5: ObjectInputStream 도 DataInput?

답:

예, ObjectInputStream extends InputStream implements ObjectInput
ObjectInput extends DataInput
기본 타입 메서드 + 객체 메서드
다음 Unit 의 주제

🎯 핵심 요약 — 3줄 정리

1. DataInputStream / DataOutputStream

기본 타입 ↔ 바이너리
readInt, readUTF 등
DataInput / DataOutput 인터페이스

2. 특징

자바는 Big-Endian
writeUTF 의 Modified UTF-8 (65535 제한)
readFully = 정확히 채움

3. 실무

BufferedStream 결합
버전 관리 (헤더)
다음 Unit (직렬화) 의 토대

📚 다음으로...

Unit 9.4 — Serialization (직렬화)

이번 Unit에서 기본 타입 입출력을 봤다면, 다음은 객체 직렬화.

Serializable 인터페이스
ObjectInputStream / ObjectOutputStream
객체 그래프
transient 키워드
보안 고려사항

Phase 9 진행 상황

🚀 Phase 9 — I/O 강화
  ✅ Unit 9.1 try-with-resources
  ✅ Unit 9.2 BufferedInputStream / BufferedOutputStream
  ✅ Unit 9.3 DataInputStream / DataOutputStream ← 여기
  ⏭ Unit 9.4 Serialization (직렬화)
  ⏭ Unit 9.5 serialVersionUID

3주차 누적 진행

✅ Phase 1 ~ 8 완주 (37 Unit)
🚀 Phase 9 — I/O 강화 (3/5 진행)

총: 40/43 Unit (약 93%)

Psj

Software Developer

이전 포스트

3주차 Unit 9.2 — BufferedInputStream / BufferedOutputStream

다음 포스트