9장 입출력 처리하기

Jasik·2021년 12월 12일

202112 스터디 - 코어 자바 8

목록 보기

10/15

입력 스트림: 바이트를 읽어올 소스
파일, 네트워크 연결, 메모리에 있는 배열에서 읽어올 수 있다.

출력 스트림: 바이트의 목적지

reader, writer: 문자 시퀀스를 소비하고 생산

스트림 얻기

InputStream in = Files.newInputStream(path);
OutputStream out = Files.newInputStream(path);

path: Path 클래스 인스턴스

URL url = new URL("http://example.domain.com/index.html");
InputStream in = url.openStream();

byte[] bytes = ...;
InputStream in = new ByteArrayInputStream(bytes);

ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] bytes = out.toByteArray();

InputStream in = ...;
int b = in.read(); // reads one byte

byte[] bytes = ...;
bytesRead = in.read(bytes); // reads bytes in bulk
bytesRead = in.read(bytes, start, length); // reads bytes in bulk

// reads all bytes from input stream
public static byte[] readAllBytes(InputStream in) throws IOException {
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    copy(in, out);
    out.close();
    
    return out.toByteArray();
}

스트림에 쓰기를 마친 후에는 반드시 해당 스트림을 닫아서 버퍼에 저장된 출력을 커밋해야한다. 따라서 try-with-resources 문을 쓰는 것이 좋다.

byte[] bytes = ...;
try (OutputStream out = ...) {
    out.write(bytes);
}

// from input stream to output stream
public static void copy(InputStream in, OutputStream out) throws IOException {
    final int BLOCKSIZE = 1024;
    byte[] bytes = new byte[BLOCKSIZE];
    
    int length;
    while ((length = in.read(bytes)) != -1) {
        out.write(bytes, 0, length);
    }
}

문자 인코딩

자바는 문자에 유니코드 표준을 사용. 각 문자나 코드포인트는 21비트 정수. 문자 인코딩은 21비트 숫자를 바이트로 패키징하는 것.

UTF-8, UTF-16, 등

바이트 스트림에서 문자 인코딩을 자동으로 감지하는 신뢰할 만한 방법이 없다. 그러므로 언제나 명시적으로 인코딩을 지정해야 한다. 이를테면 웹 페이지를 읽을 때는 Content-Type 선언부를 검사한다.

텍스트 입력

InputStream inputStream = ...;
Reader in = new InputStreamReader(inputStream, characterSet);

String content = new String(Files.readAllBytes(path), characterSet);

// 파일을 일련의 줄로 읽기
try (Stream<String> lines = Files.readAllLines(path, characterSet)) {
    ...
}

// 파일에서 숫자나 단어를 읽으려면 
Scanner in = new Scanner(path, "UTF-8");
while (in.hasNextDouble()) {
    double value = in.nextDouble();
}

텍스트 출력

텍스트를 쓸 때는 Writer 사용.

OutputStream outStream = ...;
Writer out = new OutputStreamWriter(outStream, characterSet);
out.write(str);

PrinterWriter 더 편리. println, printf 등 사용 가능

PrintWriter out = new PrinterWriter(Files.newBufferedWriter(path, characterSet));

경로, 파일, 디렉터리

루트 구성요소로 시작하면 절대경로, 아니면 상대경로

Path absolute = Paths.get("/", "home", "cay");
Path relative = Paths.get("myapp", "conf", "user.properties");

Path homeDirectory = Paths.get("/home/cay");

경로 해석(resolve)

Path homeDirectory = Paths.get("/home/cay");
Path workPath = homeDirectory.resolve("myapp/work");

파일과 디렉터리 생성하기

// 마지막 구성요소 빼고 다있어야함
Files.createDirectory(path);

// 중간 디렉토리도 생성
Files.createDirectories(path);

Files.createFile(path);
Files.exists(path);

파일 복사, 이동, 삭제

Files.copy(fromPath, toPath);

Files.move(fromPath, toPath);

Files.deleteIfExists(path);

디렉터리 엔트리 방문하기

Files.list는 Stream<Path> 반환.

try (Stream<Path> entries = Files.list(pathToDirectory)) {
    ...
}

try (Stream<Path> entries = Files.walk(pathToRoot)) {
    // 자손을 모두 담음. 각 자손을 depth-first 순서로 방문
}

// depth 제한
Files.walk(pathToRoot, depth)

URL 커넥션

흔히 접하는 폼 데이터 게시 사례.

URL url = ...;
URLConnection connection = url.openConnection();

connection.setDoOutput(true);
try (
    Writer out = new OutputStreamWriter(connection.getOutputStream(), StandardCharsets.UTF_8)
) {
    Map<String, String> postData = ...;
    boolean first = true;
    for (Map.Entry<String, String> entry : postData.entrySet()) {
        if (first) {
            first = false;
        } else {
            out.write("&");
        }
        out.write(URLEncoder.encode(entry.getKey(),"UTF-8"));
        out.write("=");
        out.write(URLEncoder.encode(entry.getValue(), "UTF-8"));
    }
}
try (InputStream in = connection.getInputStream()) {
    ...
}

직렬화

객체 직렬화는 객체를 다른 곳으로 보내거나 디스크에 저장할 수 있는 바이트들의 묶음으로 변환하고, 해당 바이트들로부터 객체를 재구성하는 매커니즘이다.
객체를 한 가상 머신에서 다른 가상 머신으로 보내는 분산 처리에서 필수 도구.

Serializable 인터페이스

객체를 직렬화(바이트 묶음으로 변환)하려면 해당 객체가 Serializable 인터페이스를 구현하는 클래스의 인스턴스여야 한다. Serializable: 메서드 없는 마커 인터페이스.

public class Employee implements Serializable {
    private String name;
    private double salary;
}

객체 직렬화

ObjectOutputStream out = new ObjectOutputStream(
    Files.newOutputStream(path)
);

Employee peter = new Employee("Peter", 90000);
Employee paul = new Manager("Paul", 180000);
out.writeObject(peter);
out.writeObject(paul);

객체 읽어오기

ObjectInputStream in = new ObjectInputStream(
    Files.newInputStream(path)
);

Employee e1 = (Employee) in.readObject(); // peter
Employee e2 = (Employee) in.readObject(); // paul

Employee peter = new Employee("Peter", 90000);
Employee paul = new Manager("Paul", 120000);
Manager mary = new Manager("Mary", 180000);

peter.setBoss(mary);
peter.setBoss(mary);

out.writeObject(peter);
out.writeObject(paul);

위와 같은 경우 peter와 paul은 mary라는 한 객체에 대한 참조를 가진다. 내용이 같지만 별개인 객체에 대한 두 참조를 가지는 것이 아니다.
이를 위해 각 객체는 저장될 때 일련번호(serial number)를 얻는다. ObjectOutputStream은 해당 객체 참조를 이전에 썼는지 확인한다. 이전에 쓴 객체 참조라면 일련번호만 쓰고 객체의 내용은 중복하지 않는다.
ObjectInputStream은 읽어온 객체를 모두 기억한다. ObjectInputStream이 반복해서 나오는 객체에 대한 참조를 읽으면 이전에 읽은 객체에 대한 참조를 돌려준다.

버전 관리

객체를 직렬화할 때 해당 클래스의 이름과 serialVersionUID를 객체 스트림에 쓴다. 클래스의 구현자가 인스턴스 변수를 정의해서 고유 식별자를 지정한다.
private static final long servialVersionUID = 1L // 버전 1

클래스가 호환되지 않는 방식으로 진화될 때는 구현자가 UID를 변경해야 한다. readObject 메서드는 역직렬화되는 객체의 UID가 일치하지 않으면 InvalidClassException을 던진다.
serialVersionUID가 일치하면 구현이 달라졌어도 역직렬화가 계속 진행된다.

Jasik

가자~

이전 포스트

8장 스트림

다음 포스트