[Java] 자바 HttpUrlConnection 을 이용하여 파일 다운받기

최대한·2021년 2월 9일

개요

솔루션 엔지니어로서 일을 하다보면 종종 외부 라이브러리의 도움 없이 HttpUrlConnection 을 이용하여 HTTP 통신 테스트를 하는 경우가 자주 있다. 필자는 이를 쌩자바 라고 부르는데, 이번 포스트에서는 쌩자바를 이용한 HTTP 파일 다운로드 📂 하는 방법을 정리해보려고 한다.

본문

기본적인 흐름은 다음과 같이 크게 2가지로 나눌 수 있다.

HttpUrlConnection 연결
Stream 을 통해 다운로드

1. `HttpUrlConnection` 연결

URL 객체 생성
URL 객체로부터 HttpUrlConnection 객체를 받아옴

import java.net.HttpURLConnection;
import java.net.URL;

public class FileDownloadTest {

    public static void main(String[] args) {
    	String spec = "https://file-examples-com.github.io/uploads/2017/02/file-sample_100kB.doc";
        String outputDir = "D:/sample/output/download";

        try{
            URL url = new URL(spec);
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
            int responseCode = conn.getResponseCode();
            System.out.println("responseCode " + responseCode);
       	} catch (Exception e){
            System.out.println("An error occurred while trying to download a file.");
            e.printStackTrace();
        }
}

console : responseCode 200

통신에 성공했으면 남은 작업은 Stream 을 통해 파일을 가져오자.

2. `Stream` 을 통해 다운로드

Content-Disposition or URL 에서 파일명 가져오기
InputStream ➡️ FileOutputStream 을 통해 파일 다운로드

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class FileDownloadTest {

    public static void main(String[] args) {
        String spec = "https://file-examples-com.github.io/uploads/2017/02/file-sample_100kB.doc";
        String outputDir = "D:/sample/output/download";
        InputStream is = null;
        FileOutputStream os = null;
        try{
            URL url = new URL(spec);
            HttpURLConnection conn = (HttpURLConnection) url.openConnection();
            int responseCode = conn.getResponseCode();

            System.out.println("responseCode " + responseCode);

            // Status 가 200 일 때
            if (responseCode == HttpURLConnection.HTTP_OK) {
                String fileName = "";
                String disposition = conn.getHeaderField("Content-Disposition");
                String contentType = conn.getContentType();
                
                // 일반적으로 Content-Disposition 헤더에 있지만 
                // 없을 경우 url 에서 추출해 내면 된다.
                if (disposition != null) {
                    String target = "filename=";
                    int index = disposition.indexOf(target);
                    if (index != -1) {
                        fileName = disposition.substring(index + target.length() + 1);
                    }
                } else {
                    fileName = spec.substring(spec.lastIndexOf("/") + 1);
                }

                System.out.println("Content-Type = " + contentType);
                System.out.println("Content-Disposition = " + disposition);
                System.out.println("fileName = " + fileName);

                is = conn.getInputStream();
                os = new FileOutputStream(new File(outputDir, fileName));

                final int BUFFER_SIZE = 4096;
                int bytesRead;
                byte[] buffer = new byte[BUFFER_SIZE];
                while ((bytesRead = is.read(buffer)) != -1) {
                    os.write(buffer, 0, bytesRead);
                }
                os.close();
                is.close();
                System.out.println("File downloaded");
            } else {
                System.out.println("No file to download. Server replied HTTP code: " + responseCode);
            }
            conn.disconnect();
        } catch (Exception e){
            System.out.println("An error occurred while trying to download a file.");
            e.printStackTrace();
            try {
                if (is != null){
                    is.close();
                }
                if (os != null){
                    os.close();
                }
            } catch (IOException e1){
                e1.printStackTrace();
            }
        }
    }
}

console :

responseCode 200
Content-Type = application/msword
Content-Disposition = null
fileName = file-sample_100kB.doc
File downloaded

Process finished with exit code 0

정리

이와같이 크게 2가지 작업만 해주면 쌩자바를 이용해 쉽게 파일을 다운받을 수 있다.
이번 포스트는 기본적으로 HttpUrlConnection 객체를 통해 오픈된 파일을 다운받았지만, 나중에 기회가 된다면 setRequestProperty() 등의 메서드를 이용해 토큰값이 있어야 접근 가능한 url 등에 대해서도 정리해보겠다.

최대한

Awesome Dev!

이전 포스트

자바 직렬화(serialize)란? serialVersionUID 란?

다음 포스트

[Gradle] 로컬 라이브러리 그래들에 의존성 추가하기 및 제외하기, How to add and exclude local library files to a gradle project

1개의 댓글

TenaLee

2021년 5월 12일

좋은 글 잘 읽고 갑니다.

코드를 테스트해보니
String spec = "https://file-examples-com.github.io/uploads/2017/02/file-sample_100kB.doc";
위 리소스는 Content-Disposition 헤더가 특별히 없는 경우이므로 에러가 없지만
Content-Disposition 헤더가 있는 리소스일 경우에는 코드 상에서 에러가 발생합니다.

확인해보니 Content-Disposition 헤더의 filename 문자열을 추출하는 과정에서
실수가 있어 에러가 발생하기 때문에 코드를 살짝 수정하면 좋을 것 같습니다.

테스트는 spec 변수를 아래 리소스로 변경해 확인해보시면 됩니다.
String spec = "https://i.picsum.photos/id/152/536/354.jpg?hmac=Vh-3tACtfo0tExdnZBiHdzcsxRIS0Q-a8GN1QSC0b3U&name=4";

위 리소스는 Content-Disposition = inline; filename="152-536x354.jpg" 로 헤더값이 옵니다.
여기서 filename에 해당하는 문자열인 152-536x354.jpg 을 추출하는 기존 코드를 확인하면
아래와 같이 되어 있습니다.

fileName = disposition.substring(index + target.length() + 1); // 152-536x354.jpg"

코드의 의도를 보니 +1 을 통해서 앞의 따옴표는 잘 제거하셨는데, -> 152-536x354.jpg"
뒤의 따옴표가 제거되지 않아서 저장 시점에 오류가 발생합니다.
뒤의 따옴표를 제거할 수 있게 endIndex 를 주셔서 처리해주시거나 아래처럼 수정하시면 될 것 같습니다.

if (index != -1) {
  // 기존
  // fileName = disposition.substring(index + target.length() + 1);  // 152-536x354.jpg" (앞의 따옴표만 제거)
  // 수정
  fileName = disposition.substring(index + target.length()); // "152-536x354.jpg"
  fileName = fileName.replaceAll("\"", ""); // 152-536x354.jpg (앞뒤 따옴표 제거)
}

한가지 궁금한 점이 있습니다.

final int BUFFER_SIZE = 4096;
int bytesRead;
byte[] buffer = new byte[BUFFER_SIZE];
while ((bytesRead = is.read(buffer)) != -1) {
  os.write(buffer, 0, bytesRead);
}

바이트 배열 생성 시 BUFFER_SIZE 로 특정 상수 값을 할당하셨는데
해당 수치에 대해 명시적으로 상수 선언을 한 이유와
이 수치를 어떤 기준으로 판단해 크기를 정해야 하는지 배우고 싶습니다.

답글 달기

[Java] 자바 HttpUrlConnection 을 이용하여 파일 다운받기

개요

본문

1. HttpUrlConnection 연결

2. Stream 을 통해 다운로드

정리

자바 직렬화(serialize)란? serialVersionUID 란?

[Gradle] 로컬 라이브러리 그래들에 의존성 추가하기 및 제외하기, How to add and exclude local library files to a gradle project

1개의 댓글

관련 채용 정보

1. `HttpUrlConnection` 연결

2. `Stream` 을 통해 다운로드