[Java] java.util.Scanner

Jerry·2021년 9월 12일

Java

language & framework study

목록 보기

23/28

java.util.Scanner

primitive 타입과 String 을 정규식 구분자(delimiter pattern)를 이용해 파싱할 수 있는 심플한 텍스트 스캐너입니다.
키보드 입력, 파일, 입력 스트림 등 입력 데이터를 처리하는데에 활용할 수 있습니다.

생성자

Scanner(File source)
파일 객체를 이용해 Scanner 생성
Scanner(File source, String charsetName)

charsetName : 파일의 인코딩 charset 선택
Scanner(InputStream source)
데이터 입력 스트림 객체를 이용해 스캐너 생성
Scanner(InputStream source, String charsetName)
charsetName : 입력스트림 바이너리의 인코딩 charset 선택
Scanner(Path source)
source 경로의 파일을 스캔

   Scanner scanner= null;
  try {
    scanner = new Scanner(Paths.get("./fileoutputtest.txt"));
    while(scanner.hasNext()){
      System.out.println(scanner.next());
    }
  } catch (IOException e) {
    e.printStackTrace();
  }
  finally {
    scanner.close();
  }

Scanner(Path source, String charsetName)
charsetName : source 경로의 파일을 읽을때 charset 선택
Scanner(Readable source)
Readable 인터페이스를 구현한 객체를 받아 스캔
- 대표적인 구현 클래스 : BufferedReader, CharArrayReader, CharBuffer, FileReader, FilterReader, InputStreamReader, LineNumberReader, PipedReader, PushbackReader, Reader, StringReader
Scanner(ReadableByteChannel source)
- ReadableByteChannel 의 하위 인터페이스 : ByteChannel, ScatteringByteChannel, SeekableByteChannel
- 구현 클래스 : DatagramChannel, FileChannel, Pipe.SourceChannel, SocketChannel
Scanner(ReadableByteChannel source, String charsetName)
charsetName : source 객체를 읽을때 charset 선택
Scanner(String source)
String 타입 데이터를 스캔

매소드

[String, Pattern 매칭]

Stream findAll(String patString)

Scanner로 읽은 데이터에서 patString으로 찾은 매칭 정보 객체를 Stream으로 반환
- [유사 매소드] Stream findAll(Pattern pattern)
  
  : Pattern 객체를 이용하여 매칭 결과를 Stream으로 반환

  // fileoutputtest.txt
  this is output Stream test
  this is writer test
  //
  try(Scanner scanner= new Scanner(Paths.get("./fileoutputtest.txt"))) {
      // 파일에 없는 패턴을 찾을 경우
      System.out.println("파일에 없는 단어로 매칭 테스트");
      Stream<MatchResult> unMatchedResults=scanner.findAll("thos is");
      System.out.println("matched result size : "+ unMatchedResults.count());
  
      System.out.println();
      // 파일에 있는 패턴을 찾을 경우
      System.out.println("파일에 있는 단어로 매칭 테스트");
      Stream<MatchResult> matchedResults=scanner.findAll("this is");
      // 매칭 결과 정보로 MatchResult 객체 2개 생성 확인 
      matchedResults.forEach(result -> System.out.println(result.toString()));
  } catch (IOException e) {
    e.printStackTrace();
  }

[output]

파일에 없는 단어로 매칭 테스트
matched result size : 0

파일에 있는 단어로 매칭 테스트
java.util.regex.Matcher $ImmutableMatchResult@e9e54c2 java.util.regex.Matcher$ ImmutableMatchResult@65ab7765

MatchResult match()

마지막으로 매칭했었던 결과를 리턴
String findWithinHorizon(String pattern, int horizon)

읽은 토큰을 제외하고 남은 데이터의 horizon 길이 내에 pattern에 해당하는 매칭이 있는지 찾아서, 해당되는 String을 리턴 (토큰 단위로 구분하여 찾지 않음(delimiter 무시) )

  try(Scanner scanner= new Scanner(Paths.get("./fileoutputtest.txt"))) {
      String result=scanner.findWithinHorizon("this is", 7);
      System.out.println(result);
      scanner.next();
      result=scanner.findWithinHorizon("this is", 7);
      System.out.println(result);
  } catch (IOException e) {
    e.printStackTrace();
  }

[유사 매소드] String findWithinHorizon(Pattern pattern, int horizon)
String findInLine(String pattern)
라인 단위로 패턴에 매칭되는 String을 찾아서 리턴
- [유사 매소드] String findInLine(Pattern pattern)

 try(Scanner scanner= new Scanner(Paths.get("./fileoutputtest.txt"))) {
     String result=null;
     String result2=null;
     while(scanner.hasNext()){
       result=scanner.findInLine("output");
       System.out.println("find output : " +result);
       result2=scanner.findInLine("writer");
       System.out.println("find writer : " +result2);
       System.out.println();
       scanner.nextLine();
     }
 } catch (IOException e) {
   e.printStackTrace();
 }

[output]

find output : output
find writer : null

find output : null
find writer : writer

[다음 토큰 확인]

boolean hasNext()
다음 토큰 존재 유무 리턴
- [유사 매소드]

    boolean    hasNext(String pattern)
    boolean    hasNext(Pattern pattern)
    boolean    hasNextBigDecimal()
    boolean    hasNextBigInteger()
    boolean    hasNextBigInteger(int radix)
    boolean    hasNextBoolean()
    boolean    hasNextByte()
    boolean    hasNextByte(int radix)
    boolean    hasNextDouble()
    boolean    hasNextFloat()
    boolean    hasNextInt()
    boolean    hasNextInt(int radix)
    boolean    hasNextLine()
    boolean    hasNextLong()
    boolean    hasNextLong(int radix)
    boolean    hasNextShort()
    boolean    hasNextShort(int radix)

[다음 토큰 읽기]

String next()

다음 토큰을 String 형태로 반환
- [유사 매소드]

    String next(String pattern)
    String next(Pattern pattern)
    BigDecimal nextBigDecimal()
    BigInteger nextBigInteger()
    BigInteger nextBigInteger(int radix)
    boolean    nextBoolean()
    byte   nextByte()
    byte   nextByte(int radix)
    double nextDouble()
    float  nextFloat()
    int    nextInt()
    int    nextInt(int radix)
    String nextLine()
    long   nextLong()
    long   nextLong(int radix)
    short  nextShort()
    short  nextShort(int radix)

[그 외]

void close()

스캐너 I/O 자원 반환
Scanner useDelimiter(String pattern)
스캐너 입력 스트림으로 읽은 데이터를 토큰화하여 나눌 구분자 패턴 설정
- [유사 매소드] Scanner useDelimiter(Pattern pattern)
  Pattern 객체 사용
Pattern delimiter()

스캐너 구분자 패턴 리턴
- 기본은 java whitespace를 사용 : Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR)
  
  non-breaking space ('\u00A0', '\u2007', '\u202F')는 제외.
- '\t', U+0009 HORIZONTAL TABULATION.
- '\n', U+000A LINE FEED.
- '\u000B', U+000B VERTICAL TABULATION.
- '\f', U+000C FORM FEED.
- '\r', U+000D CARRIAGE RETURN.
- '\u001C', U+001C FILE SEPARATOR.
- '\u001D', U+001D GROUP SEPARATOR.
- '\u001E', U+001E RECORD SEPARATOR.
- '\u001F', U+001F UNIT SEPARATOR.
Scanner useLocale(Locale locale)
- 스캐너의 locale을 설정
- locale 설정은 패턴 매칭 시에 나라별로 다를수 있는 숫자 포맷 등을 고려하기 위해 사용된다.
Locale locale()

설정된 locale 리턴
Scanner useRadix(int radix)
숫자를 몇진법으로 읽을 것인지 설정. 기본은 10진법.
int radix()

설정된 radix 리턴
Scanner reset()

Scanner의 설정을 초기화

  // Scanner.java
  public Scanner reset() {
    delimPattern = WHITESPACE_PATTERN; // 구분자 초기화
    useLocale(Locale.getDefault(Locale.Category.FORMAT)); // locale 설정 초기화
    useRadix(10); // 진법 초기화
    clearCache(); // 
    modCount++; // 이 스캐너의 상태가 변경된 횟수
    return this;
  }
  
  private void clearCaches() {
    hasNextPattern = null; // hasNext()에서 마지막으로 사용한 패턴
    typeCache = null; // 마지막으로 스캔된 값
  }