Hash in Java

Seop·2023년 9월 19일

자바는 다른 프로그래밍 언어와는 다르게 Hash를 자주 사용합니다.
그러면 어느 부분에서 사용되는지 한 번 알아보겠습니다!!

Hash?

해시 함수(hash function) 또는 해시 알고리즘(hash algorithm) 또는 해시함수알고리즘(hash函數algorithm)은 임의의 길이의 데이터를 고정된 길이의 데이터로 매핑하는 함수이다. 해시 함수에 의해 얻어지는 값은 해시 값, 해시 코드, 해시 체크섬 또는 간단하게 해시라고 한다

해시란 보통 입력값을 임의의 고정된 길이의 데이터로 바꿔주는 함수입니다.

위 그림과 같이 key 값으로 이름이 들어왔다면 이를 고정된 길이의 문자열/정수 등으로 바꿔주는 역할을 수행하는 함수입니다.

보통 key 값을 임의의 데이터로 변경하는 과정을 해싱한다고 표현하죠

해싱 알고리즘의 특징

해시 알고리즘, 특히 좋은 해시 알고리즘은 다음과 같은 특징들을 가지고 있습니다

해시 알고리즘은 어떤 데이터에 대해서도 빠르게 해싱을 수행할 수 있어야 한다
해시 알고리즘은 이미 생성된 메시지를 다시 재생성하지 말아야 한다
두 개의 다른 메시지에 대해서 해시 알고래즘은 같은 해시 값을 가져서는 안된다. 다시 말해, 해시 충돌을 피해야 한다
원문 메시지가 약간만 변하더라도 해시 값은 급격하게 바뀌어야 한다

여기서 하나 알고 넘어가야하는 개념이 하나 있습니다.
바로 해시 충돌입니다.

Hash Collide

해시 충돌이란 서로 다른 원문에 대한 해싱에 대해 같은 결과를 리턴해줄 때 발생합니다.
서로 다른 두 객체의 해시 값이 같은 상황이라고 할 수 있죠

일전에 봤던 그림입니다.
현재의 해시 함수는 John Smith와 Sandra Dee의 해싱의 결괏값이 같습니다.
이와 같은 경우 해시가 충돌한다고 합니다.

여러 해시 알고리즘

자바에서 사용하는 여러 해시 알고리즘들은 다음과 같습니다.

1. MD5 Algorithm

MD5 알고리즘은 널리 사용되는 암호 해시 함수입니다.
128비트(16바이트) 의 해시 값을 생성해 내는 함수입니다.

//import statements
import java.math.BigInteger;  
import java.security.NoSuchAlgorithmException;  
import java.security.MessageDigest;  
  
// A Java program that uses the MD5 to do the hashing  public class MD5 {  
public static String getMd5(String input) {  
    try {  
  
        // invoking the static getInstance() method of the MessageDigest class    
         // Notice it has MD5 in its parameter.    
MessageDigest msgDst = MessageDigest.getInstance("MD5");  
  
         // the digest() method is invoked to compute the message digest    
         // from an input digest() and it returns an array of byte    
byte[] msgArr = msgDst.digest(input.getBytes());  
  
         // getting signum representation from byte array msgArr    
BigInteger bi = new BigInteger(1, msgArr);  
  
         // Converting into hex value    
String hshtxt = bi.toString(16);  
  
         while (hshtxt.length() < 32) {  
            hshtxt = "0" + hshtxt;  
         }  
         return hshtxt;  
      }  
      // for handling the exception     
catch (NoSuchAlgorithmException abc) {  
         throw new RuntimeException(abc);  
      }  
   }  
  
   // main method code    
public static void main(String argvs[]) throws NoSuchAlgorithmException {  
      String str = "JavaTpoint";  
      String hash = getMd5(str);  
      str = "'JavaTpoint'";  
      System.out.println("The HashCode Generated for " + str + " is: " + hash);  
   }  
}

output

5f4dcc3b5aa765d61d8327deb882cf99

하지만 MD5 알고리즘은 보안상 취약하고, 해시 충돌이 빈번하게 일어나는 단점을 가지고 있습니다.

2. SHA Algorithms

SHA 계열 알고리즘은 MD5보다 개선된 알고리즘으로 뒤에 붙은 숫자를 기준으로 원하는 길이의 문자열을 생성해주는 해싱 알고리즘입니다.

// import statements  import java.security.NoSuchAlgorithmException;  
import java.math.BigInteger;  
import java.security.MessageDigest;  
import java.nio.charset.StandardCharsets;  
  
// A Java program to find the SHA-256 hash value  public class SHAExample {  
   public static byte[] obtainSHA(String s) throws NoSuchAlgorithmException {  
      // Static getInstance() method is invoked with the hashing SHA-256    
MessageDigest msgDgst = MessageDigest.getInstance("SHA-256");  
  
      // the digest() method is invoked    
      // to compute the message digest of the input    
      // and returns an array of byte    
return msgDgst.digest(s.getBytes(StandardCharsets.UTF_8));  
   }  
  
   public static String toHexStr(byte[] hash) {  
      // Converting the byte array in the signum representation    
BigInteger no = new BigInteger(1, hash);  
  
      // Converting the message digest into the hex value    
StringBuilder hexStr = new StringBuilder(no.toString(16));  
  
      // Padding with tbe leading zeros    
while (hexStr.length() < 32) {  
         hexStr.insert(0, '0');  
      }  
  
      return hexStr.toString();  
   }  
  
   // main method    
public static void main(String argvs[]) {  
      try {  
         System.out.println("The HashCode produced by SHA-256 algorithm for strings: ");  
  
         String str = "JavaTpoint";  
         String hash = toHexStr(obtainSHA(str));  
         System.out.println("\n" + str + " : " + hash);  
  
         str = "India is a great country.";  
         hash = toHexStr(obtainSHA(str));  
         System.out.println("\n" + str + " : " + hash);  
      }  
      // handling exception    
      // usually raised when some absurd algorithm is used for doing the hash work    
catch (NoSuchAlgorithmException obj) {  
         System.out.println("An exception is generated for the incorrect algorithm: " + obj);  
      }  
   }  
}

output

The HashCode produced by SHA-256 algorithm for strings:

JavaTpoint : f9142e5ca706378c1c7f9daf6782dcff8197ef1ecfd4075b63dae2f40186afa6

India is a great country. :e28335e1124e47ebf4c0b6cb87a34737b70d539241641c60c1969602a7b46cf9

좋아 보이지만 더 좋은 해싱 알고리즘이 존재합니다.
바로 PBKDF2, BCrypt, SCrypt입니다.

PBKDF2 사용 예제

SecureRandom random = new SecureRandom();
byte[] salt = new byte[16];
random.nextBytes(salt);
KeySpec spec = new PBEKeySpec(password.toCharArray(), salt, 65536, 128);
SecretKeyFactory factory = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA1");
byte[] hash = factory.generateSecret(spec).getEncoded();