스크래핑은 웹사이트의 정보 데이터를 긁어오는 행위를 말한다.
스크래핑을 잘못하게 될 경우 소송 걸릴 수 있으니 주의
스크래핑을 하려는 사이트에서
가져가길 원하지 않는 데이터의 접근 uri를 명시해둔 파일이다.
해당 파일에서 Disallow
라고 적혀있는 주소에만 접근하지 않으면 된다.
ex : https://finance.yahoo.com/robots.txt
ex : https://finance.naver.com/robots.txt
String url = "https://search.naver.com/search.naver?where=view&sm=tab_jum&query=%EC%8A%A4%ED%81%AC%EB%9E%98%ED%95%91";
try {
Connection connection = Jsoup.connect(url);
// 403 에러를 피하기 위한 specify user agent
connection.userAgent("Mozilla/5.0");
Document document = connection.get(); // get 메서드 요청
Elements elements = document.getElementsByClass("lst_total"); // ul
Element ul = elements.get(0); // element = ul
for (Element li: ul.children()) {
Elements a = li.getElementsByClass("total_tit");
String txt = a.text();
System.out.println(txt);
}
} catch (IOException e) {
throw new RuntimeException(e);
}
네이버 view에서 "스크래핑" 검색결과 중 제목만 스크래핑
C:\Users\xh\.jdks\openjdk-19.0.2\bin\java.exe -XX:TieredStopAtLevel=1 -Dspring.output.ansi.enabled=always -Dcom.sun.management.jmxremote -Dspring.jmx.enabled=true -Dspring.liveBeansView.mbeanDomain -Dspring.application.admin.enabled=true "-Dmanagement.endpoints.jmx.exposure.include=*" "-javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2022.3.1\lib\idea_rt.jar=58765:C:\Program Files\JetBrains\IntelliJ IDEA 2022.3.1\bin" -Dfile.encoding=UTF-8 -Dsun.stdout.encoding=UTF-8 -Dsun.stderr.encoding=UTF-8 -classpath C:\Users\xh\Documents\zb-dividend\financial\out\production\classes;C:\Users\xh\Documents\zb-dividend\financial\out\production\resources;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.projectlombok\lombok\1.18.22\9c08ea24c6eb714e2d6170e8122c069a0ba9aacf\lombok-1.18.22.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-web\2.5.6\46b479490170914f7477b96a21241183b181c24d\spring-boot-starter-web-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-data-jpa\2.5.6\8d7fe99c33e09390316749614d9795d80b49207b\spring-boot-starter-data-jpa-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-security\2.5.6\af5827b9e08ea631fa213cccd1144fbdfee32896\spring-boot-starter-security-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.jsoup\jsoup\1.7.2\d7e275ba05aa380ca254f72d0c0ffebaedc3adcf\jsoup-1.7.2.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-json\2.5.6\6ef5a7087e18ed4f3736c8752440ecd489c36a4d\spring-boot-starter-json-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter\2.5.6\d5d1fada1afe9a808abf48da7066a993cf679aa\spring-boot-starter-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-tomcat\2.5.6\6d1a04a727d9d09b99207864ceb0a4567e53730a\spring-boot-starter-tomcat-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-webmvc\5.3.12\3d92ad6c28bfa5923183f328f5bfa1e39ec32714\spring-webmvc-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-web\5.3.12\78991a50d17da49bddc4987a2cc8b83d46c402a7\spring-web-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-aop\2.5.6\c5db1260ecf447f55419f1a17da75a42f211aca3\spring-boot-starter-aop-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-jdbc\2.5.6\cf01e787378c2d30b695f0c9f76fb48a6b490984\spring-boot-starter-jdbc-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\jakarta.transaction\jakarta.transaction-api\1.3.3\c4179d48720a1e87202115fbed6089bdc4195405\jakarta.transaction-api-1.3.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\jakarta.persistence\jakarta.persistence-api\2.2.3\8f6ea5daedc614f07a3654a455660145286f024e\jakarta.persistence-api-2.2.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.hibernate\hibernate-core\5.4.32.Final\99a5e10bf455337014c190e141ec631e9ff71663\hibernate-core-5.4.32.Final.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.data\spring-data-jpa\2.5.6\8e0ec2f54f3fcda49dfb3123f3a40f34b55df92a\spring-data-jpa-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-aspects\5.3.12\3cccc3052c6973c059eb2be7c4baf0b9558d49b7\spring-aspects-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.security\spring-security-web\5.5.3\2d2b773e2af5b5984852db8857a77175ce4e1104\spring-security-web-5.5.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.security\spring-security-config\5.5.3\106b6a1af7460d64fab64ba5bbfe3f52f0eec139\spring-security-config-5.5.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-aop\5.3.12\882db41939109e96f4c78cd5c0931cc4aebc3d58\spring-aop-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.fasterxml.jackson.datatype\jackson-datatype-jsr310\2.12.5\a0a9870b681a72789c5c6bdc380e45ab719c6aa3\jackson-datatype-jsr310-2.12.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.fasterxml.jackson.module\jackson-module-parameter-names\2.12.5\2c85c2036d0851425a260c01eb5f7ddbed1eeb00\jackson-module-parameter-names-2.12.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.fasterxml.jackson.datatype\jackson-datatype-jdk8\2.12.5\6b2f79547d217ad50dfc5b57af7444a3aa583b43\jackson-datatype-jdk8-2.12.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.fasterxml.jackson.core\jackson-databind\2.12.5\b064cf057f23d3d35390328c5030847efeffedde\jackson-databind-2.12.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-autoconfigure\2.5.6\b9f4016180c5242530da465561ff25c7cac14bf3\spring-boot-autoconfigure-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot\2.5.6\d8c6b97fd3182fb6d7d06ebf710cd9ccabc83b89\spring-boot-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.boot\spring-boot-starter-logging\2.5.6\a900356a11b1a41f4277136f1d13ce7a13f43b3c\spring-boot-starter-logging-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\jakarta.annotation\jakarta.annotation-api\1.3.5\59eb84ee0d616332ff44aba065f3888cf002cd2d\jakarta.annotation-api-1.3.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-core\5.3.12\662e6536968246af9baa84fbac2d3eb56a04fda9\spring-core-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.yaml\snakeyaml\1.28\7cae037c3014350c923776548e71c9feb7a69259\snakeyaml-1.28.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.apache.tomcat.embed\tomcat-embed-websocket\9.0.54\ae018906cecb818a8c6f2316d7b0793beadf6609\tomcat-embed-websocket-9.0.54.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.apache.tomcat.embed\tomcat-embed-core\9.0.54\34322c731b2394ea13681cfae0be9cd72f46f88d\tomcat-embed-core-9.0.54.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.apache.tomcat.embed\tomcat-embed-el\9.0.54\9edb062d38d0fd8a165289f44b28b3b0e0e11ed7\tomcat-embed-el-9.0.54.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-context\5.3.12\d5f5f044e05109b7f3337ea2cf692fd62d1ecbb6\spring-context-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-beans\5.3.12\caaa1d489bce88d6aa01ddd255ad5046acf8f282\spring-beans-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-expression\5.3.12\50c82e995b3b8e20a3f313b4356237db5a26e14a\spring-expression-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.aspectj\aspectjweaver\1.9.7\158f5c255cd3e4408e795b79f7c3fbae9b53b7ca\aspectjweaver-1.9.7.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-jdbc\5.3.12\957d6ddc80fbf52d965e6af90ddd0dccfed42d7d\spring-jdbc-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.zaxxer\HikariCP\4.0.3\107cbdf0db6780a065f895ae9d8fbf3bb0e1c21f\HikariCP-4.0.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.hibernate.common\hibernate-commons-annotations\5.1.2.Final\e59ffdbc6ad09eeb33507b39ffcf287679a498c8\hibernate-commons-annotations-5.1.2.Final.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.jboss.logging\jboss-logging\3.4.2.Final\e517b8a93dd9962ed5481345e4d262fdd47c4217\jboss-logging-3.4.2.Final.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.javassist\javassist\3.27.0-GA\f63e6aa899e15eca8fdaa402a79af4c417252213\javassist-3.27.0-GA.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\net.bytebuddy\byte-buddy\1.10.22\ef45d7e2cd1c600d279704f492ed5ce2ceb6cdb5\byte-buddy-1.10.22.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\antlr\antlr\2.7.7\83cd2cd674a217ade95a4bb83a8a14f351f48bd0\antlr-2.7.7.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.jboss\jandex\2.2.3.Final\d3865101f0666b63586683bd811d754517f331ab\jandex-2.2.3.Final.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.fasterxml\classmate\1.5.1\3fe0bed568c62df5e89f4f174c101eab25345b6c\classmate-1.5.1.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.dom4j\dom4j\2.1.3\a75914155a9f5808963170ec20653668a2ffd2fd\dom4j-2.1.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.glassfish.jaxb\jaxb-runtime\2.3.5\a169a961a2bb9ac69517ec1005e451becf5cdfab\jaxb-runtime-2.3.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-orm\5.3.12\2881f9e71889b35fa3785bf67706a201cea93004\spring-orm-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.data\spring-data-commons\2.5.6\15a2384f4eaf7fee512fb295174f6c0fb6c55ee1\spring-data-commons-2.5.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-tx\5.3.12\7f2e61a22682baa22ed5bef0724a4386c41477cb\spring-tx-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.slf4j\slf4j-api\1.7.32\cdcff33940d9f2de763bc41ea05a0be5941176c3\slf4j-api-1.7.32.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.security\spring-security-core\5.5.3\82152ffbb7d248e0903732c74e1578317d8dc8de\spring-security-core-5.5.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.fasterxml.jackson.core\jackson-annotations\2.12.5\52d929d5bb21d0186fe24c09624cc3ee4bafc3b3\jackson-annotations-2.12.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.fasterxml.jackson.core\jackson-core\2.12.5\725e364cc71b80e60fa450bd06d75cdea7fb2d59\jackson-core-2.12.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\ch.qos.logback\logback-classic\1.2.6\b09efa852337fa0dd9859614389eec58dc287116\logback-classic-1.2.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.apache.logging.log4j\log4j-to-slf4j\2.14.1\ce8a86a3f50a4304749828ce68e7478cafbc8039\log4j-to-slf4j-2.14.1.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.slf4j\jul-to-slf4j\1.7.32\8a055c04ab44e8e8326901cadf89080721348bdb\jul-to-slf4j-1.7.32.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework\spring-jcl\5.3.12\2b5f5bb4a78af879bd174ceff5226da3f014ab9d\spring-jcl-5.3.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\jakarta.xml.bind\jakarta.xml.bind-api\2.3.3\48e3b9cfc10752fba3521d6511f4165bea951801\jakarta.xml.bind-api-2.3.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.glassfish.jaxb\txw2\2.3.5\ec8930fa62e7b1758b1664d135f50c7abe86a4a3\txw2-2.3.5.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.sun.istack\istack-commons-runtime\3.0.12\cbbe1a62b0cc6c85972e99d52aaee350153dc530\istack-commons-runtime-3.0.12.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.springframework.security\spring-security-crypto\5.5.3\45fc09a7a2484ef843a9db4652e6ff984bc2e537\spring-security-crypto-5.5.3.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\ch.qos.logback\logback-core\1.2.6\25be1abb32e870ff042e698a799b56587e0dca9a\logback-core-1.2.6.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\org.apache.logging.log4j\log4j-api\2.14.1\cd8858fbbde69f46bce8db1152c18a43328aae78\log4j-api-2.14.1.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.h2database\h2\1.4.200\f7533fe7cb8e99c87a43d325a77b4b678ad9031a\h2-1.4.200.jar;C:\Users\xh\.gradle\caches\modules-2\files-2.1\com.sun.activation\jakarta.activation\1.2.2\74548703f9851017ce2f556066659438019e7eb5\jakarta.activation-1.2.2.jar kim.zhyun.financial.FinancialApplication
신한 쏠 비대면 서류제출 스크래핑 리얼리뷰
1) 내집마련 신혼부부 디딤돌대출 후기 (자격확인,주택금융공사 스크래핑, 서류준비)
[부동산정책] 특례보금자리론금리, 신청 방법, 확인사항 feat.스크래핑가능서류정리
[단독]"스크래핑 금지 땐 사업 차질"…마이데이터 관련 행정처분 유예 요청
스크래핑 진행으로 무서류대출가능한곳 찾아주셔서 감사합니다
[리스틀리] 웹 크롤링 툴 :: 클릭 몇 번으로 웹 데이터 엑셀 수집 - 웹스크래핑 ???
하나은행 단독 청년내일저축계좌 가입 자격조건 확인(모바일 스크래핑)
비트릭스, ‘스크린 스크래핑’을 활용한 원화 입출금 솔루션 제공
[신용대출]하나원큐 신용대출 한도조회 스크래핑 오류 해결법 총정리 (하나은행 모바일 앱)
광주은행 모바일프라임론 신용대출 후기와 스크래핑 조건
개인사업자신용대출서류 스크래핑해서 올라가는데요.
디딤돌스크래핑
특례보금자리론 신청 방법, 스크래핑 오류 대응 방법
특례보금자리론 다가구주택 신청 스크래핑, 감정평가 안내
스크래핑보드에 신용카드 연동
특례 보금자리론(전세보증금 반환 용도) 신청, 서류 스크래핑 오류 및 해결 방법
세무사랑 전자세금계산서 스크래핑 ㅣ 원클릭택스 부가세 실무
카드 매입 스크래핑 기능 질문드립니다.
세무사랑 프로그램으로 4대보험 자료 당겨와서 급여대장에 반영하기 (스크래핑)
2화. [대출후기1] 특례보금자리론, 대출까지 얼마나 걸릴까?/주택공사어플, 스크래핑, 온라인신청
특례보금자리론 스크래핑 및 서류 제출에 대한 모든것
RPA에서 스크래핑을 잘 활용하려면
RPA 스크래핑의 뜻
지능 스크래핑 도구 ScrapeStorm과 LISTLY의 기능 비교 분석
필요 서류, 주택금융공사, 신한은행, 스크래핑, 과정 요약, 잔금일, 매매 계약서, 신혼부부, 고정금리...
기웅정보통신, 스크래핑 기반 ‘보험금 자동청구’ 특허 취득
[세무사랑] 스크래핑(원클릭택스)
RPA 스크래핑 기능중에 이미지 추출 가능한가요?
토스 핀크 비상금대출 스크래핑 거절나면 어디로 알아봐야할까?
주택금융공사 소득스크래핑 미제출시
종료 코드 0(으)로 완료된 프로세스
Connection 을 통해 가져온 값에서 Elements를 가져오는 방법이 정말 다양하게 있었다.
스크래핑 시 필요한 항목으로 적절히 사용하면 되겠다! 🙈