[크롤링] 크롤링 정리

미래내일일경험 - 빅리더(23.06~23.12)/교육

[크롤링] 크롤링 정리

구일_ 2023. 7. 10. 20:32

막상 정리하려고 했는데 내가 필요한게 아니니까 할려고하니까 너무 귀찮아서 대강 정리할래요.

근데 사실 이거 이렇게 써도 아무도 안읽자너

이거는 진짜로 그냥 공식문서만 읽어도 해결되는 건데 3일이나 수업을 한지 모르겠어요.

#Step 1. 필요한 모듈을 로딩합니다
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
import time, os
#CSV로 저장할 준비
fc_name = "./data/seoul.csv"

#Step 4. 크롬 드라이버 설정 및 웹 페이지 열기
driver = webdriver.Chrome("./chromedriver")
driver.get('https://eungdapso.seoul.go.kr/main.do')

# 팝업창이 있으면 닫고 창 최대화 하기
all_win = driver.window_handles
for handle in all_win :
    if handle != all_win[0] :
        driver.switch_to.window(handle)
        time.sleep(1); driver.close()
driver.switch_to.window(driver.window_handles[0])
time.sleep(2); driver.maximize_window()
time.sleep(5)

# collect_cnt = int(input(' 몇 건을 수집하시겠습니까?: '))
# collect_page_cnt = math.ceil(collect_cnt / 10)

driver.find_element(By.CLASS_NAME, "mv_linkbanner_text").click()
time.sleep(5)

soup_1 = BeautifulSoup(driver.page_source, 'html.parser')
content_1 = soup_1.find('table','rp_tb').find_all('tr')
total_content = soup_1.find('div','rp_lkitem').get_text()
print(f"total_content:{total_content}")
time.sleep(2)

# tr들의 내용을 확인하기 위해 출력
for i in content_1 :
    print(i.get_text().replace("\n",""))

# driver.find_element(By.CLASS_NAME, "rp_tb")

no2 = [] #번호 저장
title = [] #제목 저장
date = [] # 날짜 저장



import pandas as pd


# df.to_csv(fc_name,index=False, encoding="utf-8-sig")
print('요청하신 데이터 수집 작업이 정상적으로 완료되었습니다')

driver.close()

저작자표시 비영리 변경금지

'미래내일일경험 - 빅리더(23.06~23.12) > 교육' 카테고리의 다른 글

[스터디챌린지] ICT융합대학 스터디 챌린지 3주차(7/15 ~ 7/21) (0)	2023.07.21
[스터디챌린지] ICT융합대학 스터디 챌린지 2주차(7/8 ~ 7/14) (0)	2023.07.14
[스터디챌린지] ICT융합대학 스터디 챌린지 1주차(7/1 ~ 7/7) (0)	2023.07.07
[파이썬] 반복문 없이 리스트 element 출력하기 (0)	2023.07.06
[선형대수] 토플리츠(Toeplitz) 행렬 (0)	2023.07.05

현재글[크롤링] 크롤링 정리

구일의 덕지덕지 기워넣는 개발 일기

취준하면서 생기는 이것저것

코딩, 알고리즘, 빅리더, 미래내일 일경험 사업, dfs, 프로그래밍, 백준8111, 슬기로운 방학생활, 깊이우선탐색, 빅리더 인턴십, linear regression, 코딩테스트, 파이썬, 백준16952, 고용노동부, 명지대, Python, 백준, 스터디챌린지, An Introduction to Statistical Learning,

Today :
Yesterday :

구일의 덕지덕지 기워넣는 개발 일기

[크롤링] 크롤링 정리

'미래내일일경험 - 빅리더(23.06~23.12) > 교육' 카테고리의 다른 글

'미래내일일경험 - 빅리더(23.06~23.12)/교육'의 다른글

티스토리툴바

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

[크롤링] 크롤링 정리

'미래내일일경험 - 빅리더(23.06~23.12) > 교육' 카테고리의 다른 글

'미래내일일경험 - 빅리더(23.06~23.12)/교육'의 다른글

관련글

티스토리툴바