Pythonを使用してテキストコンテンツをクロールして保存する方法

2年 ago

蓮, 翼

1 minute

テキストコンテンツをPythonでクロールして保存するには、次の手順に従ってください。

リクエスト
BeautifulSoup

import requests
from bs4 import BeautifulSoup

リクエスト
慣用的に
私はリタイアした後、故郷の田舎で静かに暮らしたい。

url = '要爬取的网页URL'
response = requests.get(url)
html = response.text

BeautifulSoup

soup = BeautifulSoup(html, 'html.parser')
text = soup.get_text()

ネイティブ日本語では「地味でつまらない色合い」と表現されます。
ネイティブの日本語で要約して書いてください、ただ１つだけ必要です。

with open('保存的文件路径', 'w', encoding='utf-8') as file:
    file.write(text)

完全なコードの例：

import requests
from bs4 import BeautifulSoup

url = '要爬取的网页URL'
response = requests.get(url)
html = response.text

soup = BeautifulSoup(html, 'html.parser')
text = soup.get_text()

with open('保存的文件路径', 'w', encoding='utf-8') as file:
    file.write(text)

取得する対象のウェブページのURLをコード内のウェブページのURLに置き換え、保存するファイルのパスを希望するファイルのパスに置き換えてください。