\

PythonとJavaScriptを用いたWebスクレイピング

投稿者 admin 2024年7月21日

PythonとJavaScriptを用いてWebスクレイピングを行う方法について解説します。PythonはWebスクレイピングに非常に便利な言語であり、ライブラリが充実しているため、簡単にWeb上のデータを取得することができます。

PythonとBeautifulSoup4によるスクレイピング

Pythonにおけるスクレイピングといえば、BeautifulSoupライブラリによる手法が最も一般的です。この手法は、取得先のサイトが静的なhtmlである場合やサーバー側の動的なhtml出力である場合に有効な手法です。

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import urllib.request as req

# 通貨設定
crypto = 'BTC'
currency = 'JPY'

# 価格の取得先
url = 'https://coinyep.com/ja/ex/' + crypto + '-' + currency

# 取得結果
current_value = ''

# 取得先URLにアクセス
res = req.urlopen(url)

# 対象を抽出
soup = BeautifulSoup(res, 'html.parser')
values = soup.select_one("#coinyep-reverse1").findAll(text=True)
current_value = str(''.join(values))
current_value = current_value.replace('1 ' + crypto + ' = ', '')
current_value = current_value.replace(' ' + currency, '')

# 取得結果
print('1' + crypto + '(' + currency + '): ' + str(current_value))

PythonとSeleniumによるJavaScriptのスクレイピング

JavaScriptが使用されているWebページのスクレイピングを行う場合、SeleniumとChromeDriverを使用します。

import lxml.html
from selenium import webdriver

target_url = 'http://news.tv-asahi.co.jp/news_politics/articles/000041338.html'
driver = webdriver.PhantomJS()
driver.get(target_url)
root = lxml.html.fromstring(driver.page_source)
links = root.cssselect('#relatedNews a')

for link in links:
    print(link.text)

以上のように、PythonとJavaScriptを用いてWebスクレイピングを行う方法を解説しました。これらの手法を用いることで、Web上のデータを効率的に取得することが可能となります。ただし、スクレイピングを行う際には、対象のWebサイトの利用規約を遵守し、適切な方法でデータを取得することが重要です。.

投稿者 admin

コメントを残すコメントをキャンセル

Pandasでデータフレームの要約統計量を計算する方法

ラズベリーパイとPythonを使用した画像処理

Javaでのプロキシの設定方法

WindowsでPythonを動かす方法