Web Scraping Part 2 — Python requests

Table of Contents

requests is a very commonly used HTTP library in Python. It provides a clean and simple way to send GET or POST requests and is one of the core tools used for web scraping.

Installing `requests`
#

To install it, run this in the command prompt (CMD):

pip install requests

Sending a GET Request
#

You can send a GET request to a website using:

import requests

url = "http://www.example.com"  # Make sure to include http:// or https://

response = requests.get(url)

Check Response Status Code
#

Use response.status_code to check whether the request succeeded:

if response.ok:
    print("Request succeeded!")
    print(response.text)
else:
    print(f"Request failed, status code: {response.status_code}")

Adding Headers
#

Why Add Headers?
#

In real web scraping tasks, some websites check the identity of the request to determine whether it’s coming from a real browser. If you send a request without proper headers, the site might reject it or return an error page.

Browsers automatically add headers like User-Agent, which tells the server, for example: “This request was sent from Chrome/Firefox”. But requests.get() doesn’t include these by default, so the website might suspect it’s a bot and block the request.

To make your request look more like a real browser visit, you should include headers.

Example: Add User-Agent
#

import requests

url = "http://www.example.com"

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"
}

response = requests.get(url, headers=headers)

if response.ok:
    print("Request succeeded!")
    print(response.text)
else:
    print(f"Request failed, status code: {response.status_code}")

Other Common Headers (add based on needs)
#

headers = {
    "User-Agent": "Mozilla/5.0",
    "Accept-Language": "zh-TW,zh;q=0.9",
    "Referer": "https://www.google.com"
}

These help simulate a “real user clicking through” and increase your chance of successfully retrieving data.

Web Scraping Part 3 — Parsing HTML with Beautiful Soup

6 May 2025·2 mins

Data Web Scraping Data Requests Web Scraping

Web Scraping Part 1 — HTTP Requests and Responses

6 May 2025·2 mins

Data Web Scraping Data Requests Web Scraping

Converting Categorical Variables with `get_dummies`

6 May 2025·2 mins

Data Data Basics Data Pandas

Installing requests #

Sending a GET Request #

Check Response Status Code #

Adding Headers #

Why Add Headers? #

Example: Add User-Agent #

Other Common Headers (add based on needs) #

Related

Installing `requests`
#

Sending a GET Request
#

Check Response Status Code
#

Adding Headers
#

Why Add Headers?
#

Example: Add User-Agent
#

Other Common Headers (add based on needs)
#