darkproxyscrape ロゴ

How To Use A Proxy In Python

ハウツー, プロキシ, パイソン, 11月02日-2022年5分で読める

We often come across the term ‘proxy‘ when we are working in the computer science field. When connected to the Internet, every computer gets a unique Internet Protocol (IP) address that identifies the computer and its geographic location. Your computer sends out a request whenever it needs any information from the Internet. The request is

目次

We often come across the term ‘proxy‘ when we are working in the computer science field. When connected to the Internet, every computer gets a unique Internet Protocol (IP) address that identifies the computer and its geographic location. Your computer sends out a request whenever it needs any information from the Internet. The request is sent to a target computer that checks the type of information being asked for. The target computer sends the information back if it is allowed to give it to our IP address. At times, the computer wants to get the information from the Internet without being identified. That information is usually blocked, but we can get it using a proxy that acts as an intermediary between the client and the server machine.

The clients usually use the proxy server to browse web pages and request resources anonymously as it acts as an identification field between the client computer and the Internet. 

Proxy servers have become quite popular with the growing concern of online security and data theft. Here the question arises how the proxy server is connected to the security of our system? We can say that a proxy server adds an additional security level between our server and the external world. This extra security helps in saving our system from a breach. 

How To Use A Proxy In Python?

Pythonリクエストでプロキシを使用するには、以下の手順に従う必要がある。

輸入リクエスト

シンプルなHTTPライブラリであるrequestsパッケージをインポートする。手動でURLにクエリー文字列を追加しなくても、このパッケージを使って簡単にリクエストを送信できます。requestsは以下のコマンドでインポートできます。

輸入リクエスト

辞書の作成

HTTP接続とHTTPS接続を定義するプロキシ辞書を作成する必要がある。辞書変数には、プロキシURLにプロトコルをマッピングする "proxies"のような任意の名前を与えることができる。さらに、URL変数にスクレイピングするウェブサイトを設定する必要がある。

proxies = {
  "http":'http://203.190.46.62:8080',
  "https":'https://111.68.26.237:8080'
}
url = 'https://httpbin.org/ip'

ここで辞書は、HTTPとHTTPSという2つの別々のプロトコルのプロキシのURLを定義する。

応答変数の作成

リクエストメソッドのいずれかを使用するレスポンス変数を作成する必要があります。このメソッドは2つの引数をとります:

  • 作成したURL
  • あなたが定義した 辞書
response = requests.get(url,プロキシ = プロキシ)
print(response.json())

出力はこうだ:

Requests Methods

There are a number of requests methods like:

  • GET – It retrieves information from a given server using a given URL. 
  • POST – This method requests that the given web server accepts the enclosed data in the body of the request message to store it.
  • PUT – It requests that the enclosed data gets stored under the given URL.
  • DELETE – This method sends a DELETE request to the given URL.
  • PATCH – This request method is supported by the HTTP protocol and makes partial changes to an existing resource. 
  • HEAD – It sends a HEAD request to the given URL when you do not need the file content and only want the HTTP headers or the status_code.

You can use the below syntax of the requests methods when the URL is specified. Here, our URL is the same as we used in the above code i-e., https://httpbin.org/ip.

response = requests.get(url)
response = requests.post(url, data={"a": 1, "b": 2})
response = requests.put(url)
response = requests.delete(url)
response = requests.patch(url)
response = requests.head(url)
response = requests.options(url)

Proxy Sessions

If you want to scrape the data from websites that utilize sessions, you can follow the steps given below.

Step#01

Import the requests library.

輸入リクエスト

Step#02

Create a session object by creating a session variable and setting it to the requests Session() method. 

session = requests.Session()

session.proxies = {
   'http': 'http://10.10.10.10:8000',
   'https': 'http://10.10.10.10:8000',
}

url = 'http://mywebsite.com/example'

Step#03

Send the session proxies through the requests method and pass the URL as an argument.

response = session.get(url)

Main Types Of Proxies

Let’s discuss the two essential types of proxies, i-e;

  1. Static Proxies
  2. プロキシのローテーション

Static Proxies

We can define static proxies as the datacenter Internet Protocols assigned via an Internet Service Provider (ISP) contract. They are designed to remain connected to one proxy server for a set amount of time. The name “static” implies that it allows us to operate as a residential user with the same IP for as long as required. 

In short, with the use of static proxies, we get the speed of datacenter proxies and the high anonymity of residential proxies. Furthermore, a static proxy allows us to avoid IP address rotation, making its use significantly simpler.

The static IP services are not created by using virtual machines, unlike regular datacenter proxies. These proxies, also known as sticky IP addresses, look like genuine consumers to almost all websites. 

プロキシのローテーション

プロキシローテーションとは、新しいリクエストを送信するたびにIPアドレスを変更する機能と定義できる。

When we visit a website, we send a request that shows a destination server a lot of data, including our IP address. For instance, when we gather data using a scraper( for generating leads), we send many such requests. So, the destination server gets suspicious and bans it when most requests come from the same IP. 

したがって、リクエストを送信するたびにIPアドレスを変更するソリューションが必要だ。その解決策がローテーション・プロキシである。つまり、ウェブスクレイピングでIPをローテーションさせるためにスクレイパーを手に入れるという無用な手間を省くために、ローテーション・プロキシを得て、プロバイダーにローテーションを任せればいいのだ。

なぜプロキシを使う必要があるのか?

Following are the reasons to use various types of proxies.

  • Social media managers appreciate proxies for letting them stick to a single server. If users constantly log in to their accounts by changing IP addresses, the social media platform will get suspicious and block their profile.
  • E-commerce sites might show different data for users from other locations and returning visitors. Also, the server becomes alert if a buyer logins his account multiple times from various IP addresses. So, we have to use proxies for online shopping.
  • We need proxies for manual marketing research when a specialist wants to check the required data through a user’s eyes from one location. 
  • Ad verification allows the advertisers to check if their ads are displayed on the right websites and seen by the right audiences. The constant change of IP addresses accesses many different websites and thus verifies ads without IP blocks.
  • When accessed from specific locations, the same content can look different or may not be available. The use of the proxies allows us to access the necessary data regardless of its geo-location. 
  • We can use proxies for accessing data, speeding up the browsing speed as they have a good cache system.

結論

これまで、 プロキシはクライアントとサーバーマシンの間の中継として機能することを説明した。あなたが情報を要求するたびに、あなたのコンピュータはこの要求をプロキシに送信し、プロキシは別のIPアドレスを使用してターゲットコンピュータに情報を送信します。そのため、あなたのIPアドレスは機密のままです。さらに、Pythonのrequestsモジュールでプロキシを使用し、必要に応じて様々なアクションを実行することができます。データセンター型 プロキシの速度と居住型プロキシの高い匿名性を備えた静的IPが必要な場合は、静的プロキシが適しています。逆に、ローテーションプロキシは、テストやスクレイピングにおいてメリットがあります。