www.reddit.com

In [9]:
import requests
from bs4 import BeautifulSoup
In [3]:
response=requests.get("https://www.reddit.com") 
In [15]:
response.headers  # User-Agent 가 지정되지 않음.
Out[15]:
{'Content-Type': 'text/html; charset=UTF-8', 'x-ua-compatible': 'IE=edge', 'x-frame-options': 'SAMEORIGIN', 'x-content-type-options': 'nosniff', 'x-xss-protection': '1; mode=block', 'set-cookie': 'loid=0000000000039ki3sa.2.1496755065730.Z0FBQUFBQlpOcXQ1bTRvOWF5MFh1QWRXRld5MWRBZGgxd2JBMktRQVVXNWs4emNZbFp6VnBlSHRfaEJqTFF4Sk5GZnFhVDNtU1ZkXzhBTF94NXR1eE52ZjRIbFB0em5ZU1Z6Z0lwOEEtVExlQzZqbWJNbXNjOEJYcVE5blAzQTNxakVDMGp1R0lTWnY; Domain=reddit.com; Max-Age=63071999; Path=/; expires=Thu, 06-Jun-2019 13:17:45 GMT; secure, session_tracker=D8Sd42mFLzcGnu9MST.0.1496755065727.Z0FBQUFBQlpOcXQ1NmpiRnMzaS1WZ0FrcUNSUzNibDRIRG1CVVpGMzA4QnRpOFI0Z3M0U1M1ak5hVHBMOWUwb1A1Y2pESVluRzlzSUtaQXFBallPX0JselhoX1dzMXpVSGVQdGJ1RkItMDZVZDU0MkNSVmVKRzNNbFdfSzM0RUlRdVJnTjNTaFRNeWc; Domain=reddit.com; Max-Age=7199; Path=/; expires=Tue, 06-Jun-2017 15:17:45 GMT; secure, edgebucket=rYUFYisqbjNA434vRJ; Domain=reddit.com; Max-Age=63071999; Path=/;  secure', 'Content-Encoding': 'gzip', 'cache-control': 'max-age=0, must-revalidate', 'X-Moose': 'majestic', 'Strict-Transport-Security': 'max-age=15552000; includeSubDomains; preload', 'Content-Length': '25758', 'Accept-Ranges': 'bytes', 'Date': 'Tue, 06 Jun 2017 13:17:45 GMT', 'Via': '1.1 varnish', 'Connection': 'keep-alive', 'X-Served-By': 'cache-sjc3126-SJC', 'X-Cache': 'MISS', 'X-Cache-Hits': '0', 'X-Timer': 'S1496755066.686250,VS0,VE289', 'Vary': 'accept-encoding', 'Server': 'snooserv'}
In [16]:
response.headers['User-Agent']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-16-951dda1279cb> in <module>()
----> 1 response.headers['User-Agent']

/Users/ckn/anaconda/lib/python3.6/site-packages/requests/structures.py in __getitem__(self, key)
     52 
     53     def __getitem__(self, key):
---> 54         return self._store[key.lower()][1]
     55 
     56     def __delitem__(self, key):

KeyError: 'user-agent'
In [4]:
response # 이싸이트의 경우 user-agent 를 지정하지 않으면 오류
Out[4]:
<Response [429]>
In [6]:
request_headers = {
    'User-Agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 '
'(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'), 
}
In [7]:
response=requests.get("https://www.reddit.com",headers=request_headers)
In [8]:
response
Out[8]:
<Response [200]>
In [10]:
html=requests.get("https://www.reddit.com",headers=request_headers).text
soup=BeautifulSoup(html,'html.parser')
In [11]:
tag_list=soup.select('p.title > a[class^=title]') # 클레스네임이 title인 p의 직계 a중 class 속성의 이름이 title로 시작하는 것
In [14]:
for idx, tag in enumerate(tag_list,1): # 1부터 인덱스 시작
    print(idx,tag.text)
1 When it's just not your day
2 RIP Peter Sallis - Wallace and Gromit
3 Stephen Hawking announces he is voting Labour: 'The Tories would be a disaster' - 'Another five years of Conservative government would be a disaster for the NHS, the police and other public services'
4 "We are still in": Tech giants join growing alliance to honor Paris climate agreement
5 On a recent visit to Venice I was amused by the doorbell/intercoms that looked like retro SciFi robots.
6 fAmOuS sInGeR fUcKiNg fEaStS oN tHe CoRpSeS oF iT's OwN kInD
7 Game developers who have worked on terrible games, when and why did you realize the game was going to flop?
8 I wish I had a reason to post this, my dog is just adorable.
9 Found this gem in a small Indian food shack.
10 IamA (The Marine on the Rowing Machine) AMA!
11 It bothers me so much that there are so many songs I don't know about that I might like so much
12 You can access most of the world's knowledge on a £30 phone but people still spend 50p on the Sun everyday.
13 6 years later, even a small tornado's scar across town is still visible.
14 The faces of the press Corp while listening to Sean Spicer talk are priceless
15 Mayor of Nashville Knows How To Make America Great Again
16 Caroline Kennedy walks ahead while her father carries her doll (1960)
17 So everyone is gonna act like that didn't just happen
18 My uncle got these when they first came out before he died in a car accident. My grandma always told me how much he loved these and being a sneaker head my grandma gave them to me and wanted me to take care of them. I saw this Jordan magazine and saw the back and thought it was too cool not to share
19 Surf's up dude
20 Man slides into pool but never swims
21 2meirl4meirl
22 A group representing $6.2 trillion of the US economy says they're 'still in' the Paris climate agreement. Going by the name "We Are Still In," the coalition called itself "the broadest cross section of the American economy yet assembled in pursuit of climate action."
23 How about no
24 This harbor is extremely deep
25 Scientists reduce fear of death by using virtual reality to induce an out-of-body experience
In [ ]: