Beautiful Soup

4 Notes
+ Remove tags from an element (March 7, 2020, 4:14 p.m.)

comments = soup.findAll('div', {'class': 'cmnt-text'}) for comment in comments: print(comment.get_text())

+ Methods (March 7, 2020, 3:54 p.m.)

comment = soup.find('div', {'class': 'comment-user'}) print(type(comment)) <class 'bs4.element.Tag'> -------------------------------------------------------------------------- comment = soup.findAll('div', {'class': 'comment-user'}) print(type(comment)) <class 'bs4.element.ResultSet'> -------------------------------------------------------------------------- question = soup.find('p', {'itemprop': 'text'}).text -------------------------------------------------------------------------- image_url = image_tag.find('img').get('src') -------------------------------------------------------------------------- soup.find('app-comment-list') -------------------------------------------------------------------------- comment_boxes = comments_placeholder.findAll('app-comment') -------------------------------------------------------------------------- comment = comment_box.find('p', {'class': 'text', 'itemprop': 'text'}) --------------------------------------------------------------------------

+ Usages (March 7, 2020, 3:40 p.m.)

From local file: soup = BeautifulSoup(open('source.html'), 'html.parser') comments = soup.find('app-comment-list') print(comments) ----------------------------------------------------------------------------------- From URL: response = requests.get(url='URL') comments = soup.find('app-comment-list') print(comments) ----------------------------------------------------------------------------------- From URL pass data as POST to URL: data = {'from_post': 1, 'to_post': 100) response = requests.get(url='URL', json=data) comments = soup.find('app-comment-list') print(comments) ----------------------------------------------------------------------------------- From URL using requests and proxy: params = { 'timeout': 20, 'verify': False, 'proxies': {'https': 'https://192.168.1.17:8080'}, 'url': URL, 'json': {} } response = requests.get(**params) comments = soup.find('app-comment-list') print(comments) -----------------------------------------------------------------------------------

+ Installation (March 7, 2020, 3:39 p.m.)

pip install beautifulsoup4 or apt-get install python3-bs4