html - Printing accents and foreign characters using beautiful soup and python -


i scraping artists discogs.com. unable artist names appear on page. e.g. artist andrés appears andr\xe9s when run code.

can explain i'm doing wrong?

    bs4 import beautifulsoup     import requests     import urllib2     itertools import chain     import codecs      headers = { 'user-agent': 'mozilla/5.0 (windows nt 6.0; wow64; rv:24.0) gecko/20100101 firefox/24.0' }      all_artists = []      result_pages = 1 #446      def load_artists():         page in xrange(1, result_pages+1):             url = url = 'https://www.discogs.com/search/?sort=have%2cdesc&style_exact=house&genre_exact=electronic&decade=2010&page=' + str(page)             r = requests.get(url, headers = headers)             soup = beautifulsoup(r.content.decode('utf-8'), 'html.parser')             [all_artists.append(tag["title"]) tag in soup.select('div#search_results h5 span')]      load_artists()      all_artists 

you need use python3, , no longer suffer this


Comments

Popular posts from this blog

java - SSE Emitter : Manage timeouts and complete() -

jquery - uncaught exception: DataTables Editor - remote hosting of code not allowed -

java - How to resolve error - package com.squareup.okhttp3 doesn't exist? -