Im learning how to use Scrapy but when I try to export to csv I get a LookupError: unknown encoding: 'b'utf8''
I made an example with Stack Overflow (only for learning) and tried to scrap first page of questions and then export to csv. But I get an empty CSV and the error on my terminal is:
2025-03-16 12:07:52 [scrapy.core.scraper] ERROR: Spider error processing <GET ; (referer: None)
Traceback (most recent call last):
File "C:\Users\fvarelaa\AppData\Local\anaconda3\Lib\site-packages\scrapy\utils\defer.py", line 327, in iter_errback
yield next(it)
^^^^^^^^
LookupError: unknown encoding: 'b'utf8''
2025-03-16 12:07:52 [scrapy.core.engine] INFO: Closing spider (finished)
2025-03-16 12:07:52 [scrapy.extensions.feedexport] INFO: Stored csv feed (0 items) in: video.csv
2025-03-16 12:07:52 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
from scrapy.item import Field
from scrapy.item import Item
from scrapy.spiders import Spider
from scrapy.selector import Selector
from scrapy.loader import ItemLoader
class Pregunta(Item):
id = Field()
pregunta = Field()
descripcion = Field()
class StackOverFlowSpider(Spider):
name = "MiPrimerSpider"
custom_settings = {
"USER_AGENT": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
}
start_urls = [";]
def parse(self, response):
sel = Selector(response)
preguntas = sel.xpath("//div[contains(@id,'questions')]//div[@class='s-post-summary js-post-summary']")
for pregunta in preguntas:
item = ItemLoader(item = Pregunta(), selector = pregunta)
item.add_xpath('pregunta','.//h3/a/text()')
item.add_xpath('descripcion',".//div[contains(@class,'excerpt')]/text()")
item.add_value('id', 1)
yield item.load_item()
# scrapy runspider Intro_Scrapy.py -o video.csv
Im learning how to use Scrapy but when I try to export to csv I get a LookupError: unknown encoding: 'b'utf8''
I made an example with Stack Overflow (only for learning) and tried to scrap first page of questions and then export to csv. But I get an empty CSV and the error on my terminal is:
2025-03-16 12:07:52 [scrapy.core.scraper] ERROR: Spider error processing <GET https://stackoverflow/questions> (referer: None)
Traceback (most recent call last):
File "C:\Users\fvarelaa\AppData\Local\anaconda3\Lib\site-packages\scrapy\utils\defer.py", line 327, in iter_errback
yield next(it)
^^^^^^^^
LookupError: unknown encoding: 'b'utf8''
2025-03-16 12:07:52 [scrapy.core.engine] INFO: Closing spider (finished)
2025-03-16 12:07:52 [scrapy.extensions.feedexport] INFO: Stored csv feed (0 items) in: video.csv
2025-03-16 12:07:52 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
from scrapy.item import Field
from scrapy.item import Item
from scrapy.spiders import Spider
from scrapy.selector import Selector
from scrapy.loader import ItemLoader
class Pregunta(Item):
id = Field()
pregunta = Field()
descripcion = Field()
class StackOverFlowSpider(Spider):
name = "MiPrimerSpider"
custom_settings = {
"USER_AGENT": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
}
start_urls = ["https://stackoverflow/questions"]
def parse(self, response):
sel = Selector(response)
preguntas = sel.xpath("//div[contains(@id,'questions')]//div[@class='s-post-summary js-post-summary']")
for pregunta in preguntas:
item = ItemLoader(item = Pregunta(), selector = pregunta)
item.add_xpath('pregunta','.//h3/a/text()')
item.add_xpath('descripcion',".//div[contains(@class,'excerpt')]/text()")
item.add_value('id', 1)
yield item.load_item()
# scrapy runspider Intro_Scrapy.py -o video.csv
Share
Improve this question
edited Mar 18 at 17:11
vassiliev
7005 silver badges13 bronze badges
asked Mar 16 at 15:27
Francisco Augusto Varela AguirFrancisco Augusto Varela Aguir
3211 silver badge8 bronze badges
1 Answer
Reset to default 0This is https://github/scrapy/parsel/issues/307 which is fixed in parsel 1.10.0 released on 2024-12-16.