[bill][cragr] Unable to download document
With which module do you encounter problems?
CrAgr
Describe the problem you encounter
Exception when downloading a document (bill)
Paste here the stacktrace or error message you observe
2023-12-06 19:08:16,526:DEBUG:urllib3.connectionpool:3.6:connectionpool.py:456:_make_request https://www.credit-agricole.fr:443 "GET /ca-norddefrance/particulier/operations/documents/edocuments/_jcr_content.bam.pj.html/stb/collecteNI?zzzz&typeaction=telechargement HTTP/1.1" 200 None
encoding error : input conversion failed due to input error, bytes 0x81 0x82 0x83 0x84
encoding error : input conversion failed due to input error, bytes 0x81 0x82 0x83 0x84
I/O error : encoder error
2023-12-06 19:08:16,559:DEBUG:woob.core.bcall:3.6:bcall.py:92:backend_process <Backend ca>: Called function <bound method Application._do_complete of <woob.applications.bill.bill.AppBill object at 0x7f5293c56f50>> raised an error: XMLSyntaxError('Growing input buffer, line 1, column 1')
Bug(ca): Growing input buffer, line 1, column 1 (collecteNI?zzzz&typeaction=telechargement, line 1)
Traceback (most recent call last):
File "/home/xxxx/yyyy/woob/core/bcall.py", line 88, in backend_process
result = function(backend, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/tools/application/base.py", line 348, in _do_complete
res = getattr(backend, function)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/modules/cragr/module.py", line 245, in download_document
return self.browser.download_document(document)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/browsers.py", line 1121, in inner
return func(browser, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/modules/cragr/browser.py", line 1643, in download_document
response = self.open(document.url, params=params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/browsers.py", line 1022, in open
return super(PagesBrowser, self).open(callback=internal_callback, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/browsers.py", line 879, in open
return super().open(url, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/browsers.py", line 530, in open
response = self.session.send(preq,
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/sessions.py", line 161, in send
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/sessions.py", line 154, in func
return callback(self, resp)
^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/browsers.py", line 526, in inner_callback
return callback(response)
^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/browsers.py", line 1002, in internal_callback
response.page = url.handle(response)
^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/url.py", line 278, in handle
page = self.klass(self.browser, response, m.groupdict())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/pages.py", line 627, in __init__
super().__init__(*args, **kwargs)
File "/home/xxxx/yyyy/woob/browser/pages.py", line 199, in __init__
self.doc = self.build_doc(self.data)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxxx/yyyy/woob/browser/pages.py", line 742, in build_doc
doc = html.parse(io, parser, base_url=self.url)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/lxml/html/__init__.py", line 937, in parse
return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/lxml/etree.pyx", line 3541, in lxml.etree.parse
File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseDocument
File "src/lxml/parser.pxi", line 1916, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1803, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1144, in lxml.etree._BaseParser._parseDoc
File "src/lxml/parser.pxi", line 618, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 728, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 657, in lxml.etree._raiseParseError
File "https://www.credit-agricole.fr/ca-norddefrance/particulier/operations/documents/edocuments/_jcr_content.bam.pj.html/stb/collecteNI?zzzz&typeaction=telechargement", line 1
lxml.etree.XMLSyntaxError: Growing input buffer, line 1, column 1
2023-12-06 19:08:16,640:DEBUG:woob.backend.ca.browser:3.6:browsers.py:1256:dump_state Stored cookies into storage
What are the steps to reproduce the problem?
Download a document from woob bill with module cragr
bill --debug -b ca download xxxxxxxxxxx_yyyyyyyyyy@cragr
What woob version are you using?
Output of woob config --version
command:
Woob config v3.6 Copyright(C) 2010-2023 Christophe Benz, Romain Bignon
What module version are you using?
Output of woob config info MODULE_NAME
command:
.------------------------------------------------------------------------------.
| Module cragr |
+-----------------.------------------------------------------------------------'
| Version | 202310191405
| Maintainer | Quentin Defenouillère <quentin.defenouillere@budget-insight.com>
| License | LGPLv3+
| Description | Crédit Agricole
| Capabilities | CapTransfer, CapProfile, CapDocument, CapCollection, CapBankTransferAddRecipient, CapBankTransfer, CapBank, CapCredentialsCheck, CapBankWealth
| Installed | yes
| Location | https://updates.woob.tech/3/main/cragr.tar.gz
| |
| Configuration | website: Caisse Régionale
| | login: Identifiant à 11 chiffres (default: )
| | password: Code personnel à 6 chiffres (default: )
'-----------------'
How did you install woob?
yay/pacman on arch
Additional info you'd like to mention
When downloading the document, the module seems to try to determine the document encoding by parsing it with an xml parser
adding the following to class SubscriptionsDocumentsPage in modules/cragr/document_pages.py seems to workaround the problem:
def detect_encoding(self):
return None
Edited by Matthieu Helleboid