Skip to content

[impotsgouvfrpar] download with a meaningful filename

Currently there is not meaningful filename extracted by the get_document method:

id: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxa@impotsgouvfrpar
url: /enp/Affichage_Document_PDF?idEnsua=xxxxxxxxxxxxxxxxxxxxxx
date: 2024-01-01
format: pdf
label: Impôt sur les revenus 2023 – Montant de l’avance de réductions et crédits d’impôt
type: notice
transactions: []
has_file: True

However, a nice filename is available when the file is downloaded, as shown by using a browser directly. I wonder if there is a way to extract this name with the current architecture. Currently, the download is managed inside download_document with

  return self.browser.open(document.url).content

so I guess it's forgotten when it reaches applications/bill/bill.pyand its simple `write' on the buffer.

Two ideas for improvement:

  • extract the filename inside get_document with a HEAD request on the url
  • improve the downloading stuff

I would be happy to work on this suggestion and I would welcome any suggestion.