Skip to content

[browser/filters] Add FileExtension and MimeType Filters

Hicham Kouz requested to merge hicham.kouz/woob:add_new_filters_mime_ext into master

Add FileExtension and MimeType Filters

Introduce two new filters, FileExtension and MimeType, to enhance file processing capabilities and MIME type validation.

FileExtension Filter:

  • Extracts file extensions from file names or paths.
  • Supports optional validation of associated MIME types.
  • Improved handling of inputs without dots or with multiple dots.

MimeType Filter:

  • Retrieves MIME types from file names, paths, or extensions.
  • Enhances error handling for unrecognized MIME types.
  • Can be used to validate MIME types associated with FileExtension.
        MimeType
        """
        Get the MIME type from a file name or path.

        :param txt: The file name or path for which to determine the MIME type.
        :type txt: str
        :raises FormatError: If the MIME type is not recognized.

        >>> MimeType().filter('foo.pdf')
        'application/pdf'
        >>> MimeType().filter('path/foo/invoices.tar.gz')
        'application/x-tar'
        >>> MimeType(default='NAN').filter('foo.no')
        'NAN'
        """
        FileExtension
        """
        Get the file extension from a file name or path.

        :param txt: The file name or path for which to extract the file extension.
        :type txt: str
        :raises FormatError: If the file extension is not recognized.

        >>> FileExtension().filter('file.docx')
        'docx'
        >>> FileExtension().filter('path/to/file.tar.gz')
        'tar.gz'
        >>> FileExtension(default='NAN').filter('file_without_extension')
        'NAN'
        >>> FileExtension().filter('/home/user/Documents/report.pdf')
        'pdf'
        >>> FileExtension(default='UNKNOWN').filter('spreadsheet')
        'UNKNOWN'
        >>> FileExtension(default='UNKNOWN', validate_mime=True).filter('path/to/file.dfs')
        'UNKNOWN'
        >>> FileExtension(default='UNKNOWN', validate_mime=True).filter('file.jpg')
        'jpg'
        """

...

Merge request reports