UnpackedArchive.markdown 4.55 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
{file:docs/Home.markdown} > **{file:docs/UnpackedArchive.markdown}**

Unpacked Archive
================

From version 0.2.0, EPUB Parser can parse EPUB books from unpacked archive, or file system directory.

Let's parse pretty comic Page Blanche:

    % tree page-blanche
    page-blanche
    ├── EPUB
    │   ├── Content
    │   │   ├── PageBlanche_Page_000.xhtml
    │   │   ├── PageBlanche_Page_001.xhtml
    │   │   ├── PageBlanche_Page_002.xhtml
    │   │   ├── PageBlanche_Page_003.xhtml
    │   │   ├── PageBlanche_Page_004.xhtml
    │   │   ├── PageBlanche_Page_005.xhtml
    │   │   ├── PageBlanche_Page_006.xhtml
    │   │   ├── PageBlanche_Page_007.xhtml
    │   │   ├── PageBlanche_Page_008.xhtml
    │   │   └── cover.xhtml
    │   ├── Image
    │   │   ├── PageBlanche_Page_001.jpg
    │   │   ├── PageBlanche_Page_002.jpg
    │   │   ├── PageBlanche_Page_003.jpg
    │   │   ├── PageBlanche_Page_004.jpg
    │   │   ├── PageBlanche_Page_005.jpg
    │   │   ├── PageBlanche_Page_006.jpg
    │   │   ├── PageBlanche_Page_007.jpg
    │   │   ├── PageBlanche_Page_008.jpg
    │   │   └── cover.jpg
    │   ├── Navigation
    │   │   ├── nav.xhtml
    │   │   └── toc.ncx
    │   ├── Style
    │   │   └── style.css
    │   └── package.opf
    ├── META-INF
    │   └── container.xml
    └── mimetype

To load EPUB books from directory, you need specify file adapter via {EPUB::OCF::PhysicalContainer} at first:

    require 'epub/parser'
    
48
    EPUB::OCF::PhysicalContainer.adapter = :UnpackedDirectory
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

And then, directory path as EPUB path:

    epub = EPUB::Parser.parse('./page-blanche')

Now you can handle the EPUB book as always.

    epub.title # => "Page Blache"
    epub.each_page_on_spine.to_a.length # => 10
    puts epub.nav.content_document.contents.map {|content| "#{File.basename(content.href.to_s)} ... #{content.text}"}
    # PageBlanche_Page_002.xhtml ... Dédicace
    # PageBlanche_Page_005.xhtml ... Commencer la lecture
    # => nil

If set {EPUB::OCF::PhysicalContainer.adapter}, it is used every time EPUB Parser parses books even when it's packaged EPUB file. Instead of setting adapter globally, you can also specify adapter for parsing individually by passing keyword argument `container_adapter` to `.parse` method:

    # From packaged file
    File.ftype './page-blanche.epub' # => "file"
    archived_book = EPUB::Parser.parse('./page-blanche.epub') # => EPUB::Book
    # From directory
    File.ftype './page-blanche' # => "directory"
70
    unpacked_book = EPUB::Parser.parse('./page-blanche', container_adapter: :UnpackedDirectory) # => EPUB::Book
71

72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107
Command-line tools
------------------

Command-line tools `epubinfo` and `epub-open` may also handle with directory as EPUB books.

Executing `epubinfo`:

    $ epubinfo page-blanche
    Title:              Page Blanche
    Identifiers:        code.google.com.epub-samples.page-blanche
    Titles:             Page Blanche
    Languages:          fr
    Contributors:       Vincent Gros
    Coverages:          
    Creators:           Boulet, Bagieu Pénélope
    Dates:              2012-01-18
    Descriptions:       
    Formats:            
    Publishers:         éditions Delcourt
    Relations:          
    Rights:             This work is shared with the public using the Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.
    Sources:            
    Subjects:           
    Types:              
    Unique identifier:  code.google.com.epub-samples.page-blanche
    Epub version:       3.0

Executing `epub-open`:

    $ epub-open page-blanche
    Enter "exit" to exit IRB
    irb: warn: can't alias bindings from irb_workspaces.
    irb(main):001:0> title
    => "Page Blanche"
    irb(main):002:0> exit

108 109 110 111 112 113
Note
----

Actually loading EPUB books from unpacked directory is not recommended. The reason why is it's too complex to handle with files properly because of character encoding of file names such as Unicode normalization matters like UTF-8 NFD, NFC, NFKD, NFKC and OS X-specific custom NFD, IRI normalization like percent-encoding, case sensitivity or so on. And, you know, this is not standardized way to load EPUB books. So, at least in the near future, there's not plan to support various environment.

Of course, always pathces are welcome.