README.adoc 7.46 KB
Newer Older
1 2
= EPUB Parser

KitaitiMakoto's avatar
KitaitiMakoto committed
3 4
= {doctitle}

5 6
image:https://gitlab.com/KitaitiMakoto/epub-parser/badges/master/build.svg[link="https://gitlab.com/KitaitiMakoto/epub-parser/commits/master", title="pipeline status"]
image:https://badge.fury.io/rb/epub-parser.svg[link="https://gemnasium.com/KitaitiMakoto/epub-parser",title="Gem Version"]
7
image:https://gitlab.com/KitaitiMakoto/epub-parser/badges/master/coverage.svg[link="https://kitaitimakoto.gitlab.io/epub-parser/coverage/",title="coverage report"]
8

9 10
* https://kitaitimakoto.gitlab.io/epub-parser/file.Home.html[Homepage]
* https://kitaitimakoto.gitlab.io/epub-parser/[Documentation]
11
* https://gitlab.com/KitaitiMakoto/epub-parser[Source Code]
12
* https://kitaitimakoto.gitlab.io/epub-parser/coverage/[Test Coverage]
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

== INSTALLATION

----
gem install epub-parser
----

== USAGE

=== As a library

----
require 'epub/parser'

book = EPUB::Parser.parse('book.epub')
book.metadata.titles # => Array of EPUB::Publication::Package::Metadata::Title. Main title, subtitle, etc...
book.metadata.title # => Title string including all titles
book.metadata.creators # => Creators(authors)
book.each_page_on_spine do |page|
  page.media_type # => "application/xhtml+xml"
  page.entry_name # => "OPS/nav.xhtml" entry name in EPUB package(zip archive)
  page.read # => raw content document
  page.content_document.nokogiri # => Nokogiri::XML::Document. The same to Nokogiri.XML(page.read)
  # do something more
  #    :
end
39
book.cover_image # => EPUB::Publication::Package::Manifest::Item which represents cover image file
40 41
----

42
See document's {file:docs/Home.markdown} or https://kitaitimakoto.gitlab.io/epub-parser/[API Documentation] for more info.
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107

=== `epubinfo` command-line tool

`epubinfo` tool extracts and shows the metadata of specified EPUB book.

----
$ epubinfo ~/Documebts/Books/build_awesome_command_line_applications_in_ruby.epub
Title:              Build Awesome Command-Line Applications in Ruby (for KITAITI MAKOTO)
Identifiers:        978-1-934356-91-3
Titles:             Build Awesome Command-Line Applications in Ruby (for KITAITI MAKOTO)
Languages:          en
Contributors:       
Coverages:          
Creators:           David Bryant Copeland
Dates:              
Descriptions:       
Formats:            
Publishers:         The Pragmatic Bookshelf, LLC (338304)
Relations:          
Rights:             Copyright © 2012 Pragmatic Programmers, LLC
Sources:            
Subjects:           Pragmatic Bookshelf
Types:              
Unique identifier:  978-1-934356-91-3
Epub version:       2.0
----

See {file:docs/Epubinfo.markdown} for more info.

=== `epub-open` command-line tool

`epub-open` tool provides interactive shell(IRB) which helps you research about EPUB book.

----
epub-open path/to/book.epub
----

IRB starts. `self` becomes the EPUB book and can access to methods of `EPUB`.

----
title
=> "Title of the book"
metadata.creators
=> [Author 1, Author2, ...]
resources.first.properties
=> #<Set: {"nav"}> # You know that first resource of this book is nav document
nav = resources.first
=> ...
nav.href
=> #<Addressable::URI:0x15ce350 URI:nav.xhtml>
nav.media_type
=> "application/xhtml+xml"
puts nav.read
<?xml version="1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
    :
    :
    :
</html>
=> nil
exit # Enter "exit" when exit the session
----

See {file:docs/EpubOpen.markdown} for more info.

108 109 110 111 112 113 114 115 116 117 118
=== `epub-cover` command-line tool

`epub-cover` tool extract cover image from EPUB book.

----
% epub-cover childrens-literature.epub
Cover image output to cover.png
----

See {file:docs/EpubCover.adoc} for details.

119 120
== DOCUMENTATION

121
Documentation is available in https://kitaitimakoto.gitlab.io/epub-parser/file.Home.html[homepage].
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176

If you installed EPUB Parser by gem command, you can also generate documentaiton yourself(https://rubygems.org/gems/rubygems-yardoc[rubygems-yardoc] gem is needed):

----
$ gem install epub-parser
$ gem yardoc epub-parser
...
Files:          33
Modules:        20 (   20 undocumented)
Classes:        45 (   44 undocumented)
Constants:      31 (   31 undocumented)
Methods:       292 (   88 undocumented)
52.84% documented
YARD documentation is generated to:
/path/to/gempath/ruby/2.2.0/doc/epub-parser-0.2.0/yardoc
----

It will show you path to generated documentation(`/path/to/gempath/ruby/2.2.0/doc/epub-parser-0.2.0/yardoc` here) at the end.

Or, generating by yardoc command is possible, too:

----
$ git clone https://gitlab.com/KitaitiMakoto/epub-parser.git
$ cd epub-parser
$ bundle install --path=deps
$ bundle exec rake doc:yard
...
Files:          33
Modules:        20 (   20 undocumented)
Classes:        45 (   44 undocumented)
Constants:      31 (   31 undocumented)
Methods:       292 (   88 undocumented)
52.84% documented
----

Then documentation will be available in `doc` directory.

== REQUIREMENTS

* Ruby 2.3.0 or later

== SIMILAR EFFORTS

* https://github.com/skoji/gepub[gepub] - a generic EPUB library for Ruby
* https://github.com/chdorner/epubinfo[epubinfo] - Extracts metadata information from EPUB files. Supports EPUB2 and EPUB3 formats.
* https://github.com/kmuto/review[ReVIEW] - ReVIEW is a easy-to-use digital publishing system for books and ebooks.
* https://github.com/takahashim/epzip[epzip] - epzip is EPUB packing tool. It's just only doing 'zip.' :)
* https://github.com/jugyo/eeepub[eeepub] - EeePub is a Ruby ePub generator
* https://gitlab.com/KitaitiMakoto/epub-maker[epub-maker] - This library supports making and editing EPUB books based on this EPUB Parser library
* https://gitlab.com/KitaitiMakoto/epub-cfi[epub-cfi] - EPUB CFI library extracted this EPUB Parser library.

If you find other gems, please tell me or request a pull request.

== RECENT CHANGES

177 178 179
=== 0.3.9

* [BUG FIX]Set {EPUB::Metadata::DCMES#lang} properly from xml:lang attribute
180
* Change default XML backend from REXML to Nokogiri
181

182 183
=== 0.3.8

184
* [REFACTORING]Add {EPUB::Parser::NokogiriAttributeWithPrefix} and use `Nokogiri::XML::Node#attribute_with_prefix` instead of `EPUB::Parser::Utils#extract_attribute`
185
* Set default value for detect_encoding argument for {EPUB::Publication::Package::Manifest::Item#read} to false
186 187
* Make XML library switchable between REXML and Nokogiri
* Make REXML a default XML backend
188

189 190 191
=== 0.3.7

* Strip leading and trailing white spaces from identifiers
192
* Change home page and documentation from rubydoc.info to GitLab Pages
193
* Make {EPUB::Book::Features#cover_image Book::Features#cover_image} return EPUB 2 cover image if EPUB 3's not available
194
* Add `epub-cover` command-line tool. See {file:docs/EpubCover.adoc} for details.
195

196 197 198 199 200
See {file:CHANGELOG.adoc} for older changelogs and details.

== TODOS

* Consider to implement IRI feature instead of to use Addressable
KitaitiMakoto's avatar
KitaitiMakoto committed
201
* EPUB 3.2
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220
* Help features for `epub-open` tool
* Vocabulary Association Mechanisms
* Implementing navigation document and so on
* Media Overlays
* Content Document
* Digital Signature
* Handle with encodings other than UTF-8

== DONE

* Simple inspect for `epub-open` tool
* Using zip library instead of `unzip` command, which has security issue
* Modify methods around fallback to see `bindings` element in the package
* Content Document(only for Navigation Documents)
* Fixed Layout
* Vocabulary Association Mechanisms(only for itemref)
* Archive library abstraction
* Extracting and organizing common behavior from some classes to modules
* Multiple rootfiles
221
* Abstraction of XML parser(making it possible to use REXML, standard bundled XML library of Ruby)
222 223 224 225

== LICENSE

This library is distribuetd under the term of the MIT License.
226
See {file:MIT-LICENSE} file for more info.