OpenXML (docx without word/styles.xml)

Created by: Anonymous

Original issue 453 created by andriy.luts...@crowdin.com on 2015-04-07T12:20:37.000Z:

According to ECMA-376, 4th Edition
(and also https://msdn.microsoft.com/en-us/library/aa982683%28v=office.12%29.aspx)
style (word/styles.xml) definitions part is optional.
Usually this file exists in generated by LO/OO/MSO, but also there some apps which don't create this file.

In result of processing this (attached) docx file we obtain zip file only with [Content_Types].xml.

Solution of this issue:
File
okapi/okapi/filters/openxml/src/main/java/net/sf/okapi/filters/openxml/OpenXMLFilter.java in method nextInZipFile() in condition if(nZipType==MSWORD):

line 694:

if (sEntryName.equals("word/styles.xml"))

change to:

if (sEntryName.equals("word/document.xml"))

and line 704-705:

(sEntryName.equals("[Content_Types].xml") || // but don't do Content_Types
sEntryName.equals("word/styles.xml"))) // and styles a second time

change to:

(sEntryName.equals("[Content_Types].xml") || sEntryName.equals("word/document.xml")))

I can't estimate quality of my "hack" but after this fixing processing of attached document finished successfully.

Tested on okapi's development branch.

Assignee Loading
Time tracking Loading