INTEGRATION: CWS lo8 (1.1.2); FILE ADDED

2005/06/08 16:27:17 lo 1.1.2.1: restructuring of project and fix for #i44847#
2005-10-24 15:05:15 +00:00 · 2005-10-24 15:05:15 +00:00 · e20f83fad9
commit e20f83fad9
parent eca52043e3
2 changed files with 282 additions and 0 deletions
--- a/xmerge/source/aportisdoc/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html
+++ b/xmerge/source/aportisdoc/java/org/openoffice/xmerge/converter/xml/sxw/aportisdoc/package.html
@ -0,0 +1,263 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
+<!--
+ #  The Contents of this file are made available subject to the terms of
+ #  either of the following licenses
+ #
+ #         - GNU Lesser General Public License Version 2.1
+ #         - Sun Industry Standards Source License Version 1.1
+ #
+ #  Sun Microsystems Inc., October, 2000
+ #
+ #  GNU Lesser General Public License Version 2.1
+ #  =============================================
+ #  Copyright 2000 by Sun Microsystems, Inc.
+ #  901 San Antonio Road, Palo Alto, CA 94303, USA
+ #
+ #  This library is free software; you can redistribute it and/or
+ #  modify it under the terms of the GNU Lesser General Public
+ #  License version 2.1, as published by the Free Software Foundation.
+ #
+ #  This library is distributed in the hope that it will be useful,
+ #  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ #  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ #  Lesser General Public License for more details.
+ #
+ #  You should have received a copy of the GNU Lesser General Public
+ #  License along with this library; if not, write to the Free Software
+ #  Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ #  MA  02111-1307  USA
+ #
+ #
+ #  Sun Industry Standards Source License Version 1.1
+ #  =================================================
+ #  The contents of this file are subject to the Sun Industry Standards
+ #  Source License Version 1.1 (the "License"); You may not use this file
+ #  except in compliance with the License. You may obtain a copy of the
+ #  License at http://www.openoffice.org/license.html.
+ #
+ #  Software provided under this License is provided on an "AS IS" basis,
+ #  WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
+ #  WITHOUT LIMITATION, WARRANTIES THAT THE SOFTWARE IS FREE OF DEFECTS,
+ #  MERCHANTABLE, FIT FOR A PARTICULAR PURPOSE, OR NON-INFRINGING.
+ #  See the License for the specific provisions governing your rights and
+ #  obligations concerning the Software.
+ #
+ #  The Initial Developer of the Original Code is: Sun Microsystems, Inc.
+ #
+ #  Copyright: 2000 by Sun Microsystems, Inc.
+ #
+ #  All Rights Reserved.
+ #
+ #  Contributor(s): _______________________________________
+ #
+ #
+ --><html>
+<head>
+<title>org.openoffice.xmerge.converter.xml.sxw.aportisdoc package</title>
+</head>
+
+<body bgcolor="white">
+
+<p>Provides the tools for doing the conversion of StarWriter XML to
+and from AportisDoc format.</p>
+
+<p>It follows the {@link org.openoffice.xmerge} framework for the conversion process.</p>
+
+<p>Since it converts to/from a Palm application format, these converters
+follow the <a href=../../../../converter/palm/package-summary.html#streamformat>
+<code>PalmDB</code> stream format</a> for writing out to the Palm sync client or
+reading in from the Palm sync client.</p>
+
+<p>Note that <code>PluginFactoryImpl</code> also provides a
+<code>DocumentMerger</code> object, i.e. {@link org.openoffice.xmerge.converter.xml.sxw.aportisdoc.DocumentMergerImpl DocumentMergerImpl}.
+This functionality was derived from its superclass
+{@link org.openoffice.xmerge.converter.xml.sxw.SxwPluginFactory
+SxwPluginFactory}.</p>
+
+<h2>AportisDoc pdb format - Doc</h2>
+
+<p>The AportisDoc pdb format is widely used by different Palm applications,
+e.g. QuickWord, AportisDoc Reader, MiniWrite, etc.  Note that some
+of these applications put tweaks into the format.  The converters will only
+support the default AportisDoc format, plus some very minor tweaks to accommodate
+other applications.</p>
+
+<p>The text content of the format is plain text, i.e. there are no styles
+or structures.  There is no notion of lists, list items, paragraphs,
+headings, etc.  The format does have support for bookmarks.</p>
+
+<p>For most Doc applications, the default character encoding supported is
+the extended ASCII character set, i.e. ISO-8859-1.  StarWriter XML is in
+UTF-8 encoding scheme.  Since UTF-8 encoding scheme covers more characters,
+converting UTF-8 strings into extended ASCII would mean that there can be
+possible loss of character mappings.</p>
+
+<p>Using JAXP, XML files can be parsed and read in as Java <code>String</code>s
+which is in Unicode format, there is no loss of character mapping from UTF-8
+to Java Strings.  There is possible loss of character mapping in
+converting Java <code>String</code>s to ASCII bytes.  Java characters that
+cannot be represented in extended ASCII are converted into the ASCII
+character '?' or x3F in hex digit via the <code>String.getBytes(encoding)</code>
+API.</p>
+
+<h2>SXW to DOC Conversion</h2>
+
+<p>The <code>DocumentSerializerImpl</code> class implements the
+<code>org.openoffice.xmerge.DocumentSerializer</code>.
+This class specifically provides the conversion process from a given
+<code>SxwDocument</code> object to DOC formatted records, which are
+then passed back to the client via the <code>ConvertData</code> object.</p>
+
+<p>The following XML tags are handled. [Note that some may not be implemented yet.]</p>
+<ul>
+<li>
+    <p>Paragraphs <tt>&lt;text:p&gt;</tt> and Headings <tt>&lt;text:h&gt;</tt></p>
+
+    <p>Heading elements are classified the same as paragraph
+    elements since both have the same possible elements inside.
+    Their main difference is that they refer to different types
+    of style information, which is outside of their element tags.
+    Since there are no styles on the DOC format, headings should
+    be treated the same way a paragraph is converted.</p>
+
+    <p>For paragraph elements, convert and transfer text nodes
+    that are essential.  Text nodes directly contained within paragraph
+    nodes are such.  There are also a number of elements that
+    a paragraph element may contain.  These are explained in their
+    own context.</p>
+
+    <p>At the end of the paragraph, an EOL character is added by
+    the converter to provide a separation for each paragraph,
+    since the Doc format does not have a notion of a paragraph.</p>
+</li>
+<li>
+    <p>White spaces <tt>&lt;text:s&gt;</tt> and Tabs <tt>&lt;text:tab-stop&gt;</tt></p>
+
+    <p>In SXW, normally 2 or more white-space characters are collapsed into
+    a single space character.  In order to make sure that the document
+    content really contains those white-space characters, there are special
+    elements assigned to them.</p>
+
+    <p>The space element specifies the number of spaces are in it.
+    Thus, converting it just means providing the specific number of spaces
+    that the element requires.</p>
+
+    <p>There is also the tab-stop element.  This is a bit tricky.  In a
+    StarWriter document, tab-stops are specified by a column position.
+    A tab is not an exact number of space, but rather a specific column
+    positioning.  Say, regular tab-stops are set at every 5th column.
+    At column 4, if I hit a tab, it goes to column 5.  At column 1, hitting
+    a tab would put the cursor at column 5 as well.  SmartDoc and AporticDoc
+    applications goes by columns for the ASCII tab character. The only problem
+    is that in StarWriter, one could specify a different tab-stop, but not
+    in most of these Doc applications, at least I have not seen one.
+    Solution for this is just to go with the converting to the ASCII tab
+    character and not do anything for different tab-stop positioning.</p>
+</li>
+<li>
+    <p>Line breaks <tt>&lt;text:line-break&gt;</tt></p>
+
+    <p>To represent line breaks, it is simpliest to just put an ASCII LF
+    character.  Note that the side effect of this is that an end of paragraph
+    also contains an ASCII LF character.  Thus, for the DOC to SXW conversion,
+    line breaks are not distinguishable from specifying the end of a
+    paragraph.</p>
+</li>
+<li>
+    <p>Text spans <tt>&lt;text:span&gt;</tt></p>
+
+    <p>Text spans contain text that have different style attributes
+    from the paragraphs'.  Text spans can be embedded within another
+    text span.  Since it is purely for style tagging, we only needed
+    to convert and transfer the text elements within these.</p>
+</li>
+<li>
+    <p>Hyperlinks <tt>&lt;text:a&gt;</tt>
+
+    <p>Convert and transfer the text portion.</p>
+</li>
+<li>
+    <p>Bookmarks <tt>&lt;text:bookmark&gt;</tt> <tt>&lt;text:bookmark-start&gt;</tt>
+    <tt>&lt;text:bookmark-end&gt;</tt> [Not implemented yet]</p>
+
+    <p>In SXW, bookmark elements are embedded inside paragraph elements.
+    Bookmarks can either mark a text position or a text range. <tt>&lt;text:bookmark&gt;</tt>
+    marks a position while the pair <tt>&lt;text:bookmark-start&gt;</tt> and
+    <tt>&lt;text:bookmark-end&gt;</tt></p> marks a text range.  The DOC format only
+    supports bookmarking a text position.  Thus, for the conversion,
+    <tt>&lt;text:bookmark&gt;</tt> and  <tt>&lt;text:bookmark-start&gt;</tt> will both mark
+    a text position.</p>
+</li>
+<li>
+    <p>Change Tracking <tt>&lt;text:tracked-changes&gt;</tt>
+    <tt>&lt;text:change*&gt;</tt> [Not implemented yet]</p>
+
+    <p>Change tracking elements are not supported yet on the current
+    OpenOffice XML filters, will have to watch out on this.  The text
+    within these elements have to be interpreted properly during the
+    conversion process.</p>
+</li>
+<li>
+    <p>Lists <tt>&lt;text:unordered-list&gt;</tt> and
+    <tt>&lt;text:ordered-lists&gt;</tt></p>
+
+    <p>A list can only contain one optional <tt>&lt;text:list-header&gt;</tt>
+    and one or more <tt>&lt;text:list-item&gt;</tt> elements.</p>
+
+    <p>A <tt>&lt;text:list-header&gt;</tt> contains one or more paragraph
+    elements.  Since there are no styles, the conversion process does not
+    do anything special for list headers, conversion for the paragraphs
+    within list headers are the same as explained above.</p>
+
+    <p>A <tt>&lt;text:list-item&gt;</tt> may contain one or more of paragraphs,
+    headings, list, etc.  Since the Doc format does not support any list
+    structure, there will not be any special handling for this element.
+    Conversion for elements within it shall be applied according to the
+    element type.  Thus, lists with paragraphs within it will result in just
+    plain paragraphs.  Sublists will not be identifiable.  Paragraphs in
+    sublists will still appear.</p>
+</li>
+<li>
+    <p><tt>&lt;text:section&gt;</tt></p>
+
+    <p>I am not sure what this is yet, will need to investigate more on this.</p>
+</li>
+</ul>
+<p>There may be other tags that will still need to be addressed for this conversion.</p>
+
+<p>Refer to {@link org.openoffice.xmerge.converter.xml.sxw.aportisdoc.DocumentSerializerImpl DocumentSerializerImpl}
+for details of implementation. It uses <code>DocEncoder</code> class to do the encoding
+part.</p>
+
+<h2>DOC to SXW Conversion</h2>
+
+<p>The <code>DocumentDeserializerImpl</code> class implements the
+<code>org.openoffice.xmerge.DocumentDeserializer</code>. It is 
+passed the device document in the form of a <code>ConvertData</code> object.
+It will then create a <code>SxwDocument</code> object from the conversion of
+the DOC formatted records.</p>
+
+<p>The text content of the Doc format will be transferred as text.  Paragraph
+elements will be formed based on the existence of an ASCII LF character.  There
+will be at least one paragraph element.</p>
+
+<p>Bookmarks in the Doc format will be converted to the bookmark element
+<tt>&lt;text:bookmark&gt;</tt> [Not implemented yet].</p>
+
+
+<h2>Merging changes</h2>
+
+<p>As mentioned above, the <code>DocumentMerger</code> object produced by
+<code>PluginFactoryImpl</code> is <code>DocumentMergerImpl</code>.
+Refer to the javadocs for that package/class on its merging specifications.
+</p>
+
+<h2>TODO list</h2>
+
+<p><ol>
+<li>Investigate Palm's with different character encodings.</li>
+<li>Investigate other StarWriter XML tags</li>
+</ol></p>
+
+</body>
+</html>
--- a/xmerge/source/bridge/antcall.txt
+++ b/xmerge/source/bridge/antcall.txt
@ -0,0 +1,19 @@
+r:\btools\apache-ant-1.6.1\bin\ant 
+-Dprj=../.. 
+-Dprjname= 
+-Ddebug=off 
+-Doptimize=on  
+-Dtarget=xmrg_bridge 
+-Dsolar.update=on 
+-Dout=../..\wntmsci10.pro 
+-Dinpath=wntmsci10.pro 
+-Dproext=".pro"  
+-Dsolar.bin=Y:\so-cwsserv03\lo8\SRC680\wntmsci10.pro\bin.m105 
+-Dsolar.jar=Y:\so-cwsserv03\lo8\SRC680\wntmsci10.pro\bin.m105 
+-Dsolar.doc=Y:\so-cwsserv03\lo8\SRC680\wntmsci10.pro\doc.m105  
+-Dcommon.jar=Y:\so-cwsserv03\lo8\SRC680\common.pro\bin.m105 
+-Dcommon.doc=Y:\so-cwsserv03\lo8\SRC680\common.pro\doc.m105  
+-f build.xml  
+-emacs
+
+Buildfile: build.xml