INTEGRATION: CWS lo8 (1.1.2); FILE ADDED
2005/06/08 16:27:17 lo 1.1.2.1: restructuring of project and fix for #i44847#
This commit is contained in:
parent
eca52043e3
commit
e20f83fad9
2 changed files with 282 additions and 0 deletions
|
@ -0,0 +1,263 @@
|
|||
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
||||
<!--
|
||||
# The Contents of this file are made available subject to the terms of
|
||||
# either of the following licenses
|
||||
#
|
||||
# - GNU Lesser General Public License Version 2.1
|
||||
# - Sun Industry Standards Source License Version 1.1
|
||||
#
|
||||
# Sun Microsystems Inc., October, 2000
|
||||
#
|
||||
# GNU Lesser General Public License Version 2.1
|
||||
# =============================================
|
||||
# Copyright 2000 by Sun Microsystems, Inc.
|
||||
# 901 San Antonio Road, Palo Alto, CA 94303, USA
|
||||
#
|
||||
# This library is free software; you can redistribute it and/or
|
||||
# modify it under the terms of the GNU Lesser General Public
|
||||
# License version 2.1, as published by the Free Software Foundation.
|
||||
#
|
||||
# This library is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||
# Lesser General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU Lesser General Public
|
||||
# License along with this library; if not, write to the Free Software
|
||||
# Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
||||
# MA 02111-1307 USA
|
||||
#
|
||||
#
|
||||
# Sun Industry Standards Source License Version 1.1
|
||||
# =================================================
|
||||
# The contents of this file are subject to the Sun Industry Standards
|
||||
# Source License Version 1.1 (the "License"); You may not use this file
|
||||
# except in compliance with the License. You may obtain a copy of the
|
||||
# License at http://www.openoffice.org/license.html.
|
||||
#
|
||||
# Software provided under this License is provided on an "AS IS" basis,
|
||||
# WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
|
||||
# WITHOUT LIMITATION, WARRANTIES THAT THE SOFTWARE IS FREE OF DEFECTS,
|
||||
# MERCHANTABLE, FIT FOR A PARTICULAR PURPOSE, OR NON-INFRINGING.
|
||||
# See the License for the specific provisions governing your rights and
|
||||
# obligations concerning the Software.
|
||||
#
|
||||
# The Initial Developer of the Original Code is: Sun Microsystems, Inc.
|
||||
#
|
||||
# Copyright: 2000 by Sun Microsystems, Inc.
|
||||
#
|
||||
# All Rights Reserved.
|
||||
#
|
||||
# Contributor(s): _______________________________________
|
||||
#
|
||||
#
|
||||
--><html>
|
||||
<head>
|
||||
<title>org.openoffice.xmerge.converter.xml.sxw.aportisdoc package</title>
|
||||
</head>
|
||||
|
||||
<body bgcolor="white">
|
||||
|
||||
<p>Provides the tools for doing the conversion of StarWriter XML to
|
||||
and from AportisDoc format.</p>
|
||||
|
||||
<p>It follows the {@link org.openoffice.xmerge} framework for the conversion process.</p>
|
||||
|
||||
<p>Since it converts to/from a Palm application format, these converters
|
||||
follow the <a href=../../../../converter/palm/package-summary.html#streamformat>
|
||||
<code>PalmDB</code> stream format</a> for writing out to the Palm sync client or
|
||||
reading in from the Palm sync client.</p>
|
||||
|
||||
<p>Note that <code>PluginFactoryImpl</code> also provides a
|
||||
<code>DocumentMerger</code> object, i.e. {@link org.openoffice.xmerge.converter.xml.sxw.aportisdoc.DocumentMergerImpl DocumentMergerImpl}.
|
||||
This functionality was derived from its superclass
|
||||
{@link org.openoffice.xmerge.converter.xml.sxw.SxwPluginFactory
|
||||
SxwPluginFactory}.</p>
|
||||
|
||||
<h2>AportisDoc pdb format - Doc</h2>
|
||||
|
||||
<p>The AportisDoc pdb format is widely used by different Palm applications,
|
||||
e.g. QuickWord, AportisDoc Reader, MiniWrite, etc. Note that some
|
||||
of these applications put tweaks into the format. The converters will only
|
||||
support the default AportisDoc format, plus some very minor tweaks to accommodate
|
||||
other applications.</p>
|
||||
|
||||
<p>The text content of the format is plain text, i.e. there are no styles
|
||||
or structures. There is no notion of lists, list items, paragraphs,
|
||||
headings, etc. The format does have support for bookmarks.</p>
|
||||
|
||||
<p>For most Doc applications, the default character encoding supported is
|
||||
the extended ASCII character set, i.e. ISO-8859-1. StarWriter XML is in
|
||||
UTF-8 encoding scheme. Since UTF-8 encoding scheme covers more characters,
|
||||
converting UTF-8 strings into extended ASCII would mean that there can be
|
||||
possible loss of character mappings.</p>
|
||||
|
||||
<p>Using JAXP, XML files can be parsed and read in as Java <code>String</code>s
|
||||
which is in Unicode format, there is no loss of character mapping from UTF-8
|
||||
to Java Strings. There is possible loss of character mapping in
|
||||
converting Java <code>String</code>s to ASCII bytes. Java characters that
|
||||
cannot be represented in extended ASCII are converted into the ASCII
|
||||
character '?' or x3F in hex digit via the <code>String.getBytes(encoding)</code>
|
||||
API.</p>
|
||||
|
||||
<h2>SXW to DOC Conversion</h2>
|
||||
|
||||
<p>The <code>DocumentSerializerImpl</code> class implements the
|
||||
<code>org.openoffice.xmerge.DocumentSerializer</code>.
|
||||
This class specifically provides the conversion process from a given
|
||||
<code>SxwDocument</code> object to DOC formatted records, which are
|
||||
then passed back to the client via the <code>ConvertData</code> object.</p>
|
||||
|
||||
<p>The following XML tags are handled. [Note that some may not be implemented yet.]</p>
|
||||
<ul>
|
||||
<li>
|
||||
<p>Paragraphs <tt><text:p></tt> and Headings <tt><text:h></tt></p>
|
||||
|
||||
<p>Heading elements are classified the same as paragraph
|
||||
elements since both have the same possible elements inside.
|
||||
Their main difference is that they refer to different types
|
||||
of style information, which is outside of their element tags.
|
||||
Since there are no styles on the DOC format, headings should
|
||||
be treated the same way a paragraph is converted.</p>
|
||||
|
||||
<p>For paragraph elements, convert and transfer text nodes
|
||||
that are essential. Text nodes directly contained within paragraph
|
||||
nodes are such. There are also a number of elements that
|
||||
a paragraph element may contain. These are explained in their
|
||||
own context.</p>
|
||||
|
||||
<p>At the end of the paragraph, an EOL character is added by
|
||||
the converter to provide a separation for each paragraph,
|
||||
since the Doc format does not have a notion of a paragraph.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>White spaces <tt><text:s></tt> and Tabs <tt><text:tab-stop></tt></p>
|
||||
|
||||
<p>In SXW, normally 2 or more white-space characters are collapsed into
|
||||
a single space character. In order to make sure that the document
|
||||
content really contains those white-space characters, there are special
|
||||
elements assigned to them.</p>
|
||||
|
||||
<p>The space element specifies the number of spaces are in it.
|
||||
Thus, converting it just means providing the specific number of spaces
|
||||
that the element requires.</p>
|
||||
|
||||
<p>There is also the tab-stop element. This is a bit tricky. In a
|
||||
StarWriter document, tab-stops are specified by a column position.
|
||||
A tab is not an exact number of space, but rather a specific column
|
||||
positioning. Say, regular tab-stops are set at every 5th column.
|
||||
At column 4, if I hit a tab, it goes to column 5. At column 1, hitting
|
||||
a tab would put the cursor at column 5 as well. SmartDoc and AporticDoc
|
||||
applications goes by columns for the ASCII tab character. The only problem
|
||||
is that in StarWriter, one could specify a different tab-stop, but not
|
||||
in most of these Doc applications, at least I have not seen one.
|
||||
Solution for this is just to go with the converting to the ASCII tab
|
||||
character and not do anything for different tab-stop positioning.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>Line breaks <tt><text:line-break></tt></p>
|
||||
|
||||
<p>To represent line breaks, it is simpliest to just put an ASCII LF
|
||||
character. Note that the side effect of this is that an end of paragraph
|
||||
also contains an ASCII LF character. Thus, for the DOC to SXW conversion,
|
||||
line breaks are not distinguishable from specifying the end of a
|
||||
paragraph.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>Text spans <tt><text:span></tt></p>
|
||||
|
||||
<p>Text spans contain text that have different style attributes
|
||||
from the paragraphs'. Text spans can be embedded within another
|
||||
text span. Since it is purely for style tagging, we only needed
|
||||
to convert and transfer the text elements within these.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>Hyperlinks <tt><text:a></tt>
|
||||
|
||||
<p>Convert and transfer the text portion.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>Bookmarks <tt><text:bookmark></tt> <tt><text:bookmark-start></tt>
|
||||
<tt><text:bookmark-end></tt> [Not implemented yet]</p>
|
||||
|
||||
<p>In SXW, bookmark elements are embedded inside paragraph elements.
|
||||
Bookmarks can either mark a text position or a text range. <tt><text:bookmark></tt>
|
||||
marks a position while the pair <tt><text:bookmark-start></tt> and
|
||||
<tt><text:bookmark-end></tt></p> marks a text range. The DOC format only
|
||||
supports bookmarking a text position. Thus, for the conversion,
|
||||
<tt><text:bookmark></tt> and <tt><text:bookmark-start></tt> will both mark
|
||||
a text position.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>Change Tracking <tt><text:tracked-changes></tt>
|
||||
<tt><text:change*></tt> [Not implemented yet]</p>
|
||||
|
||||
<p>Change tracking elements are not supported yet on the current
|
||||
OpenOffice XML filters, will have to watch out on this. The text
|
||||
within these elements have to be interpreted properly during the
|
||||
conversion process.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p>Lists <tt><text:unordered-list></tt> and
|
||||
<tt><text:ordered-lists></tt></p>
|
||||
|
||||
<p>A list can only contain one optional <tt><text:list-header></tt>
|
||||
and one or more <tt><text:list-item></tt> elements.</p>
|
||||
|
||||
<p>A <tt><text:list-header></tt> contains one or more paragraph
|
||||
elements. Since there are no styles, the conversion process does not
|
||||
do anything special for list headers, conversion for the paragraphs
|
||||
within list headers are the same as explained above.</p>
|
||||
|
||||
<p>A <tt><text:list-item></tt> may contain one or more of paragraphs,
|
||||
headings, list, etc. Since the Doc format does not support any list
|
||||
structure, there will not be any special handling for this element.
|
||||
Conversion for elements within it shall be applied according to the
|
||||
element type. Thus, lists with paragraphs within it will result in just
|
||||
plain paragraphs. Sublists will not be identifiable. Paragraphs in
|
||||
sublists will still appear.</p>
|
||||
</li>
|
||||
<li>
|
||||
<p><tt><text:section></tt></p>
|
||||
|
||||
<p>I am not sure what this is yet, will need to investigate more on this.</p>
|
||||
</li>
|
||||
</ul>
|
||||
<p>There may be other tags that will still need to be addressed for this conversion.</p>
|
||||
|
||||
<p>Refer to {@link org.openoffice.xmerge.converter.xml.sxw.aportisdoc.DocumentSerializerImpl DocumentSerializerImpl}
|
||||
for details of implementation. It uses <code>DocEncoder</code> class to do the encoding
|
||||
part.</p>
|
||||
|
||||
<h2>DOC to SXW Conversion</h2>
|
||||
|
||||
<p>The <code>DocumentDeserializerImpl</code> class implements the
|
||||
<code>org.openoffice.xmerge.DocumentDeserializer</code>. It is
|
||||
passed the device document in the form of a <code>ConvertData</code> object.
|
||||
It will then create a <code>SxwDocument</code> object from the conversion of
|
||||
the DOC formatted records.</p>
|
||||
|
||||
<p>The text content of the Doc format will be transferred as text. Paragraph
|
||||
elements will be formed based on the existence of an ASCII LF character. There
|
||||
will be at least one paragraph element.</p>
|
||||
|
||||
<p>Bookmarks in the Doc format will be converted to the bookmark element
|
||||
<tt><text:bookmark></tt> [Not implemented yet].</p>
|
||||
|
||||
|
||||
<h2>Merging changes</h2>
|
||||
|
||||
<p>As mentioned above, the <code>DocumentMerger</code> object produced by
|
||||
<code>PluginFactoryImpl</code> is <code>DocumentMergerImpl</code>.
|
||||
Refer to the javadocs for that package/class on its merging specifications.
|
||||
</p>
|
||||
|
||||
<h2>TODO list</h2>
|
||||
|
||||
<p><ol>
|
||||
<li>Investigate Palm's with different character encodings.</li>
|
||||
<li>Investigate other StarWriter XML tags</li>
|
||||
</ol></p>
|
||||
|
||||
</body>
|
||||
</html>
|
19
xmerge/source/bridge/antcall.txt
Normal file
19
xmerge/source/bridge/antcall.txt
Normal file
|
@ -0,0 +1,19 @@
|
|||
r:\btools\apache-ant-1.6.1\bin\ant
|
||||
-Dprj=../..
|
||||
-Dprjname=
|
||||
-Ddebug=off
|
||||
-Doptimize=on
|
||||
-Dtarget=xmrg_bridge
|
||||
-Dsolar.update=on
|
||||
-Dout=../..\wntmsci10.pro
|
||||
-Dinpath=wntmsci10.pro
|
||||
-Dproext=".pro"
|
||||
-Dsolar.bin=Y:\so-cwsserv03\lo8\SRC680\wntmsci10.pro\bin.m105
|
||||
-Dsolar.jar=Y:\so-cwsserv03\lo8\SRC680\wntmsci10.pro\bin.m105
|
||||
-Dsolar.doc=Y:\so-cwsserv03\lo8\SRC680\wntmsci10.pro\doc.m105
|
||||
-Dcommon.jar=Y:\so-cwsserv03\lo8\SRC680\common.pro\bin.m105
|
||||
-Dcommon.doc=Y:\so-cwsserv03\lo8\SRC680\common.pro\doc.m105
|
||||
-f build.xml
|
||||
-emacs
|
||||
|
||||
Buildfile: build.xml
|
Loading…
Reference in a new issue