The First Asian Forum for Information Technology (1st AFIT)



Introduction to
Asian Document Style Standardization for Information Interchange (DocSII)

This text or its updated version is posted on
http://www.y-adagio.com/public/committees/docsii/refs/int_docsii.htm.


Yushi Komachi

Panasonic/MGCS, Shimomeguro, Tokyo Japan
email: komachi@y-adagio.com

2002-11-08


1. Overview of DocSII
1.1 Human-readable document and its interchange
1.2 Interchange format
1.3 Focus on "structuring language form with a style specification"
2. DocSII, What to do — More clarification of DoCSII activity —
3. DocSII web page and Workshop
3.1 Background
3.2 Scope
3.3 Activity plan
3.4 Workshop 1
4. Some understandings and trial
4.1 Latinization (changing to Latin-alphabet style/layout)
4.2 Conventional Document style
4.3 Trial
5. Conclusion
References

NOTE Red items are included in the old version.

1. Overview of DocSII

1.1 Human-readable document and its interchange

The large efforts of Asian experts on character encoding and ISO/IEC JTC1/SC2 people made it feasible to interchange of multilingual character strings.

Some character strings configure a document logical element; e.g., title, emphasized key word, note, etc. For easy human-readability and understandability of elements, some elements are rendered with their specific styles or layouts; e.g., rendered with large and bold fonts, rendered with italic fonts, rendered with indentation, etc.

See Example 1

Those human-readable documents are subjects to be interchanged between originator and recipient.

1.2 Interchange Format

The document interchange including styles can be carried out in the formats of the following three types:

a) final form (non-revisable), e.g., PDF, PostScript, or SPDL

b) editing tool's form (revisable), e.g., Word form, RTF, or etc.

NOTE: It depend on the editing tool and is in many cases proprietary form.

c) structuring language form with a style specification by style language (revisable), e.g., [SGML + DSSSL] or [XML + XSL].

NOTE: HTML or XHTML is a simple particular case of this form, where style specification is fixed and implemented within a browser.

Here we focus on the case c), where interchanged documents are revisable, complicated document styles/layouts are supported and besides their rendered image are preserved.

See Example 2 (Japanese text with complicated styles/layouts)

1.3 Focus on "structuring language form with a style specification"

The interchange form c) is very flexible and useful, and appropriate for a number of document applications, e.g., e-government, e-commerce, or etc. However, actual style specification by using a style language requires high-level of expertise on both the style language and document style/layout.

DocSII (Asian Document Style Standardization for Information Interchange) was established by CICC in 2002 as a new project, to solve this problem, particularly in Asian document environment.


2. DocSII, What to do

— More clarification of DoCSII Activity —

(1) DocSII develops and standardizes a [style language library] described by an existing style language, e.g., DSSSL or XSL.

NOTE:

The style language library is similar to a macro library of programming language and makes it feasible to describe actual style specifications to a structured documents, and contributes to document interchange preserving rendered page layouts.

(2) DocSII studies existing document styles/layouts, in particular, studies major document styles/layouts actually used for Asian documents presentation. Being based on the study, a style language library is developed.

(3) DocSII recognizes that document styles/layouts are actually employed depending upon

and that they are changing year by year.

(4) DocSII will submit the developed style language library to ISO/IEC JTC1 for international multilingual document interchange preserving rendered page layouts.

NOTE:
Major document styles/layouts for Japanese/English documents have been studied by Japanese experts and DSSSL library for the styles/layouts has been submitted to ISO as ISO/IEC TR 19758.

3. DocSII Web page and Workshop

As the first step of activity, DocSII established its web page and hold the 1st workshop in Sept 2002, Tokyo. The web page is located at

http://www.y-adagio.com/public/committees/docsii/index.htm

The followings (3.1 through 3.4) are based on the discussions in the workshop.


3.1 Background

(1) Today we can interchange our documents by using an internet or web service. Those documents can include a variety of languages (multilingual documents) by using internationally approved character codes or character entities.

(2) Structures of those documents can be described by internationally approved description languages, e.g., SGML[1], XML[2] or HTML[3].

(3) Human readable documents have to be rendered with an appropriate document style/layout. Even when interchanging the documents, the document style/layout is requested to be preserved.

(4) For document interchange with style/layout, HTML and CSS[4] have been employed. However they restrict formatting objects for simplicity.

(5) Style specification and interchange beyond HTML/CSS capability can be performed by using style languages, e.g., DSSSL[5] or XSL[6].

(6) However, actual style specification using those languages requires enough knowledge about style/layout rules which are significantly based on the cultural background of region, country, society or group within which the documents are distributed.

(7) Today's document interchange, in particular international interchange, preserving style/layout faces to this difficulty.


3.2 Scope

DocSII intends to solve the problem: special expertise on document style/layout is required to describe actual style specification using a style language.

DocSII collects styles/layouts employed in Asian countries and systematically classify the formatting objects.

DocSII creates a style language library. (Using the library, style specification can be carried out without particular expertise of style/layout rules and style languages.)

DocSII submits the library to the ISO requesting an international approval. (The internationally approved library will contribute to much more document users in the world.)


3.3 Activity Plan

step 1

To collect document styles/layouts employed in Asian countries.

To extract formatting objects from the rules and systematically classify the formatting objects.

step 2

To create a style language library. (It will be done by several special experts of style specification languages, who will work in the background of DocSII.)

step 3

To submit the library to the ISO requesting an international approval.

NOTE: An existing example of style language library is ISO/IEC TR 19758[7].


3.4 Workshop 1

3.4.1 Agenda

The Workshop 1 was held in accordance with the following agenda:

Date/TimeItem
Sept.17/10:00Opening
Sept.17/10:15Chair remark
Sept.17/10:30Presentation by delegates
Sept.17/14:30 through 17:00Discussions
Sept.18/10:00Discussions
Sept.18/11:30Review of summary
Sept.18/12:30Closing

3.4.2 Participant

There were participants of six countries:

3.4.3 Summary

All the participants accepted the following summary:

1. DocSII recognizes that the scope covers:

2. DocSII researches composition styles and layouts of each country. At first, DocSII requests for each delegation to research the following formatting objects and to report in the form similar to clause 4 of ISO/IEC TR 19758 by the end of Nov. 2002 (If possible, actual examples are requested.):

3. DocSII instructs its secretariat to create a mailing list for members to make an efficient discussion.

4. DocSII appoints the delegation of the DocSII Workshop 1 to the regular members and requests them to notify appropriate experts for DocSII scope.

5. DocSII requests the members to begin their domestic discussion on traditional document styles and layouts.


4. Some understandings and trials

During the discussions of the workshop and other expert meetings, the followings (4.1 and 4.2) become clear:


4.1 Latinization (changing to Latin-alphabet style/layout)

Today's Asian documents are frequently created by some computerized systems. Accordingly, those document style/layout are based on the formatting functionality of the editing/formatting software, the original versions of which have been developed by US or European experts.

It means "latinization" of Asian document styles.

If such a latinized document style is accepted by a number of Asian document users, it becomes a new Asian document style. It could be an aspect of document culture.


4.2 Conventional Document style

Conventional document styles have been developed, in each country, by a number of typographers being based on their experience on typesetting and printing. It may closely related to font and printing culture and publishing culture.

Some of the conventional document styles can not be supported by an existing Latinized system. However, there are requirements for the document styles.

Style languages deal with document style specifications by combination of minimized style elements. It results in
- requirement for style language library
but in
- capability for supporting conventional document styles as well as Latinized styles.


4.3 Trial

In order to go ahead the action items agreed in the 1st work shop, some Japan's experts are trying to collect examples of Asian specific document styles/layouts or examples which are not supported by existing simple formatters.

They are shown bellow:

  1. Dimension splitter (table composition)
  2. Enclosure (for emphasizing)
  3. Line spanning (in English/Thai mixture)
  4. Interrupted underline
  5. Vertical heading in horizontal composition

Those examples are expected to let you be interested in document styles/layouts and DocSII.


5. Conclusion

The experts on the DocSII topics are welcomed to join to DocSII discussions. They are expected to contact with DocSII secretariat Ms. Miki Emura (emura@net.cicc.or.jp) in CICC[8].


References