|
|
DilloUsersBug Tracker
[currently broken]
Developers
New Developer
Documentation * Naming&Coding Source repository Dpi1 Spec CSS Spec Authors Security contact Related |
CSS in DilloFinished
Pending
We are following this implementation plan 20081022 Developers with CSS expertise are greatly appreciated! Everything below this line is many years old. There is an ancient prototype which includes this text as documentation. See README in the tarball for more details.
===================================
Cascading Style Sheets (Overview)
===================================
Sebastian Geerken <s.geerken@ping.de>
Apr 2002: first version, posted to the developer's list
Oct 2003: revised and extended, while implementing phase 1, adopted
as developer's documentation
Nov 2003: small corrections
About
=====
This is an overview of the implementation of CSS in dillo, without
details on the internals of the single modules, they are described in
separate documents.
This text does furthermore not cover the problems how to render
particular attributes, since this is the task of the Dw module. Most
attributes can already be rendered, some of them (e.g floats, fixed
positions) have to be implemented, which can be done bit by bit.
Modus Operandi
==============
The implementation of CSS is splitted into two phases. After phase 2,
dillo will hopefully support full CSS, and it should be simple to add
interesting features like XML/CSS parsing. Phase 1 concentrates on the
general structure, and will have the following limitations:
1. There will be a distinction between simple and complex CSS
properties. A complex attribute is, in this context, defined as
an attribute, which change (after it has already been rendered)
will make structural changes in the widget tree necessary,
e.g. when changing the attribute "display" from "inline" to
"block", some words of the parent DwPage will have to be
replaced by a newly created DwPage, which furthermore will
contain these replaced words. On the other hand, changing simple
attributes like fonts, colors etc. will at most involve
recalculating sizes. [1]
In phase 1, the Doctree module will support the construction of
elements with complex properties, but no changes. Since HTML
rendering and CSS processing is done in in asynchronous way (see
below), this means that complex properties are only allowed in
the user agent and the user stylesheet.
2. The CSS module will only handle style sheets with very simple
selectors, we will first focus on a fast implementation.
Supported selectors only regard the element in question, and do
not support attributes, only (HTML) classes.
In phase 2, dillo will get a complete CSS engine, either written
from scratch, or an already existing one (e.g. RCSS by Raph
Levien).
The User's View
===============
An important goal is asynchronous HTML/CSS parsing: when the HTML
parser reads a <LINK> tag referring to an external style sheet, it
continues to render the document _without_ the style sheet, while the
style sheet (if it is not already in the cache) is retrieved and
parsed parallel to this, and then applied on the current rendered part
of the document. If the time difference is large enough, the user will
notice a sudden change of colors, fonts, etc., but will be able to
read the content with less delay time (which may be several seconds).
Overview
========
The following diagram shows the associations between the data
structures, and there multiplicities, for parsing HTML documents.
Worth to notice is that for every document (represented by SgmlDoc),
there is one document tree (Document) and one CSS context.
+------------+ +-------------+
| CssContext |< - - -| Css_doctree |
+------------+ +-------------+
1 ^ ^ |
| , - - - - - - - -' `- - - - - - - - - - - .
1 | | V
+---------+ 1 1 +-----------------+
| SgmlDoc | ------------------------------> | DoctreeDocument |
+---------+ +-----------------+
^ 1 | 1
| |
| 0..1 | *
+------------+ 1 * +-----------+ 0..1 1 +----------------+ 0..1
| SgmlParser | -----> | SgmlState | -------> | DocTreeElement |---.
+------------+ +-----------+ +----------------+ |
. . * | |
/_\ /_\ `-------'
| |
| |
+------------+ +-----------+
| HtmlParser | | HtmlState |
+------------+ +-----------+
| 0..1
|
V 1
+--------+
| HtmlLB |
+--------+
For more information about the SGML and the HTML parser, see
"SGML.txt". In short, the purposes:
- HtmlParser exists as long as the HTML document is parsed, and is
inherited from SgmlParser, which contains all data for the
general SGML/XML parsing, while HtmlParser adds some HTML
specific data.
- SgmlDoc exists as long as a document is shown (partly, when the
SgmlParser still exists, or fully, after the SgmlParser has
already been destroyed). CSS operations are done related to this
structure, since CSS documents may be applied even if the
document has already been retrieved fully, and so the SgmlParser
does not exist anymore.
- HtmlLB exists as long as SgmlDoc exists. It adds some more data
like links and forms.
The way how CssContext works in detail, is described in "CSS.txt"; its
purpose, and details on the other structures are described below.
A further role plays the module "Css_doctree", which provides no data
structures, but provides some functions to prepare CSS values for the
document tree.
CssValue and DwStyle
====================
There are two ways style attributes are handled: CssProperty/CssValue
and DwStyle. The first is used by the CSS module: CssProperty is an
enumeration type, and CssValue a union, which represents values
exactly in the way they have been parsed (i.e. the value may only
depend on the property itself, not e.g. on the context, see below).
Both, the document tree, and the dillo widget, use the structure
DwStyle, which is created by the module "css_doctree", when the
attributes for an element are evaluated. The representation of values
differs generally from CssValue, a particular property is in one of
the following categories (for the exact terminology, see [CSS2]
chapter 6.1):
1. Absolute values are represented directly. Examples are absolute
lengths. The value "auto" is handled the same way.
2. Some relative values are immediately computed, this may depend
on attributes of the parent element. Examples are relative line
heights, i.e. "line-height: 150%" will be computed into an
absolute (pixel) value.
3. Other relative sizes are represented this way in DwStyle,
examples are relative widths and heights.
Whether a specific attribute falls into category 2 or 3, is determined
by two factors:
1. If the attribute value is independent of certain values, which
changes affect only the level of Dw (important: window size),
they can, for simplicity, put into category 2. Otherwise, they
must belong to 3. The latter may not be inherited, for the
reason, see next point.
2. Since only *computed* values may be inherited, attributes,
which values are inherited, may not be part of category 2,
since Dw will not be able to handle them correctly.
The Document Tree
=================
The document tree has two purposes:
1. representation of the document structure, needed (in phase 2)
for the evaluation of CSS selectors, and
2. near-complete encapsulation of the dillo widget.
The SGML parser accesses only the document tree, and not anymore Dw,
only the HTML parser must, in some cases, refer to Dw directly. The
interface is similar to a small subset of the Document Object Model
(see [DOM2]) [2], and provides methods for the following purposes:
1. construction of nodes (mainly elements and text), adding them to
other nodes,
2. examining the structure (e.g. for evaluating CSS selectors),
3. assigning style attributes,
4. drawing, and
5. changing the pseudo class.
Some notes about the latter three points: The document tree is in most
cases able to construct and access the Dw structures simply by style
attributes. E.g., if the attribute "display" has the value "table", it
"knows" that it must create a DwTable and add it to the DwPage
associated with the parent node. This way, the SGML/HTML parser may be
simplified, much functionality can be replaced by a user-agent-defined
style sheet, as in [CSS2] appendix A.
Since this is not in all cases possible, two back-doors are kept open:
1. It is possible to add a special type of element to the tree,
with a specified DwWidget. The <img> tag is processed this
way.
2. Both, DwStyle and CssProperty/CssValue, are extended by
non-standard attributes, when necessary. For better distinction,
they are preceded by "x_". Examples are "x_link" and "x_colspan".
An element may have a pseudo class, which is used in the style
evaluation (see below). Changing this is e.g. done when the user
clicks on a not yet visited link, the pseudo class then switches from
"link" to "visited".
Changes in Dw
-------------
(This is only relevant for phase 2.)
Dw will certainly change for CSS, but some restrictions, which are
inherent to the basic design, will not attempted to be overcome, since
this would make Dw over-complicated. Instead the complexity is
apportioned on both modules, Dw and the document tree. Some
restrictions to consider:
- The allocation (this is the space a widget a widget allocates at
a time) of a dillo widget is always rectangular, so that e.g. an
inline section cannot be represented as a widget, but only as a
part of a widget. (Furthermore, a dillo widget is rather complex,
so that the number of widgets should be limited.)
- The allocation of a child widget must be a within the allocation
of the parent widget. In some cases, this leads to a widget tree,
which order differs from the document tree; e.g. since the HTML
document snippet
<ul>
<li>Some text.</li>
<li>
<div style="float:right; width=50%">Some longer text, so
that the effect described in this passage can be
demonstrated.
</div>
Some more and longer text.</li>
<li>Final text.</li>
</ul>
may be rendered like this:
- - - - - - - - - - - - - - - - - - - - - - - - - .
| * Some text.
* Some more and - - - - - - - - - - - - -.|
| longer text. |Some longer text, so that
* Final text. the effect described ||
` - - - - - - - - - - - |above in this passage can '
be demonstrated. |
` - - - - - - - - - - - -
Dw will render this as a DwPage (for the <ul>, which contains
three DwListItem's (for the <li>'s), and the <div> section will
be a DwPage, too. However, due to the restriction mentioned
above, this DwPage cannot be a child of the second DwListItem,
since the allocation of the DwPage exceeds the one of the
DwListItem. Instead, it must be a child of a widget higher in the
tree.
(Notice that floats are not implemented yet.)
- Dw handles some state changes quite well, namely changes of
dimensions (e.g. sizes of images) and style changes of words.
More complex changes (see remarks on "complex attributes" in
"Modus Operandi" above), as they will be necessary with CSS,
affect the widget tree itself. Dw may be extended by operations
making this conversion possible at all, but switching is
generally the task of the document tree.
The CSS Context (and the Css_doctree Module)
============================================
The CSS module is responsible for parsing style sheets, and evaluating
CSS selectors ([CSS2] chapter 5). The current design will change when
switching from phase 1 to phase 2, but some of the functions are only
accessible by the Css_doctree submodule ("p_" prefix), which will
absorb most design changes.
The evaluation function (within Css_doctree) gets the element node,
and the default attributes (DwStyle) as argument, and returns a
DwStyle with the values described in the section "CssValue and
DwStyle". The caller is responsible to determine the default
attributes (those which are not changed, if no rule is found), either
by setting them to default values, or copying them from the parent
element.
A "context" may combine several style documents, from different
origins (see [CSS2] chapter 6.4), so that the parser always adds rules
to a context. There is no need for an incremental parser, instead, a
document is always parsed as a whole (see below).
In some cases, it is necessary to add element-specific rules to the
context, either for evaluating the "style" attribute, or to process
(mostly deprecated) HTML elements and attributes.
Pseudo Elements and Generated Content
=====================================
(Except list items, content generation is planned for phase 2, so most
of this is irrelevant for phase 1.)
Content is generated in two cases:
1. if the ":before" and ":after" pseudo elements are used ([CSS2]
chapter 5.12.3), and
2. for list items.
Content generation is the task of the document tree. For dealing with
":before" and ":after", one document element refers to three DwStyle:
(i) two from the evaluation of ":before" and ":after", and
(ii) one actual style.
The actual style may be affected by the pseudo class of the element.
(There are many things missing for pseudo elements, they have to be
specified, and some may not be implemented in dillo.).
What Actually Happens in Different Situations
=============================================
Adding Elements to the Tree
---------------------------
This happens after the parser has read an opening tag:
1. The HTML parser evaluates the element attributes to create one
or two new, element-specific rules, and inserts them into the
CSS context.
2. The SGML parser adds a new element to the current document
element.
3. From the CSS context, the style is determined, based on the
style of the parent, where some attributes are set to default
values.
4. This style is attached to the new element, and the element is
drawn (e.g. the appropriate DwWidget methods are called).
Handling the <STYLE> Element
----------------------------
The content of <STYLE> is written into the stash. When </STYLE> is
read, following is done:
1. The HTML parser passes the stash content to the CSS parser,
which inserts the new rules into the CSS context.
2. The styles for the whole document are recalculated, and the
document is redrawn.
Handling the <LINK> Element
---------------------------
An external style sheet is read by a special cache client, which
writes the content into a buffer, and is associated with the document
tree and the CSS context. If the data has been fully retrieved, the
process is similar to <STYLE>, described above.
What is important is to preserve the order the documents have been
specified in the document, e.g. for
<link href="style1.css" rel="stylesheet" type="text/css">
<style>
/* style sheet 2 */
</style>
it is likely that the content of <style> will be processed earlier,
although it has been specified later in the HTML document. This is
done by assigning numbers to the style sheet documents, which are
increased each time <link> or <style> is processed.
CSS/XML Parsing
===============
An interesting feature is handling XML documents and render them in a
way defined by CSS, and specified in the XML document by the
<?xml-stylesheet?> processing instruction (see [XML-SS]). This will be
done by a new XML parser, which will be based on the current SGML
parser, and handle generic XML documents, for which no cache client
has been written in dillo.
Notice that this will not make sense, before phase 2 of the document
tree has been finished, since XML/CSS rendering will, unlike HTML/CSS
rendering, depend largely on complex attributes in author stylesheets.
Miscellaneous Notes
===================
This section is a reminder for things having to be considered, but do
not fit to anywhere else.
* (Only relevant for phase 2.) It is likely that style changes in the
document tree will affect the widget structure to an amount that
makes some external references (e.g. iterators) undefined. E.g.,
this may happen with the "find text" function:
1. The document, or a part of the document, has been loaded, but
not yet an external CSS style sheet referred by this document.
2. The user chooses "find text", inserts a search test, and
presses "OK". This will initialize the find text state,
instantiate an iterator, moving this iterator to the first
occurrence of the text in the document, and highlight text.
3. Before the user presses "OK", to find the next occurrence, the
style sheet has been fully retrieved, and is applied on the
document. In some cases, this may make it necessary to change
large parts of the current widget tree.
4. As soon as the user presses the "OK" button in the find text
dialog, the iterator is tried to be moved to the next
occurrence of the search text, but now it contains wild
pointers, and dillo will crash!
This case must at least be caught, so that the find text state can
be reset (and so finding text starts from the beginning, discarding
the current position). Better would be a way to "convert" the
iterators, so that their effective state remains. Details on the
latter depend on the detailed design of the document tree.
Footnotes
=========
[1] FWIW, a complete list of supported attributes, both simple and
complex. These simple attributes are already fully implemented:
background-color, border-spacing, color, font, font-family,
font-size, font-style, font-weight, text-align, vertical-align,
width.
These simple attributes are only partly implemented (at the time
when this was written):
border-{top|right|bottom|left}-color,
border-{top|right|bottom|left}-style,
border-{top|right|bottom|left}-width, text-decoration, height.
And these (still simple) ones not at all:
background-attachment, background-image, background-position,
background-repeat, border-collapse, clip, font-size-adjust,
font-stretch, font-variant, letter-spacing, max-height,
max-width, min-height, min-width, outline-color, outline-style,
outline-width, overflow, text-indent, text-shadow.
Two attributes are partly simple: A simple implementation of
'list-style-type' is only possible for 'disc', 'circle' and
'square', textual values (numbers etc.) are complex. Furthermore,
'white-space' can only implemented in a simple way for the values
'normal' and 'nowrap', not for 'pre'. (These attributes are
already both implemented, in the way, how simple/complex
attributes are planned for phase 1.)
These simple attributes make only sense in conjunction with
complex attributes:
clear, bottom, left, line-height, marker-offset, right, top.
This is the list of complex attributes:
caption-side, content, counter-increment, counter-reset,
cursor, direction, display, empty-cells, float,
list-style-image, list-style-position, marks, position, quotes,
text-transform, unicode-bidi, visibility, word-spacing, z-index.
Only 'display' is already implemented in the planned, limited way.
And, for completeness, dillo won't implement these, because they
cover other aspects dillo does not support:
azimuth, cue, cue-after, cue-before, elevation, pause,
pause-after pause-before, pitch, pitch-range, play-during,
richness, speak, speak-header speak-numeral, speak-punctuation,
speech-rate, stress, voice-family, volume, orphans, page,
page-break-after, page-break-before, page-break-inside, size,
widows.
Finally, a special case is 'table-layout', which will be ignored
by dillo, because of its way to render table. ('table-layout' only
describes the algorithm, not the result.)
[2] In the future, it may be, that the document tree will be fully DOM
compliant. However, now, some compromises are made in regard to
the interfaces, e.g. there are no text nodes, but instead the
caller (the SGML parser) has to split the text into words and
spaces.
References
==========
[CSS2] Cascading Style Sheets, level 2, CSS2 Specification
http://www.w3.org/TR/1998/REC-CSS2-19980512
[DOM2] Document Object Model (DOM) Level 2 Core Specification
http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113
[XML-SS] Associating Style Sheets with XML documents Version 1.0
http://www.w3.org/1999/06/REC-xml-stylesheet-19990629/
|