Commit ae465b8d authored by anton's avatar anton

ANS Forth extension proposals etc.

parent 0bc30d48
<title>Buffering clarification</title>
<h3><a name="buffering">Buffering clarification</a></h3>
[<a href="proposals.html">Other proposals</a>]
<h4>Problem</h4>
The standard does not specify whether buffering is allowed for words
dealing with the user input and output devices, and the file
words. The existence of FLUSH-FILE indicates that buffering is allowed
to some extent.
<h4>Proposal</h4>
No buffering is allowed for the user input device. No buffering is on
the user output device, if it is a terminal. Buffering is allowed for
the user output device, if it is no terminal (e.g., a file), and for
the file wordset. The amount of buffering is system-defined.
<pre>
OUTFILE-ID ( -- file-id ) file-ext
</pre>
the file-id for the current user output device
<h4>Remarks</h4>
Allowing buffering for non-terminal user output device may increase
the performance for filters.
<p>Flushing the user output device can be achieved with
<pre>
OUTFILE-ID FLUSH-FILE ...
</pre>
<h4>Experience</h4>
Gforth <=0.3.0 used no buffering for KEY, line buffering for
outputting to a terminal, and block buffering for outputting on
non-terminals. Several users complained about the buffering for the
user output device (they used a terminal), so starting with 0.4.0 we
use no buffering for the user output device if it is a
terminal. Nobody has complained about the file buffering.
<p>Gforth has implemented OUTFILE-ID since 0.2, and it is in use
(mainly internally).
<p>Experiments with a filter we had on Gforth under Linux showed no
significant performance advantage for buffering the user output
device. We should repeat this experiment with a more I/O-intensive
filter (the one we measured produced only 94KB of output).
<p>Another experiment with Gforth under Linux-Alpha resulted in a
worst-case slow-down factor of 25 for turning off buffering. The
benchmark used in this experiment was:
<pre>
: foo
10000000 0 +do
[char] x emit
loop ;
</pre>
<hr><a href="/anton/">Anton Ertl</a>
This diff is collapsed.
\ example implementation and test cases
: defer ( "name" -- )
create ['] abort ,
does> ( ... -- ... )
@ execute ;
: defer@ ( xt1 -- xt2 )
>body @ ;
: defer! ( xt2 xt1 -- )
>body ! ;
: is
state @ if
POSTPONE ['] POSTPONE defer!
else
' defer!
then ; immediate
: action-of
state @ if
POSTPONE ['] POSTPONE defer@
else
' defer@
then ; immediate
\ test cases
require test/tester.fs
{ defer defer1 -> }
{ : is-defer1 is defer1 ; -> }
{ : action-defer1 action-of defer1 ; -> }
{ ' * ' defer1 defer! -> }
{ 2 3 defer1 -> 6 }
{ ' defer1 defer@ -> ' * }
{ action-of defer1 -> ' * }
{ action-defer1 -> ' * }
{ ' + is defer1 -> }
{ 1 2 defer1 -> 3 }
{ ' defer1 defer@ -> ' + }
{ action-of defer1 -> ' + }
{ action-defer1 -> ' + }
{ ' - is-defer1 -> }
{ 1 2 defer1 -> -1 }
{ ' defer1 defer@ -> ' - }
{ action-of defer1 -> ' - }
{ action-defer1 -> ' - }
This diff is collapsed.
<title>Exception proposal</title>
<h3><a name="exception">EXCEPTION</a></h3>
[<a href="proposals.html">Other proposals</a>]
<h4>Problem</h4>
Libraries cannot introduce throw values, because they don't know which
values are used by other libraries or the application.
<p>The system does not know how to report a new exception.
<h4>Proposal</h4>
<pre>
EXCEPTION ( c-addr u -- n ) exception
</pre>
n is a previously unused THROW value in the range
{-4095...-256}. Consecutive calls to EXCEPTION return consecutive
decreasing numbers.
<p>The system may use the string denoted by c-addr u when reporting
that exception (if it is not caught).
<p>After a marker is executed or a word is forgotten that was defined
before EXCEPTION was called, THROWing n is an ambiguous condition.
<h4>Typical Use</h4>
<pre>
s" Out of GC-managed memory" EXCEPTION CONSTANT gc-out-of-memory
...
... gc-out-of-memory THROW ...
</pre>
<h4>Remarks</h4>
The restriction to values in the range {-4095...-256} ensures that
existing standard programs continue to work.
<p>The requirement to return consecutive decreasing THROW values makes it
possible to check for whole classes of exceptions with WITHIN:
<pre>
... CATCH ?DUP IF
DUP lib-last-exception lib-first-exception 1+ WITHIN IF
... \ deal with exceptions from lib
ELSE
THROW \ just pass the ones on that we don't know how to handle
THEN
</pre>
<p>The ambiguous condition after forgetting allows systems to reclaim
exception numbers and the memory taken up by the strings on
forgetting. Systems are not required to do this.
<p>Andrew Haley also voiced concerns about the interaction with
multitasking systems. This proposal can be integrated with
multitasking systems in several ways:
<ul>
<li>Exception numbers just can be treated like dictionary space: Each
task/user gets a piece of the available range and allocates numbers
within this space independently.
<li>Each task maintains its own mapping of exceptions. The same
exception number could be used for different purposes in different
tasks, but exception number usage would be restricted to the task and
could not be used across tasks.
</ul>
<h4>Experience</h4>
<code>EXCEPTION</code> is implementend in Gforth since before release
0.4.0.
<p>An approximation in ANS Forth is included in the Gforth compat library
which is also included in the <a
href="../garbage-collection.zip">garbage collector</a>.
<p><code>EXCEPTION</code> is used in the <a
href="../garbage-collection.zip">garbage collector</a>.
<h4>Comments</h4>
Michael L. Gassanenko on experience:
<pre>
I used:
CREATE not-ready
... IF not-ready THROW ...
BTW, the system(s) that I have to work on have no built-in CATCH .
</pre>
Peter Knaggs:
<pre>
another part of your "library" model, most useful.
</pre>
<hr><a href="/anton/">Anton Ertl</a>
<title>Extension queries</title>
<h3>Extension queries</h3>
[ <a href="rfds.html">RfDs/CfVs</a> | <a href="proposals.html">Other proposals</a> ]
<!--
<h4>Poll standings</h4>
See <a href="#voting">below</a> for voting instructions.
<h5>Systems</h5>
<pre>
[ ] conforms to ANS Forth.
[ ] already implements the proposal in full since release [ ]:
[ ] implements the proposal in full in a development version:
[ ] will implement the proposal in full in release [ ].
[ ] will implement the proposal in full in some future release.
There are no plans to implement the proposal in full in [ ].
[ ] will never implement the proposal in full:
</pre>
<h5>Programmers</h5>
<pre>
[ ] I have used (parts of) this proposal in my programs:
[ ] I would use (parts of) this proposal in my programs if the systems
I am interested in implemented it:
[ ] I would use (parts of) this proposal in my programs if this
proposal was in the Forth standard:
[ ] I would not use (parts of) this proposal in my programs.
</pre>
<h5>Informal results</h5>
-->
<h4>Problem</h4>
How does a program know whether the system it runs on supports one of
the extensions that ran through the RfD/CfV process, so that the
program can implement the extension itself or work around its absence?
<h4>Proposal</h4>
If the string passed to <code>ENVIRONMENT?</code> starts with "X:",
<code>ENVIRONMENT?</code> returns false if the system does not
implement the extension indicated by the query string in full, or if
there is no such extension that has gone to a CfV.
<p>For an extension from the <a href="rfds.html">list of CfVs</a>,
take the linked-to filename, delete the ".html", and prepend "X:" to
construct a query string for the extension.
<p>If the system implements the extension, <code>ENVIRONMENT?</code> may
return true (without additional values) or false.
<h4>Typical Use</h4>
<pre>
S" X:deferred" ENVIRONMENT? 0= [IF]
... \ reference implementation of the deferred words proposal
[THEN]
</pre>
<h4>Remarks</h4>
<h5>Why allow returning false when the system supports the extension?</h5>
Returning false when the system supports the extension will usually be
safer than returning true when the system does not support the
extension; in the former case the program will be slower, or have
degraded features; in the latter case the program will usually fail in
unpredictable ways.
<p>Therefore, systems must not return true for extensions that have
not yet gone to a CfV (the proposal for the extension could still
change).
<p>So, if a system happens to already support the extension, it will
have to report false on queries for the extension at least from the
time when the proposal goes to a CfV until the time that an update of
the system with updated extension queries is released.
<p>Moreover (and possibly more importantly), this feature means that
systems whose implementors have never heard of (or ignore) RfDs and
CfVs will work correctly for extension queries (as long as they don't
support any queries starting with "X:" on their own), so a program
written to cope with this specification will usually work correctly
even on such systems.
<h5>Why not let ENVIRONMENT? return a flag and true, like for wordset
queries?</h5>
This proposal is easier to use. What is the point of returning an
extra flag? "Yes, we have heard of that extension, but no, we have
not implemented it"? That's not a useful information to have; what
should a program do with that information?
<h5>Why the "X:" prefix?</h5>
This will hopefully ensure that there is no naming conflict with any
existing environmental query of any system; it also reserves a part of
the environmental query name space (by requiring a false result for
anything that has not gone to a CfV), without consuming all of it.
<p>If you know of any name conflict of the "X:" prefix with an
existing system and have a better suggestion for a prefix, let me
know.
<h5>What about extension proposals that have not (yet) gone to a
CfV?</h5>
If you want to introduce queries for them, do it with a different
prefix.
<h5>Why not include extension proposals that have not (yet) gone to a
CfV?</h5>
They may still change before they go to a CfV, so it would not be
clear if the system and the querying program refer to the same
proposal.
<h5>Implementation and Tests</h5>
<ul>
<li><a href="reference-implementations/extension-query.fs">Reference implementation</a> (easy: empty file).
<li><a href="tests/extension-query.fs">Tests</a>
</ul>
<h4>Experience</h4>
All ANS Forth systems I know implement this proposal in a minimal way
(answer all queries with false). None implement it in a non-minimal
way. No programs have used the proposal yet.
<h4>Change history</h4>
<dl>
</dl>
<h4>Comments</h4>
<!--
<h4><a name="voting">Voting instructions</a></h4>
Fill out the appropriate ballot(s) below and mail it/them to me
<anton@mips.complang.tuwien.ac.at>. Your vote will be published
(including your name (without email address) and/or the name of your
system) here. You can vote (or change your vote) at any time by
mailing to me, and the results will be published here.
<p>Note that you can be both a system implementor and a programmer, so
you can submit both kinds of ballots.
<h4>Ballot for systems</h4>
If you maintain several systems, please mention the systems separately
in the ballot. Insert the system name or version between the
brackets. Multiple hits for the same system are possible (if they do
not conflict).
<pre>
[ ] conforms to ANS Forth.
[ ] already implements the proposal in full since release [ ].
[ ] implements the proposal in full in a development version.
[ ] will implement the proposal in full in release [ ].
[ ] will implement the proposal in full in some future release.
There are no plans to implement the proposal in full in [ ].
[ ] will never implement the proposal in full.
</pre>
If you want to provide information on partial implementation, please
do so informally, and I will aggregate this information in some way.
<h4>Ballot for programmers</h4>
Just mark the statements that are correct for you (e.g., by putting an
"x" between the brackets). If some statements are true for some of
your programs, but not others, please mark the statements for the
dominating class of programs you write.
<pre>
[ ] I have used (parts of) this proposal in my programs.
[ ] I would use (parts of) this proposal in my programs if the systems
I am interested in implemented it.
[ ] I would use (parts of) this proposal in my programs if this
proposal was in the Forth standard.
[ ] I would not use (parts of) this proposal in my programs.
</pre>
If you feel that there is closely related functionality missing from
the proposal (especially if you have used that in your programs), make
an informal comment, and I will collect these, too. Note that the
best time to voice such issues is the RfD stage.
-->
<hr><a href="/anton/">Anton Ertl</a>
<title>FAST-EXECUTE proposal</title>
<h3><a name="fast-execute">FAST-EXECUTE</a></h3>
[<a href="proposals.html">Other proposals</a>]
<h4>Problem</h4>
The stack-effect of EXECUTE usually cannot be determined
statically. In the context of an optimizing compiler this creates a
significant performance problem: no register allocation can be
performed across an EXECUTE or across any word that may call EXECUTE
directly or indirectly.
<p>The frequent invocation of EXECUTE in object-oriented programs makes
it important to avoid this cost.
<h4>Proposal</h4>
<pre>
FAST-EXECUTE core-ext
</pre>
<h5>Interpretation:</h5>
Interpretation semantics for this word are undefined.
<h5>Compilation: ( u1 u2 u3 u4 -- )</h5>
Append the run-time semantics given below to the current definition.
<h5>Run-time: ( u1*x u3*r xt -- u2*x u4*r )</h5>
Remove xt from the stack and perform the semantics identified by
it. Other stack effects are due to the word EXECUTEd. An ambiguous
definition exists if xt does not have the stack effect ( u1*x u3*r --
u2*x u4*r )
<h4>Typical Use</h4>
... ['] + [ 2 1 0 0 ] FAST-EXECUTE ...
<h4>Remarks</h4>
This word does not introduce new functionality. It can be implemented
(without the performance-enhancing effect) on standard systems with
<pre>
: FAST-EXECUTE 2DROP 2DROP POSTPONE EXECUTE ; IMMEDIATE
</pre>
We can therefore wait safely until we have more experience with this
word before adopting it.
<h4>Experience</h4>
none.
<h4>Comments</h4>
Michael L. Gassanenko:
<pre>
I think, the new standard must have a section of "experimental words",
as FORTH-83 did, and this proposal can go only to this section.
</pre>
Peter Knaggs:
<pre>
I can see very much why you would want this, especially
for optimising compilers. However, using the assume command from my
stack algebra will have the same effect. (I must write up a version of
that for JFAR.)
</pre>
<hr><a href="/anton/">Anton Ertl</a>
<title>Forth 200x</title>
<h1>Forth 200x</h1>
<h2>The short story</h2>
A new standards process (Forth 200x) for updating the '94 standard is
underway. It will produce a formal standards document; proposals for
changes to the '94 standard should run through the <a
href="rfds.html">RfD/CfV process</a> before being discussed at the
standards meeting. There is now a <a
href="http://groups.yahoo.com/group/forth200x/">mailing list</a> for
RfDs/CfVs and other issues related to the Forth 200x effort. The next
standards meeting will be held on the day before EuroForth 2005, i.e.,
on Oct 20th, 2005 in Santander (Spain). It has not been decided
whether an official standards body (like ISO) will be involved.
<h2>The long story</h2>
At EuroForth 2004 we had a workshop <em>Forth 2005</em> about an
update of the Forth standard. There's a <a
href="http://www.complang.tuwien.ac.at/anton/euroforth2004/photos/img_1824.jpg">picture
of the blackboard (1.3MB)</a> that summarizes the main points. The
participants decided to take some votes, so you see some vote results.
Here's the decoded (and more bandwidth-friendly) form:
<ul>
<li>Should such an effort be done at all? Most people seemed to like
the idea.
<li>Should the new effort only deal with existing practice, or also
with new ideas?
<li>Should we use the <a
href="rfds.html">RfD/CfV
process</a> to produce semiformal proposals for changes to the
standard before we decide on the new standard (vote: 13 yes: 0 no: 2
abstain).
<li>It turned out that a number of participants do not read Usenet;
therefore a public moderated mailing list (with a public archive) was
proposed, and Peter Knaggs volunteered as moderator (14Y:0N:1A). The
mailing list was created right away: <a
href="http://groups.yahoo.com/group/forth200x/">Forth200x</a>, and
moderation currently happens by getting approved as a member of the
mailing list (only members are allowed to post). You can become a
member right away by sending a request to
forth200x-subscribe@yahoogroups<tt>.</tt>com or via the <a
href="http://groups.yahoo.com/group/forth200x/">mailing list
homepage</a>. I am not yet sure if and how an RfD can be processed in
parallel in comp.lang.forth and in the mailing list, but I will try it
at least for those RfDs that I do.
<li>Should we run the standard through a standards body like ANSI,
ISO, IEEE, etc.? If so, which one? Opinions were divided on that,
but most seemed to agree that we should get going and possibly create
a new standards document first, and deal with a standards body later
(if at all). It was proposed to defer answering the question for 1
year (12Y:0N:3A). One argument against involving standards bodies is
that they want to have an exclusive copyright on the document, so that
even the developers of the standards have lose the right to copy and
continue to develop it.
<li>Document format questions and a standards editor. Some people
favoured starting with the HTML version of the standard and sticking
with that format, others favoured MS Word (which many strongly
opposed), some proposed using LaTeX. One argument for Word was that
it supports change bars which supposedly other document formats don't.
Finally someone pointed out that the editor of the standards document
has to be comfortable with the document format. Anton Ertl
volunteered as editor (15Y:0N:0A). One problem with that is that I
was also volunteered and approved as chairman of the effort, but I
guess that can be resolved before the editor role becomes active.
<li>Should there be a standards meeting? Where and when? We decided
to have a standards meeting 1 day before the next EuroForth
(9Y:0N:7A), i.e., on Oct 20th, 2005 in Santander (Spain). The
standards meeting should only deal with proposals that have run
through the RfD/CfV process.
<li>Anton Ertl was nominated chairman, and should inform the Forth
community (in the form of the various known formal and informal
groups) of this effort.
</ul>
<hr><a href="/anton/">Anton Ertl</a>
\ No newline at end of file
parse-word
case insensitivity
number prefixes
separate FP stack
{ (locals), fp locals, buffer locals
[defined] [undefined]
required
directory handling for included and required
0 for NIL
S\" .\"
Using TAB, CR, LF, FF in source code
feature requests (detect/ask for these extensions or Forth 200x)
ALIAS
<title>Input source restoration proposal</title>
<h3><a name="input-source-restoration">restoration of the input source specification on THROW</a></h3>
[<a href="proposals.html">Other proposals</a>]
<h4>Problem</h4>
The standard makes THROW responsible for restoring the input source
specification. In combination with QUERY, this leads to the following
anomaly:
<pre>
: foo query 1 throw ;
: bar ['] foo catch ;
</pre>
According to the standard, the 1 THROW should restore the input source
specification in effect before the execution of CATCH. If FOO did not
execute a THROW (or 0 THROW), then the input source specification
should not be restored.
<h4>Proposal</h4>
Remove the requirement to restore the input source specification from
the definition of THROW.
<p>Add the following requirement to the definition of LOAD,
INCLUDE-FILE, INCLUDED, EVALUATE etc.: catch exceptions, restore the
input source specification, and throw the exceptions.
<h4>Remarks</h4>
Alternatively, CATCH could be required to restore the input source
specification, irrespective of whether there was a THROW or not. This
would also eliminate the anomaly.
<h4>Experience</h4>
Most systems implementing THROW already implement the proposed
behaviour. I have never heard of anyone complaining about that,
neither from Gforth users nor otherwise. This indicates that no
existing programs would be affected by the change.
<p>AFAIK, there exist no experiences with the alternative.
<h4>Comments</h4>
Michael L. Gassanenko:
<pre>
> Proposal
> Remove the requirement to restore the input source specification from the
> definition of THROW.
I agree. AFAIK, people (including myself) do implement CATCH and
THROW without input souurce restoration, because if your program
controls live hardware, why should you spend time and memory on >IN etc.?
</pre>
<hr><a href="/anton/">Anton Ertl</a>
<title>Output redirection proposal</title>
<h3><a name="output-redirection">Output redirection</a></h3>
[<a href="proposals.html">Other proposals</a>]
<h4>Problem</h4>
Many words exist for convenient output to the user output device
(e.g., ., .S, F.). Replicating this functionality for output to files
is very cumbersome and a lot of work.
<h4>Proposal</h4>
<pre>
redirect-output ( ... file-id xt -- ... ) file-ext
</pre>
Set the user output device to file-id, EXECUTE xt, restore the old
user output device. If an exception is THROWn during the execution of
xt, the old user output device is restored, and the exception is
THROWn onwards.
<h4>Typical Use</h4>
<pre>
... ( r ) report-file @ ['] f. redirect-output ...
</pre>
<h4>Remarks</h4>
The syntax is CATCH-like to ensure proper restoration under all
circumstances (including THROWs). A more convenient syntax should be
found for CATCH as well as redirect-output.
<p>Michael Gassanenko's point about output to memory is a good
one. Either we address this by adding words for creating fids for
memory buffers, or we change this proposal to take an xt ( c-addr u --
) instead of a fid.
<p>Why not simply have a variable for the output file, similar to
BASE? Providing varying bases through a variable BASE was a mistake; I
am sure no Forth programmer will have trouble reciting a story where
BASE lead to problems. A similar design mistake for output redirection
would cause more trouble (e.g., some bug causes a THROW while output
is redirected -> the user does not even know that something happened).
<p>STDOUT is used in Gforth for the default OUTFILE-ID (i.e., the
standard output at the start of the system).
<h4>Experience</h4>
none
<h4>Comments</h4>
Michael L. Gassanenko:
<pre>
xt... It smells LISP, and I do not like LISP smell in Forth.
maybe,
<fid> redirect-output N>R .... NR> restore-output
would be better than
: aux14 ." xt=" MYVAR @ U. ;
... <fid> ['] aux14 redirect-output
?
What about redirecting output to strings? (IMO, it would be more useful)
</pre>
Michael L. Gassanenko again, about redirection to memory:
<pre>
What hapens if the memory buffer being the current output device
overflows?
</pre>
Peter Knaggs:
<pre>
REDIRECTE-OUTPUT
OUTFILE-ID
Not so sure about these two. Why not define a STDIN and STDOUT words
witch provide the standard fild-id for terminal I/O and redefine all I/O
words to be file based. Thus redirecting output would simply be a case
of changing the output file id, simular to BASE if you like.
Note that I did suggest something along these lines to the committee
back in '90 but it was rejected.
</pre>