Commit 8b7ac24a authored by Radford Neal's avatar Radford Neal

Changes for release of 2019-01-11

parent 7bab621b
configure
NEWS
NEWS.pdf
This source diff could not be displayed because it is too large. You can view the blob instead.
This source diff could not be displayed because it is too large. You can view the blob instead.
File added
Revision: 58871
Last Changed Date: 2019-00-00
Last Changed Date: 2019-01-11
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
The R release pqR is based on, unchanged except for adding the README
in mods, and the 00 file within it, containing this text.
Updates R-admin.texi to minimally say where pqR can be obtained.
This mod updates the copyright notices at the start of many
source files.
Replaced the section of "future directions" in R-ints.texi with a stub
for pqR, with no content at the moment.
Move NEWS to ONEWS, ONEWS to OONEWS, OONEWS to OOONEWS. Create new
NEWS.Rd with initial content for pqR.
Changes to display pqR version information, including separating the
pqR version from the version of R that pqR is based on. Also updates
some copyright notices, and changes bug reporting address.
New version of function Rprofmem (and related Rprofmemt) implemented.
See NEWS entry.
Miscellaneous code cleanups, including the following:
o Cleaned up inconsistencies in checks for arity of primitives in eval.c
Removed checks from functions implementing language features "repeat",
"while", and "function", consistent with other language features (lack
of checks does not cause a crash --- missing arguments just appear
to be NULL). Changed the check in do_set to the standard form
using checkArity, with names.c changed to make the arity be 2 rather
than -1.
o The "spare" bit in sxpinfo is renamed to "misc", and the documentation
in the code and in R-ints.texi is changed to reflect this, and to
document that this bit is actually used.
o The documentation before do_seq in seq.c is changed to be correct
(seq.int is no longer SPECIAL), and the incorrect reference to
seq.int in R-ints.texi is removed.
o The fixup_NaRm function defined in summary.c is moved to match.c,
where it belongs. It is now properly declared in Rinternals.h,
rather than the definition in summary.c being surreptitiously
referenced as an extern from logic.c.
o Defined an isRaw macro globally for consistency with other such macros,
deleting several local definitions of this.
o Fixed problem with recompilation with byte compilation not enabled.
o Fixed (by a kludge, not a proper fix) a bug in the "tre" package that
shows up when WCHAR_MAX doesn't fit in an "int". The kludge reduces
WCHAR_MAX to fit, but really the "int" variables ought to be bigger.
(This problem shows up on a Raspberry Pi running Raspbian.) Also fixed
a going-past-end-of array bug (that probably never happened).
o Code in R-2.15.0 exists for maintaining a cache of primitive objects,
but this code forgets to ever actually enter a primitve into the cache.
This is now done (in mkPRIMSXP).
o Updated R-admin to discuss derived but distributed files (eg configure).
See also entries in NEWS for other mods that merit description there.
The "inspect" .Internal function was changed to show some details of
pairlist nodes, if SHOW_PAIRLIST_NODES is defined as 1 in inspect.c.
It also shows length and truelength (the hash) for CHARSXP nodes.
Optionally displays details of promises, and now handles R_UnboundSymbol
correctly.
Finally, it no longer produces output with tabs (spaces instead).
Rinternals.h now has a #define for R_OMP_FIRSTPRIVATE_VARS, which
contains a comma-separated list of variables that should usually be
included in the firstprivate part of an OMP parallel construction,
since they are used in macros such as NA_REAL.
The only use of this in the interpreter itself gets deleted by a later
mod, but this mod is retained anyway...
New pnamedcnt primitive function for printing named field of object.
See NEWS entry for details.
Eliminated non-deterministic aspect of testing of random number
generators. Also prints more details on failure.
See NEWS entry for details.
This mod defines a cons_with_tag function that creates a CONS cell
given its CAR, CDR, and TAG fields. This makes for clearer, more
concise, and faster code when CONS cells with TAG set need to be created.
Defines several functions for use within the R interpreter.
New functions copy_string_elements and copy_vector_elements are now
defined. Compared to using SET_STRING_ELT and SET_VECTOR_ELT, these
allow copying of multiple elements without error checks on every
element, and sometimes without old-to-new checks on every element.
A new function copy_elements is also defined, for copying elements in
any sort of vector (duplicating non-atomic elements, and using
copy_string_elements).
A new function set_elements_to_NA_or_NULL is defined for doing that.
See also the NEWS item on a bug fix.
New functions are defined for finding exact or partial matches, to
replace the existing pmatch, psmatch, and other matching functions
(but pmatch and psmatch are retained, in case anyone uses them).
These functions return 0, -1, or +1 for no match, partial match, and
exact match, allowing any subset of these conditions to be easily
checked for in one comparison. There is therefore no need for an
"exact" argument as in psmatch. The new functions should usually be
faster as well (the old psmatch uses strcmp for exact matches, which
might be able to use special machine instructions if they exist, but
since most calls will be for short strings, and early failure to match
is likely the most common result, this is unlikely to provide a
benefit once extra procedure call overhead is accounted for.)
Versions are provided for matching a string to a string, an SEXP to an
SEXP, or a string to an SEXP, called ep_match_strings, ep_match_exprs,
and ep_match_string_expr.
Calls of pmatch are replaced by calls of ep_match_exprs in various
places as part of this mod, and these functions are also used in later
mods (eg, in matchArgs and in do_subset3).
Speedup from using these functions is mentioned in a NEWS entry.
Two changes to how expressions are evaluated.
First, a facility has been introduced for an expression to be
evaluated in a context in which a "variant result" is allowed - eg, if
the result will be ignored anyway (expression is evaluated only for
side effects), a null result might be allowed. This is done by
introducing an "evalv" function that is like "eval" but with an extra
parameter saying what variant results are permissible. This facility
is used for later modifications, with some symbols defined here in
anticipation of these modifications.
Second, calling of primitive functions has been speeded up by copying
relevant information (eg, arity) from the table defining primitives
(in names.c) to fields in the SEXP for the primitive. This saves
table access computations and also division and remainder operations
to get at the information in the "eval" field in names.c, which is
encoded as decimal digits.
A procedure SET_PRIMFUN in memory.c was surreptitiously changing the
function pointer for a primitive via the function pointer access
macro, PRIMFUN. A SET_PRIMFUN macro now does this properly.
The types used for pointers for C functions implementing primitives
have been make safer, taking account of C99's specification that one
can convert between all types that are pointers to functions without
loss of information, but not necessarily between a pointer to a
function and a pointer to void.
Code in saveload.c (for loading old workspaces?) creates a primitive
directly, bypassing the mkPRIMSXP procedure. This seems unwise, since
creation via mkPRIMSXP is apparently needed to ensure protection of
primitives. Whatever is going on there should not be affected by this
modification, however.
R-ints.texi has been updated to document evalv and variants.
Two related changes.
Created a promiseArgsWithValues function that calls promiseArgs and
then sets the values of the promises created, and a promiseArgsWith1Value
function that does the same except setting only the value for the
first promise. Code to do these things appears in several places, so
creating these functions cleans things up (and is needed for later
mods).
The promiseArgsWithValues and promiseArgsWith1Value functions are not
entirely equivalent to the previous code, which set the values of what
it took to be promises without checking that they actually were
promises. Since promiseArgs doesn't always create a promise for every
argument (it doesn't when the argument is R_MissingArg), this doesn't
seem safe, though there seem to be no examples where a bug actually
arises. The promiseArgsWithValues and promiseArgsWith1Value silently
skip setting the value for arguments that aren't promises, as will be
necessary when missing arguments do arise.
Also, a problem is fixed with the DispatchOrEval function in eval.c.
Without this fix, some subtle things go wrong with existing features
in 2.15.0, and more serious things go wrong with some later pqR mods.
The issue is that if DispatchOrEval is called with argsevald set to 1
(which indicates that arguments have already been evaluated), if
DispatchOrEval dispatches to a method for an object, it passes on
these argument values without putting them in promises along with the
unevaluated arguments. Because of this, a method that attempts to
deparse an argument will not work correctly. It seems possible that
there might also be some other bad effects of not having these
promises.
Here is an illustration:
> a <- 0
> class(a) <- "fred"
> seq.fred <- function (x, y) deparse(substitute(y))
> seq(a,1+2)
[1] "1 + 2"
> seq.int(a,1+2)
[1] "3"
Both "seq" and "seq.int" dispatch to seq.fred, but seq.int calls
DispatchOrEval, which doesn't pass on a promise with the unevaluated
argument. After the fix, seq.int does the same as seq. This example
is now tested in tests/eval-etc.R.
Also fixed some formatting in DispatchOrEval, and improved the
documentation for R_possible_dispatch to explain its features used in
this fix.
IF ENABLE_ISNAN_TRICK is defined when pqR is configured (by including
-DENABLE_ISNAN_TRICK in CFLAGS), the ISNAN macro is changed to be
faster for many common cases. This change relies on the same result
being produced when casting NaN, -NaN, NA, and -NA to integer, which
is true on Intel systems, but not on SPARC systems. A fatal error is
produced if this is seen to not be true (in which case the define of
ENABLE_ISNANS_TRICK should of course be removed).
Reorder RCNTXT structure and code in begincontext to maybe make saving
a context faster.
Introduced a facility for making a local copy of R_NilValue, and
potentially other globals. This is done with LOCAL_COPY(R_NilValue).
Access to the local copy may be faster, partly because the compiler
will know that it isn't being modified.
Lookup of symbols defined in the base environment has been sped up by
flagging symbols that have a base environment definition recorded in
the global cache. This allows the definition to be retrieved quickly
without looking in the hash table. In particular, this speeds up
basic operations such as "+", "<-", "if", and "length".
An "eval" of an expression that evaluates to itself (usually a
constant in an expression) has been made faster by quickly checking
for a self-evaluating value by a shift and mask operation, and if the
expression is self-evaluating, returning it immediately, without the
overhead of things like checking for stack overflow. Depending on the
machine and the compiler, it's possible that the subsequent switch on
expresssion type for non-self-evaluating expressions will also be
faster, due to it having many fewer cases.
Lookup of some builtin/special function symbols (eg, '+' and 'if') has
been sped up by allowing fast bypass of non-global environments that
do not contain (and have never contained) one of these symbols. The
symbols that are special for this purpose are specified in InitNames
in names.c.
Defines versions of getAttrib that allow faster attribute search when
it is known that no special processing is needed. Plus other minor
speedups.
Various inlined procedures were changed to be more efficient.
Several sets of types, represented by 32-bit words with "1" bits
corresponding to included types, are now defined in Rinternals.h. These
allow fast testing for set membership with if ((set >> type) & 1) ...
This is sed heavily in the inlined functions, but may also be used
elsewhere.
The "length" function was un-inlined, since it's fairly long.
Numerous occurences of code like for (i = 0; i < length(...); i++)
... were replaced by code that doesn't call length many times,
sometimes by saving the result of one call of length, sometimes by
replacing length with LENGTH. (Though note that LENGTH doesn't work
for R_NilValue!)
Detailed speed-up of the "install" function for installing a new (or
old) symbol. Also put in (currently disabled) code for seeing how
many symbols there are, for tuning purposes.
The matchArgs function, used in the interpreter to match formal and
actual arguments when calling functions has been sped up, and given a
new interface.
One interface change allows the formal arguments to either be given as
a list SEXP (as before), or as an array of C strings, along with a
count of how many strings are in the array. (If formals are given by
C strings, the SEXP for the formals list parameter should be NULL,
whereas if the formals are given by a list, the pointer for the C
strings should be NULL and their count should be 0.)
Numerous calls of matchArgs are changed to use the interface with an
array of C strings (for example, in the code implementing rep and
seq.int). These calls were previously preceded by creation of a list
with calls to "install" for all the formal argument names. Using the
new interface is cleaner and considerably faster.
A second interface change is that if the formals are given by a list
SEXP, tags for the arguments are attached to the actuals list by
matchArgs. Places where matchArgs is called are changed to no longer
do this themselves. (Doing this in matchArgs is both cleaner and
faster.)
The new code is also faster in ways unrelated to these interface
changes.
Finally, 38 calls of check1arg(args,call,"x") were replaced with calls
of a new macro check1arg_x(args,call) that should be faster.
Parentheses are make faster by making them SPECIAL. Also, curly
brackets pass on the eval variant to the last expression, and pass
VARIANT_NULL for earlier expressions.
Values of forced promises no longer have NAMED always set to 2.
Instead NAMED for an object is incremented when it becomes the value
of a promise.
The creation of argument lists for closures is sped up by avoiding an
unnecessary allocation of a CONS cell, in the same way as was done in
my Sep 2010 patch for evalList, which was incorporated into 2.12.0 and
later versions of R. Also, now uses cons_with_tag in all these routines.
PROTECT, UNPROTECT, etc. have been made mostly macros in most of the
files in src/main. This applies only to files that include Defn.h
after defining the symbol USE_FAST_PROTECT_MACROS. If this is
defined, macros PROTECT2 and PROTECT3 for protecting two or three
objects at once are also defined.
This change speeds up numerous operations.
Some binary and unary arithmetic operations have been sped up by, when
possible, using the space holding one of the operands to hold the
result, rather than allocating new space. Though primarily a speed
improvement, for very long vectors avoiding this allocation could
avoid running out of space.
Global constants R_ScalarLogicalNA, R_ScalarLogicalTRUE, and
R_ScalarLogicalFALSE have been created, and the interpreter's
ScalarLogical function now returns one of these rather than allocate
new space for every logical value.
To avoid problems with an external C or Fortran routine changing one
of these values (with an incorrect specification of DUP=FALSE even
though it modifies the argument), the values of these constants are
checked after the return of an external function called with .C or
.Fortran, and if they have changed, their values are reset and an
error is signalled.
The bytecode interpreter sets up a similar set of logical constants.
That facility should be merged with this one (perhaps by just calling
the ScalarLogical function in the bytecode interpreter).
Various places in coerce.c were changed to use ScalarLogical rather
than allocate logical values themselve. This is both cleaner and now
more efficient given the change above.
Several primitive functions that can generate integer sequences (":",
seq.int, seq_len, and seq_along) will now sometimes not generate an
actual sequence, but rather just a description of its start and end
points. This is not visible to users (except in time and space
savings), but allows for speed up (with other mods) of primitive
operations such as "for" loops and indexing of vectors.
The basic sexprec structure for objects is modified here, to allow for
a number of future modifications. The new scheme is documented in
R-ints.texi.
Procedures copy_1_string, copy_2_strings, and copy_3_strings are now
defined in utils.c, and used in many places in the interpreter. These
procedures concatenate 1, 2, or 3 strings, checking for overflow of
the destination space. They are faster and less error-prone than the
various code sequences they replace (often involving strlen and sprintf).
This gives signficant speed-ups for some operations such as calling
S3 methods.
Speed up character translation a bit somtimes, by doing operations
only when they are actually needed.
Many calls of vmaxget and vmaxset are replaced by macros VMAXGET and
VMAXSET that do the same thing faster.
Removes a call of R_isMissing in the interpeter's evalList function.
This check, done for every argument to a builtin primitive that is a
symbol, is slow, and appears to serve only to produce an error message
that is slightly different (and sometimes less informative) than
simply letting the symbol be evaluated.
Extensive cleanup in bind.c and coerce.c.
Simple cases of "c" with no names (or names ignored), no conversion,
and no recursion are done more quickly.
The copy_elements procedure is now used where appropriate.
A complete set of XFromY functions are now present in coerce.c (some
were missing). A copy_numeric_or_string_elements procedure is now
defined in coerce.c, which uses these functions.
Also, fixed a bug where excess warning messages may be produced on
conversion to RAW. See NEWS entry.
Access via the $ operator to lists, pairlists, and environments has
been sped up. The speedup comes mainly from (a) avoiding the overhead
of calling DispatchOrEval if there are no complexities, (b) passing on
the field to extract as a symbol, or a name, or both, as available,
and then converting only as necessary, (c) using the new ep_match
functions instead of the previous local pstrmatch procedure, and (d)
not translating a string multiple times.
An error reporting bug in $ was also fixed. See NEWS entry.
Fixes the "debug" facility. See NEWS item.
Also cleans up code, and propagates evalv variant to branches of "if".
Logical operations and relational operators have been sped up in
simple cases, and use the new facility for producing a scalar logical
result without allocating new storage. Relational operators have also
been substantially speeded up for long vectors. Relational operators
are reduced to either EQOP or LTOP to avoid repetitive code, which
then makes it reasonable to specially treat equal length operands and
operands of length 1.
Speeds up extraction and replacement of subsets of vectors or
matrices, by speeding up the creation of the vector of indexes used.
Often avoids a duplication and eliminates a second scan of the
subscript vector for zero subscripts, folding it into a previous scan
at no additional cost. String subscripts are handled more efficiently
by not creating a vector of indexnames when it is not needed, and by
other detailed code improvements.
The previous code duplicated a vector of indexes when it seems
unnecessary. Duplication was for two reasons: first, to handle the
situation where the index vector is itself being modified in a replace
operation, and second, so that any attributes can be removed, which is
helpful only for string subscripts, given how the routine to handle
them returns information via an attribute. Duplication for the second
reasons can easily be avoided. The first reason for duplication is
sometimes valid, but can usually be avoided by, first, only doing it
if the subscript is to be used for replacement rather than extraction,
and second, only doing it if the NAMED field for the subscript isn't
zero.
Also removes two layers of procedure call overhead (passing seven
arguments, so not trivial) that seemed to be doing nothing.
Extending lists and character vectors by assigning to an index past
the end, deleting list items by assigning NULL, and concatenation of
character vectors with "c" have all been speded up. This is partially
from use of copy_string_elements and copy_vector_elements. Another
gain comes from handling deletion of a contiguous block specially.
Speeds up extraction of subsets with "[", as detailed in NEWS entries.
The BLAS routines supplied with R were modified to improve the
performance of the routines DGEMM (matrix-matrix multiply) and DGEMV
(matrix-vector multiply). Also, proper propagation of NaN, Inf,
etc. is done now.
These routines are probably still not as fast as those in a more
sophisticated BLAS, but will be of benefit to users who do not
install a different BLAS.
Improves the performance of the uniform random number generation
routines (which are also used as the base for other generators), and
an unnecessary limitation. See NEWS entry for details.
The previous code was also rather messy - global references were mixed
with references by argument to the same variables, sometimes concealed
by macro definitions, and the seed was often referenced by pointers
which actually always pointed to the same location, which was also in
some places referenced directly.
A bit of previous code that assumed R integers are exactly 32 bits was
changed to assume only an R integer is at least 32 bits in size.
Speeds up "any" and "all" by detailed code improvements.
The R code for as.data.frame.matrix has been sped up a bit. (Other
mods also have the effect of speeding up this function.)
Includes the matprod library from github.com/radfordneal/matprod,
and uses it for the %*% operator when doing so is specified by the
mat_mult_with_BLAS option. See help("%*%") and help(options) for
details.
The NEWS item for this is a stub, since it will be combined with
that for a later mod.
I previously proposed a simple patch speeding up squaring, which the R
core team did not adopt. Instead, in R-2.12.0 they introduced an
inline function R_POW, that checks specially for a power of 2, and
does it as a multiply, otherwise calling the R_pow function. R_pow
proceeds to check again for a power of 2 at the beginning, and then
check again for a power of 2 just before calling the C pow function.
The R_pow function also contains a check for a power of 0.5, but it is
disabled for recent versions of gcc, to bypass a bug that according to
a comment existed at one time. Note that the inline R_POW function
will not necessarily be actually inlined by the compiler, and that in
any case the check for a power of 2 is done over again for every
element of a vector being raised to that power, even if the power is a
scalar.
In this new patch, if the power is a scalar, I check for it being 2,
1, 0, or -1, and if so handle it specially. Otherwise, I call R_pow,
which I changed to not bother checking for powers of 2, and to
actually check for a power of 0.5. (Since the relevant code has
changed, any buggy compilers still extant probably will compile the
new code OK; if not, they probably compile lots of stuff incorrectly,
since there is nothing unusual in the new code.)
For non-scalar powers, I use R_POW, but change it to a macro, so that
it will definitely be inlined, and have it check for powers of 2 and
1. (If one is going to do this check when the power is a vector, it
makes sense to tailor it to something other than a vector of powers
that are all the same, since this doesn't seem like a common case.
Some powers of 1 and some of 2 seems plausible in some statistical
applications.) The macro also allows for the power to be an integer,
slightly speeding up some integer^integer operations.
The speed improvement from this patch depends a lot on the machine
architecture and the compiler. On machines where memory is much
slower than the processor, checking for a power of 2 every time may
mostly overlap with the a memory fetch or store operation, but one
would not expect this to always be the case.
See also the NEWS item on this.
Rewrote the internal rowSums and colSums functions to be faster. Also
changed the R rowSums and colSums functions that call the internal
functions so that they treat the common case where the array is matrix
specially, with less overhead.
Code for the sum and prod functions has been changed to move some
checks outside the inner summation loops. The effect depends on the
extent to which the unnecessary checks overlap memory fetch
operations, but one would expect signficant speed-ups with some
machines/compilers.
I had previously proposed this modificaton to sum and prod before
R-2.12.0. That patch was not adopted by the R core team, though they
did swap the order of checking for NA/NaN and checking the na.rm
option, which avoids the worst inefficiency of the previous code in
R-2.11.1.
The speed of the transpose (t) function has been improved, when