Commit b07b1f75 authored by Sergio Costas's avatar Sergio Costas

Updated documentation

parent 0681bebb
......@@ -20,9 +20,10 @@ but they aren't used during compilation.
It can be useful for projects where RUST is not feasible, like
code for microcontrollers, kernel drivers...
## INSTALLATION
In order to use CRUST, you need *flex* and *bison*, and *python 3*.
After downloading the source code, just run:
sudo ./setup.py install
......@@ -32,244 +33,10 @@ library with the parser, and install the static analyzer.
## USAGE
### Preparing the source code
First, each C source code must include the file **crust.h**.
This file ensures that the tags are removed at compile time
during the C Preprocessor pass.
Then, each C source file must be annotated with the CRUST tags.
The available tags are:
* __crust__ : specifies that this is a CRUST element, and must
be tracked by the CRUST static analyzer. It can be applied only to
single-indirection pointers.
* __crust_borrow__ : when used with an argument in a function
definition, it means that the pointer is borrowed to that function,
so the calling function retains the ownership. When used with the
return value of a function, it means that the calling function won't
receive the ownership, but it is retained by the called function.
* __crust_not_null__ : specifies that an argument in a function or
the return value will never be NULL.
* __crust_recycle__ : when a function returns a CRUST pointer, and
has at least one CRUST argument, it is possible to mark one (and
only one) of those CRUST arguments as *recyclable*. That means that
the memory block passed as that argument will be reused for the
return value if available. In other words: if the *recycle* argument
is not NULL, the return value will never be NULL, but if the *recycle*
argument is NULL, then the return value can be NULL.
* __crust_alias__ : an alias variable is a variable that can point
to the same block pointer by other CRUST-type variable, in which
case it will have exactly the same state than the main pointer. But
it can also point to a block not pointed by any other CRUST pointer.
It is useful for FOR loops to operate over a linked list of CRUST
elements without freeing them.
* __crust_disable__ : disables the error messages. Useful when there
is no way of implementing something without violating the rules.
* __crust_enable__ : after a __crust_disable__ statement, enables
again the error messages. If there are nested tags, it is needed as
many *enable* tags as *disable* to re-enable the messages.
* __crust_full_enable__ : enables the error messages without taking
into account the disable counter (three *disable* tags require three
*enable* tags to re-enable the messages, but only one *full_enable*
tag).
* __crust_no_0__ : by default, when analyzing FOR and WHILE loops,
CRUST presumes that it is possible to never run the code inside (if
the condition is met before running the loop). If it is known for
sure that the loop will run at least once, but CRUST can't infer it,
it is possible to append this tag to the loop to specify that the
code inside must be evaluated at least once.
* __crust_debug__ : shows the status of the analysis. Useful when
writing unitary tests, to ensure that the pointers have the right
status.
### Calling CRUST
To check a source file, just call CRUST with:
crust file_name.c
If there are headers located in other places, it is possible to
specify them from the command line using the **-I** statement.
Example: to include the headers located at */usr/share/a_program/includes*
to analyze the file *source.c*, just use:
crust -I/usr/share/a_program/includes source.c
It is possible to use as many *-I* statements as needed, one per
directory.
It is also possible to make definitions from the command line
using the **-D** statement. In the previous case, if we want to
define DEBUG_ALL, it can be done with:
crust -I/usr/share/a_program/includes -DDEBUG_ALL source.c
It is possible to specify several source files, or even to use
wildcards:
crust "src/*.c"
The quotes are needed to avoid BASH to expand the wildcards itself.
It is also possible to specify filenames that must be avoided (useful
when using wildcards, but there are files that should not be analyzed).
This can be done with the **-e** statement:
crust "src/*.c" -esourcefile.c
will process all files whose name end with *.c* except *sourcefile.c*.
## MANAGEMENT MODEL
Before reading this, it is strongly recommended to read the chapter
"understanding ownership" from the RUST tutorial, because the model
used in CRUST is mainly the same.
https://doc.rust-lang.org/beta/book/second-edition/ch04-00-understanding-ownership.html
Also it can be useful to check the **unitest** folder, with all the
tests and the comments that explain why something is right or wrong.
CRUST presumes that every CRUST-type pointer can be in one of the
following states:
* UNINITIALIZED: when it was defined but still hasn't been assigned
to a block
* NULL: when it is known for sure that it points to NULL
* NOT_NULL: when it is known for sure that it points to a block
* NOT_NULL_OR_NULL: when it is initialized but it is unknown if it
points to NULL or to a block
* FREED: when the ownership has been passed to another variable or
function, so it is presumed that the block has been freed and this
is a dangling pointer.
It also mandates that each memory block must be pointed by one, and
only one, CRUST variable (there is an exception with aliases, and
global variables).
CRUST follows all the possible execution branches for each of the
functions in the source file. When it starts with a function, each of
the arguments are set to NOT_NULL_OR_NULL, unless they are tagged
with __crust_not_null__.
Every time a CRUST variable is initialized inside the code (example: when
assigning to a variable a function's return value), it is allowed only if
its current status is UNINITIALIZED, NULL or FREED. In case it is NOT_NULL,
or NOT_NULL_OR_NULL, it is considered an error, because that will result
in a memory leak.
If a function returns a CRUST value, it is also an error to not store it
in a CRUST variable, because it also results in a memory leak. The only
exception is when the return value is marked as *borrowed*, in which case
it must be stored in a CRUST *borrowed* variable.
Every time a CRUST variable is passed as an argument to a function, the
ownership is passed too, so, from that point, it will be in the FREED
state inside the calling function until it is assigned a new block,
because it is assumed that the block has been freed in the called
function. There is an exception, and it is when the argument in the
called function is marked as *borrowed*. In that case, the called
function can't free the block, but can modify it, because the
ownership is retained in the caller function.
Trying to use a CRUST variable that is in UNINITIALIZED or FREED status is
an error, either using it as an argument when calling a function or accessing
them through indirection (in the case of a pointer to a struct), because,
in the case of FREED status, that block has already been freed (it is a
dangling pointer).
Global variables are an special case: since it is not possible to know
its state (because it can be changed outside the current function), they
are assumed to be in NOT_NULL_OR_NULL status by default, and it is checked
at exit that they are in NOT_NULL, NULL or NOT_NULL_OR_NULL state (
never in FREED or UNINITIALIZED state). Assigning a CRUST variable to a global
variable is allowed, and the variable will not be marked as freed, but the
global variable will work like an alias variable (this is: freeing the
block pointed by a local variable copied in a global variable will result
in the global variable being also freed). It is possible to have several global
variables pointing to the same block during the life of the function.
When the execution reaches the end of the function all variables are checked.
It is an error to reach the end with CRUST variables in NOT_NULL or
NOT_NULL_OR_NULL state. There is an exception: when a local variable has been
copied to a global variable, it is not needed to free it. But at the end
of the execution, each block pointed by global variables must be pointed by
only one global variable (it is an error to reach the end of the function
and have two or more global variables pointing to the same block).
It is possible to change the state of the global variables using
*__crust_set_null__(variable_name)* to set it to NULL, or
*__crust_set_not_null__(variable_name)* to set it to NOT_NULL. The later
is useful when a global variable in a function is known to be NOT_NULL, so
putting it at the beginning of that function will remove the undesired
messages. The former is given mainly for completion.
CRUST understand some comparisons, so this code:
if (crust_variable == NULL) {
crust_variable = function();
}
will never produce an error, because if *crust_variable* has a NOT_NULL_OR_NULL
status before the IF statement, the code evaluation will branch in two, one
where it has the NULL status (that will evaluate the call to the function), and
another with a NOT_NULL status, that will evaluate only the code after the **if**
statement.
Finally, when the execution evaluation reaches the end of the function, all the
CRUST-type variables are checked, and it is an error that they are in NOT_NULL or
NOT_NULL_OR_NULL status (unless they are *borrowed*). Doing so would lead again
to memory leaks.
# CURRENT STATUS
The code is still in *alpha* status. The C parser is quite complete, but there
are still some obscure cases that aren't managed yet (they can be found in the
bison file, *c99.y*, marked with a call to *show_error*). In case you receive
a message:
Undefined statement at line...
Please, contact the author
Just paste the code that generated the error and send it to the author.
## GNU extensions
The parser recognizes the following non-standard statements (but it doesn't use them, just
doesn't fail if they are present in the code):
* #pragma
* __builtin_va_list
* __signed__
* __extension__
* __prog__
* __restrict
* __inline
It also recognizes the following syntax extensions (but are managed as stubs):
* statements inside parentheses
* ellipsis syntax in CASE statements ( CASE n1 ... n2: )
* __attribute__ (...)
* __asm__ [XXXX] (...)
* asm [XXXXX] (...)
* __alignof__(...)
* __typeof__(...)
* __builtin_offsetof(...)
For detailed documentation, check the *doc* folder.
## CONTACTING THE AUTHOR
# CONTACTING THE AUTHOR
Sergio Costas Rodriguez
rastersoft@gmail.com
http://www.rastersoft.com
......
No preview for this file type
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment