Commit a209c892 authored by Jonas Termansen's avatar Jonas Termansen

Fix POSIX comformance issues in sort(1).

Fix -C disabling checking rather than checking quietly. Fix sort(1) exiting 1 on certain errors, as POSIX requires sort(1) to only exit if the input wasn't sorted when -c. Fix -o opening the output file for truncation before all the input has been read, as POSIX requires allowing -o to be an input file. POSIX requires sort(1) to handle input errors by either erroring with no output, or by erroring and sorting the input read so far. Change the current behavior of continuing to the next file to simply failing hard on the first input error. Don't increment the last line number on the end of the standard input. Report -c/-C as incompatible with -o. Exit unsuccessfully on any output errors. Update to current coding conventions and add documentation while here.
parent 53592a6e
......@@ -82,6 +82,7 @@ memstat.1 \
pager.1 \
passwd.1 \
readlink.1 \
sort.1 \
uname.1 \
SBINS=\
......
.Dd April 8, 2018
.Dt SORT 1
.Os
.Sh NAME
.Nm sort
.Nd sort lines of text
.Sh SYNOPSIS
.Nm
.Op Fl CcmruVz
.Op Fl o Ar path
.Ar
.Sh DESCRIPTION
.Nm
reads lines of text from the standard input and writes the lines in sorted order
to the standard output.
If files are specified, the input is the concatenated content of the files read
in sequential order.
The
.Ar file
path can be set to
.Sq -
to specify the standard input.
The lines are compared according to the current locale's collating rules.
.Pp
The options are as follows:
.Bl -tag -width "12345678"
.It Fl c, \-check, \-check=diagnose-first
Check whether the input is already sorted.
If a line is out of order (or an equal line is found if
.Fl u ) ,
write an error describing which line was out of
order and exit 1.
.It Fl C, \-check=quiet, \-check=silent
Same as
.Fl c ,
but write no error to the standard output about the input being out order.
.It Fl m, \-merge
Merge the presorted input files into a sorted output.
.It Fl o Ar path , Fl \-output Ns = Ns Ar path
After reading the full input; write the output to the file at
.Pa path
(creating it if it does not already exist, discarding its previous contents if
it already existed).
The output file can be one of the input files.
This option is incompatible with
.Fl C
and
.Fl c .
.It Fl r , \-reverse
Compare the lines in reverse order.
.It Fl u , \-unique
Don't write a line if it is equal to the previous line.
.It Fl V , \-version-sort
Sort according to the version string, per
.Xr strverscmp 3 .
.It Fl z , \-zero-terminated
Lines are delimited with the NUL byte (0) instead of the newline byte (10).
.El
.Sh IMPLEMENTATION NOTES
In the event of an input error,
.Nm
will write an error to the standard error and exit unsuccessfully.
.Pp
.Nm
reads the whole input into memory, rather than storing intermediate sorting
steps in the filesystem, and requires enough memory to store a copy of the whole
input.
.Sh ENVIRONMENT
.Bl -tag -width "LC_COLLATE"
.It Dv LANG
The default locale for locale variables that are unset or null.
.It Dv LC_ALL
Overrides all the other locale variables if set.
.It Dv LC_COLLATE
Compare the input according to this locale's collating rules using
.Xr strcoll 3 .
.El
.Sh EXIT STATUS
.Nm
will exit 0 on success, exit 1 if the input was out of order when
.Fl C
or
.Fl c ,
or exit 2 (or higher) otherwise.
.Sh EXAMPLES
Read lines from the standard input and write them in sorted order to the
standard output:
.Bd -literal
sort < input > output
.Ed
.Pp
Read lines from the three specified files (where the second happens to be the
standard input) and write them in sorted to the standard output:
.Bd -literal
grep pattern lines.txt | sort foo - bar -o output.txt
.Ed
.Pp
Sort the input file if it isn't already sorted:
.Bd -literal
if sort -C file; [ $? = 1 ]; then
sort file -o file
fi
.Ed
.Pp
Remove duplicate lines from the input by sorting it and removing lines equal to
the previous line:
.Bd -literal
sort -u
.Ed
.Sh SEE ALSO
.Xr cat 1 ,
.Xr comm 1 ,
.Xr join 1 ,
.Xr uniq 1 ,
.Xr qsort 3 ,
.Xr strcoll 3 ,
.Xr strverscmp 3
.Sh STANDARDS
.Nm
is standardized in
.St -p1003.1-2008 ,
which is currently partially implemented in this implementation of
.Nm .
.Pp
The
.Fl V
and
.Fl z
options, as well as the long options, are extensions also found in GNU
coreutils.
.Pp
As an extension, the
.Fl C
and
.Fl c
options support multiple input files.
.Sh BUGS
The
.St -p1003.1-2008
options
.Fl b ,
.Fl d ,
.Fl f ,
.Fl i ,
.Fl k ,
.Fl n ,
and
.Fl t
are not currently implemented.
.Pp
The
.Fl m
option is not currently taken advantage of to speed up the sorting, rather the
presorted input files are sorted all over again.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment