Commit 40f241ba authored by Rachel Wil Sha Singh's avatar Rachel Wil Sha Singh 💬

Binary Search Tree WIP

parent 4e16d394
\documentclass[a4paper,12pt,oneside]{book}
\usepackage[utf8]{inputenc}
\newcommand{\laTopic} {Binary Search Trees}
\newcommand{\laTitle} {Rachel's Data Structures Notes}
\newcounter{question}
\renewcommand{\chaptername}{Topic}
\usepackage{../../rachwidgets}
\usepackage{../../rachdiagrams}
\title{}
\author{Rachel Singh}
\date{\today}
\pagestyle{fancy}
\fancyhf{}
\lhead{\laTopic \ / \laTitle}
\chead{}
\rhead{\thepage}
\rfoot{\tiny \thepage\ of \pageref{LastPage}}
\lfoot{\tiny Rachel Singh, last updated \today}
\renewcommand{\headrulewidth}{2pt}
\renewcommand{\footrulewidth}{1pt}
\newtoggle{standalone}
\toggletrue{standalone}
\begin{document}
\begin{titlepage}
\centering{}
\sffamily{
\textbf{
{\fontsize{2cm}{3cm}\selectfont Rachel's Data Structures Notes} ~\\~\\
{\fontsize{2cm}{3cm}\selectfont \laTopic}
}
}
\begin{figure}[h]
\begin{center}
\includegraphics[width=12cm]{../images/binary-search-tree.png}
\end{center}
\end{figure}
\sffamily{
\textbf{
An overview compiled by Rachel Singh
}
}
\vspace{1cm} \small
This work is licensed under a \\ Creative Commons Attribution 4.0 International License. ~\\~\\
\includegraphics{../images/cc-by-88x31.png}
~\\~\\
Last updated \today
\end{titlepage}
\tableofcontents
%--------------------------------------------------------------------%
%--------------------------------------------------------------------%
\chapter{Binary Search Trees}
\section{Introduction to Binary Search Trees}
\begin{center}
\includegraphics[width=6cm]{images/tree-binarysearchtree}
\end{center}
Binary search trees are a type of data structure that keep the data
it contains \textbf{ordered}. The ordering process happens when
a new piece of data is entered by finding a location for the data
that adheres to the ordering rules. With a binary search tree,
\textbf{smaller values} are stored to the left and \textbf{larger values}
are stored to the right.
This means that when we're searching for data, when we land at a node
we can figure out whether to traverse \textit{left} or \textit{right}
by comparing the node to what we're searching for.
\newpage
%--------------------------------------------------------------------%
%--------------------------------------------------------------------%
\section{Architecture of a Binary Search Tree}
\begin{center}
\includegraphics[width=14cm]{images/linkedlistdiagram}
\end{center}
With a \textbf{Linked List}, we need to implement a Node structure
and a LinkedList class, where the Node stores the data and the LinkedList
provides an interface for users to add, remove, and search for data
and stores a pointer to the \textbf{first} and \textbf{last} elements.
\vspace{1cm}
\begin{center}
\includegraphics[width=14cm]{images/binarysearchtreediagram}
\end{center}
Similarly for a \textbf{Binary Search Tree}, we need another type of Node
to store the data, as well as the BinarySearchTree structure that
acts as an interface and keeps a pointer to the \textbf{root node}.
\newpage
We might also think of a Binary Search Tree as being ordered based
on the data's \textbf{key} - some sort of unique identifier we
assign to each node - and then containing additional data (a value)
within the node.
~\\
For example, if we create our Node with two templated types like this:
\begin{center}
\includegraphics[width=7cm]{images/bstnode}
\end{center}
our \textbf{key} could be a unique lookup (e.g., ``employee ID''),
and the \textbf{data}/value could be another structure that
stores more employee data (name, department, etc.)
\begin{center}
\includegraphics[width=14cm]{images/bst-employees}
\end{center}
What's the significance of the hierarchy in this tree? Nothing,
really - the point of a binary search tree is that we're assuming
the \textbf{keys} we will be pushing into the tree will be in
a somewhat \textbf{random order}, and using the BST structure
will help keep things ordered and somewhat faster to search through.
\newpage
\subsection{BinarySearchTreeNode in C++:}
\begin{lstlisting}[style=code]
template <typename TK, typename TD>
class Node
{
public:
Node()
{
ptrLeft = nullptr;
ptrRight = nullptr;
}
Node( TK newKey, TD newData )
{
key = newKey;
data = newData;
ptrLeft = nullptr;
ptrRight = nullptr;
}
~Node()
{
if ( ptrLeft != nullptr ) { delete ptrLeft; }
if ( ptrRight != nullptr ) { delete ptrRight; }
}
Node<TK, TD>* ptrLeft;
Node<TK, TD>* ptrRight;
TD data;
TK key;
};
\end{lstlisting}
The node I've written here contains a \textbf{key}, which nodes will
be ordered by, and \textbf{data}, which can contain more
information. ~\\
As with any structure utilizing \textbf{pointers}, the pointers
should be initialized to \texttt{nullptr} in any constructors. ~\\
The \textbf{destructor} here will trigger the deletion of any
child nodes, creating a chain reaction to clean up the entire
tree if the root node is deleted.
\newpage
\subsection{BinarySearchTree in C++:}
\begin{lstlisting}[style=code]
template <typename TK, typename TD>
class BinarySearchTree
{
public:
BinarySearchTree();
~BinarySearchTree();
// Basic functionality
void Push( const TK& newKey, const TD& newData );
bool Contains( const TK& key );
TD& GetData( const TK& key );
void Delete( const TK& key );
// Traversal functions
string GetInOrder();
string GetPreOrder();
string GetPostOrder();
// Additional functionality
TK& GetMinKey();
TK& GetMaxKey();
int GetCount();
int GetHeight();
private:
// (more here)
private:
Node<TK, TD>* m_ptrRoot;
int m_nodeCount;
};
\end{lstlisting}
A Binary Search Tree, just like other data structures, can store more
functionality than this, or less if needed. ~\\
There are additional \textbf{private methods} that would be implemented.
This declaration is just showing the \textbf{public (interface) methods}
and the \textbf{private member variables}. I will talk about the
private helper methods in depth later.
%--------------------------------------------------------------------%
%--------------------------------------------------------------------%
\newpage
\section{Efficiency of a Binary Search Tree}
The Binary Search Tree ends up being a good compromise between
choosing \textbf{faster random access but slow search/insert/delete} (like with a dynamic array)
and \textbf{faster inserts/deletes but slow search/access} (like with a linked list).
~\\
The Binary Search Tree ends up being slower than $O(1)$ (instant) but
faster than $O(n)$ (linear) for all of its operations:
\begin{center}
\begin{tabular}{| l || c | c | c | c |} \hline
\textbf{Structure} & \textbf{Random access} & \textbf{Search} & \textbf{Insert} & \textbf{Delete}
\\ \hline
\textbf{Dynamic Array} & $O(1)$ & $O(n)$ & $O(n)$ & $O(n)$
\\ \hline
\textbf{Linked List} & $O(n)$ & $O(n)$ & $O(1)$ & $O(1)$
\\ \hline
\textbf{Binary Search Tree} & $O(log(n))$ & $O(log(n))$ & $O(log(n))$ & $O(log(n))$
\\ \hline
\end{tabular}
\end{center}
Why is this? By the nature of its tree structure, as we traverse
the tree we're essentially \textbf{cutting out half the tree}
each time we choose to go \textit{left} or \textit{right}.
Halfing the nodes to search \textit{each cycle} means we have
the opposite of exponential growth: A logarithmic function.
\begin{figure}[h]
\centering
\begin{subfigure}{.3\textwidth}
\centering
Constant growth ~\\ $O(1)$ ~\\~\\
\resizebox{\textwidth}{!}{%
\begin{tikzpicture}
\begin{axis}[xmin=0,ymin=0,xmax=5,ymax=2,xticklabel=\empty,yticklabel=\empty,]
\addplot[red] coordinates
{(0,0.5) (20,0.5)};
\end{axis}
\end{tikzpicture}
}
\end{subfigure}%
\begin{subfigure}{.3\textwidth}
\centering
Logarithmic growth ~\\ $O(log(n))$ ~\\~\\
\resizebox{\textwidth}{!}{%
\begin{tikzpicture}
\begin{axis}[xmin=0,ymin=0,xmax=5,ymax=2,xticklabel=\empty,yticklabel=\empty]
\addplot[red] {log10(x)};
\end{axis}
\end{tikzpicture}
}
\end{subfigure}%
\begin{subfigure}{.3\textwidth}
\centering
Linear growth ~\\
$O(n)$ ~\\~\\
\resizebox{\textwidth}{!}{%
\begin{tikzpicture}
\begin{axis}[xmin=0,ymin=0,xmax=5,ymax=5,xticklabel=\empty,yticklabel=\empty,]
\addplot[red] coordinates
{(0,0) (5,5)};
\end{axis}
\end{tikzpicture}
}
\end{subfigure}
\end{figure}
%--------------------------------------------------------------------%
%--------------------------------------------------------------------%
%--------------------------------------------------------------------%
%--------------------------------------------------------------------%
\newpage
\section{The Binary Search Tree and Recursion}
Many of the BinarySearchTree functions will be \textbf{recursive},
starting at the root node and recursing down. Because of this,
the Push function (and many others) that would actually do the work would look like this:
\begin{lstlisting}[style=code]
void RecursivePush(
TK newKey,
TD newData,
Node<TK, TD>* ptrCurrent );
\end{lstlisting}
However, we don't want the user \textit{outside of the BinarySearchTree}
to have to call Push and pass in the tree's node. They shouldn't even
have access to any \texttt{Node} objects...
\begin{lstlisting}[style=code]
myTree.Push( 'a', "apple", ???? ); // What do I pass in?
\end{lstlisting}
~\\ That's why we have the \textbf{public Push function}...
\begin{lstlisting}[style=code]
void Push( TK newKey, TD newData );
\end{lstlisting}
~\\ And a \textbf{private RecursivePush function}...
\begin{lstlisting}[style=code]
void RecursivePush( TK newKey, TD newData,
Node<TK, TD>* ptrCurrent );
\end{lstlisting}
~\\ Where the user calls the \textbf{public Push} and that function
makes the first call to \textbf{RecursivePush}, passing in the
root node to begin operations on.
\begin{lstlisting}[style=code]
void Push( TK newKey, TD newData )
{
RecursivePush( newKey, newData, m_ptrRoot );
}
\end{lstlisting}
\newpage
\subsection{Private recursive functions}
\begin{itemize}
\item \texttt{void Push( TK newKey, TD newData )} calls ~\\
\texttt{ RecursivePush( newKey, newData, m\_ptrRoot ); }
\item \texttt{bool Contains( TK key )} calls ~\\
\texttt{ return RecursiveContains( key, m\_ptrRoot ); }
\item \texttt{string GetPreOrder()} calls ~\\
\texttt{ return RecursiveGetPreOrder( m\_ptrRoot ); }
\item \texttt{string GetInOrder()} calls ~\\
\texttt{ return RecursiveGetInOrder( m\_ptrRoot ); }
\item \texttt{string GetPostOrder()} calls ~\\
\texttt{ return RecursiveGetPostOrder( m\_ptrRoot ); }
\item \texttt{Node<TK,TD>* FindNode( TK key )} calls ~\\
\texttt{ return RecursiveFindNode( key, m\_ptrRoot ); }
\item \texttt{TK GetMaxKey()} calls ~\\
\texttt{ return RecursiveGetMaxKey( m\_ptrRoot ); }
\item \texttt{TK GetMinKey()} calls ~\\
\texttt{ return RecursiveGetMinKey( m\_ptrRoot ); }
\item \texttt{int GetHeight()} calls ~\\
\texttt{ return RecursiveGetHeight( m\_ptrRoot ); }
\end{itemize}
% Delete functionality
\newpage
\section{Functionality of a Binary Search Tree}
\subsection{The Constructor}
The constructor of the BinarySearchTree should set the \texttt{m\_ptrRoot}
pointer to \texttt{nullptr} and initialize the \texttt{m\_nodeCount} to 0.
\vspace{1cm}
\subsection{The Destructor}
The destructor of the BinarySearchTree will check to see if
\texttt{m\_ptrRoot} is not null - if it's not null, we will free
that memory with the \texttt{delete} command.
\vspace{1cm}
\subsection{Push}
Within \textbf{Push}, first check to see if the tree already
contains a node with the given \texttt{newKey} by calling the
\texttt{Contains} method. If that key is already present,
I would throw an exception (for this design, we are assuming
the keys are unique identifiers). ~\\
If the key is \textit{not} already in the tree, then
we are concerned with two scenarios:
\begin{enumerate}
\item The \texttt{m\_ptrRoot} is nullptr.
\item The \texttt{m\_ptrRoot} \textit{is not nullptr}.
\end{enumerate}
If the root is null, this is where we put our new node and set up its data:
\begin{lstlisting}[style=code]
m_ptrRoot = new Node<TK,TD>( newKey, newData );
m_nodeCount++;
\end{lstlisting}
If the root is already storing some data, then we call
\texttt{RecursivePush}, passing forward the \texttt{newKey},
\texttt{newData}, and the \texttt{m\_ptrRoot} as the starting point.
\newpage
\subsubsection{RecursivePush}
Within the RecursivePush function, we need to be concerned with several scenarios:
\begin{enumerate}
\item The \texttt{newKey} is \textbf{less than} the \texttt{ptrCurrent->key}, ~\\
and \texttt{ptrCurrent->ptrLeft} \textbf{is} \texttt{nullptr}: ~\\
Store the new data here.
\item The \texttt{newKey} is \textbf{less than} the \texttt{ptrCurrent->key}, ~\\
and \texttt{ptrCurrent->ptrLeft} \textbf{IS NOT} \texttt{nullptr}: ~\\
Recurse left.
\item The \texttt{newKey} is \textbf{greater than} the \texttt{ptrCurrent->key}, ~\\
and \texttt{ptrCurrent->ptrRight} \textbf{is} \texttt{nullptr}: ~\\
Store the new data here.
\item The \texttt{newKey} is \textbf{greater than} the \texttt{ptrCurrent->key}, ~\\
and \texttt{ptrCurrent->ptrRight} \textbf{IS NOT} \texttt{nullptr}: ~\\
Recurse right.
\end{enumerate}
Setting up the new node is a matter of just allocating space and incrementing the \texttt{m\_nodeCount}:
\begin{lstlisting}[style=code]
// Storing data to the left of the current pointer
ptrCurrent->ptrLeft = new Node<TK,TD>(newKey, newData);
m_nodeCount++;
\end{lstlisting}
And recursing just requires passing through the same key and data, but a new node to look at:
\begin{lstlisting}[style=code]
// Recurse left
RecursivePush( newKey, newData, ptrCurrent->ptrLeft );
\end{lstlisting}
\newpage
\subsection{GetNodeWithKey}
\subsection{GetMinKey}
\subsection{GetMaxKey}
\subsection{GetHeight}
\end{document}
\contentsline {chapter}{\numberline {1}Binary Search Trees}{2}%
\contentsline {section}{\numberline {1.1}Introduction to Binary Search Trees}{2}%
\contentsline {section}{\numberline {1.2}Architecture of a Binary Search Tree}{3}%
\contentsline {subsection}{\numberline {1.2.1}BinarySearchTreeNode in C++:}{5}%
\contentsline {subsection}{\numberline {1.2.2}BinarySearchTree in C++:}{6}%
\contentsline {section}{\numberline {1.3}Efficiency of a Binary Search Tree}{7}%
\contentsline {section}{\numberline {1.4}The Binary Search Tree and Recursion}{8}%
\contentsline {subsection}{\numberline {1.4.1}Private recursive functions}{9}%
\contentsline {section}{\numberline {1.5}Functionality of a Binary Search Tree}{10}%
\contentsline {subsection}{\numberline {1.5.1}The Constructor}{10}%
\contentsline {subsection}{\numberline {1.5.2}The Destructor}{10}%
\contentsline {subsection}{\numberline {1.5.3}Push}{10}%
\contentsline {subsubsection}{RecursivePush}{11}%
\contentsline {subsection}{\numberline {1.5.4}GetNodeWithKey}{12}%
\contentsline {subsection}{\numberline {1.5.5}GetMinKey}{12}%
\contentsline {subsection}{\numberline {1.5.6}GetMaxKey}{12}%
\contentsline {subsection}{\numberline {1.5.7}GetHeight}{12}%
\documentclass[a4paper,12pt,oneside]{book}
\usepackage[utf8]{inputenc}
\newcommand{\laTopic} {Binary Search Trees}
\newcommand{\laTitle} {Rachel's Data Structures Notes}
\newcounter{question}
\renewcommand{\chaptername}{Topic}
\usepackage{../../rachwidgets}
\usepackage{../../rachdiagrams}
\title{}
\author{Rachel Singh}
\date{\today}
\pagestyle{fancy}
\fancyhf{}
\lhead{\laTopic \ / \laTitle}
\chead{}
\rhead{\thepage}
\rfoot{\tiny \thepage\ of \pageref{LastPage}}
\lfoot{\tiny Rachel Singh, last updated \today}
\renewcommand{\headrulewidth}{2pt}
\renewcommand{\footrulewidth}{1pt}
\newtoggle{standalone}
\toggletrue{standalone}
\begin{document}