Skip to content

New search algorithm for comment visibility

In !473 (closed) there was a proposal for a new algorithm to get the thread visibilities of an entire comment tree without calling a function for every comment in the tree.

The current algorithm is best case linear and worst case quadratic. The best case is from when Comment::isThreadVisible returns immediately when the comment is visible. This is a constant time operation done for all comments in the tree, hence it is linear in the number of comments. The quadratic case is from when the comment is the parent of a long chain of comments that are all not visible, except for the last comment, which is visible.

The current algorithm uses a forward search to determine whether a single comment is visible. The algorithm in the proposal, which also would be best case linear in the number of comments, suggests a backward search to determine all of the visible comments. It had the same issue with the search path, and it would be worse case quadratic when a sizable portion of the number of comments share the same parent.

The new algorithm uses both of those approaches. A forwards pass and a backwards pass. Both passes visit each comment in the tree once, hence it will be linear in the number of comments.

The forwards pass is a breadth-first-search. This will use a new CommentTree wrapper class to have a backreference to its parent and its visibility will have an initial value from isVisibleToUser.

The backwards pass is a reverse level order traversal of the comment tree. If the comment is visible, it sets its parent visibility to true. The backwards pass also builds the return array which is a map of int to bool. With a comment id, accessing this map should be constant time. It maps the comment id to its thread visibility to the user.

Thread visibility to the user is defined by two cases. During the search, either the comment is visible to the user or a descendant of the comment is visible to the user. The new algorithm has a different definition, but the correctness of the new algorithm is shown by recursively applying this definition.

Make a note of the backwards pass of the algorithm. When a comment is visible, it sets its parent visibility to true. And because this is a reverse level order traversal, all the children of a parent must be visited before the parent. This is equivalent to the parent having at least one visible child.

Now, pick one of children that was visible and recursively apply this definition. The child is either has children or not. If it has no children, pick another child. If no other child of the original parent has any children, then the algorithm is correct because these were all the descendants of the parent. Hence, it is the same as saying the parent had a visible descendant.

If the child does have a child, because the child was visible, it was either visible to the user or it had at least one child that was. Pick one of its children. Applying once more the definition will show that an arbitrary depth can be reached, which is equivalent to saying the original parent had a visible descendant.

There was a minor bug in Comment::isThreadVisibleToUser which caused soft deleted comments to be visible when they had no visible descendants. The bug was fixed, and it was because the logic for a voter is different than the logic for a search path.

I have added a test which should confirm that running the new algorithm once on a top level comment is the equivalent of running isThreadVisibleToUser on every comment in the tree. I have run phpunit on CommentTest and have received:

OK (69 tests, 110 assertions)

I am requesting to merge this into a test branch because I want to know if the automated builds will work and confirm this test.

After these changes, the Submission entity will require an update. For linear view, it will get all comments and filter through Comment::isVisibleToUser with $search = true. For nested view, get all top level comments and call Comment::getAllThreadVisibilitiesForUser which will produce a map for every top level comment. The comment template will then use this map. These changes will be done later.

Merge request reports