Skip to content

draft: Advanced topic filtering

Daniel Woznicki requested to merge dwoznicki/tildes:topic-filtering into master

This MR is a proof of concept for more finely grained topic listing filtering.

What's in this draft

The initial draft is a simple /advanced_search page which provides more filtering options via the query string. All filters are repeatable and can be mixed together. Available query string options include:

  • group_ids: Only show topics with any of these group IDs.
  • no_group_ids: Only show topics without any of these group IDs.
  • tags: Only show topics with any of these tags. PostgreSQL ltree formatting is supported. For example, foo.* will match every tag with root foo.
  • all_tags: Only show topics with all of these tags. For example, all_tags=hello&all_tags=world will only show topics tagged as both hello and world. PostgreSQL ltree formatting is supported.
  • not_tags: Only show topics without any of these tags. PostgreSQL ltree formatting is supported.
  • user_ids: Only show topics posted by any of these users.

My goal was to handle all the use cases laid out in #359.

Benchmarks

I did a bit of limited benchmarking to make sure performance looked reasonable. To check latency, I queried a set of randomly generated filter queries 5 times each and tracked the average. Benchmarks were run with 1,000,000 randomly generated topics (see !147) in the database. All benchmarks were run on my personal Macbook.

MacBook Pro (Retina, 13-inch, Late 2013)
Processor 2.6 GHz Dual-Core Intel Core i5
Memory 8 GB 1600 MHz DDR3

After a couple runs, it looks like the average latency for these queries is around 250ms. More complex queries take longer than I'd like, so I may look into this more, depending on how things move forward.

Click for a sample benchmarking run.
url = https://localhost:4443/advanced_search?tags=professor&all_tags=benefit&user_ids=1612&user_ids=1655&user_ids=1631&user_ids=1639, num_queries = 5, avg = 156ms, max = 171ms, min = 137ms
url = https://localhost:4443/advanced_search?not_tags=throughout&not_group_ids=1476&tags=wife&user_ids=1683&tags=tend&group_ids=1452, num_queries = 5, avg = 296ms, max = 414ms, min = 235ms
url = https://localhost:4443/advanced_search?all_tags=unit&all_tags=nearly, num_queries = 5, avg = 140ms, max = 174ms, min = 121ms
url = https://localhost:4443/advanced_search?not_group_ids=1383, num_queries = 5, avg = 205ms, max = 303ms, min = 153ms
url = https://localhost:4443/advanced_search?user_ids=1735&tags=read&not_group_ids=1427&tags=commercial&user_ids=1669&user_ids=1676, num_queries = 5, avg = 355ms, max = 375ms, min = 337ms
url = https://localhost:4443/advanced_search?group_ids=1407, num_queries = 5, avg = 240ms, max = 446ms, min = 155ms
url = https://localhost:4443/advanced_search?group_ids=1446&all_tags=responsibility&all_tags=former&all_tags=expert&all_tags=road, num_queries = 5, avg = 118ms, max = 133ms, min = 103ms
url = https://localhost:4443/advanced_search?tags=discover&tags=owner&tags=financial&not_group_ids=1427&group_ids=1476&not_group_ids=1456, num_queries = 5, avg = 426ms, max = 465ms, min = 392ms
url = https://localhost:4443/advanced_search?tags=lead&not_tags=these&not_group_ids=1395&tags=magazine&user_ids=1708, num_queries = 5, avg = 356ms, max = 381ms, min = 333ms
url = https://localhost:4443/advanced_search?not_group_ids=1404&user_ids=1633&group_ids=1474&group_ids=1422&tags=available, num_queries = 5, avg = 236ms, max = 267ms, min = 209ms
url = https://localhost:4443/advanced_search?group_ids=1462&all_tags=option&user_ids=1703&group_ids=1435&not_tags=kid, num_queries = 5, avg = 178ms, max = 206ms, min = 160ms
url = https://localhost:4443/advanced_search?group_ids=1475, num_queries = 5, avg = 143ms, max = 157ms, min = 130ms
url = https://localhost:4443/advanced_search?not_tags=scientist&user_ids=1662&all_tags=suddenly&tags=meeting&tags=anyone, num_queries = 5, avg = 229ms, max = 245ms, min = 219ms
url = https://localhost:4443/advanced_search?not_group_ids=1414, num_queries = 5, avg = 226ms, max = 383ms, min = 165ms
url = https://localhost:4443/advanced_search?all_tags=seek, num_queries = 5, avg = 702ms, max = 893ms, min = 533ms
url = https://localhost:4443/advanced_search?all_tags=statement.*&tags=or, num_queries = 5, avg = 339ms, max = 916ms, min = 145ms
url = https://localhost:4443/advanced_search?group_ids=1419&group_ids=1429&all_tags=together, num_queries = 5, avg = 290ms, max = 374ms, min = 222ms
url = https://localhost:4443/advanced_search?all_tags=ten.*&group_ids=1387&all_tags=voice, num_queries = 5, avg = 152ms, max = 263ms, min = 116ms
url = https://localhost:4443/advanced_search?user_ids=1610&group_ids=1466, num_queries = 5, avg = 191ms, max = 281ms, min = 150ms
url = https://localhost:4443/advanced_search?not_tags=official&not_group_ids=1386&tags=show&not_tags=right, num_queries = 5, avg = 605ms, max = 686ms, min = 489ms
url = https://localhost:4443/advanced_search?group_ids=1442, num_queries = 5, avg = 177ms, max = 218ms, min = 149ms
url = https://localhost:4443/advanced_search?not_tags=federal&not_group_ids=1396&all_tags=enter&tags=realize&tags=rich.*, num_queries = 5, avg = 203ms, max = 355ms, min = 143ms
url = https://localhost:4443/advanced_search?group_ids=1460&not_tags=want.*&all_tags=so&all_tags=argue&not_tags=Mr, num_queries = 5, avg = 126ms, max = 140ms, min = 111ms
url = https://localhost:4443/advanced_search?tags=similar&tags=piece, num_queries = 5, avg = 531ms, max = 729ms, min = 461ms
url = https://localhost:4443/advanced_search?not_tags=professor&tags=when&not_tags=thus&group_ids=1405, num_queries = 5, avg = 299ms, max = 366ms, min = 258ms
url = https://localhost:4443/advanced_search?tags=war&all_tags=lot&all_tags=without&group_ids=1431, num_queries = 5, avg = 235ms, max = 308ms, min = 197ms
url = https://localhost:4443/advanced_search?user_ids=1788&not_tags=ability, num_queries = 5, avg = 328ms, max = 440ms, min = 268ms
url = https://localhost:4443/advanced_search?group_ids=1400&all_tags=south, num_queries = 5, avg = 233ms, max = 278ms, min = 203ms
url = https://localhost:4443/advanced_search?not_tags=rule&not_tags=herself&all_tags=special, num_queries = 5, avg = 379ms, max = 413ms, min = 343ms
url = https://localhost:4443/advanced_search?all_tags=establish&all_tags=report&not_tags=action&all_tags=consumer&not_group_ids=1403&not_tags=thus, num_queries = 5, avg = 122ms, max = 133ms, min = 103ms
url = https://localhost:4443/advanced_search?user_ids=1714, num_queries = 5, avg = 168ms, max = 189ms, min = 151ms
url = https://localhost:4443/advanced_search?tags=final.*&tags=claim, num_queries = 5, avg = 246ms, max = 602ms, min = 152ms
url = https://localhost:4443/advanced_search?user_ids=1668&all_tags=here, num_queries = 5, avg = 290ms, max = 461ms, min = 196ms
url = https://localhost:4443/advanced_search?all_tags=degree&tags=all, num_queries = 5, avg = 230ms, max = 273ms, min = 204ms
url = https://localhost:4443/advanced_search?group_ids=1462&not_tags=really&all_tags=vote&all_tags=reach.*&tags=remain&all_tags=camera, num_queries = 5, avg = 181ms, max = 207ms, min = 159ms
url = https://localhost:4443/advanced_search?not_tags=get&all_tags=most&not_tags=then&user_ids=1754, num_queries = 5, avg = 237ms, max = 311ms, min = 193ms
url = https://localhost:4443/advanced_search?tags=include&all_tags=everything&not_group_ids=1414, num_queries = 5, avg = 188ms, max = 230ms, min = 171ms
url = https://localhost:4443/advanced_search?all_tags=treat&all_tags=team&all_tags=education&not_group_ids=1466, num_queries = 5, avg = 110ms, max = 116ms, min = 100ms
url = https://localhost:4443/advanced_search?all_tags=memory.*&all_tags=go&group_ids=1439&not_group_ids=1442&user_ids=1741, num_queries = 5, avg = 146ms, max = 177ms, min = 114ms
url = https://localhost:4443/advanced_search?tags=but&group_ids=1479, num_queries = 5, avg = 296ms, max = 377ms, min = 225ms
url = https://localhost:4443/advanced_search?tags=no&not_group_ids=1480&not_tags=model&user_ids=1700&not_group_ids=1440&not_tags=already, num_queries = 5, avg = 259ms, max = 273ms, min = 241ms
url = https://localhost:4443/advanced_search?all_tags=product&all_tags=market&tags=energy&group_ids=1429&not_tags=officer&tags=animal, num_queries = 5, avg = 204ms, max = 400ms, min = 135ms
url = https://localhost:4443/advanced_search?not_tags=everyone&all_tags=charge&all_tags=travel, num_queries = 5, avg = 213ms, max = 237ms, min = 181ms
url = https://localhost:4443/advanced_search?not_tags=activity&not_group_ids=1388, num_queries = 5, avg = 256ms, max = 314ms, min = 185ms
url = https://localhost:4443/advanced_search?tags=black&all_tags=side&all_tags=important&not_tags=color&tags=report, num_queries = 5, avg = 208ms, max = 314ms, min = 146ms
url = https://localhost:4443/advanced_search?not_tags=list&tags=top&user_ids=1718&not_group_ids=1452, num_queries = 5, avg = 178ms, max = 211ms, min = 162ms
url = https://localhost:4443/advanced_search?group_ids=1481&tags=truth&all_tags=dark, num_queries = 5, avg = 145ms, max = 177ms, min = 133ms
url = https://localhost:4443/advanced_search?not_tags=parent&group_ids=1425, num_queries = 5, avg = 148ms, max = 160ms, min = 135ms
url = https://localhost:4443/advanced_search?all_tags=especially&all_tags=agree.*, num_queries = 5, avg = 140ms, max = 149ms, min = 129ms
url = https://localhost:4443/advanced_search?not_group_ids=1388&all_tags=project&not_tags=star, num_queries = 5, avg = 438ms, max = 640ms, min = 312ms
url = https://localhost:4443/advanced_search?user_ids=1711&user_ids=1770, num_queries = 5, avg = 366ms, max = 553ms, min = 219ms
url = https://localhost:4443/advanced_search?not_group_ids=1435&not_group_ids=1479&not_tags=reflect, num_queries = 5, avg = 321ms, max = 367ms, min = 291ms
url = https://localhost:4443/advanced_search?all_tags=finish&user_ids=1651, num_queries = 5, avg = 262ms, max = 351ms, min = 204ms
url = https://localhost:4443/advanced_search?user_ids=1638&not_group_ids=1474&tags=control, num_queries = 5, avg = 230ms, max = 260ms, min = 183ms
url = https://localhost:4443/advanced_search?group_ids=1393&not_tags=which&not_tags=wish, num_queries = 5, avg = 149ms, max = 181ms, min = 132ms
url = https://localhost:4443/advanced_search?not_group_ids=1480&not_tags=manage&not_group_ids=1477&tags=new, num_queries = 5, avg = 399ms, max = 501ms, min = 341ms
url = https://localhost:4443/advanced_search?user_ids=1805, num_queries = 5, avg = 294ms, max = 415ms, min = 207ms
url = https://localhost:4443/advanced_search?user_ids=1779&not_group_ids=1425&group_ids=1420&user_ids=1775&group_ids=1405&not_tags=city, num_queries = 5, avg = 261ms, max = 293ms, min = 230ms
url = https://localhost:4443/advanced_search?user_ids=1682&all_tags=themselves&group_ids=1464&all_tags=kid&tags=pay, num_queries = 5, avg = 314ms, max = 608ms, min = 212ms
url = https://localhost:4443/advanced_search?tags=success&not_group_ids=1383&all_tags=appear&tags=woman, num_queries = 5, avg = 198ms, max = 238ms, min = 173ms
url = https://localhost:4443/advanced_search?not_group_ids=1475&not_group_ids=1383&all_tags=away&not_tags=agreement&group_ids=1400, num_queries = 5, avg = 179ms, max = 188ms, min = 164ms
url = https://localhost:4443/advanced_search?tags=development&not_tags=often, num_queries = 5, avg = 173ms, max = 211ms, min = 151ms
url = https://localhost:4443/advanced_search?not_tags=out&tags=travel&user_ids=1623&group_ids=1468&group_ids=1461, num_queries = 5, avg = 211ms, max = 262ms, min = 173ms
url = https://localhost:4443/advanced_search?not_tags=certain&all_tags=each&not_tags=dog&not_group_ids=1410&all_tags=tax&user_ids=1677, num_queries = 5, avg = 108ms, max = 117ms, min = 99ms
url = https://localhost:4443/advanced_search?not_tags=risk&all_tags=against&group_ids=1470, num_queries = 5, avg = 226ms, max = 324ms, min = 165ms
url = https://localhost:4443/advanced_search?tags=fight&user_ids=1652&group_ids=1460, num_queries = 5, avg = 136ms, max = 157ms, min = 119ms
url = https://localhost:4443/advanced_search?group_ids=1394&not_tags=throw&tags=fund&user_ids=1712&not_tags=or.*, num_queries = 5, avg = 352ms, max = 391ms, min = 295ms
url = https://localhost:4443/advanced_search?not_group_ids=1440, num_queries = 5, avg = 305ms, max = 447ms, min = 156ms
url = https://localhost:4443/advanced_search?group_ids=1424&tags=chair&user_ids=1701&group_ids=1479, num_queries = 5, avg = 196ms, max = 214ms, min = 180ms
url = https://localhost:4443/advanced_search?group_ids=1401&tags=wear&group_ids=1433&user_ids=1659&all_tags=close&tags=power.*, num_queries = 5, avg = 223ms, max = 254ms, min = 205ms
url = https://localhost:4443/advanced_search?user_ids=1753&not_group_ids=1389&group_ids=1445&all_tags=use.*&all_tags=put&tags=protect, num_queries = 5, avg = 157ms, max = 173ms, min = 146ms
url = https://localhost:4443/advanced_search?not_tags=information&tags=final&not_tags=impact&tags=decade&group_ids=1405, num_queries = 5, avg = 298ms, max = 310ms, min = 289ms
url = https://localhost:4443/advanced_search?tags=or&not_tags=site&tags=general, num_queries = 5, avg = 231ms, max = 305ms, min = 182ms
url = https://localhost:4443/advanced_search?not_group_ids=1463&not_tags=create&all_tags=may, num_queries = 5, avg = 651ms, max = 701ms, min = 541ms
url = https://localhost:4443/advanced_search?not_tags=art&group_ids=1415&tags=daughter&all_tags=when&not_tags=own, num_queries = 5, avg = 104ms, max = 146ms, min = 90ms
url = https://localhost:4443/advanced_search?not_group_ids=1461&all_tags=allow&not_tags=bad, num_queries = 5, avg = 166ms, max = 175ms, min = 156ms
url = https://localhost:4443/advanced_search?not_group_ids=1470&group_ids=1408&user_ids=1718&not_tags=beyond&not_group_ids=1438, num_queries = 5, avg = 163ms, max = 336ms, min = 108ms
url = https://localhost:4443/advanced_search?not_group_ids=1405&tags=enjoy&not_tags=land&user_ids=1672, num_queries = 5, avg = 227ms, max = 259ms, min = 207ms
url = https://localhost:4443/advanced_search?not_group_ids=1466&not_group_ids=1449&group_ids=1420&user_ids=1684&tags=remember.*&all_tags=catch, num_queries = 5, avg = 213ms, max = 315ms, min = 162ms
url = https://localhost:4443/advanced_search?not_tags=shoulder&not_group_ids=1472&user_ids=1710&user_ids=1704&user_ids=1663, num_queries = 5, avg = 201ms, max = 294ms, min = 166ms
url = https://localhost:4443/advanced_search?not_tags=public&not_group_ids=1381&group_ids=1422, num_queries = 5, avg = 336ms, max = 501ms, min = 237ms
url = https://localhost:4443/advanced_search?not_group_ids=1388&not_tags=fine&all_tags=later&not_group_ids=1403, num_queries = 5, avg = 482ms, max = 690ms, min = 347ms
url = https://localhost:4443/advanced_search?all_tags=many&user_ids=1697&tags=show&tags=just&tags=pay&group_ids=1394, num_queries = 5, avg = 321ms, max = 458ms, min = 247ms
url = https://localhost:4443/advanced_search?not_group_ids=1400&user_ids=1644&user_ids=1629, num_queries = 5, avg = 172ms, max = 188ms, min = 149ms
url = https://localhost:4443/advanced_search?user_ids=1629&not_tags=prepare, num_queries = 5, avg = 325ms, max = 632ms, min = 155ms
url = https://localhost:4443/advanced_search?all_tags=probably&not_group_ids=1452, num_queries = 5, avg = 445ms, max = 614ms, min = 345ms
url = https://localhost:4443/advanced_search?user_ids=1628&user_ids=1753&group_ids=1424&not_tags=ground, num_queries = 5, avg = 385ms, max = 652ms, min = 271ms
url = https://localhost:4443/advanced_search?user_ids=1775&all_tags=thank&not_group_ids=1441&user_ids=1742&all_tags=join&not_tags=a, num_queries = 5, avg = 243ms, max = 474ms, min = 168ms
url = https://localhost:4443/advanced_search?user_ids=1684&all_tags=focus, num_queries = 5, avg = 328ms, max = 461ms, min = 244ms
url = https://localhost:4443/advanced_search?not_group_ids=1414&all_tags=compare, num_queries = 5, avg = 165ms, max = 191ms, min = 150ms
url = https://localhost:4443/advanced_search?group_ids=1454&tags=radio&tags=end&not_group_ids=1473, num_queries = 5, avg = 308ms, max = 335ms, min = 291ms
url = https://localhost:4443/advanced_search?group_ids=1394&all_tags=cold&all_tags=year, num_queries = 5, avg = 154ms, max = 185ms, min = 141ms
url = https://localhost:4443/advanced_search?user_ids=1633&user_ids=1745&tags=away&user_ids=1656&user_ids=1665, num_queries = 5, avg = 313ms, max = 527ms, min = 250ms
url = https://localhost:4443/advanced_search?user_ids=1630&tags=life&tags=city&tags=against&tags=positive&group_ids=1420, num_queries = 5, avg = 104ms, max = 126ms, min = 88ms
url = https://localhost:4443/advanced_search?tags=also, num_queries = 5, avg = 513ms, max = 609ms, min = 457ms
url = https://localhost:4443/advanced_search?user_ids=1659&not_tags=finally&not_group_ids=1436, num_queries = 5, avg = 256ms, max = 401ms, min = 188ms
url = https://localhost:4443/advanced_search?not_group_ids=1384, num_queries = 5, avg = 187ms, max = 279ms, min = 148ms
url = https://localhost:4443/advanced_search?group_ids=1418&not_tags=control&tags=still&not_group_ids=1388, num_queries = 5, avg = 616ms, max = 834ms, min = 362ms
url = https://localhost:4443/advanced_search?all_tags=bit&group_ids=1425&not_group_ids=1474, num_queries = 5, avg = 458ms, max = 904ms, min = 292ms
url = https://localhost:4443/advanced_search?all_tags=next&not_group_ids=1468&not_tags=social, num_queries = 5, avg = 419ms, max = 550ms, min = 315ms


avg = 263.42049853515624, errs = 0

Actual implementation

Before I continue working, I'd like some feedback on what direction we'd like this feature to take. Keeping it as advanced search options is pretty easy, but other comments I've seen (e.g. #359 and here) suggest people are thinking of it more as a customizable feed. I'll propose a few options to get the ball rolling.

Option 1: Advanced search

In addition to text search, there could be an "Advanced search" link in the sidebar. Clicking would take you to a page with a search form containing fields for each filter option (e.g. Tags, Not tags, etc.). After filling out the form, a user would click a search button and be taken the search results page.

Option 2: Custom topic listings

In addition to the home page listing and group listings, users could create their own custom topic listings based on these or other filters. These listings could be named, and users could access them from... somewhere. Perhaps topic listings could be combined, so that a user could have their "Pizza cooking" topics and "Geocaching" topics appear in the same feed.

Option 3: Custom home page

Users could customize the topics that appear on the home page for their account. This is like a first class version of option 2. It also strikes me as more Reddit-esque.

Feel free to suggest more options, a different querying implementation, etc.

@cfabbro This seems like the kind of thing you might be interested in.

Edited by Daniel Woznicki

Merge request reports