Sometimes, the humble, un-exciting solutions have the largest impact.
HealthCare.gov has a service called Find Local Help that allows people to search for one-on-one help to enroll in healthcare. This vital search had problems sustaining more than a small set of searches at a time.
These location searches would take anywhere between 2-6 seconds to return results, and searching certain locations would trigger alerts for the development team.
After some digging, we found the location search didn’t use an index, and that we could improve how the database generated the index. We did this by using query plans and R-Tree spatial indexes.
Because SQL is a declarative language, we don’t see how a database searches records. Behind the scenes, a database uses the query planner to weigh the performance cost of database activities, and use the fastest possible way to return results. In practice, this may mean that if you add an index, your database might not use it, and you have to check.
To inspect the “query plan”, use DESCRIBE or EXPLAIN. This gives a summary of which indexes might get used, temporary tables held, and other useful data for improving performance.
For example, in mysql,
describe might output something like this:
+------------+----------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+----------+------+-----+---------+----------------+ | Id | int(11) | NO | PRI | NULL | auto_increment | | Name | char(35) | NO | | | | | Country | char(3) | NO | UNI | | | | District | char(20) | YES | MUL | | | | Population | int(11) | NO | | 0 | | +------------+----------+------+-----+---------+----------------+
explain should get used often when working with MySQL database. Sometimes, a database doesn’t always do what you expect it to do. Query plans contextualizes how your database behaves and provides key clues for debugging.
If you’ve worked with relational databases before, you may have familiarity with B/B+ tree, which often get used for representing filesystems. R-Trees, or “Rectangle Trees”, which map tree representations of data to bounding rectangles of a spatial surface.
Traversing the tree (or cube, if you look at the diagram above) amounts to inhabiting a more specific spatial area (a rectangle or box) where your point may lie. In our case the rectangle represents your local town or city, and the point the address of an assister.
R-trees allow the database builds and maintains an efficient set of rectangles for the points you provide.
Example graphics courtesy of Wikipedia.
We’ve worked with location data for a long time. Watch our CTO Paul Smith’s presentation on R-trees from 2012.
After our small team deployed these updates, we saw response times drop and stabilize. Location search response times dropped from an average of 4 seconds per response, to 150 milliseconds per response.
You can see this impact in the chart above after the grey “deploy” bar. People will have a more responsive experience, and opens up future enhancements to the user interface and the search experience.
This work represents the competent engineering work that Ad Hoc brings to government services. Interested in helping citizens have better experiences on government websites? We’re hiring.