Think about standard, pre-parsed format that content providers could provide
As CoreNLP (or any automated system) will never do a perfect job at sentence analysis, content providers could provide pre-processed data with at least POS (and optionally grammatical info). This would make for more accurate glossing for many situations.
An example use case would be an HTML header which the browser extension would use instead of trying to parse the text in the page. The text (spans, divs, whatever) could have IDs that enable easy and accurate replacement when the extension runs. It would also allow for the system to better do batch processing in the browser context.
This would also be very useful for batch processing contexts like e-books or subtitle providers.