Server-side render one or more Vue applications directly in Ruby (on Rails), serve the result embedded as HTML in the response and automatically hydrate the application when loaded.
GitLab has a heterogeneous frontend stack. Some parts are implemented with jQuery, some are Rails views (HAML) + JavaScript bundled as assets, some parts are written as Vue applications. Even if we get rid of the last jQuery occurrences, it is highly unlikely that we can get towards a uniform frontend stack in a reasonable amount of time. This has direct implications on how we can serve client-side executed source code (JavaScript). In particular, we lack the ability to stream the rendered UI directly to the browser.
The way servers (Apache, nginx) and browsers work is, they take the content from the HTTP response body and immediately parse and render the included (HTML) markup (stream based rendering). This means that a browser can display visual elements to the user, before the full response is loaded.
In comparison, by rendering a UI on the client side, we accept additional roundtrips and a deferred compilation and rendering timing. For a view that is built as a Vue application, this results in:
No visual feedback before the dependencies (JavaScript bundles) are loaded, parsed and rendered
No real above the fold optimization (you always render the full view as a single chunk and insert into DOM)
Proposal
For the longest time, I thought about our heterogeneous frontend setup as a problem, but I think we could instead make it our superpower. The feedback from various software developers at GitLab is that it can take longer to build a Vue application, than to do the same in a simple Rails view. In other words, many developers prefer to have an option when it comes to choosing a stack to work with (Vue or Rails views). As much as I have tried to invalidate this assessment in various conversations with colleagues, it is simply a reality in GitLab's codebase and organization.
Without a major strategic shift towards thinking of GitLab as a platform (uniform API strategy → uniform frontend strategy → Vue), we need to think about how we can provide the best experience to users and developers for both worlds. As Rails views are well-supported by the framework, this leaves us with the support for Vue applications.
In December 2021, the release of a Rails "HTML over the wire" library (Hotwire) has inspired many developers to question their thinking about how to build frontends. And indeed, it also gave me the idea for what I'd like to call UI over the wire.
Based on an experiment, I actually have troubles seeing complex stateful applications being fully built with e.g. Turbo and Stimulus.
Besides the technical feasibility, we also have an enormous codebase that would need to be rewritten (which is, as everything, doable, but not desirable). But, there is one concept from Hotwire/Turbo that is particularly interesting: Decomposing views into frames. If we take a look at how we integrate Vue applications into GitLab today, there are some conceptual similarities with Turbo Frames. Most often, we just "mount" a Vue application into a certain position within the DOM tree. Turbo does the same, but follows a more dynamic approach (it can append, replace, delete, …). It also adds another concept by supporting live views via streaming-based content section(s). Turbo Streams operate in a push based manner and server-side state changes are pushed to the client. The Turbo specification does not state what you send to the client, as long as it is HTML and attached to a Turbo Frame or Stream ID.
UI over the wire is similar to what Turbo does. It will allow us to take an existing Vue application, and instead of manually mounting it on the client-side, we serve the pre-rendered application as HTML directly to the target Turbo frame plus rehydrate afterwards.
We have previously shied away from server-side rendering due to its enormous complexity and implications on infrastructure (Node.js like reverse-proxy for all requests, authentication, …). Instead of rendering the full application, we could instead provision V8 Isolates as dedicated rendering containers for Vue during a regular request/response cycle. In the same way we render a partial with ERB or HAML today, we would render a full Vue application into a buffer (String) and serve it to the client.
Less client-server roundtrips to fetch data: Due to being embedded into the Ruby process, we have access to the same resources (Model, Database, GraphQL Resolver) and can directly fetch data from the source. No network roundtrips!
Full control over the JavaScript runtime: The runtime lives in the same Ruby process as the rest of the application.
Decomposition
This is similar to our existing codebase. We usually mount a Vue application to a certain anchor within the DOM. e.g.
const element = document.getElementById("editor-app");mount(Editor, element);
Most modern JavaScript (Vue) applications access data via APIs. In our case this is either Apollo using GraphQL or arbitrary HTTP requests (REST, …). When server-side rendering apps, the application still needs to access those APIs to fetch the required data for rendering the application. This adds network time during the SSR process and that might not be desirable, because it adds to latency, might cause issues, etc.
In a controlled render runtime, we can intercept network calls (e.g. intercept all GraphQL requests), and immediately forward the query to the Rails GraphQLController. Due to the almost direct database access, the overall roundtrip time for rendering the app should decrease.
V8 orchestrates isolates: lightweight contexts that group variables with the code allowed to mutate them. You could even consider an isolate a “sandbox” for your function to run in.
A single runtime can run hundreds or thousands of isolates, seamlessly switching between them. Each isolate’s memory is completely isolated, so each piece of code is protected from other untrusted or user-written code on the runtime. Isolates are also designed to start very quickly. Instead of creating a virtual machine for each function, an isolate is created within an existing environment. This model eliminates the cold starts of the virtual machine model.
In comparison to VM's, Docker containers or Node.js applications, V8 Isolates offer a more lightweight compute platform that has a better memory footprint (if done right) and less cold start problems. A single runtime can handle many, many, many isolates (think tabs in browser). This is like building our own Node.js, deno or bun (without the APIs), but with Ruby (and Rails) as the primary container. In other words, the runtime lifecycle is fully controlled by the Ruby process. Adding isolates and IO is done strictly through a Ruby interface.
Implement a JavaScript renderer with V8 Isolates
Ruby offers us to extend its capabilities with extensions. The same way we maintain connections to Postgres or MySQL, we can also spin up a V8 runtime and provide many isolates to render Vue applications on demand (like we render ERB or HAML templates).
Prototype
This is a proof of concept that utilizes Deno's Rust V8 bindings to create a Ruby gem that is able to read, parse, and compile a JavaScript file and print the result.
V8 isolates potentially allows fine-grained resource control (CPU + Memory). For example:
It is possible to have a dedicated thread/Isolate, so that some slow code executed in one Isolate does not block the execution of another Isolate.
Isolates that exceed a defined memory limit can be evicted.
Cold-start problem
V8 implements an optimizing compiler. During code execution, the code paths are tracked and compiled into native code at runtime. Until the application is compiled to native code, rendering can take longer. This is known as the cold-start problem. If this turns out to be a real-life problem with SSR, we can mitigate the effects in various ways:
Pre-warm the application with a synthetic request (render cycle)
Use V8 Snapshots to persist the compiled state of an application and load the binary at runtime
Distribution
In order for Isolates being able to render a Vue application, we must provide all the dependencies (npm packages) to the application. As we already bundle JavaScript with Webpack, we can also provide bundles for server-side rendering and allow them to be loaded at server startup time.
There are probably plenty of ways to optimize load times. Reduce bundle size, above the fold optimization, lazy loaded images, preload JavaScript bundles, offline/service workers. None of them solve the fundamental issue of UI not being rendered immediately.
@slashmanov I spoke to @ntepluhina earlier today and I had this rough idea of using V8 Isolates in conjunction with Turbo Frames/Streams. As she told me to talk to you, because you are already working on streaming in #101, I decided to just write it down so that it does not get lost, but I am not actively working on this.
Hannes Moserchanged the descriptionCompare with previous version
This looks interesting. However, as much as I would like for us to have a better SSR with Vue I don't think we can proceed with this unless we make all the components we want to render universal (i.e. that can be rendered both in browser and server contexts), which is not a trivial task to do considering the size of GitLab codebase. It's one of the reasons we don't have streaming Vue components yet even though it yields to massive user experience improvements.
That's an excellent point, and indeed a major challenge of migrating the existing clientside rendered app to SSR.
I wonder if one of the approaches could be a compatibility patch that would stub missing globals in JS environment. What do you think?
Another challenge that I can think of is conditional rendering based on client-side information (for example screensize). Some of this info is impossible to get serverside, so it can cause flickering or other unintended behavior due to descrepancies between server-side rendered markdown and the one generated after hydration.
I wonder if one of the approaches could be a compatibility patch that would stub missing globals in JS environment. What do you think?
I wish it was that simple! 😅
For example we can't replace dompurify at the moment, which is heavily used in gitlab-ui's v-safe-html directive. We can complete a lot of stuff on a client after hydration, but there are things like security that can not be avoided on server. I am investigating into this but I don't think there are low hanging fruits there. It's either complex or slow solutions we're mostly left with.
There's another issue of passing i18n and cookies through the SSR app (we use globals for that now). It's also complicated and would require some changes to avoid shared state.
Another challenge that I can think of is conditional rendering based on client-side information (for example screensize). Some of this info is impossible to get serverside, so it can cause flickering or other unintended behavior due to descrepancies between server-side rendered markdown and the one generated after hydration.
That depends on your approach to UI. If it allows to hide and show small bits with media queries then it's a non-issue. For other cases there are prediction techniques, Client Hints or simply cookies. But I don't think we are at that stage yet to consider this an issue unless we fix the universal components one.
@slashmanov I would even go so far that this needs some real investment beforehand to build the necessary render runtime for the render target. But even then, we would only progressively roll out "new apps" or maybe migrate a single (small) existing one. The advantage of this approach (and I believe yours as well) is that it actually allows us to do this in small steps. This is in contrast to a holistic SSR approach at the application level where you need to convert everything at once (or at least per route). To my knowledge this RFC and yours are the closest to make this a reality. A new area (like work items) might be a suitable contender for this.
In general, I hope we can keep discussing SSR proposals, because it is inevitable that we need it. It is a factual shortcoming of our platform that we don't stream the UI to clients. I believe V8 Isolates are a particularly interesting approach, due to our Ruby server landscape and Ruby's C <-> FFI capabilities. I believe Isolates are therefore better to embed than e.g. a Node.js sidecar or a dedicated render server (https://github.com/airbnb/hypernova). Furthermore, I also think it is about time to make Vue a first-class citizen in our stack and not a mere appendage.
I think the groupfoundations is the closest one. I've tried to implement something similar to this but only using Rails SSR, I thought you might be interested in this as well.
Yeah, I actually tried the playground you created, and it demos the issue perfectly. I really believe my RFC is just an extension to yours by thinking a bit more about the scalability issue of rendering (using a more efficient render container and adding some DX features by "borrowing" ideas from Hotwire).
I tried to read thru this proposal (success) and trying to understand it (partial success) -- thus have some questions. If this is ever fully implemented then:
I still write my components in Vue just as today. Yes?
I get rid of the plan JS glue that links a Vue app to the root element in rendered template, as my controller contains APPLICATION = UI.load("./javascript/my_app.vue"). Yes?
Behind the scenes, server side rendering is enabled for my Vue component. I the component author do not have to set up anything. Correct?
Currently, one page could have multiple Vue apps. Would APPLICATION = UI.load("./javascript/my_app.vue") constrain us and require upfront refactoring?
Someday we could think about Hotwire, this proposal is for SSR'ing existing Vue components. Yes?
@sri19 thanks for these questions. I will answer here, but also update the RFC accordingly.
I still write my components in Vue just as today. Yes?
Yes, this does not change anything in our Vue development workflow. It only allows to "mark" a section in the layout as "render target". This proposal replaces the need for a manual mount with getElementById, by a more declarative approach.
I believe this to be an almost neglectable change from a Vue perspective.
But, as @slashmanov already mentioned, we would need to make sure that Vue components that are used (server-side rendered) can actually be rendered on the server (often these are small things, like proper use of global variables, timers, etc.).
I get rid of the plan JS glue that links a Vue app to the root element in rendered template, as my controller contains APPLICATION = UI.load("./javascript/my_app.vue"). Yes?
Yes, but as you pointed out, code ergonomics are not finalized. The general idea is that you can "mount" an app by declaring its place in the view in a declarative way.
This idea could be extended to even dynamically "replace" a client-side app with a "push" action from the server (i.e., turbo streams for apps). I have linked the "Decomposing views with frames" section from the Hotwire/Turbo handbook to reference the general idea of decomposition. I think we already do that today!, but we don't provide an easy-to-use framework for it. And, very important, this proposal is agnostic about a View being a single Vue application or many Vue applications (this properly reflects the reality of our frontend codebase).
Behind the scenes, server side rendering is enabled for my Vue component. I the component author do not have to set up anything. Correct?
Yes! Given that the components can actually be rendered on the server-side (there are a few caveats), this is exactly how it should work. Server-side rendering would get first-class support and would be the default. Teams can decide to opt in though, so there is no need to migrate existing applications right away, but it could be done step by step.
Currently, one page could have multiple Vue apps. Would APPLICATION = UI.load("./javascript/my_app.vue") constrain us and require upfront refactoring?
No. I have adopted the example to better reflect the intent, and additionally give some ideas about multi-target rendering in the snippet.
What I'd suggest though is to not load the application on every request/response cycle to allow some clever "caching" of the to be rendered instance. We could obviously also do this by memoizing by application, state, and environment (like current_user, etc.) and hide this implementation detail from the user.
Someday we could think about Hotwire, this proposal is for SSR'ing existing Vue components. Yes?
Given that we follow pretty much the principles of Hotwire/Turbo, e.g. using a declarative approach to "reserve a space" for the app, using controller actions to render a specific format (ui_stream), we probably are compatible but need to think about we could integrate with Turbo so that it is a seamless experience for all parts (including Turbo Streams → dynamically adding/replacing/evicting apps).
I fully agree on the necessity to simplify our rendering stack & data exchange between FE & BE. I am also very supportive of your idea how to tackle it. You mentioned a few times in your video "this is complex, but can be solved". So my thoughts on this matter are how we could iterate towards a solution. I hope it's not too obvious and somehow useful to the discussion:
How we currently render Vue Applications
Right now we create:
<some-path>/index.html.erb - just to render the mounting element
<some-path/index.js - just to initialize the Vue app and potentially pass down data
<SomeHelper>Class.rb - aggregating data
<some-path>/app.vue - actual Vue app
This structure has multiple problems
Data gets passed through via multiple ways and can only be tested via expensive Feature Specs
We duplicate the same code a lot (might be fine, but indicates that it can be simplified)
Too many files for just rendering
How to simplify
With the introduction of ViewComponent I was thinking why we can't create a VueAppViewComponent that can take the name of a Vue component and renders it by passing down the correct data. We could even pre-load certain GraphQL queries if we want to and pass it to Apollo.
Yeah, I wasn't aware that we explore ViewComponent, but creating components is very much inline with the goals of decomposition in general.
This structure has multiple problems
Data gets passed through via multiple ways and can only be tested via expensive Feature Specs
We duplicate the same code a lot (might be fine, but indicates that it can be simplified)
Too many files for just rendering
I agree, but I don't necessarily think you can reduce the number of files that are in play. We might be able to reduce the number of files that need to be written manually though. e.g. mounting the Vue app is a repetitive task which most likely can be automated (either using the proposed tag helpers in this RFC, or components, or whatever).
You mentioned a few times in your video "this is complex, but can be solved". So my thoughts on this matter are how we could iterate towards a solution. I hope it's not too obvious and somehow useful to the discussion:
To be clear, I am mostly referring to the task of creating a custom/dedicated JS runtime for rendering, but you point still stands :-)
Thanks a lot for compiling this RFC and the video presentation @lamportsapprentice !
It's great to see we explore options to better integrate vue apps in rails and improve performance of our web application. At a glance, I think your idea has a lot of potential and definitely is worth exploring further.
I have couple of questions which maybe you can help me with as I don't know in details how rails rendering works:
Can rails 'wait' until the JS part renders? It would be great to have an option to define data fetching in one place only. This way irrespective of wheather the page is rendered serverside, or via a clientside navigation, we can be sure it uses the same data-fetching logic. It'll help with maintainability and simplifies the mental model people have to use.
Can rails hoist parts of the rendering output (e.g. scripts, css links, etc.) to the top of the document so downloading of the application bundle starts sooner?
@andrei.zubov Tried to answer as good as I can. Let me know if this does not clarify it.
Can rails 'wait' until the JS part renders? It would be great to have an option to define data fetching in one place only. This way irrespective of wheather the page is rendered serverside, or via a clientside navigation, we can be sure it uses the same data-fetching logic. It'll help with maintainability and simplifies the mental model people have to use.
I think this is actually how I would implement the first iteration of the render runtime. A blocking call to the V8 Isolate which eventually returns after the renderToString promise resolves. It is a JavaScript runtime after all, so there should be not that big of a difference on how we render in comparison to a Node.js SSR implementation.
// Render function in JavaScriptfunctionrender():Promise{returnrenderToString(app).then((html)=>{returnhtml;});}
# Invoke render function from Ruby/V8runtime.run_blocking("render")
Can rails hoist parts of the rendering output (e.g. scripts, css links, etc.) to the top of the document so downloading of the application bundle starts sooner?
Rails can't do that for us, but the JavaScript runtime + Vue probably can. I would consider this a potential future improvement where we either return a stream instead of a string, or we could also implement various callbacks to populate and send headers, preload tags, etc. sooner. I have added an example from an old Node.js SSR implementation of mine.
A streaming based render approach with chunk extraction
importReactfrom'react';importpathfrom'path';import{renderToNodeStream}from'react-dom/server';import{ServerStyleSheet}from'styled-components';import{ChunkExtractor}from'@loadable/server';import{createascreateCache}from'../cache/memory';import{createascreateCacheStream}from'../cache/stream';constcache=createCache();functiontplStart(){return` <!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-type"/> <meta content="width=device-width, initial-scale=1" name="viewport"/> <meta content="Whether you've just run your first 5 km or even the whole marathon distance, Spurtli helps you to create a memory that will last a lifetime." name="description"/> <meta property="og:title" content="Spurtli"/> <meta property="og:description" content="Generative design with fitness data"/> <meta property="og:image" content="https://pbs.twimg.com/media/D-a_r77XUAA7T7r.png:large"/> <meta property="og:url" content="https://www.spurtli.com/about"/> <meta name="twitter:title" content="Spurtli"/> <meta name="twitter:description" content="Generative design with fitness data"/> <meta name="twitter:image" content="https://pbs.twimg.com/media/D-a_r77XUAA7T7r.png:large"> <meta name="twitter:card" content="summary_large_image"> <title>Spurtli – Generative design with fitness data.</title> </head> <body> <div id="app">`;}functiontplEnd(client,chunkExtractor){return` </div> <script> window.__APOLLO_STATE__ = '${JSON.stringify(client.extract())}'; </script> <script src="https://js.stripe.com/v3/"></script>${chunkExtractor.getScriptTags()} <!-- Matomo --> <script type="text/javascript"> var _paq = window._paq || []; _paq.push(['trackPageView']); _paq.push(['enableLinkTracking']); (function () { var u = "//analytics.conc.at/"; _paq.push([ 'setTrackerUrl', u + 'matomo.php' ]); _paq.push(['setSiteId', '1']); var d = document, g = d.createElement('script'), s = d.getElementsByTagName('script')[0]; g.type = 'text/javascript'; g.async = true; g.defer = true; g.src = u + 'matomo.js'; s.parentNode.insertBefore(g, s); })(); </script> <!-- End Matomo Code --> </body> </html> `;}functionnormalizedPath(request){const{path}=request;return!path||path==='/'?'/about':path;}exportfunctionssr(req,res){constrequestPath=normalizedPath(req);// TODO: fix client code, no polyfill!global.location=requestPath;global.localStorage={getItem(){},setItem(){},};global.window={...global,};// TODO - ENDres.status(200);res.set('Content-Type','text/html');if (cache.has(requestPath)){constbody=cache.get(requestPath);returnres.send(body);}constcacheStream=createCacheStream(cache,requestPath);cacheStream.pipe(res);cacheStream.write(tplStart());// extract chunksconstclientStats=path.resolve('dist/client/loadable-stats.json');constclientOutputPath=path.resolve('dist/client');constclientExtractor=newChunkExtractor({statsFile:clientStats,entrypoints:['web'],outputPath:clientOutputPath,});constserverStats=path.resolve('dist/server/loadable-stats.json');constserverOutputPath=path.resolve('dist/server');constserverExtractor=newChunkExtractor({statsFile:serverStats,entrypoints:['web'],outputPath:serverOutputPath,});// propsconstcontext={};// componentconst{default:App,client}=serverExtractor.requireEntrypoint('web');constjsx=clientExtractor.collectChunks(<Appcontext={context}location={requestPath}/>);constsheet=newServerStyleSheet();conststream=sheet.interleaveWithNodeStream(renderToNodeStream(jsx));stream.on('end',()=>{cacheStream.end(tplEnd(client,clientExtractor));});stream.pipe(cacheStream,{end:'false'});}
That's also one of my main concerns with this proposal is that Rails isn't really a streaming-first framework. It has a streaming support but it's very limited (I've tried it myself). Having a node or any-other-js-runtime sever in front could allow us to create a much better request and streaming flow. For example we could launch concurrent requests at once and stream as we complete the fetches. We won't be able to have that with Rails as far as I know.
It begs the question that if we remove Rails templates that can directly reference controller data and (eventually) replace everything with turbo frames why would we need Rails as our SSR engine in the first place?
@slashmanov thanks for bringing up a dedicated SSR service/proxy as an option. I have spent some time on the idea, and I am actually coming from a similar place. I originally thought about a dedicated render service, but pivoted due to a set of constraints (multiple Vue apps, legacy views, …). Now, I actually believe that the embedded renderer is a better approach. Please find my explanation below.
That's also one of my main concerns with this proposal is that Rails isn't really a streaming-first framework. It has a streaming support but it's very limited (I've tried it myself).
I think we should be careful with such an assessment. A stream is just one form of I/O and Ruby is very flexible. We can always write into a buffer, and flush the buffer continuously to the client, or utilize the built-in stream capabilities in Rails (https://api.rubyonrails.org/classes/ActionController/Streaming.html). Ruby's IO capabilities are quite abstract (https://ruby-doc.org/core-3.1.2/IO.html) and intended for general purpose use (we stream to stdout a lot after all).
For example we could launch concurrent requests at once and stream as we complete the fetches. We won't be able to have that with Rails as far as I know.
I do not think that concurrency/parallelism is a fundamental problem of Ruby, but a concern of the template engine. The biggest issue here would be the partials in a template, that are going to be rendered sequentially (render Vue app 1, then render Vue app 2). This obviously adds up with the amount of Vue applications per template (the more, the slower). There are probably a few things to mitigate this problem:
SSR a Vue app must be fast. As close as possible to the time it takes to render any ERB or HAML partial
We should provide as much state as possible directly to the render instance (the least possible amount of fetching data from APIs)
Pre-warm and pre-compile the app and use a render pool
Having a node or any-other-js-runtime sever in front could allow us to create a much better request and streaming flow.
I specifically wrote this RFC to avoid using a dedicated Node.js process. I try to compile a list of reasons for that below:
The complexity of operating such render service is high (like really, really high), and we already operate one (the Rails monolith)
Due to how authentication works, every request would proxy through this service (cookies)
It adds another network hop for every page (including all Rails pages, API calls, etc.)
Severely adds to infrastructure cost (it has to proxy each request and does more than rendering)
Last, but not least, it would further increase the Rails/Vue gap instead of bringing it closer together
It begs the question that if we remove Rails templates that can directly reference controller data and (eventually) replace everything with turbo frames why would we need Rails as our SSR engine in the first place?
I am not sure if I fully understand this part, so I try my best to answer. Please correct me if my assumptions are incorrect:
Eventually all Rails views are going to be Vue apps, getting data/state via the GraphQL (REST) API
I don't think this is ever going to be true. Emails, Devise Authentication Flows, Legacy Views will be around for a very long time.
Turbo Frames are different from Rails Views
Turbo frames are actually embedded into the whole Rails stack. They are used for decomposition of the app, and you could even decide to build a full SPA (Single-Page Application) with them. But, after all, they are just providing the same Rails Model/Controller/View pattern.
Final thoughts
A quite important reason to integrate a JavaScript renderer in Rails is that we get a clear migration path without any significant drawback. We eventually get the advantage of pipelining multiple views (and apps) at once. The latter is something that is close to impossible to solve with a proxy service.
I think we should be careful with such an assessment. A stream is just one form of I/O and Ruby is very flexible. We can always write into a buffer, and flush the buffer continuously to the client, or utilize the built-in stream capabilities in Rails (https://api.rubyonrails.org/classes/ActionController/Streaming.html). Ruby's IO capabilities are quite abstract (https://ruby-doc.org/core-3.1.2/IO.html) and intended for general purpose use (we stream to stdout a lot after all).
I am not an expert on Ruby capabilities and if it's as good as the Node's streams support that's actually great news! Though I've never heard how Ruby deals with backpressure and stream piping, which Node has out of the box. I would really love to learn more about it in detail if someone could share this knowledge.
I do not think that concurrency/parallelism is a fundamental problem of Ruby, but a concern of the template engine. The biggest issue here would be the partials in a template, that are going to be rendered sequentially (render Vue app 1, then render Vue app 2). This obviously adds up with the amount of Vue applications per template (the more, the slower). There are probably a few things to mitigate this problem:
I really like that we share the same concerns! I also think that sequential rendering will be much slower than what Vue 3 offers out of the box (async rendering). Mainly because of the network waterfall effect.
SSR a Vue app must be fast. As close as possible to the time it takes to render any ERB or HAML partial
Vue 3 SSR is really fast, but I think a more important thing here is caching which Rails has out of the box. We'll have to outperform Rails cache with this approach.
We should provide as much state as possible directly to the render instance (the least possible amount of fetching data from APIs)
I think otherwise because in that case we would have to know in advance which data we need to pull. With Vue 3 streaming we can request that data as we go in parallel, which might depend on another data during rendering.
Pre-warm and pre-compile the app and use a render pool
I am not sure if the proposed architecture could address that. I agree that load balancing is crucial when it comes to SSR. If we separate Rails and Node instances load balancing becomes much more efficient and we can better leverage our resources.
I specifically wrote this RFC to avoid using a dedicated Node.js process. I try to compile a list of reasons for that below:
The complexity of operating such render service is high (like really, really high), and we already operate one (the Rails monolith)
This is the discussion I want to have! I am not certain that it's actually that high. I have an experience of running SSR for 1 million monthly users just fine. Yes it requires more discipline on how you write your code (avoid memory leaks, shared state, etc), but other than that I don't see major issues with maintaining a Node server. I would love to be corrected if that's not the case.
Due to how authentication works, every request would proxy through this service (cookies)
If we want to outperform Rails rendering at the moment I think this is the way to go: to delegate all of the requests to our Vue app so we can make them in parallel. I am aware of the cookies problem, it can be solved with an AsyncLocalStorage.
It adds another network hop for every page (including all Rails pages, API calls, etc.)
I agree in that sense that there's a bigger network overhead with the Node approach. But I don't think it's that severe that it will significantly affect the performance.
Severely adds to infrastructure cost (it has to proxy each request and does more than rendering)
I agree that in case where we only had CSR and now we'll have SSR it will increase the load on the server. But I don't agree that it will add a severe infra cost because we'll be outsourcing the load from the Rails server, which would act only as an API backend in most cases (assuming we migrate most of the hot path pages to Vue). I also think that with proper caching it should lower the load on the servers, but that's another topic.
Last, but not least, it would further increase the Rails/Vue gap instead of bringing it closer together
Rails has a different philosophy on how you should write Frontend. We're basically going against it by using Vue and rendering most of the page with JavaScript. Rails suggests that we use templates only and update the interface over the network. Which is not sustainable in the long run and offers very poor UX, but is very easy to write. I think it's not a bad thing that we would get far away from Rails approach and apply modern techniques which would allow us to have excellent performance with a modern DX.
I don't think this is ever going to be true. Emails, Devise Authentication Flows, Legacy Views will be around for a very long time.
Of course, but there's also a way to combine these two approaches. We can always request just a partial view of the page from Rails (with empty layout for example). We can also fetch this partial view in parallel with rendering the 'app shell' (header and sidebars), which could also be cached. The only complex thing here is routing, which would always stay in Rails just because of its size.
Just one point to cover the theme before answering in detail. I believe forwarding from Rails to V8 vs Node.js proxying to Rails is conceptually similar, but we have a Rails setup in production, not a Node.js one.
Vue 3 SSR is really fast, but I think a more important thing here is caching which Rails has out of the box. We'll have to outperform Rails cache with this approach.
The beauty of having a Vue app rendered as a partial is that we can utilize the Rails cache for free (given it makes sense, for highly dynamic views it probably does not make too much sense).
I think otherwise because in that case we would have to know in advance which data we need to pull. With Vue 3 streaming we can request that data as we go in parallel, which might depend on another data during rendering.
I agree in the sense that this is not trivial. But, and this could be an extension to the RFC, in a custom and embedded runtime, we have full control over the runtime and can intercept network calls from the render instance. For example, if we intercept GraphQL calls, we could directly forward them to the GraphQL controller. In other words, we save network calls and utilize the advantage of having "direct" access to data. I assume there is probably no other concept that can beat that in terms of latency.
I am not sure if the proposed architecture could address that. I agree that load balancing is crucial when it comes to SSR. If we separate Rails and Node instances load balancing becomes much more efficient and we can better leverage our resources.
I am thinking about this in a slightly different way. Every Vue application can be compiled ahead of time and pre-warmed with arbitrary data in a separate isolate. You most likely need to render for each request to avoid cross-state pollution (otherwise we cache in Rails).
The real question is how many isolates per app we need for each runtime instance (this would be the render pool). The crucial part is to avoid OOM'ing the process (the JS runtime), and we can need to monitor and evict a V8 Isolate if they take to long, become stale, etc.
Furthermore, I expect a strong correlation between the resource requirement for a page (route), and the resources required for rendering. In other words, resource requirements should be fairly predictable, and I think as long as infrastructure can do capacity planning, we are good. We already have a decent amount of pods, we can just 1:1 add the load to the existing processes. IIRC we already load ~800M+ in memory today, so adding another 10, 20, 30, 50MB is not going to be the biggest of issues here.
This is the discussion I want to have! I am not certain that it's actually that high. I have an experience of running SSR for 1 million monthly users just fine. Yes it requires more discipline on how you write your code (avoid memory leaks, shared state, etc), but other than that I don't see major issues with maintaining a Node server. I would love to be corrected if that's not the case.
Strong disagree on this point. It is a ton of complexity for a project the size of GitLab (writing a proper reverse proxy is a project on its own, bypassing non-authenticated requests completely is a load balancer configuration horror), and the killer argument here is that you want to void complexity.
I am a strong believer in the following "rule": Before introducing a new service, its expected benefits must outweigh the existing solution by 10x to be worth it.
My personal experience with a dedicated SSR service on a large scale deployment is quite bad. It was (is) an absolute operational nightmare. A few examples I can recall out of my head:
What if SSR fails? Are you able to detect it via monitoring? And, if there is monitoring, is someone watching it? (this was an actual issue, and it had nothing to do with the engineers not being capable)
What if there is a bug in Node.js. The blast radius of having a forever on service is different than some easy to evict sidecars. I remember a very specific Intl memory leak that OOM killed pods over the weekend, because no deployments. Took us days to debug it, and it is nothing that you can easily debug in a browser either.
Upgrading Node.js might be harder for us than bumping the V8 version of an isolated render instance.
The amount of custom middleware to meet the expectations and requirements of the app were humongous. I don't even remember all the security fixes over time, but tl;dr absolute nightmare.
If we want to outperform Rails rendering at the moment I think this is the way to go: to delegate all of the requests to our Vue app so we can make them in parallel. I am aware of the cookies problem, it can be solved with an AsyncLocalStorage.
Just to clarify, outperforming Rails rendering is not one of the goals of this RFC. I actually think rendering views in Rails is fast and fine. Proxying every request through a Node.js cluster makes the application worse for at least the parts that do not use Vue. We also increase the probability of outages due to the compound SLAs of the Node.js service + Rails (P(Node.js)*P(Rails) = P(Compound)).
I really only want to add the capabilities to SSR a Vue app without any regression in the existing system. A failing SSR render in the embedded render runtime would not lead to the page to fail, it is just slower (aka same as today).
For the cookie discussion I would need to dig deeper in my memory. I only recall lots weird and complex cookie dependent routing configs, cookie rewriting, tons of edge cases, etc.
I agree that in case where we only had CSR and now we'll have SSR it will increase the load on the server. But I don't agree that it will add a severe infra cost because we'll be outsourcing the load from the Rails server, which would act only as an API backend in most cases (assuming we migrate most of the hot path pages to Vue). I also think that with proper caching it should lower the load on the servers, but that's another topic.
I can't recall the overhead/ratio, but in a comparable stack (React + Node.js + Rails), also a large scale SaaS product, the infrastructure cost for added pods was at least a relevant factor (you also need Redis for session management). Don't get me wrong, the V8 runtime also adds compute/memory cost, but it would do way less work than the Node.js cluster (less work = cheaper).
Of course, but there's also a way to combine these two approaches. We can always request just a partial view of the page from Rails (with empty layout for example). We can also fetch this partial view in parallel with rendering the 'app shell' (header and sidebars), which could also be cached. The only complex thing here is routing, which would always stay in Rails just because of its size.
😀 Ha, as I wrote at the beginning, we are talking the same language. Our only difference is that your proposal makes the Node.js cluster the authoritative instance, and in my case it is Rails. The basic idea is the same.
One additional argument, a non-technical one, I think it is easier to get consensus for extending the capabilities of an existing system than introducing a completely new one.
praise: Thanks so much for this RFC and the video. It is very well laid out and the video was very well put together and easy to follow.
disclaimer: I don't have personal experience with this type of SSR. I have used Next/Nuxt before but that is the extent of my SSR experience.
Overall this seems like a very promising RFC. The added benefit of standardizing how to render Vue applications in Rails and pass data to them is great.
As you mentioned, it sounds like there could be some complex issues to solve and few gotchas when implementing SSR components. But it also sounds like we could make this opt-in and slowly start to migrate components to use SSR so that seems promising.
Based on my understanding of this RFC I think it would be worth looking into further and creating a reference implementation in the GitLab codebase (I didn't see one yet, please correct me if there is one) that engineers could play around with.
Based on my understanding of this RFC I think it would be worth looking into further and creating a reference implementation in the GitLab codebase (I didn't see one yet, please correct me if there is one) that engineers could play around with.
Due to building a runtime is not trivial, I created a PoC to demo the creation of a V8 Isolate via Ruby, compile some JavaScript and return the result back to Ruby. The PoC uses Ruby + Rust + Rusty v8 bindings + a pre-compiled v8 binary and can be found here https://gitlab.com/lamportsapprentice/isorun
The same way we maintain connections to Postgres or MySQL, we could also spin up a V8 runtime and provide many isolates to render Vue applications on demand (like we render ERB or HAML templates).
question: Do we have memory & cpu performance comparison of Rails rendering HAML vs. Rails rendering a Vue app with v8 isolate?
For me, this measurement is critical and could make or break the proposed architecture. I get that UI-over-the-wire could improve the end-to-end and client-perceived performance, but we have to be careful not to load too much on the monolithic BE bottleneck. If I recall correctly, Rails comes with some clever HAML caching which makes it pretty speedy 🤔...
@pslaughter I think I answered a similar question above, but I am trying to go a bit deeper here.
Do we have memory & cpu performance comparison of Rails rendering HAML vs. Rails rendering a Vue app with v8 isolate?
In general, you can't compare rendering a JavaScript application with a template engine:
First, the capability to render a JavaScript application directly on the server does not exist, so it is a net new feature (at least in our stack). I think the more valid comparison is how this holds up vs. a Node.js cluster for SSR. In terms of complexity, template engines vs a full JavaScript runtime, are also quite different beasts.
It is obviously true, the same way you need to load Rails, bootstrap dependencies (initializers) and maintain some permanent application state on your heap, you also need to do this with the JavaScript render instance. So we indeed need to maintain a thread pool or other concurrency infrastructure (the PoC uses tokio), load the actual JavaScript files, compile them, keep the result in memory, etc. V8, the library, takes about ~45M. Add the compiled JS files, overhead, etc. It might be up to a 100M (exact numbers can only be figured out when used, so let's just use this one).
If we take a look at the memory footprint of our Rails app, we load about ~800M into the process. You could try and look at the HAML engine in isolation for an arbitrary complex app and measure the memory footprint, but I think that would not be a valid comparison, because that is not the way you're ever going to use it.
The only thing that really matters is capacity planning. In other words, are the resources taken by the render target in any way predictable or not? This question, we can answer with a clear Yes! We can monitor the heap, we have full control over timeouts.
Production setup expectations:
Cut rendering after some time (e.g. 250ms) and just serve the client JS file as we do now
Evict the V8 Isolate if it uses more than X MB of memory
Monitor per application memory usage (instrumentation + alert)
Those parameters are configurable, and in comparison to a Node.js service, you can actually control it directly at runtime.
Regarding CPU resources, this is a more interesting one. As you are probably aware of, V8 has an optimizing compiler that tracks code paths and compiles into native code at runtime. This is also true for this runtime implementation. It is going to be as efficient as rendering in the browser is.
This still leaves us with one more problem, the cold start. Given that it is an actual problem in production (it might not be), we can utilize V8 snapshots to create a pre-warmed binary of every JavaScript application (or even pre-compile it with the asset pipeline ahead of time).
If I recall correctly, Rails comes with some clever HAML caching which makes it pretty speedy
As this is a low-level render target (like any other partial), it can utilize the same caching mechanisms in Rails. As for other templates (HAML/ERB), this only works as long as your results are not vulnerable to cross-request state pollution. The thing that matters here is the cache key. The result of a (V8) rendered partial can be memoized based on arbitrarily complex and state dependent hash keys (authentication & current_user, page context, …). We can therefore benefit from the same infrastructure we already use in Rails.
That's really cool! This example is executed in the controller, right?
Conceptually, can it be done in a generic way, so that any GraphQL query from the client side app would be routed to be executed internally without manual steps in every controller?
@andrei.zubov this example is actually completely hacked together, but what you describe is the idea behind it. Instead of routing HTTP requests through the network, we intercept the request and forward it to where they would end up if we send them via network. We therefore save the network roundtrip, which is a significant higher tax than bridging directly from JavaScript to Ruby (via Rust).
Example
Pay attention to const raw = await Deno.core.ops.op_app_send("fetch", optionsJSON); This is where the magic happens. This directly calls the block defined for vm.render.
Ah, ok got it, thanks @lamportsapprentice🙏
Do I understand it correctly that we need to inject different versions of ApolloClient to clientside and serverside apps, as fetch logic would be different between SSR and Client-side rendering?
Yeah, I think this would be inevitable. That said, there are definitely ways to streamline this by providing packages that can just figure out what environment they are running in, and route the correct way. I think this is a DX concern and not a technical limitation (hopefully :-)).
No, not at all, I agree! 😄 It's indeed inevitable for most SSR implementations anyway, so we would just need to find a solution that would provide a good DX without affecting client-side performance 👍
I just thought initially, that it would be possible to plug into, say fetch, on a VM level, but that's probably too much to ask for 😄
Great work anyway!
@andrei.zubov Yeah, this is actually a great point. I originally had an idea going towards the direction you describe. I wanted to intercept all network calls (by overriding window.fetch), but then I faced an issue, where requests that I cannot directly process in Ruby, need to be literally proxied to the network again. It somewhat reduces the ability to have granular control over fetch calls in general. This why I ended up using a customized HttpLink and let the developer decide how to handle specific fetch calls. I think the final product should provide a npm package that might be used like import {SSRLink} from @isorun/rails/apollo.
@lamportsapprentice do you know if there's a way to launch the V8 Isolate SSR in parallel to Rails's template rendering and then just insert the result into Rails output? I think it should give us a boost to the TTFB of the page that uses such a way of rendering Vue apps.
@slashmanov I don't think there is built-in support for Futures/Deferred in either HAML or ERB (I could be wrong), which would probably be a prerequisite for it, but, we could build something that hooks into the render process and do some post-processing. Essentially render HAML/ERB/JS in parallel (as you describe) and inject after the fact. I would like to measure the effect of the waterfall method before pivoting due to the added complexity.
It would be great to have futures for this (like batch loaders work in GraphQL, or Promises in JavaScript). To directly answer the v8 question, we can easily fork multiple threads in Rust, so this wouldn't be a problem.
@lamportsapprentice do you know if we can keep the global scope shared and always running for these V8 instances? I am asking that because we rely heavily on global scope in our current code. The most common case is i18n, which expects these translations to be present on the global scope prior to rendering anything. I am a bit afraid that spinning up multiple V8 instances could become slow because of that: we'll have to transfer lots of data (basically all available translations, it's about 80MB of JS at the moment).
Another thing is text sanitisation, we also heavily rely on dompurify that itself relies on DOM presence. There's an RFC to move that to a Worker-based solution, but it's not progressing at the moment. This could be a huge blocker as well for us in terms of widespread adoption of this solution. If we could pass a Ruby method to the V8 instance that would do the sanitisation part for us that'd be really great!
do you know if we can keep the global scope shared and always running for these V8 instances?
Hm, yeah good point. I am actually doing something like that. I expose a render function that is called again and again, and on per thread basis, the global context/state is persisted.
Besides the prototype working like that, I am not sure what is going to be the best solution for this particular problem yet. Besides the performance implication, shared state can also be dangerous in a production environment. One bad call would result in subsequent requests being doomed, and we probably wouldn't even notice (+no auto recovery).
Fortunately, V8 uses a shortcut to speed things up: just like thawing a frozen pizza for a quick dinner, we deserialize a previously-prepared snapshot directly into the heap to get an initialized context. On a regular desktop computer, this can bring the time to create a context from 40 ms down to less than 2 ms. On an average mobile phone, this could mean a difference between 270 ms and 10 ms.
In comparison to the browser or Node.js, we can also create custom snapshots, therefore efficiently initialize the runtime environment based on some predefined criteria (e.g. i18n/translations being loaded directly into the heap).
In general, we can easily pre-warm instances, even without snapshots, so we can avoid all sort of cold-start problems if we have to.
Another thing is text sanitisation, we also heavily rely on dompurify that itself relies on DOM presence.
Interesting question. I haven't looked into how many of the Web APIs are supported by deno_core (this is the Rust lib I use to utilize v8), but it could be that it "just works" or "no chance this is ever going to work". I could give it a shot with some test script if you have anything at hand?
I have somewhat solved the JS<->Rust<->Ruby implementation by now, by releasing the GVL and re-claiming it after I am done with the call. But, it is only used to intercept fetch calls from the Apollo HttpLink at the moment. This method could easily be extended to other actions if we want to (aka, call a Rust function or a Ruby one).
# initializer for isorunIsorun.configuredo# when the JavaScript application sends a message to ruby, we can decide to# respond to a given action and the arguments provided by the action## @example# on_app_send do |action, args|# case action# when "fetch"# { data: { testField: "Hello from isorun" } }.to_json# else# ""# end# endon_app_senddo|action,args|url,options=JSON.parse!(args).with_indifferent_access.values_at(:url,:options)url=URI.parse(url)caseactionwhen"fetch"session=ActionDispatch::Integration::Session.new(Rails.application)session.host!("localhost:3000")session.process(options[:method],url.path,params: JSON.parse!(options[:body]))session.response.bodyelse""endendend
The most tricky part I see is more an architectural one. If we get into the habit of customizing frontend code too much to make SSR work, it is added maintenance burden + multiple files + prone to human error. As we have full control over the runtime, we could be smart and provide a lot of the things we specifically need for GitLab to work (the runtime does not need to be general purpose, it can be heavily optimized, aka using custom snapshots).
@slashmanov Just adding to my previous answer and related to #95 (closed). We could also use web workers in the runtime. It is actually possible to spawn them. I found some examples in the deno_core/deno_runtime examples and source code, but I haven't tried yet.
Yeah, nice idea 👍. I haven't thought about this too much even I noticed that there are some WASM related announcements in the Ruby roadmap itself.
Do you happen to know if someone can escape the WASM sandbox? (i.e. do some networking, etc.). The examples suggest, that stuff like this "just works".
It would be great to also do a comparison of runtime characteristics between wasmer and v8 for an application. In theory, the wasm stuff should outperform everything. I'll see if I can somehow bundle a wasm binary from what I have. The API of wasmer and isorun is somewhat similar, so I can probably easily exchange the call in one of my demos.
Hm… I have read a bit more about and there might be a few caveats. javy relies on QuickJS (https://bellard.org/quickjs/) to execute our JavaScript code.
The Shopify Functions team created javy, a toolchain that compiles a JS VM to Wasm and embeds your JS in the Wasm module. The engine that javy relies on is QuickJS, a small JavaScript VM that is fully ES2015 compliant.
We would still need to provide all the Web APIs we need (in addition to QuickJS), so this is probably the same effort than writing a custom runtime (I rely on deno_core and deno_runtime for isorun).
A short summary from a discussion w/ the Foundations team:
SSR probably has relatively low impact on time to first meaningful paint and time to interactive when (slow) API calls are part of the process. A good response time for SSR is < 100ms. This leaves us with using SSR for more or less static rendering, which is less interesting.
Building the Rust native extension is a prerequisite for going to production, and we currently can't build deno for aarch64-linux.
Shared-state extraction and merge is a missing feature of isorun. It can be solved, but needs some extra considerations to be general purpose (probably very GitLab specific issue).
The effort to build a proper production-ready runtime is going beyond that of a pet project. I have decided to open-source the code (https://github.com/eliias/isorun) and do a pre-release of the gem( https://rubygems.org/gems/isorun) with multi-platform binaries.
Next steps:
I will not implement a GitLab page for this. If someone wants to pick this up, I am happy to provide support and all things learned.
I will maintain as a pet project, but not pursue GitLab integration for now.
Closing the RFC, as it has been widely discussed and can be re-opened/re-visited when it is a better time.