Verified Commit e8999500 authored by MrMan's avatar MrMan

Add way more

parent 1c7d3ce4
......@@ -5,18 +5,18 @@
# Roadmap #
- What is a backend?
- Past
- Past (AKA the "distant" past)
- C programs
- Java & Servlets
- CGI & LAMP
- Common Gateway Interface
- The LAMP stack
- Reverse Proxies
- Present (AKA, the "recent past")
- The Present (AKA the very recent past)
- VMs
- PaaS
- Containers
- Container Orchestration
- Future (AKA, now and after now)
- ???
- The Future?
- How does one keep up?
# Disclaimer #
......@@ -26,7 +26,7 @@ If you notice some inaccuracies, write them down/keep them in mind until the end
# What is a backend?
Backend can refer to a lot of things, but we'll be mostly focusing on backends as used by/created for serving web/mobile traffic, in the "three tier architecture" style. We can call this a "web" server, but as we'll soon see, lately the middle layer of the three tier architecture concerns itself with serving *data* most of all.
Backend can refer to a lot of things, but we'll be mostly focusing on backends as used by/created for serving web/mobile traffic, in the "three tier architecture" style.
## In interview question form ##
......@@ -34,15 +34,15 @@ Backend can refer to a lot of things, but we'll be mostly focusing on backends a
## From a frontend engineer's perspective ##
"Where can I pull the data?"
"Where can I get the data?"
## From a devops engineer's perspective ##
"How many instances do you need?"
A backend looks like a lot of things to different parts of a team but it has to do one very specific thing -- **answer requests from clients**.
# A Backend (in Python) #
# A minimal example, in Python #
A backend may look like many things to many people, but they must do one very specific thing -- **answer requests sent by clients over ther internet**.
\tiny
......@@ -64,12 +64,12 @@ while True:
client_connection.close()
```
\normalsize
\scriptsize
Building A Web Server, Part 1 (https://ruslanspivak.com/lsbaws-part1)
Building A Web Server, Part 1 - https://ruslanspivak.com/lsbaws-part1
# Our first definition #
# A working definition #
Fundamentally, a backend needs to:
......@@ -81,24 +81,28 @@ Fundamentally, a backend needs to:
Put simply, we need to **receive a request, and return a response**
If that sounds too easy to be true -- you're right, it is. We're atop many layers of abstractions that enable the code to be relatively simple:
Too easy to be true? It is. There are many layers of abstractions below that enable this code to be relatively simple:
- Python runtime
- Operating System
- Sockets
- Networks, TCP/IP, Ethernet, etc
- HTTP, TCP/IP, Ethernet, Networking, etc
# The Past #
# Past #
## How have people solved this problem in the past? ##
## What did backends look like in the "distant" past? ##
# The Past - C programs (~1990) #
Cern's `httpd` was the first web server, it's written in C.
Cern's `httpd` is considered one of the first web servers.
C was one of relatively few options for writing software back then -- writing a backend in C in this day and age is very dangerous -- while C is very powerful, much of that power is raw, and is very easy to use in an unsafe manner.
- It's written in C (there were relatively few choices)
- C is powerful
- C is fast
- C is unsafe
- You probably shouldn't use C to write your web server in <year after 1990>
We'll pretend this server worked much like (as the fundamentals are the same) the Python web server we had above, no need to look at C code today.
Let's avoid looking at C code today and pretend this server worked much like the previous example.
\scriptsize
https://en.wikipedia.org/wiki/CERN_httpd
......@@ -113,7 +117,7 @@ We've achieved the basic functionality required of any backend -- **receive a re
# The Past - Java & Servlets (~199x) #
Java rose as safe-yet-performant alternative to C, and people started using it to write web servers. Here's a current example:
Java rose as safer, reasonably performant, cross-platform alternative to C, and people started using it to write web servers. Here's a current example:
\tiny
......@@ -151,20 +155,22 @@ https://en.wikipedia.org/wiki/Java_servlet
Building servers in Java enables:
- Memory safe web servers
- Modular, pluggable web servers
- More modular, cross-platform server programs
- Making web server development more approachable for developers
The ability to use a safer, memory-managed language like Java increases the velocity of everyone hacking on the early internet
The ability to use a safer, memory-managed language like Java makes it easier to build things on the early internet
At this point, backends can **receive a request and return responses in a safe-yet-performant manner**, along with access to Java's easy-to-use rich ecosystem of libraries.
# The Past - CGI (~199x) #
# The Past - Common Gateway Interface (~1997) #
New technology enables us to abstract away the web server itself:
Write *any* program that receives information about a web request through `stdin` and environment variables. Write the webpage you want to render to `stdout`.
- Performant, modular web servers (ex. Apache)
- Common Gateway Interface (CGI)
- New "scripting" languages (PHP, Python, Perl) to write programs in
In this era we get to witness:
- The rise of performant, modular web servers (ex. Apache)
- New, "lighter" languages like PHP, Python, Perl can easily power web sites
- `.cgi` showing up in URLs all over the internet
PHP (1994) is what happens when we optimize a language for web development, and just sprinkle in some dynamic behavior:
......@@ -182,32 +188,40 @@ PHP (1994) is what happens when we optimize a language for web development, and
```
# The Past - CGI (continued) #
So what does our flow look like now?
What does our journey to receive a request and produce a response look like now?
- A request comes in to the Apache webserver (C)
- Apache looks through it's configuration to find a handler
- The CGI protocol is used to call the relevant script/program
- The results of the script/program are sent to the user as a response
- Apache finds the relevant handler program
- CGI is used to call the handler program
- The results of the program are sent to the user as a response
We've got at least 3 things to worry about:
- Apache
- CGI's Protocol
- The handler program
# What did we gain? #
The emergence of Apache and the CGI pattery is yet another step change in ease of use:
The emergence of Apache and the CGI pattern is yet another step change in ease of use:
- Even more modular web servers
- The ability to easily use new, better programming langauges
- The ability to easily use new, more focused programming langauges
Backends we write with the support of CGI can ***focus on business logic (leaving request/response wrangling to an "outer" web server)**, while achieving all our previous goals.
Backends we write with the support of CGI can **focus more on business logic** and leaving request/response wrangling to an "outer" web server, while achieving all our previous goals.
# The Past - LAMP (2000+) #
One of the biggest step changes in productivity after CGI was the discovery/adoption of the LAMP stack:
One of the biggest step changes in productivity after CGI was the assembly/adoption of the LAMP stack:
- **L**inux - free, easy to use server operating system
- **L**inux - free, customizable server operating system
- **A**pache - capable, modularized web server
- **M**ySQL - advanced application data management
- **P**HP - low-friction application runtime
- **P**HP - low-friction programming language built for the web
This paradigm is/was *dominant* -- it reportedly made up for over 50% of internet traffic at one point.
This paradigm is/was **dominant** -- it reportedly made up for over 50% of internet traffic at one point.
Example: Wordpress
\scriptsize
https://en.wikipedia.org/wiki/LAMP_(software_bundle)
......@@ -217,42 +231,76 @@ https://en.wikipedia.org/wiki/LAMP_(software_bundle)
With the adoption of LAMP:
- Writing complex applications is now *much* easier
- More companies are able to enter the web sphere
- Adoption of LAMP and the web skyrockets
- More individuals/companies are able to get on the web
- Linux, MySQL, PHP, and open source in general benefit greatly from the adoption/support
Backends we write now can **more easily render webpages, perform complex data operations, and run cheaply on commodity hardware**, while achieving all our previous goals.
# The Past - Reverse Proxies (2004+) #
NGINX was written with the explicit goal of outperforming Apache
C10K problem
NGINX's workder thread & event loop driven approach allowed massive scale with limited resources, but it achieves this by doing *less* (it doesn't handle dynamic content).
http://radar.oreilly.com/2006/08/programming-language-trends.html
"C10K" - 10,000 concurrent connections
Processes aren't really the past, per-say -- in the end *some* process is running *somewhere*, but the idea that you don't need
Apache was fast, but didn't do so well with concurrent connections, which became a problem as more people joined the internet.
NGINX config example with upstream backends
NGINX was written with the explicit goal of outperforming Apache, and solving the C10K problem
NGINX's worker thread & asynchronous event loop driven approach (versus Apache's thread-per-request\*) allowed massive scale with limited resources (one core), but it achieves this by doing *less* (it doesn't handle dynamic content).
\scriptsize
\* https://www.digitalocean.com/community/tutorials/apache-vs-nginx-practical-considerations
# The Past - Reverse Proxies (continued) #
NGINX example:
\tiny
```nginx
http {
upstream myproject {
server 127.0.0.1:8000 weight=3;
server 127.0.0.1:8001;
server 127.0.0.1:8002;
server 127.0.0.1:8003;
}
server {
listen 80;
server_name www.domain.com;
location / {
proxy_pass http://myproject;
}
}
}
```
\normalsize
**Paradigm shift** Programs bound to `localhost` (AKA `127.0.0.1`) on different non-reserved ports, multiplexed by NGINX
\scriptsize
https://www.nginx.com/resources/wiki/start/topics/examples/loadbalanceexample/
# What did we gain? #
With the adoption of reverse proxies we gained:
- Relatively massive scale
- A simpler model for multiplexing applications
- Simplification of the role of the "outer" web server (Apache -> NGINX)
While this might seem like a little thing, but for every way that reverse proxying is *simpler* than CGI, more new developers and programming languages can be used.
- Simplification of the "outer" web server (Apache vs. NGINX)
The emphasis on reverse proxying instead of CGI spurred investment in better software libraries/ecosystems for various languages.
Barrier to entry for new langauges to be used is now an ecosystem that contains *at least* the means to listen on `localhost` at some port, and speak HTTP. Many new languages have this in their *standard libraries*.
Now backends can be used for **websites with larger userbases, with flexibility in implementation language, and with *slightly* simpler architecture**.
# Crossing over into a new era #
Remember our fundamental description of a backend: **receive a request, return a response**.
We've come a long way -- let's step back and remember our fundamental description of a backend: **receive a request, return a response**.
We've developed technology that makes it easier to do this in a safer and less error-prone manner than just writing completely custom applications in C:
......@@ -265,17 +313,17 @@ We've developed technology that makes it easier to do this in a safer and less e
# Present #
## How did we solve this problem in the very-recent past? ##
## What do backends look like in the very-recent past? ##
# The Present - VMs #
As more and more computing power, memory, and harddrive space become available, you're probably going to want to run
More computing power, memory, and hard-drive space is available than ever before -- you can run *even more* programs!
How do you stop one (possibly compromised) `apache` process or `php` script from breaking your system?
How do you stop one (possibly compromised) running program from bringing down your entire web server?
Virtual Machines (VMs) are one of the widely adopted solutions to this problem -- allowing some process to run in a completely virtualized machine, offering *isolation* and *security*.
System-level Virtual Machines (VMs) arise as a solution to this problem -- allowing one or more processes to run in a completely virtualized machine, offering increased *isolation* and *security*, **without requiring application-level changes**.
With separate processes running in VMs, one crashed (or malicious) process can now no longer crash others or fatally harm the host system. New hardware with more resources can be safely utilized efficiently.
With separate processes running in VMs, one crashed (or malicious) process can now no longer crash others or fatally harm the host system. New hardware with more resources can be safely utilized more efficiently.
# The Present - VMs (continued) #
......@@ -297,6 +345,8 @@ Now we can relatively performantly run isolated processes as "backends". If we p
- The outer web server processes and passes some request to the VM
- The process (which may succeed or crash) inside the VM processes the request
Now we can write backends that are **more secure and better isolated, leading to safer use of more resources**.
# The Present - PaaS #
Here's an idea, if we have well known "stacks" like LAMP and Ruby on Rails, well known databases like MySQL, and VMs as a reasonable way to isolate them, why not automate the experience of deploying software?
......@@ -313,21 +363,20 @@ What if there was an easier way to isolate processes running on a machine *witho
# The Future - More client-side rendering #
TODO
##Why?##
Why?
- Smaller, simpler backends
- Offloading rendering to more powerful clients means smaller, simpler, more focused backend services
- Weaker client? Don't worry, we've come full circle with "isomorphic" apps and Server Side Rendering ("SSR")
- Specialization of *types* of backends (ones that serve data, others that serve webpages)
- More efficient use of available resources on client devices
- Faster iteration for separate teams
- Faster iteration for separate teams (see: microservices)
# The Future - Container Orchestration #
Now that we have containers everywhere, how can we treat a group of machines as just a blob of resources for running containers?
Why?
##Why?##
- Containers simplify deployment (basically fat binaries v2.0)
- Containers offer better isolation that raw processes, and no need to manage OS primitives on a machine (users, groups, etc)
- Automation is (generally) good
- Consistent management of a pool of servers
# The Future - Functions as a Service #
......@@ -371,6 +420,56 @@ No one knows what the future will actually hold, but most of it is here, it's ju
\* https://en.wikiquote.org/wiki/William_Gibson
# How does one keep up with all this? #
Things are moving very fast, but remember that fundamentals move slowly (if at all), since everything is built on them.
Personally when evaluating new technology I start by considering three things:
- What is the thing *supposed* to do and why?
- What did people use before?
- What does this new tool/approach bring to the table?
- What are the alternatives?
After spending some time looking at these things, you can boil down the new technology down to one sentence in your head to make it easier to digest, and refine your knowledge the next time it comes up. If the new technology is *really* interesting, take a few hours and read the architecture/documentation/code.
**If a tool can/does not mention nor convince you of it's value proposition quickly, just postpone investigation.** If it's important you'll (probably) see it again.
Also remember, keeping up is *relative* -- you don't have to be a core contributor to know the what and why of a tool.
## Keeping up case study: Prometheus ##
**What is the thing supposed to do and why?**
Collect (application|infrastructure|...) metrics. People need metrics to make decisions (ex. knowing when disks are almost full).
**What did people use before?**
- literal human intervention (Q: "hey is X up?", A: "yep, looks like it")
- manual bash scripts (that email?)
- `cron` + bash scripts (that email?)
- Nagios
- Graphite
**What does this new tool/approach bring to the table**
- An focus on high label cardinality of metrics
- Labels versus hierarchical metric names
- Avoids aggregations for real data
- Simplicity of deployment, with most batteries included (gathering, displaying, alerting)
- Standardization attempts (this is recent)
**What are some alternatives?**
- Graphite
- Nagios
- InfluxDB (if you add Kapacitor)
- netdata
- Zenoss (slightly more fleet focused)
- Zabbix (slightly more fleet focused)
Inclusion of a comparison page in the documentation of a tool is a *very good* sign (Prometheus has one).
# The End #
Thanks for listening
......@@ -388,4 +487,26 @@ I run a couple very small consultancies to support businesses in Japan and the U
- GAISMA G.K. (https://gaisma.co.jp)
- VADOSWARE LLC (https://vadosware.io)
Need help getting your organization to the future? I can help with that.
Need help getting your organization to the present/future? I can help with that.
Need help getting your organization to the past? I probably can't help with that.
# Blooper Reel: Hot takes on interviewing (YMMV) #
## Some things good companies do: ##
- Layered questions that increase in difficulty
- Questions at are very relevant to daily tasks
- Optional take-home assignments
- Train employees to be good interviewers (ex. giving hints, avoiding adversarial atmosphere)
- Give feedback on interviews when possible
- Ask fizzbuzz
## Some things bad companies do: ##
- Read your resume (clearly for the first time) in the interview room
- Ask how many X will fit in some Y (ex. X=balls Y=school bus)
- Treat interviews like timed tests
- Ask questions completely unrelated to job responsibilities
- Never ask for feedback on their interview process
- Ask fizzbuzz
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment