Introduction#
Before we start our journey in Web development, let's first talk about what Web is, hoping to give everyone a more intuitive understanding. First, we can make an analogy that the Web is like a complex city traffic network, where the client in the Web is like your home in this city, and the server is like a store in the city you want to visit. We can try to put some web terms/components into this context:
- Network connection: Allows you to send and receive data over the internet, which can be imagined as the streets connecting your home and the store, capable of carrying a certain amount of traffic (data flow).
- TCP/IP: Transmission Control Protocol and Internet Protocol are communication protocols that define how data is transmitted. This is like the mode of transportation you use to go shopping, such as a car or bicycle (or any other means you can think of). Relevant traffic rules (protocols) specify what types of vehicles can pass (data formats), how vehicles should operate (how data is transmitted)... Unexpected situations like traffic jams and accidents are also common during network transmission (however, driving in the network world doesn't require you to take all the driving tests, haha).
https://en.wikipedia.org/wiki/Internet_protocol_suite - DNS: Domain Name System servers are like a directory of websites, recording the correspondence between domain names and IP addresses. When you enter a URL in your browser, the browser checks the corresponding domain name system to translate the domain name into a recognizable IP address before retrieving the webpage. Only then can the browser find the server that hosts the webpage you want and send an HTTP request to the correct place. This is like knowing the specific location of the store after determining which one you want to go to and finding it on the map.
https://developer.mozilla.org/zh-CN/docs/Learn/Common_questions/What_is_a_domain_name - HTTP: Hyper Text Transfer Protocol is a protocol that defines the language of communication between clients and servers, belonging to the application layer protocol, which is a level higher than the transport layer where TCP/IP resides (see the OSI model diagram at the bottom of this article). This is like the form you fill out when placing an order while shopping online, detailing your communication method, delivery address, product information... In this process, you don't need to worry about how the products are actually transported; as long as you reach an agreement with the merchant and successfully place an order, you can wait for the delivery at home (this is a form of abstraction! HTTP, as a higher-level abstraction, does not need to know the implementation of the underlying TCP/IP protocol; as long as everyone agrees on the communication format between different layers - the interface, then each can do their part).
- Composed files: A webpage consists of many different types of files, just like different categories of products in a store. These files can be roughly divided into two types:
- Code: A webpage is generally composed of HTML, CSS, and JavaScript, but you will see more different technologies in the following content.
- Resources: A collection of other static files that make up the webpage, such as images, music, videos, Word documents, PDF files... Once determined, they rarely change.
https://developer.mozilla.org/zh-CN/docs/Learn/Getting_started_with_the_web/How_the_Web_works
Feel free to explore MDN (https://developer.mozilla.org/), where you can find authoritative interpretations of many technical details/terms.
After a macro description of the Web, let's delve into some professional terms and important technologies in the Web world one by one (to ensure accurate transmission of meaning, the following content retains some English original descriptions, which I believe won't be too difficult for you to read).
World Wide Web#
Web Server#
A web server can refer to hardware or software, or the overall collaboration of both.
- The hardware part (the shelves in a store that hold products): A web server is a computer that stores web service software and the composed files of websites (such as HTML documents, images, CSS stylesheets, and JavaScript files). It connects to the internet and supports physical data interaction with other devices connected to the internet.
- The software part (the inventory management system inside the store, which can interface with e-commerce platforms): A web server includes several parts that control how network users access hosted files; at least it must be an HTTP server. An HTTP server is software that can understand URLs and HTTP. This server can be accessed via the domain name of the website stored on it (for example, mozilla.org), and it can also distribute its content to different users' devices.
Basically, when a browser needs a file hosted on a web server, it requests the file via HTTP. When this request reaches the correct web server (hardware), the HTTP server (software) receives the request, finds the requested document (if the document does not exist, it will return a 404 response), and sends this document to the browser via HTTP (you can imagine it as a process of ordering and receiving goods online). You will gain a deeper understanding of this process when you learn Spring Boot later.
https://developer.mozilla.org/zh-CN/docs/Learn/Common_questions/What_is_a_web_server
Web Browser (Client)#
A browser is an application software that retrieves web resources via URLs and presents them in a visual manner on the user's device. In the Web world, there is a concept called host environment, which simply refers to the specific environment in which our JavaScript code executes, or a runtime environment, and the browser is one of them. Besides, there are server-side host environments like Node.js that interested students can explore on their own.
More about browsers will be introduced in the upcoming section on "The Rendering Process of Pages in Browsers."
Uniform Resource Locators (URL)#
A form of URI (Uniform Resource Identifier), providing a path to, or location of, a resource.
Formats
- General format: Scheme (protocol, e.g., http/https)://DomainName/path:Port
- Special format: file://path-to-document (“file” indicates the document resides on the machine running the browser)
The text http:// indicates that HTTP should be used to obtain the resource. Next in the URL is the server’s fully qualified hostname (e.g., www.deitel.com) — the name of the web-server computer on which the resource resides. The hostname www.deitel.com is translated into an IP address — a numerical value that uniquely identifies the server on the Internet. An Internet Domain Name System (DNS) server maintains a database of hostnames and their corresponding IP addresses and performs the translations automatically.
Tips
- If a server has been configured to use some other port number, it is necessary to attach that port number to the hostname in the URL (e.g., :80 for HTTP, :443 for HTTPS).
- Embedded spaces and (; , &) cannot appear in a URL (if San Jose is a domain name, it must be typed as San%20Jose; 20 is the hexadecimal ASCII code for a space).
Self Learning: The relationship and differences between URI, URL, and URN
Path#
Absolute path: A path that includes all directories along the way.
Relative path: Relative to some base path that is specified in the configuration files of the server.
Tips
- http://www.gumboco.com/departments/ (the ending slash means the specified document is a directory).
- http://www.gumboco.com/ (the server will search for index.html at the top level of the directory).
Multipurpose Internet Mail Extensions (MIME) file types#
Specifies the forms of documents received from a server (attached to the beginning of the document by the server; servers determine the type of a document by using the file name extension as the key into a table of types).
Form: type/subtype (e.g., text/html, text/plain).
Self Learning: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Basics_of_HTTP/MIME_types
I believe you now have a certain understanding of the overall picture of Web technologies, and it's time to delve into HTTP, the underlying protocol of the Web (the essence of Web principles). Understanding the details of this protocol is very important for both front-end and back-end development!
However, due to the length of the course, many derived technical points of HTTP cannot be elaborated on here. Interested students can follow the links below for relevant content.
Hyper Text Transfer Protocol (HTTP)#
Hyper Text Transfer Protocol (HTTP) is an application layer protocol based on the TCP protocol used for transmitting hypermedia documents (such as HTML). As the top-level protocol, it hides many details of the network and transport layers (see the OSI model in the Bonus section). It is designed to constrain/specify the communication between web browsers and web servers but can also be used for other purposes. HTTP follows the classic client-server model, where the client opens a connection to make a request and then waits until it receives a response from the server, just like a turn-based conversation between two people. HTTP is a stateless protocol, meaning that the server does not retain any data (state) between two requests; in general, the server cannot distinguish who initiated multiple requests.
Self Learning: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Overview
Messages are the most important form of data transmitted via HTTP, and they have their own standardized data format. Let's take a look at what parts they consist of (it is like the language we use in our daily conversations, with its own grammar and semantics, understanding it is very helpful for you to understand the information exchange between clients and servers).
Request Phase#
- HTTP method, Domain part of the URL, HTTP version
- Header fields
- Blank line
- Message body
Request methods#
Method | Description |
---|---|
GET | Returns the contents of a specified document |
POST | Executes a specified document, using the enclosed data |
HEAD | Returns the header information for a specified document |
PUT | Replaces a specified document with the enclosed data |
DELETE | Deletes a specified document |
Self Learning: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Methods
Header fields#
- General: For general information, such as the date.
- Request: Included in request headers.
e.g.- Accept: text/plain (specifies a preference of the browser for the MIME type of the requested document).
- Host: host name.
- If-Modified-Since: date (specifies that the requested file should be sent only if it has been modified since the given date).
- Response: For response headers.
- Entity: Used in both request and response headers.
Self Learning: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers
Response Phase#
- Status line (e.g., HTTP/1.1 200 OK)
First digits of HTTP status codes:First Digit Category 1 informational 2 success 3 redirection 4 client error 5 server error - Response header fields
- Blank line
- Response body (.html)
Self Learning: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Status
Let's compare the request and response messages together.
HTTPS#
HTTPS = HTTP + SSL/TLS (Transport Layer Security) session layer protocol.
HTTPS serves two purposes: to verify the identity of the target server for the request and to ensure that the transmitted data cannot be eavesdropped or tampered with by intermediate nodes in the network.
HTTPS uses an encrypted channel to transmit the content of HTTP, meaning that HTTPS first establishes a TLS encrypted channel with the server. TLS is built on top of the TCP protocol, and it effectively encrypts the transmitted content (asymmetric encryption), so from the perspective of the transmitted content, HTTPS has no difference from HTTP.
Self Learning: https://zh.wikipedia.org/wiki/Hypertext_Transfer_Protocol_Secure
Caching (HTTP Cache)#
Caching is a technique that saves copies of resources and uses those copies directly on subsequent requests (space for time). When a web cache finds that the requested resource has already been stored, it intercepts the request and returns a copy of that resource without re-downloading it from the server. For websites, caching is an important component for achieving high performance. However, caching needs to be configured reasonably, as not all resources are permanently unchanged; importantly, the cache for a resource should expire upon its next change (i.e., expired resources should not be cached).
Self Learning: https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Caching
The historical evolution of HTTP: HTTP/0.9 → HTTP/1.0 → HTTP/1.1 → HTTP/2.0 https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Basics_of_HTTP/Evolution_of_HTTP
Try it out:
https://developer.chrome.com/docs/devtools/network/ | Chrome browser's network inspection tool, press F12, magic happens!
https://hoppscotch.io/ | A very useful online packet capture tool, super convenient for viewing messages!
Programming the World Wide Web (8th edition)
Internet and World Wide Web How to program (2011)
Understanding the HTTP Protocol | Deleted Frontend Playground (godbasin.github.io)
The previous content may have left you feeling a bit overwhelmed, but don't worry. Next, we will place these knowledge points into the practical scenario of browser rendering, logically connecting them and clarifying their relationships, along with some appropriate expansions (this part will involve some of the three core components of front-end development, and it will serve as a warm-up for the upcoming courses; it's okay if you don't fully understand it).
The Rendering Process of Pages in Browsers#
In fact, when we enter a webpage address in the browser and press Enter, the browser's processing steps are as follows:
- DNS resolution (involving the addressing process of DNS) to find the server hosting the webpage.
- The browser establishes a TCP connection with the server.
- The browser initiates an HTTP request.
- The server responds to the HTTP request, returning the HTML content of the page.
- The browser parses the HTML code and requests resources in the HTML code (such as JavaScript, CSS, images, etc., which may involve HTTP caching).
- The browser renders the page for the user (this involves the rendering principles of the browser).
Further analysis allows us to divide the rendering process of pages in the browser into two main parts (see Bonus):
- Page Navigation: The user inputs a URL, and the browser process makes requests and prepares to handle them.
- Page Rendering: After obtaining the relevant resources, the rendering process is responsible for rendering within the current tab.
After gaining a certain understanding of the overall process of browser rendering, let's delve deeper into the lifecycle of a webpage from 0 to 1~ (let's get to know two important APIs provided by the browser - DOM and BOM, which are essential for any front-end framework! Oh, and also the browser's event mechanism; a qualified front-end engineer should master it!)
The Browser Object Model (BOM) represents all the other objects provided by the browser (host environment) for handling everything outside the document (the representation of the browser in JavaScript).
The Document Object Model (DOM) represents all page content as modifiable objects. The document
object is the main "entry point" of the page. We can use it to change or create any content on the page (the representation of HTML tags in JavaScript).
Try it out:
Press F12 and type window
or document
in the console to see what appears.
Self Learning: https://developer.mozilla.org/zh-CN/docs/Web/API/Document_Object_Model/Introduction
Events
- Browser events, such as when a page is finished loading or when it’s to be unloaded.
- Network events, such as responses coming from the server (Ajax events, server-side events).
- User events, such as mouse clicks, mouse moves, and key presses.
- Timer events, such as when a timeout expires or an interval fires (setTimeout, setInterval).
Self Learning:
https://zh.javascript.info/introduction-browser-events
https://zh.javascript.info/event-loop
Secrets of the JavaScript Ninja, (Second Edition) - JOHN RESIG BEAR BIBEAULT and JOSIP MARAS
https://zh.javascript.info/browser-environment | Learning JavaScript, this website is sufficient.
Bonus#
Extra content, consume as needed~
Software Architecture Design Patterns#
MVC (Model View Controller)
MVP (Model View Presenter)
MVVM (Model View View-Model) (A must-know for modern front-end engineers regarding View-Model two-way binding)
https://github.com/livoras/blog/issues/11
World Wide Web Consortium (W3C)#
As the evangelist and standard setter in the Web field, the W3C organization is one that every web programmer should know and understand. Although web programming is relatively free and open, without a unified standard to constrain the creators and users of different technologies, this freedom can turn into a frustrating chaos.
Responsibilities and Mission
- Devoted to developing non-proprietary, interoperable technologies for the World Wide Web.
- Make the web universally accessible—regardless of disability, language, or culture. The W3C home page (www.w3.org) provides extensive resources on Internet and web technologies.
Standard Setting
Web technologies standardized by the W3C are called Recommendations (these standards are not mandatory). Current and forthcoming W3C Recommendations include the Hyper-Text Markup Language 5 (HTML5), Cascading Style Sheets 3 (CSS3), and the Extensible Markup Language (XML). A recommendation is not an actual software product but a document that specifies a technology’s role, syntax rules, and so forth.
Page Navigation Process#
When a user inputs content in the address bar, the browser internally processes as follows:
- First, the UI thread of the browser process handles it: if it is a URI, it will initiate a network request to fetch the website content; if not, it will enter the search engine.
- If a network request needs to be initiated, the request process is completed by the network thread. If the HTTP request response is an HTML file, the data is passed to the rendering process; if it is another file, it means this is a download request, and the data will be passed to the download manager.
- If the request response is HTML content, the browser should navigate to the requested site, and the network thread will notify the UI thread that the data is ready.
- Next, the UI thread will look for a rendering process to render the webpage. When both the data and rendering process are ready, the HTML data is passed from the browser process to the rendering process via IPC (Inter-process Communication).
- After the rendering process receives the HTML data, it will start loading resources and rendering the page.
- After the rendering process completes rendering, it notifies the browser process that the page has loaded via IPC.
Page Rendering Process#
- Parsing: Parses HTML/CSS/JavaScript code.
- Layout: Determines coordinates and sizes, whether to wrap, and calculates various position/overflow/z-index properties.
- Painting: Determines the rendering order of elements.
- Rasterization: Converts the computed information into pixels on the screen.
Self Learning: https://coolshell.cn/articles/9666.html