Matters' architecture and technology stack

Aug 9, 2020

With the launch of the open source program, Matt City citizens can directly see all the mechanisms and logic of Matt City. After the code repository is fully opened, anyone can put forward suggestions and ideas, submit functions and optimizations, and can also build a platform like Matt City and participate in the evolution of Matt City ecology.

After continuous iterations over the past two years, Matt City has more and more functions and more and more capacity. This makes the whole system more and more complex, and even professional software developers need to spend a lot of energy to use and participate.

Previously, we introduced the documentation and testing environment of the Matt City API , and now we have a dedicated repository for technical documentation , collaborative documentation , and issue submissions . After that, we will continue to write a series of articles that introduce different aspects of the whole system in Matt City.

This article is the first in this series, introducing the general structure and ideas of the entire system. Some of the involved code repositories have not been made public yet. If you want to try it first, you can sign up for the Matt City Open Source Program .

front end

The front end of Matt City's web page is a Progressive Web App that uses a responsive design to adapt to different devices and allows users to have a native app-like experience after adding it to the desktop. The front-end and back-end use GraphQL to call data and define data structures, all written in TypeScript .

Compared with the backend, the frontend is easier to get started with, and it may be the place where community designers and developers can use their imaginations the most. When developing locally , we can point the front end to the production environment in Matt City and see the effect of changes on real data in time. We can also view API documents and directly test query statements through Apollo Playground .

Drawing on the JAMstack architecture , the rendering of the Matt City web page is roughly divided into two steps: when a user visits a web page in Matt City, the public version of the web page will be retrieved from the server's cache; Based on the user's login status, personal data is requested from the backend and the personalized part of the web page is updated.

The server-side rendering of web pages is implemented by Next.js , and the document structure is also affected by Next.js. The entry of each web page is located in src/pages , and the document path is mapped to the url used by the user through Dynamic Routes . src/pages calls reusable view logic from src/views src/views , which in turn calls the component library located in src/components .

The front-end component library is written in React , follows the design system of Matt City , and includes many common contexts (such as current user information, global language settings) and powerful hooks (such as responsive design, pull-down update). In the future, we will introduce tools such as Storybook to make the component library more clear and convenient for developers to modify and use directly.

In the React code style, we use functional programming a lot, and use functional components to make the code structure more concise and clear. Components that need to call data have a fragments field, which contains GraphQL fragments that describe the data requirements. In this way, the parent component can directly call the fragment in the query without considering the specific data requirements of the child component.

Just as the mutual calls of React components form a tree structure, the layer-by-layer calls of GraphQL fragments also form such a tree structure, which fits with the React tree. At the top of the fragment tree, the integrated query and mutation are initiated through Apollo Client .

The configuration of Apollo Client is located in src/common/utils/withApollo.ts , which consists of different Apollo Links , including server API address, identity verification, persisted queries and other logic. At the same time, there are also some client-side GraphQL schemas and resolvers , so that we can also read and write client-side local data through GraphQL, such as the selection of the front page article waterfall, and the draft of article comments.

The article editor is a separate project, located in matter-editor , based on Quill.js . This part is the most complex part of front-end interaction, and it is also the most in need of optimization and improvement. We will write a special article to introduce it later. At present, there are many bugs in the editor that cannot be reproduced. We will invite the citizens of Matt City to catch bugs with us later.

rear end

The backend of Matt City relies on many services, and the structure is relatively complex. We have drawn a simplified architecture diagram on GitHub. When starting locally, it is more convenient to use docker to install and manage different services .

The back-end GraphQL API is based on Apollo Server , which provides the entry for reading and writing data, and also defines the data structure shared by the front-end and back-end. The GraphQL schema that determines the API structure is located under the src/types path, and the comments in it will appear as documents in the Apollo Playground .

We implement some common logic at the schema level through GraphQL directives , such as permission management, caching, and operation frequency restrictions, which are located in the src/types/directives path. GraphQL directives is not a very common function, but it is actually very powerful . It can control the schema parsing process in a declarative way, and can also simplify the code structure. We will increase its use in the future.

There is a saying that the hardest thing in computer science is cache cleanup and naming; naming is really hard, but we put a lot of effort into debugging the cache and pulling the battle-tested logic and code into a separate repository . There is a plugin and several corresponding directives in it, which implements simple caching and cleaning. The precise cleaning of the GraphQL server-side cache has always been relatively weak, so we will write a special article to introduce Matt City's solution to facilitate direct use by other projects.

The root node of GraphQL schema is divided into query and mutation , query is used to read data, and mutation is used to write data. The execution logic of both is defined by resolver , located in src/query and src/mutation respectively. When the resolver is executed, it calls the data source from the context , initiates specific requests to the database and other services, and performs calculations.

Different data sources are defined by files in src/connector , which also contain interfaces required by other docking services, such as s3, Google Translate, ElasticSearch, etc. Among them, Redis-based queue operations are stored under the queue path, including operations performed periodically (such as database updates), and operations that limit parallelism (such as appreciation, support, and withdrawal). As the capacity of the city of Matt expands, more and more operations will be completed asynchronously in the queue in the future.

When receiving a request, Apollo Server injects all data sources into the context, which is called by the resolver. This part of the logic at the top level is defined by the file in src/routes , which is parallel to the two endpoints oauth and pay for third-party authentication and payment access.

Sorting Algorithms and Databases

The presentation of content in Matt City is the result of the emergence of user creation and behavior, and the logic of content sorting is the emerging rule, which directly determines what kind of content is seen by readers. These logics are both the definition of "good content" and the quality of "public space".

The data used for sorting comes from the behavior of each user: for articles, it is like, supported, commented, associated, read, favorited, and featured; for tags, it is edited, featured, and followed; for authors, it is tracking, and for authors All actions of articles or tags. Each behavior is a time series, including countless time-segmented windows; at the same time, each user has different weighting methods, such as the number of followers, the number of supports received, and the number of supports given, which can give behavior assign different weights.

The sorting algorithm needs to use these multi-dimensional data to emerge recognized content and authors. It must not only ensure a certain frequency of updates, but also avoid screen-swiping attacks by malicious users. This makes the sorting algorithm complex and sophisticated, and it also makes us need to constantly communicate and improve the sorting logic in a concise and easy-to-understand way, so that the texture of the public space can truly become the consensus of the community. Later, we will also write a special article to introduce the idea of sorting algorithm to facilitate the participation of the citizens of Matt City.

These various sortings are stored in a database as a materialized view , and are regularly updated through cron jobs. Database migration, configuration and seeding files are stored in the db path, parallel to the src path. The relevant code of the sorting method is in db/migrations , and the corresponding materialized view is presented in the return result of the API through different resolver calls.

The database structure is relatively complex, and it is difficult to understand and get started quickly. To this end, we have made the document and structure diagram of the database. After downloading the document, you can click on the webpage to intuitively understand the current database structure.

Development environment and deployment method

Because Matt City has only a small team of engineers, we try our best to standardize local development practices and automate DevOps operations to improve development efficiency.

Regardless of the front-end and back-end, the GraphQL category will generate the corresponding TypeScript category during development and construction to realize data structure verification: the back-end uses the graphql-schema-typescript project and calls npm run gen to generate; the front-end uses apollo-tooling Project implementation, call npm run gen:type to generate. When developing locally, these categories are automatically generated in time.

The front-end and back-end are also roughly the same in the development tool configuration: Prettier is used to automatically standardize the code format, Commitzen is used to standardize the git commit format, and Jest is used for unit testing. There are also cucumber documents under the bdd path of the front-end warehouse, which can be used as both product function documents and front-end and back-end integration testing scripts; however, this part has not yet developed its potential and needs to be developed and improved.

The deployment of the new version is done through GitHub action. On the one hand, the git commit of the new version is automatically integrated into a release note, and on the other hand, the new version of the code is uploaded to the server. As the services that Matt City relies on become more and more complex, we began to try to use Terraform to automate the change and scheduling of infrastructure, and the progress of this part will be updated later.

The above is the current general structure and ideas of Matt City, and there are many details and aspects, which will be introduced in future articles. You are welcome to put forward your own opinions, no matter if you found any problems, or there are parts you want to know more about. You are also welcome to sign up for the first step of the Matt City open source project, and be the first to enter the code repository to play and have a look.

Happy Hacking ❤️