Stepping on thunder - URL parameter encoding, URL parsing logic

Benjamin
·
·
IPFS
·

A few days ago when I went to work, I discussed a problem with my partner:

He wants to insert a check step before a page that leads to the new information, so the step to check again specifies the link to the guide.

The original api function judges whether to update or add according to the existence of the ID of the incoming data, so this part should also be written in the "guided link".

It's listed like this:

 Steps to check: /chkActiveProcess
Directed page: /insertForm?itemId=12345

However, there is a situation where the user wants to add a new piece of data, the content of which is copied from an existing piece of data, so the companion made the following changes:

 Steps to check: /chkActiveProcess
Guided page: /insertForm?itemId=12345&isNewItem=yes

Use a new parameter to judge this situation, and then the problem occurs.

The link displayed after the redirect, no matter how much I try, I can't see the isNewItem=yes part.


Here is a little bit of information on URL parsing. The information contained in a URL includes:
What connection method to use, name/coordinates of which host to connect to, path, and parameters.

for example,

 http://example.com/add?a=1&b=2

When the connection is made and press enter, the necessary information for the connection as above will be parsed from left to right

First of all, http is a connection method, other than https, ftp, etc., and the text of "://" is fixed behind it;

Then example.com is the name of the connected host, starting with "://" and ending before the next "/";

The path /add is like an instruction to this web page, what function is to be executed, if the path is empty, it is regarded as accessing the root directory. Before matching "?", if there is no "?", it means that there are no parameters;

After the last "?", it is passed in the form of "parameter name=value", and multiple parameters are separated by "&", so the parameters are listed:

 a -> 1
b -> 2

Back to the topic, my partner uses nodejs, and some of the code is as follows:

 let redirectTo = '/insertForm?itemId=12345&isNewItem=yes'

...

endpoint: `/chkActiveProcess?redirectTo=${redirectTo}`

Very simple string addition, combined will look like this:

 /chkActiveProcess?redirectTo=/insertForm?itemId=12345&isNewItem=yes

Using the above definition to explain how this string of URLs is actually executed, how the parameter part will be interpreted:

 redirectTo -> /insertForm?itemId=12345
isNewItem -> yes

You can see that the parameter you want to bring in the past is one less, and it is regarded as another irrelevant parameter.

To avoid this kind of thing, it is necessary to use the code used to transmit the data in the URL, which means that the part using this code is all data and will not be misjudged.
For details, please refer to: https://zh.wikipedia.org/wiki/Percent Code


In this example, just replace "&" with "%26", the program is modified as follows

 let redirectTo = '/insertForm?itemId=12345%26isNewItem=yes'

...

endpoint: `/chkActiveProcess?redirectTo=${redirectTo}`

However, things like %26 are not recognizable at first glance, and if you type %25 by mistake, you may not recognize them, so you must make good use of tools. Various languages have URL-encoding functions available, which can be written in javascript. Say, you can write:

 let redirectTo = '/insertForm?itemId=12345&isNewItem=yes'

...

endpoint: `/chkActiveProcess?redirectTo=${encodeURIComponent(redirectTo)}`

In this way, not only &, but also other special symbols such as /, ?, = will be encoded (although this part will not be wrong if it is not encoded).


This should be considered very basic common sense, keep a record and hope not to do it again in the future.

CC BY-NC-ND 2.0

Like my work? Don't forget to support and clap, let me know that you are with me on the road of creation. Keep this enthusiasm together!