Let's build a URL shortener in Go - Part III : Short Link Generator

Recall that in the previous part, we were able to set up,  build and test the storage layer of our link shortener. During this part we are going to work solely on the algorithm we will be using to hash and process the initial input or  the long url into a smaller and shorter mapping that corresponds to it.

When doing the choice for the algorithm we do have a number of objectives to keep in mind :

  • The final input should be shorter : Maximum 8 characters
  • Should be easily human readable, avoid confusing characters mix up, characters that often similar in most fonts.
  • The entropy should be fairly large to avoid collision in short link generation.

III. 1. Generator Algorithm

During this implementation we are going to use two main  schemes  : a hash function and a binary to text encoding algorithm.

So let's go ahead and fire up our IDE or whatever editor you are currently using, we are gonna have to create 2 files,  shorturl_generator.go and shorturl_generator_test.go, and put them under the a folder called shortener

After creating those 2 files and putting them under the folder above, our project directory structure should look like the tree below :

├── go.mod
├── go.sum
├── main.go
├── shortener
│   ├── shorturl_generator.go
│   └── shorturl_generator_test.go
└── store
    ├── store_service.go
    └── store_service_test.go

III. 2. Shortener Implementation

Recall in the previous paragraph we talked about using a hash function and a binary to text encoding  for our shortener algorithm, in this section we are going to implement those 2 basic building blocks :

III. 2. 1. SHA256

We will be using  SHA256 to hash the initial inputs during the process.  So let's go and open the shorturl_generator.go file and add the SHA256 code below, here we are going to use the Golang built-in implementation of this hash function.

package shortener

import (
	"crypto/sha256"
)

func sha256Of(input string) []byte {
	algorithm := sha256.New()
	algorithm.Write([]byte(input))
	return algorithm.Sum(nil)
}
Inside shorturl_generator.go file

III. 2. 2. BASE58 .

This binary to text encoding will be used to provide the final output of the process.
People familiar with binary to text encodings might ask why we are not using Base64, which is probably one of the most popular encoding scheme of this class of algorithms.

The reasons are mostly related to the user experience, Base58 reduces confusion in character output.  Satoshi Nakamoto the anonymous early developer of the Bitcoin protocol and creator of the encoding scheme gave some good reasons  on why Base58.

Those reasons are still valid  even today :

  • The characters 0,O, I, l are highly confusing when used in certain fonts and are even quite harder to differentiate for people with visuality issues .
  • Removing ponctuations characters prevent confusion for line breakers.
  • Double-clicking selects the whole number as one word if it's all alphanumeric.

As we now understand why we are using BASE58 encoding, let's go ahead and add its implementation.
First thing first, let's load the base58 dependency library :

go get github.com/itchyny/base58-go/cmd/base58
In the terminal

Now, let's add the implementation  in the code.

package shortener

import (
	"crypto/sha256"
	"fmt"
	"github.com/itchyny/base58-go"
	"os"
)

func sha256Of(input string) []byte {
	algorithm := sha256.New()
	algorithm.Write([]byte(input))
	return algorithm.Sum(nil)
}

func base58Encoded(bytes []byte) string {
	encoding := base58.BitcoinEncoding
	encoded, err := encoding.Encode(bytes)
	if err != nil {
		fmt.Println(err.Error())
		os.Exit(1)
	}
	return string(encoded)
}
Inside shorturl_generator.go file

III. 2. 3. Final Algorithm

The final algorithm will be super straightforward now as we now have our 2 main building blocks already setup, it will go as follow :

  • Hashing  initialUrl + userId url with sha256.  Here userId is added to prevent providing similar shortened urls to separate users in case they want to shorten exact same link, it's a design decision, so some implementations do this differently.
  • Derive a big integer number from the hash bytes generated during the hasing.
  • Finally apply base58  on the derived big integer value and pick the first 8 characters
func GenerateShortLink(initialLink string, userId string) string {
	urlHashBytes := sha256Of(initialLink + userId)
	generatedNumber := new(big.Int).SetBytes(urlHashBytes).Uint64()
	finalString := base58Encoded([]byte(fmt.Sprintf("%d", generatedNumber)))
	return finalString[:8]
}
Final implementation inside Inside the shorturl_generator.go file

At this stage we have now completed the implementation part, next step should be about testing.


III. 3. Shortener Unit Tests

Our algorithm implementation is now fully complete, let's do unit testing as it should, for any custom logic/algorithm implementation.

In the top paragraph above we created our test file shorturl_generator_test.go , we will be now filling it with some pieces of unit tests.

package shortener

import (
	"github.com/stretchr/testify/assert"
	"testing"
)

const UserId = "e0dba740-fc4b-4977-872c-d360239e6b1a"

func TestShortLinkGenerator(t *testing.T) {
	initialLink_1 := "https://www.guru3d.com/news-story/spotted-ryzen-threadripper-pro-3995wx-processor-with-8-channel-ddr4,2.html"
	shortLink_1 := GenerateShortLink(initialLink_1, UserId)

	initialLink_2 := "https://www.eddywm.com/lets-build-a-url-shortener-in-go-with-redis-part-2-storage-layer/"
	shortLink_2 := GenerateShortLink(initialLink_2, UserId)

	initialLink_3 := "https://spectrum.ieee.org/automaton/robotics/home-robots/hello-robots-stretch-mobile-manipulator"
	shortLink_3 := GenerateShortLink(initialLink_3, UserId)


	assert.Equal(t, shortLink_1, "jTa4L57P")
	assert.Equal(t, shortLink_2, "d66yfx7N")
	assert.Equal(t, shortLink_3, "dhZTayYQ")
}
Inside the shorturl_generator_test.go file

Running the tests above and should all pass (✅).


III. 4. Conclusion & Next Steps

During this part III, we were able to cover the implementation of our shortener algorithm and the choices behind the design decisions in hash function and byte to text encoder.
Next part we will covering the section where we will expose an Rest API endpoint for shortening and handle the actual  URL forwarding.


Ciao ! 👋