Dead simple WebRTC explanation

TL;DR

The following code is the minimal implementation of a WebRTC video connection between two peers, implemented in browser JavaScript. There are at least 4 major parts in the implementation, details are given below. Those parts are:

The Code

class PeerConnection {
	constructor() {
		this.incomingIceBuffer = [];
		this.outgoingIceBuffer = [];
		this.canAcceptIce = false;
		this.pc = new RTCPeerConnection();
		this.pc.onicecandidate = (event) => {
			if(event && event.candidate) {
				this.outgoingIceBuffer.push(event.candidate);
			}
		}
	}

	addIceCandidate(candidate) {
		if(candidate) {
			if(this.canAcceptIce) {
				this.pc.addIceCandidate(candidate);	
			}
			else {
				this.incomingIceBuffer.push(candidate);
			}		
		}
	}

	onIceCandidate(callback) {
		this.pc.onicecandidate = (event) => {
			if(event && event.candidate) {
				callback(event.candidate);	
			}
		}

		for(var i in this.outgoingIceBuffer) {
			callback(this.outgoingIceBuffer[i]);
		}
		this.outgoingIceBuffer = [];
	}

	startAcceptingIce() {
		this.canAcceptIce = true;
		for(var i in this.incomingIceBuffer) {
			this.pc.addIceCandidate(this.incomingIceBuffer[i]);
		}
		this.incomingIceBuffer = [];
	}

	getRemoteStream() {
		let remoteReceivers = this.pc.getReceivers();
		return new MediaStream(remoteReceivers.map(el => el.track));
	}
}

class PeerConnectionSender extends PeerConnection {
	constructor() {
		super();
	}

	async createOffer() {
		let userStream = await navigator.mediaDevices.getUserMedia({
			audio:true,
			video:true
		});
		this.pc.addStream(userStream);
		let offer = await this.pc.createOffer();
		await this.pc.setLocalDescription(offer);
		return offer;
	}

	async setAnswer(answer) {
		await this.pc.setRemoteDescription(answer);
		this.startAcceptingIce();
	}
}

class PeerConnectionReceiver extends PeerConnection {
	constructor() {
		super();
	}

	async setOffer(offer) {	
		let userStream = await navigator.mediaDevices.getUserMedia({
			audio:true,
			video:true
		});
		this.pc.addStream(userStream);
		await this.pc.setRemoteDescription(offer);
		this.startAcceptingIce();
		let answer = await this.pc.createAnswer();
		await this.pc.setLocalDescription(answer);
		return answer;
	}
}

async function makeConnection() {
	let sender = new PeerConnectionSender();
	let receiver = new PeerConnectionReceiver();
	sender.onIceCandidate((candidate) => {
		receiver.addIceCandidate(candidate);
	});
	receiver.onIceCandidate((candidate) => {
		sender.addIceCandidate(candidate);
	});
	let offer = await sender.createOffer();
	let answer = await receiver.setOffer(offer);
	await sender.setAnswer(answer);
	let remoteStream = receiver.getRemoteStream();
	let sinkElement = document.getElementById("sinkElement");	
	sinkElement.srcObject = remoteStream;
}

Detailed explanation

Offer/answer exchange between peers

WebRTC is a peer-to-peer technology that is used for voice and video communication. The process of establishing a WebRTC connection begins with a negotiation between the two peers, where they exchange their capabilites through an offer/answer exchange. Peer A, who initiates the peer connection, generates an offer and sends it to the receiver. This offer contains capabilities such as number of streams, resolution of the video streams, the bitrate and encoding of the audio streams, networking information, and other necessary data.

The offer is accepted by peer B, who at the time also generates an answer and sends it back to peer A. Once peer A accepts the answer, the negotiation is complete. However, further information needs to be exchanged between the peers for the data to actually start flowing between them.

There are two methods that are used to accept the offer or answer. They are called setLocalDescription and setRemoteDescription. The input of these functions is either the offer or answer, and depends on who generated them. In general, you need to call setLocalDescription on the offer or answer that you generated, and setRemoteDescription on the offer or answer sent by the other peer.

Peer A, who generated the initial offer, uses setLocalDescription on the offer that he generated, and setRemoteDescription on the answer he receives. Peer B first calls setRemoteDescription on the offer sent by Peer A, then calls setLocalDescription on the answer that he generated.

The code that implements this functionality is defined in the PeerConnectionSender class, which represents the initiator of the offer, and the PeerConnectionReceiver who accepts the offer and generates the answer.

ICE candidate exchange

The peers also need to exchange ICE candidates, which contain networking information needed to establish the peer-to-peer link. The ICE candidates are generated asynchronously as soon as the setLocalDescription method resolves. Therefore, you must supply an event handler to the onicecandidate function that captures the ICE candidates. In this implementation, a helper method onIceCandidate has been added to the PeerConnection class that abstracts some of the details and captures the ICE candidates in an internal buffer, in case you forget to supply a handler before calling setLocalCandidate.

The generated ICE candidates should be sent to the remote peer. The remote peer then adds the ICE candidates by calling addIceCandidate. The addIceCandidate method cannot be called unless setRemoteDescription has been called first, but this implementation adds a helper method, also called addIceCandidate that stores the incoming ICE candidates in a buffer until the peer connection is ready to accept them.

Once the ICE candidates have been send and accepted by both parties, the connection is fully established and data is flowing between the peers.

Handling incoming/outgoing streams

Of course, to communicate using WebRTC we must have a way to add our streams to the peer connection object and get the remote streams once the connection is established.

Local streams can come from many places, such as files, a canvas element, the microphone and camera, etc. In our example, we want to access the microphone and camera, and to do this we use the getUserMedia method from the navigator.mediaDevices object. Once called, the getUserMedia method prompts the user to give access to the requested devices, and if access is given, the media stream object is ready to be used. This media stream object is passed to the addStream method of the peer connection, and must be called before setting the local or remote description.

You can in fact add streams to the peer connection at a later point, but that use case is not covered here.

Once the WebRTC connection is established, getting the remote streams is not hard at all. On the peer connection, we get the remote receivers by calling getReceivers, and we pass the tracks of these receivers to the constructor of a new MediaStream object. In our case, there is a helper method called getRemoteStreams that does all of this, and returns the MediaStream object.

To display the remote video, simply assign the remote stream to a video element’s srcObject property.

After all of this effort, we finally got a WebRTC video chat working.

Signaling server

Throughout the article, it’s been taken for granted that the clients have some way of communicating and sending the offers, answers, and ICE candidates to each other. I will be explaining some aspects of how this communication should take place, but I will not be supplying any code, just some rough guidelines.

In our example, both the peers are running on the same browser window, so they don’t need an external service to coordinate their messaging. In a real app, the two peers would be running in two different browser instances, meaning they would need to coordinate their communication through a centralized server. This server is responsible for registering the peers, sending the offers, answers, ICE candidates to the proper recipients, and notifying the peers of any changes in the state of the call. The existence of this server makes the claim that WebRTC is a peer-to-peer technology a bit shaky, and while it is true that the data flows directly between the two peers, a server is needed to coordinate between them.

Usually, the messaging channel to the server requires a real time component, therefore WebSockets is my preferred tech to use in this case. Usually, the way it works is that you need to serialize the offer using JSON.stringify, then send them off to the server using WebSockets, which can forward them to the other peer. The other peer then deserializes the offer, generates the answer, and sends it back to the server using WebSockets. The same technique can be used for ICE candidates, as well as control messages, such as user joined, user left, text messaging, and more.

Is that it?

Yes, this is a complete and working implementation, but there are quite a bit of edge cases to consider before being ready for production, such as STUN and TURN servers, additional events, error handling, and more. Still, it is a good starting point that will get you pointed in the right direction.

Two recordings of myself playing the accordion

While looking through an old drawer full of VHS tapes and photos, I stumbled upon a CD that contained old recordings of myself playing the accordion. The recordings were made sometime around 2002, when I was 12 years old, and are made up of traditional Albanian songs and some classical tunes.

I got quite nostalgic when I listened to them. To keep this performance from being lost, I ripped the CD and I am making the tracks available for download here. Feel free to download and use them as you wish, they are licensed under Creative Commons Attribution 4.0 (CC BY).

Special thanks to my teacher Remzi Nushi for making this possible, and his son Vagner Nushi for the recording session.

Klasike – Click to Download

 

Popullore – Click to Download

Getting started with Node.js and TypeScript – Part 3: Debugger Settings for Visual Studio Code

TL;DR: Download/clone this Github example and modify it to your requirements.

Having a proper working debugger is a massive improvement in productivity in any IDE, and in my case it was the key decision to switch to Visual Studio Code for working with JavaScript in Node.js. However, with gulp and the directory structure we are using, the debugger does not work by default in TypeScript, so we have to tweak some of the settings to make it work.

First, we have to set up the build system to emit source maps, which are files that map TypeScript code to the corresponding JavaScript code. In this manner, when we set a breakpoint at a particular line in TypeScript, it will break in the correct line in JavaScript. The following is a minimal gulpfile that takes all TypeScript source files from the /src directory and outputs the compiled JavaScript in the /app directory:

var gulp = require('gulp');
var ts = require('gulp-typescript');
var sourcemaps = require('gulp-sourcemaps');

var tsProject = ts.createProject('tsconfig.json');

gulp.task('default', function() {
	return gulp.src("src/**/*.ts")
	.pipe(sourcemaps.init())
	.pipe(tsProject())
	.pipe(sourcemaps.write("."))
	.pipe(gulp.dest("./app"));
});

This code should be self-explanatory. TypeScript supports the source maps plugin, therefore we can use that plugin to generate the source maps. We include it in line 3, initiate it in line 10, and write the source maps in line 12. The source maps are written along side the output JS source code.

In line 5 and 11, we define and use a tsConfig.json file instead of setting our compiler options straight in the gulp code. tsConfig.json is the Microsoft-approved way to set the TypeScript compiler settings, so we will be using it going forward. With this file we gain an improvement in the portability of our TypeScript settings and the ability to use the full range of options that may not be available to the gulp-typescript package. Finally, using the tsConfig.json file allows us to set the options that tell the debugger where to look for the generated source maps that were created earlier.

The following tsConfig.json file gives us a minimal example of the settings we need to set up our working debugger:

{
    "compilerOptions": {
        "target": "es5",
        "lib":["es2016","es2016.array.include"],
		"noImplicitAny":false,
		"noEmitOnError":true,
        "removeComments": false,
        "sourceMap": true,
        "rootDir": "./src",
        "outDir": "./app",
        "mapRoot": "./app"
    },
    "include": [
        "src/**/*"
    ],
    "exclude": [
        "node_modules",
        "**/*.spec.ts"
    ]
}

Take note of lines 8 through 12. The options on those lines enable the source maps, set the root directory of the TypeScript files, set the output directory of the JavaScript files, and set the root directory of the source maps. Although some of these options are duplicates of the options on the gulpfile, it’s a good idea to have them defined here as well for documentation and portability purposes.

This should be enough to make the Visual Studio Code debugger work with compiled TypeScript files. If you have any trouble, download the example project from the repository at the top of this article for an example of a project with a working debugger.

Simplifying code paths with multiple conditions

It is very common to be in the situation where your app needs to take some action, but only if multiple conditions are met. For example, suppose you have a function that is supposed to process a blog post. To post in the blog, you have to be logged in, to be an admin, and to have non-blank content. The first instinct would be to write something like this:

function postArticle(title,content) {
    if(content !== "" && title !== "" && !isSpam(content) && isAdmin() && isLoggedIn()) {
        publishArticle(title,content);
    }
    else {
        error("there was some kind of error");
    }
}

The problem with this construction is that there is no way to know why this function is failing, and subsequently there is no way to handle the error cleanup if necessary. It is possible to put in further checks inside, leading to something similar to this:

function postArticle(title,content) {
    if(content !== "" && title !== "" && !isSpam(content) && isAdmin() && isLoggedIn()) {
        publishArticle(title,content);
    }
    else {
        if(content == "") {
            error("content is empty");
        }
        else if(title == "") {
            error("title is empty");
        }
        // rest of checks go here
    }
}

Even with only two levels of nested if statements, this construction is getting harder to understand, follow along, and modify. If there are any kind of compound checks or recovery, it makes modification difficult since there are many possible interactions possible.

Below, I present a better way to handle these kinds of situations, where no more than one level of if statement is required.

function postArticle(title,content) {
    if(!isLoggedIn()) {
        error("You cannot post if you are not logged in");
    }
    else if(!isAdmin()) {
        error("Only admins can post articles");
    }
    else if(content == "") {
        error("Content cannot be blank");
    }
    else if(title == "") {
        error("Title cannot be blank");
    }
    else if(isSpam(content)) {
        error("Content is marked as spam");
    }
    else {
        publishArticle(title,content);
    }
}

The idea is that the normal flow checks for errors first, and only once all the checks have passed, the final else statement is reached and the action is performed. Each if or else if only checks one condition, and the conditions are ordered is such a way that the later checks depend on the first checks to pass. For example, being an admin is not possible if you are not logged in first, therefore the check to see if the user is admin is placed below the check to see if the user is logged in.

Similarly, it’s a good idea to perform cheap checks before expensive ones. It is easier to check if the content is blank than to take it through the spam filter, that’s why the blank check is performed first.

Finally, the else statement is reached when all the checks have passed.

The reason why we used this structure is that there is only one way to succeed, but multiple ways to fail. Therefore, we can handle each failure separately, and only handle the success at the very end.

Getting started with Node.js and TypeScript – Part 2: Setting up the tools

This is the second part in a series of tutorials on getting started with TypeScript in a Node.js setting. Part 2 goes through installing these tools and setting up a starting project.

The following steps are required for a minimal setup:

I will be going through each of them in turn. Note that installing Node.js and gulp will require administrator privileges, requiring the use of sudo in front of the commands that install them.

Installing Node.js and npm

Node.js can be dowloaded at www.nodejs.org. Just download the appropriate version for your OS, install it, and test the installation by running the following command in your terminal:

node -v

If the installation went well, you should get a string such as 8.0.0 indicating the version of Node.js that is installed on your system.

npm is the package manager for Node.js. Among other things, it is used for downloading libraries written for Node.js by others. These libraries are extremely useful, making npm is a critical component of Node.js. npm is bundled with Node.js, so if you installed Node.js as described above, you can check the version of npm similar to how you checked the version of Node.js:

npm -v

Again, you should get a string such as 4.5.0 indicating the version of npm that is installed on your system. However, npm is updated independently of Node.js, and npm is usually a bit behind on updates. For this reason, it’s a good idea to update npm to the latest version. Luckily, npm itself is used for such updates, so just run this command to update:

npm update -g npm

If you check the version of npm after the update, it should be higher than the one before (in my case it is 5.0.0).

Initializing the project

In Node.js, we can use npm to easily start a skeleton project. Using your terminal, create a directory for your project, navigate to it, and run the following command:

npm init

You will be asked a set of questions about your project, and at the end, a package.json file will be generated.

In this case, the project is called wsmh. The following screenshot demonstrates how npm init was used in this case:

In addition to the package.json file, two more directories need to be created, app and src. src will contain the TypeScript source code, while app will contain the compiled JavaScript code. These are not part of any specification or requirement, but I use this convention since it allows me to organize my code more clearly.

At the end of this section, your directory should look like the following:

At this point, the project is not ready, since there are no source files.

Installing TypeScript

Installing TypeScript is very easy using npm. The command to install TypeScript is:

npm install --save-dev typescript

This command downloads and saves TypeScript to your new node_modules directory, and adds an entry for TypeScript to your package.json file under devDependencies.

In general, npm modules are installed using npm install. For example, to install the foo module, just run:

npm install foo

The --save-dev modifier is used if the installed module will be used during the development or build process, such as testing, minification, resource initialization, database maintenance, and so on. Since TypeScript is used by the developer to compile the code, and not by the project itself, TypeScript is installed using --save-dev.

As an aside, because the project dependencies are listed in the package.json file, it’s not necessary to actually carry the files around when deploying the project or checking it in in source control. If a project is missing the dependencies, they can be easily installed by simply running the following command:

npm install

This will install all dependencies listed in the package.json file.

Installing gulp and setting up the build script

Gulp is a popular tool for automating tasks related to your project, such as setting up databases, backing up the database to a file and restoring the database from a file, running automated tests, deploying the project to a remote server, minifying CSS, compressing and optimizing images, and many more. In our case, we will be setting up Gulp to compile our TypeScript into JavaScript with a single command.

First, Gulp must be installed globally in your system, by using the following command. It requires admin access, so add sudo in Linux/MacOS if needed:

npm install -g gulp

To check if the installation was successful, run the following command on your terminal:

gulp -v

You should get a result similar to this:

Next, we need to install Gulp locally in the project, as well as gulp-typescript, a Gulp plugin that facilitates the TypeScript compilation workflow in Gulp. We can install them both with this single command:

npm install --save-dev gulp gulp-typescript

Now that the development dependencies have been installed, it’s time to set up the build script. Create a file called gulpfile.js in the root directory of your project. Place this code in that file:

const gulp = require("gulp");
const ts = require("gulp-typescript");

gulp.task('ts', function() {
return gulp.src("src/**/*.ts")
    .pipe(ts({
        lib:["es6"],
        noImplicitAny:false,
        noEmitOnError:true,
        removeComments: true,
        sourceMap:false,
        target:"es5"
    }))
    .pipe(gulp.dest("app/"));
});

This code defines a gulp task with the name ts, which will be executed when you run the gulp ts command in your terminal. The purpose of this task is to find all TypeScript files in the src/ directory, compile them to JavaScript one by one, and output the resulting files in the app/ directory.

Setting up our index.ts and index.js files

Finally, we can get started in writing the TypeScript code. Create a file called index.ts, and place it on the scr/ directory. Put the following code to test our setup:

var first:number = 5;
var second:number = 9;
var sum:number = first + second;
console.log(sum);

After you save this file, run the gulp ts command on your terminal. The TypeScript file we just wrote will be compiled into the app/index.js file, and we are finally ready to run it.

Run the node app/index.js command, and you will get the output of 14 on the console. Congratulations! You have just finished the TypeScript compilation process.

Bonus: Setting up the Sublime Text build script

This step is not required if you do not use Sublime Text, but I am adding it here since it is quite useful. As you know, TypeScript code needs to be compiled from scratch each time it is modified. However, it is quite a hassle to switch to your terminal and run the gulp ts command every time you want to compile.

Luckily, Sublime Text offers a way to run external commands with a keyboard shortcut, allowing you to compile while keeping your editor open and not break your flow.

To set up an external build system, on your menu bar, go to Tools->Build System->New Build System…

You will be shown a new tab with an editable text file. Add the following code to that text file:

{
	"shell_cmd": "gulp ts"
}

Save this file as gulp ts.sublime-build, and then select gulp ts as the build system:

Now, you can press Ctrl-B, (Command-B in a Mac), and the build system will run, and all your TypeScript files will be compiled into JavaScript. Now that’s efficiency!

Getting started with Node.js and TypeScript – Part 1: Motivation

This is the first part in a series of tutorials on getting started with TypeScript in a Node.js setting. Part 1 explains the motivation why these tools are used.

Why Node.js?

Node.js has certainly become a powerhouse in the webdev world in the last few years, and for good reasons. It is simple, fast, easy to pick up, and uses JavaScript, which makes many front-end skills transferrable to developing back-end services.

Node.js also comes bundled with npm, a package manager that lets you easily download and use libraries made by other people. npm is the largest repository of open-source software in the world, and libraries exist for any functionality you can think of.

For these reasons (and others I won’t go into), Node.js is a great choice when you are making an API that other applications will connect to, such as providing a back-end service for a mobile app. Node.js also has capabilities for real-time communications through WebSockets, making it very simple to implement real-time functionality for your app, such as chat, games, or any other interactive experience.

Why TypeScript?

JavaScript itself is also simple and easy to pick up, however, because it was originally made for use in a web browser, it has accumulated lots of baggage and issues that can make the unprepared programmer pull their hair out from frustration. One of these major issues is JavaScript’s loose typing, allowing you to dynamically add or remove properties from objects, or mix and match objects of different types without any restrictions. While on one hand this feature gives great flexibility in implementing some feature, it is a major source of bugs which usually are not discovered until the code runs and reaches the affected part. One example is writing obj.propertyname when what you wanted was obj.propertyName, and not finding out about the error until the program has been running for quite some time.

Because you can’t really know if a piece of code works or not until you execute it, programmers have come up with a huge number of testing methodologies and frameworks to ensure that the code they wrote actually does what they think. However, instead of testing out the code by running it, it is possible to run a program that checks the code for the use of missing properties, mismatched types, and proper arguments and return values of functions. This program can warn the programmer ahead of time of errors in their code, which would allow them to fix the errors without actually having to run the code. In essence, it answers the question “Do the components of my program actually fit together?”. This is where TypeScript comes in.

TypeScript is a programming language which is a superset of JavaScript (all valid JavaScript is also valid TypeScript). TypeScript code is compiled into JavaScript, making it possible to mix and match code from the two languages, and to use TypeScript in any setting where JavaScript is used. The main selling point of TypeScript is the addition of type safety to JavaScript, by the use of optional type annotations to variables or properties. These annotations ensure that the variables are used in ways that are consistent and do not lead to issues down the line. Some examples:

In JavaScript, this piece of code would be valid:

var myStr = "foo";
var myNum = 15;
myStr = myNum;    //JS will happily accept this
console.log(myStr.length);    //will output undefined

/* length gives us the number of characters on a string,
but myStr is now a number! This code is valid JS, and
the length of myStr is now undefined. If we were expecting
the length to exist down the line, an undefined length
would potentially lead to major errors. */

Contrast this with TypeScript:

var myStr:string = "foo";   //myStr is of type string
var myNum:number = 15;    //myNum is of type number
myStr = myNum;
console.log(myStr.length);    //will never get here

/* TS will throw a compile-time error at line 3, 
letting the programmer know that it is not possible
to assign a number to a string. */

As we can see from this example, TypeScript will not be allow this code to be compiled and will give the programmer the location of the error. This will allow the programmer to either fix a potential error, or force them to make their meaning unambiguous and clear (for example, if they wanted to assign the string representation of the number 15 to the variable myStr, they must use myNum.toString()). I find this feature greatly useful, since it gives me the piece of mind that my code is properly typed, whereas with plain JS for the most part, I did not even have that. Also, when the time comes to make changes to my classes or interfaces, all my existing code will fail to compile, forcing me to recheck all the uses of my classes and reevaluate the validity of my changes.

What is the cost of using TypeScript?

Of course, we are not talking about monetary cost here, but cost in the sense of what do I have to give up to make TypeScript work for me. For the most part, if you are a JavaScript developer, going to TypeScript is very easy. All valid JavaScript is valid TypeScript, so existing projects can start using TypeScript right away. The syntax of TypeScript matches very well to JavaScript, so the TypeScript learning curve is very low.

However, it is required that you invest some time learning about the tools that enable TypeScript, specifically the compiler. TypeScript cannot be run without being compiled first, and the compiler outputs JavaScript. Setting up the compiler and task-running apps to actually make the compiler usable in your project takes some effort, but once that is done, it can be used throughout the life of the project.

Part 2 will focus on setting up the tools and structure that will enable the creation, maintenance, and expansion of a project all the way to large codebases.

Finding the minimal distance from a point to a line

Calculating distances between elements is a common operation when dealing with graphics. In our case, we want the minimal distance between a point, represented by two coordinates (P_x,P_Y), and a line, represented by the line formula y=ax+b.  There are many different methods of determining this distance, but we will be using calculus, so this method should hopefully be clear to anyone who understands the basics of calculus.

There are two major steps to solving our problem:

  • Find the distance between our point and all other points in the line
  • Pick out the smallest of these distances

Finding the distance between two points is not difficult, usually a slightly modified pythagorean formula is used. Given two points (x_1,y_1) and (x_2,y_2), the distance d is equal to:

d=\sqrt{(x_2-x_1)^2+(y_2-y_1)^2}

This formula will always give us a non-negative number, since distances can’t be negative. However, we are not trying to find the distance between two particular points, but between one point (P_x,P_y) and ALL other points in the line y=ax+b. So, we will have to modify our formula to accept not two points, but a point and two variables, x and y, that will give us the distance between (P_x,P_y) and our variables. However, our variable y is dependent on x (y is a function of x), so we can rewrite out variables like so:(x,f(x)) where f(x)=ax+b.

d(x)=\sqrt{(P_x-x)^2+(P_y-f(x))^2}

So now we have a function, the distance function, that will give us the distance between our point (P_x, P_y), and and all other points in the line f(x)=ax+b. So, given a particular x value, we will have the distance between (P_x,P_y) and (x,f(x)).

But how do we find the minimal distance between our new function and our point? This is where the calculus part comes in. A function reaches it’s minimal value whenever the derivative of that function is equal to 0.  We will have to calculate the derivative of our function d(x) . Let’s do it step by step, and by rewriting some of the terms to make it clearer.

f(x)=ax+b
f'(x)=a

g(x)=(P_x-x)^2+(P_y-f(x))^2
g(x) = {P_x}^2-2{P_x}x+x^2+ {P_y}^2-2{P_y}f(x) + {f(x)}^2
g(x) = {P_x}^2-2{P_x}x+x^2+ {P_y}^2-2{P_y}ax-2{P_y}b + {a^2}{x^2}+2axb+{b^2}

g'(x)=-2{P_x}+2x-2{P_y}f'(x)+2f(x)f'(x)
g'(x)=-2{P_x}+2x-2{P_y}a-2{P_y}b+2{a^2}x +2ab
Note the use of the chain rule on {f(x)}^2

d(x) = \sqrt{g(x)}

d'(x) = {{g'(x)}\over{2\sqrt{g(x)}}}
We are using the chain rule here as well, inside the square root

d'(x)={{-2{P_x}+2x-2{P_y}a-2{P_y}b+2{a^2}x +2ab}\over{2\sqrt{{P_x}^2-2{P_x}x+x^2+ {P_y}^2-2{P_y}ax-2{P_y}b + {a^2}{x^2}+2axb+{b^2}}}}

To find the minimal distance, we set d'(x)=0. Knowing this, we can greatly simplify the formula above:

d'(x)={{-2{P_x}+2x-2{P_y}a-2{P_y}b+2{a^2}x +2ab}\over{2\sqrt{{P_x}^2-2{P_x}x+x^2+ {P_y}^2-2{P_y}ax-2{P_y}b + {a^2}{x^2}+2axb+{b^2}}}}=0

We can divide both sides by the denominator. This simplification is generally not always valid, due to cases of division by zero, however, in our case, the only time when there is division by zero is when the distance is zero. In that case, the numerator is also zero, and we already have a solution, since zero is the smallest possible distance. So we will accept this simplification, as it always leads to a correct answer.

d'(x)={-2{P_x}+2x-2{P_y}a-2{P_y}b+2{a^2}x +2ab}=0

So let’s give our new formula a try. Given the point (1,3) and the line y=2x+3, we will plug these numbers in d'(x) above to try to find a solution:

d'(x)=-2*1+2x-2*3*2-2*3*3+2*{2^2}*x+2*2*3=0
d'(x)=10x-20=0
x-2=0
x=2