@how2codeinfo | Svenska Svenska | English English

Running a Blog on Azure Data Lake

Johan Åhlén by Johan Åhlén • Updated

Can you host a blog in an Azure Data Lake? Yes, absolutely! In fact, this website is an example of that. In this article, I will describe how you can host your website in Azure Data Lake and handle dynamic things such as a contact page.

First, a couple of words about why you would build your blog on Azure Data Lake:

You will certainly a bit of technical skills and time, but if you enjoy creating things in Azure you will have great fun!

Static Website Hosting in Azure

The deal-breaker that makes this possible is a new function in Azure Data Lake that is called Static website.

Azure Data Lake Static Website

Actually you were able to host a static website in Azure Data Lake already before, but Azure Data Lake static websites gives you two very important features:

The way that the static website feature works is very simple. It creates a container called $web, which is where you put anything you wish to publish. Note that URLs will be case-sensitive. This is because blob names are always case-sensitive.

There are a couple of things to consider if you want to use it for a real-world scenario:

Custom Domain

Probably you will wish to use your own domain name for your website, like www.yoursite.com. The automatic address the static website feature gives you isn't very pretty. There is currently no setting for custom domains, so how do you solve this?

The solution is to use Azure CDN (Content Delivery Network).

Azure CDN

The purpose of Azure CDN is to increase speed and reduce the load on your webservers by caching your content on a global network of servers. When users visit your website, they will get the content from the nearest cache instead of reading directly from your webserver.

Azure CDN is available as different product offerings, with different features. I suggest selecting Premium Verizon, because it has a rules engine. In the rules engine, you can do things such as URL rewriting and URL redirection.

Azure CDN is cheap and easy to setup. It can even save you the cost of a HTTPS certificate for your domain, because it can request and manage your certificate for free.

URL rewriting and URL redirection

If you are not careful, your website could appear in search engines under multiple domains. This is because search engines could find your website on the origin (the autogenerated Azure Data Lake static website address).

To prevent search engines from indexing a website there is robots.txt. On your custom domain it should say OK. On other domains than your custom domain, robots.txt should say Disallow.

But wait... How can you make robots.txt different for different domains?

This is where URL rewriting can do magic for you. Here's an example of how I use it for www.how2code.info:

Azure Cdn Verizon rewrite rule

In a similar way, you can do URL redirection. For instance redirect any request from http to https:

Azure Cdn Verizon redirect rule
Finding your "customer origin"

You should replace the "customer origin" /80103ADC/cdn-how2code-www with your own. The easiest way to find your customer origin is to create a new rule (that you later discard), select "Origin" and then "Customer Origin". Your customer origin will then appear in a drop-down.

Azure Cdn Verizon finding the Customer Origin

Cache-Control

If you use a CDN, caching becomes even more important. This is because the CDN needs to know if it needs to reload the content or not.

Without Cache-Control, the request will go from the CDN to the Data Lake each time. This is very unnecessary and will slow down your website.

Azure Data Lake Cache-Control scenario

There are two ways to manage the Cache-Control. Either you can manage the caching directly on the blobs in Azure Data Lake, or you can manage it by rules in the CDN.

My recommendation is to manage the caching directly on the blobs. Cache-Control is available as a property on your blobs. Unfortunately it is not shown in the portal, but you can easily manage it through PowerShell, C#, REST API, etc. You can easily check it from your web browser and should then see something like this:

One more thing... In Azure CDN, there is also a setting called Query-String Caching. Azure Data Lake static website will ignore any query strings, so for best cacheability you should choose "standard-cache" mode. This is the default mode.

Content-Type

The content-type is necessary for web browsers to display/handle content correctly. Usually it is managed by web servers, but with Azure Data Lake static websites it becomes necessary to manage yourself.

When uploading files in the portal, Azure will by default assign a content-type/MIME type based on the file extension. For example:

When uploading through other means (such as PowerShell or C#), you will need to set the content-type yourself. It is available as a property on the blobs and you can even see it in the portal.

Managing it in PowerShell is similar to managing other properties, such as cache-control. This article describes how to manage properties.

Adding a Contact Form

A contact form is a typical example of something that should be handled server side:

Azure Data Lake Storage static websites are... static. They don't come with any server side code support. So you have two options:

Azure Functions (Serverless Compute) are easy to create, and very cheap. They can be created in the portal or by using for example Visual Studio Code. Here's an example.

Azure Function How2Code

Calling the Azure Function from JavaScript can be done in several ways, depending on browser compatibility. In this example, I use the XMLHttpRequest class for maximum browser compatibility.


var xhr = new XMLHttpRequest();
xhr.open('POST', 'https://func-how2code-web.azurewebsites.net/api/SendMail');
xhr.setRequestHeader("Content-Type", "application/json;charset=UTF-8");
xhr.onreadystatechange = function () {
	if (this.readyState === XMLHttpRequest.DONE) {
		if (this.status == 200) {
			window.location.href = '/en/contact/thankyou';
		} else {
			enableButton();
			alert('An error occured. Please try again later.');
		}
	}
}
xhr.send(JSON.stringify({ "email": email, "name": name, "subject": subject, "message": body }));

Azure Functions are a good way to extend your static website with server side code and they can be written in your prefered language (C#, JavaScript/Node.js, PowerShell, etc).

Editing/updating your website

Azure Data Lake static website doesn't come with any web editing functionality like WordPress. You will have to build and manage the Html files yourself.

It's not super hard to make a decent editing environment for your website.

I built my own editor by creating a ASP.NET Core C# web application. The advantage of this is that you get all the libraries to easily render html pages. Basically I have a few html templates that I apply to all my content. The content is stored in a dedicated Azure Data Lake container, and the html output is written to the $web container. Any changes to the website design is easy because I just need to change the templates. I run the editor locally, but it could easily be deployed as a web app if needed.

An even cooler way to design an editor would be to build it in Vue.js, React, Angular or similar framework. That way both the website and the editor could be hosted as Azure Data Lake static websites.

Azure even has an SDK for JavaScript! Using this SDK you can for example:

Another option is using Blazor Webassembly. From my experience that is easy to host in a static website and could be used to create an excellent editor. This would probably be my main option if I started from scratch with this blog today.

Azure Azure Data Lake Azure Storage Azure CDN

More about Azure Storage

Setup Caching on your Azure Storage Blobs

More about Azure

Getting Started with Azure Purview
Getting started with Azure SQL Elastic Database Pools
How to Connect to a SOAP API from Azure Data Factory