Svenska Svenska | English English

Running a Blog on Azure Data Lake

Johan Åhlén by Johan Åhlén • Updated

Can you build a blog on Azure Data Lake/Azure Blob Storage? Yes, absolutely! In fact, this blog is an example of that. In this article, I will describe how to build a website on Azure Data Lake including dynamic content such as blog post and tag indexes.

First, a couple of words about why you would do such as thing:

You will certainly need more technical skills and time than if you used something like WordPress, but if you enjoy creating things in Azure you will have great fun!

Azure Data Lake logo

Static Website hosting in Azure

You can easily upload a file to an Azure Data Lake, where it will become a blob and is assigned a URL. This makes it very easy to upload a complete static website. However, there are two features that you usually want from a website:

Coincidentally, these two features are exactly what you get if you enable Static website hosting in Azure Storage. That feature is currently (as of August 2020) in preview for Azure Data Lake, but it is fully available for Azure Storage without hierarchical namespaces.

Be careful with mime-types. Azure will by default assign a mime-type based on the file extension of the files you upload. If your pages are stored in .html-files, it will be okay. For an address like www.how2code.info/en, you will have to set the mime-type to text/html yourself.

Note that URLs will be case-sensitive. This is because blob names are case-sensitive.

To deploy your files to your Azure Data Lake, you could either use AzCopy or develop your own more sophisticated scripts.

Structuring your Data Lake

I suggest dividing your Data Lake/Azure Storage content into different containers:

You can build scripts that pre-transforms your blog posts from the structured format to html-files that are stored in "Web". The scripts can then also autogenerate index pages for your blog posts and tags, as well as a sitemap. Another option is to do the transformations dynamically in the end user's web client.

Here's an example of the structured format I use for my blog posts on How2Code.info:

Azure Data Lake Blog Post Structure

Using a Custom Domain

The address your website will get is something like https://something.blob.core.windows.net. Surely you want a nicer address?

Azure CDN is the solution. It gives you:

Azure CDN is available as different product offerings, with different features. I suggest selecting Premium Verizon, because it has a rules engine. In the rules engine, you can easily setup things such as default pages and redirects.

Caching is very important for CDNs. If you don't enable any caching, the CDN will have to reload the files all the time. The caching could either be setup in the CDN, or on the blobs. This article describes how to setup caching on blobs.

Azure CDN is quick and easy to setup. It is useful for almost any website, not only blogs running on Azure Data Lake.

Making your Website dynamic

It is amazing how much you can now do client side on the web. There are frameworks like Angular and Vue.js, that can be used to build incredible websites. You could build much more advanced things than this blog. Still they can be deployed to an Azure Data Lake, since to the webserver they are only static files.

You could also easily use JavaScript for things like page headers, footers, dynamic lists of blog posts, etc without any of these frameworks. It all depends on how much of their functionality you want. For a simple blog, those frameworks can be a bit of a overkill.

Azure even has an SDK for JavaScript! Using this SDK you can for example:

Still, there are cases when you want to run things server side. For example, security is much easier to handle server side. You can easily run server side code that is not exposed to the end users. With client side code, your and basically open sourcing everything.

As a compromise, you could place code in Azure Functions (Serverless Compute). That will protect the code, but they have to be carefully written so they cannot be exploited.

Another thing that can't be done client side is permanent redirects. One possible way to solve it is through the CDN rules engine.

Useful libraries

There are some JavaScript libraries that I would recommend:

Do you have any other favorite libraries? Feel free to let me know.

Finally, I must admit. This website has all content in an Azure Data Lake, but it uses an Azure App Service (Web App) for some server side functionality. Mainly for security reasons. That will change sometime in the future...

Azure Azure Data Lake Azure Storage Azure CDN

More about Azure

Setup Caching on your Azure Storage Blobs
Azure Web App Not Updating After Publish
How To Run Your Azure Web App From Zip Package

More about Azure Storage

Setup Caching on your Azure Storage Blobs