Lambros Petrou RSS Feed

Cloudflare Durable Objects are Virtual Objects

lambros@lambrospetrou.com (Lambros Petrou) — Sun, 27 Apr 2025 00:00:00 GMT

I have been working a lot with Durable Objects as part of my day job at Cloudflare and have been answering questions on the #durable-objects Discord channel on a daily basis for several months now.

One of the most common things people have trouble wrapping their head around is how to manage and handle the lifecycle of Durable Objects.

They want to do something when a Durable Object (DO) is “created”, when it’s “destroyed”, when it’s “hibernating”, when it’s “evicted from memory”, and any variation you can imagine.

The short answer is “you do not do that”.

You do not create a Durable Object. You do not destroy a Durable Object. You just use a Durable Object.

Let’s explore in detail.

Virtual Objects or Actors

For the rest of this article, “Object” and “Actor” is used interchangeably.

Durable Objects nicely fit into the Actor programming model, built natively into the Cloudflare global network infrastructure. There are several Actor programming libraries (Akka, Microsoft Orleans) and languages (Erlang, Elixir).

My most favourite description of Virtual Actors is from the Microsoft Orleans publication “Orleans: Distributed Virtual Actors for Programmability and Scalability” back in 2014.

Actors are the basic building blocks of Orleans applications and are the units of isolation and distribution. Every actor has a unique identity, composed of its type and primary key (a 128-bit GUID). An actor encapsulates behavior and mutable state, like any object. Its state can be stored using a built-in persistence facility. Actors are isolated, that is, they do not share memory. Thus, two actors can interact only by sending messages.

Every sentence from the Orleans Actors description applies to Durable Objects as well, although Durable Object IDs are 32 bytes.

The same section continues to elaborate on the key facets of a Virtual Actor.

1. Perpetual existence

Perpetual existence: actors are purely logical entities that always exist, virtually. An actor cannot be explicitly created or destroyed and its virtual existence is unaffected by the failure of a server that executes it. Since actors always exist, they are always addressable.

Durable Objects are the same. You don’t create them. You don’t destroy them.

You generate a Durable Object ID and address the corresponding Durable Object.

2. Automatic instantiation

Automatic instantiation: Orleans’ runtime automatically creates in-memory instances of an actor called activations. At any point in time an actor may have zero or more activations. An actor will not be instantiated if there are no requests pending for it.

Durable Objects are an infrastructure primitive, so in constrast to Orleans in-memory instances, they are running on a server within Cloudflare’s global infrastructure with attached durable storage for persistence.

Similarly to Orleans though, a single Durable Object can be active and consuming resources on a Cloudflare server, or inactive and consuming zero resources.

As long as there are requests routed to the Durable Object it will be alive and active somewhere. Once requests stop and enough time passes, the Durable Object will be evicted from memory and eventually removed from its host server.

Your user code should not care about this.

When a request routes to a Durable Object, Cloudflare will prepare its durable storage and initialize the Object on a server ready to accept incoming requests. All of this happens transparently.

3. Location transparency

Location transparency: an actor may be instantiated in different locations at different times […].

Durable Objects by default are created close to the location of the incoming request. Optionally, you can provide a regional location hint to influence where the Durable Object will be created.

Having said that, the user code should not care exactly at which location the DO is created. It can be on server-a at time 10:00, and on server-b at time 22:00, where server-a is in a different city or country even from server-b.

Once again, Cloudflare will make sure the Durable Object will be running on a healthy server somewhere close to the user or within the region specified. Where that is exactly, can, and will change over time.

How to think about it

It’s understandable that many folks have trouble internalizing the above properties. The traditional way of doing things in programming languages is to “create and destruct” class objects of sorts, and at the infrastructure level we create and delete server instances.

Fully embracing the “Virtual Object or Actor” paradigm really unlocks the power of these primitives.

My guidelines are:

On “activation time” (Durable Object class constructor) read the storage and accordingly do any in-memory initialization needed.
Use explicit actions to do your business logic, and if not possible, use the Alarms API to schedule work to be done in the future.

Let’s explore some common use-cases.

Access a Durable Object

Create the Durable Object ID (see docs).
Get a stub to the Durable Object (see docs). Keep in mind that creating a stub does not yet attempt to reach out to the addressable Durable Object.
Invoke an operation on the Durable Object stub.

The last step is what will actually start the whole flow of figuring out where the Durable Object should be activated, which server should handle the request, send the request there, run the invoked operation, and then return the response back to the caller.

Example from my Tiddlyflare project:

export async function routeListWikis(env: CfEnv, tenantId: string): Promise {
    let id: DurableObjectId = env.TENANT.idFromName(tenantId);
    let tenantStub = env.TENANT.get(id);
    return tenantStub.list();
}

Notice that all the 3 properties explained in the previous section apply.

We always just use the Durable Object regardless if it’s the first time it’s accessed or not, we don’t create it explicitly, and we don’t care about where it’s going to be instantiated.

We just use it.

Initialize state only once

Say you want to store some information in your Durable Object durable storage only once, and have that in-memory whenever the DO is active for fast access.

Example from my Tiddlyflare project:

export class WikiDO extends DurableObject {
    env: CfEnv;
    sql: SqlStorage;

    wikiId: string = '';
    tenantId: string = '';

    constructor(ctx: DurableObjectState, env: CfEnv) {
        super(ctx, env);
        this.sql = ctx.storage.sql;
        // ...
        const tableExists = this.sql.exec("SELECT name FROM sqlite_master WHERE name = 'wiki_info';").toArray().length > 0;
        if (tableExists) {
            const { tenantId, wikiId } = this.sql
                .exec<{ tenantId: string; wikiId: string }>('SELECT tenantId, wikiId FROM wiki_info LIMIT 1')
                .one();
            this.tenantId = tenantId;
            this.wikiId = wikiId;
        }
    }

In the above snippet, I check if a table exists, and if it exists I read the tenantId and wikiId values and store them in-memory.

This only covers the reading part, though, and the writing is done inside the create(...) action (see code).

In most applications, there is some specific operation (like create(...)) that receives the needed information and writes it to storage.

Delete all storage after inactivity

Another common use-case is “deleting a Durable Object” after its expected usage is over to avoid paying for storage that is not needed anymore.

If the deletion is tied to a specific operation, then you just invoke the storage.deleteAll() API on the Durable Object storage at the end of that operation and all is good.

The nuanced scenario is when the deletion of storage is tied with the “destruction” of the Durable Object, whatever that means. But, as we said Durable Objects are never deleted, so what do we do?

The solution to this is the Alarms API.

While processing requests you can set a date in the future to execute the alarm handler, and do the cleanup in the alarm handler. Every new request postpones the alarm and pushes the cleanup forward (you can use in-memory debouncing to avoid bursts of writes if that’s a concern).

Conclusion

Durable Objects are Virtual Objects.

They are not explicitly created or destroyed.

They are always uniquely globally addressable.

They are activated and deactivated automatically within the Cloudflare global infrastructure on-demand to handle requests.

You just use them.

Large Language Model (LLM) prompting guides

lambros@lambrospetrou.com (Lambros Petrou) — Sat, 26 Apr 2025 00:00:00 GMT

This is a “live” collection of the best Large Language Model (LLM) prompting guides I find online that help me write better prompts and in general get the most value possible out of LLMs.

LLM providers

Anthropic

Models: https://docs.anthropic.com/en/docs/about-claude/models/all-models (e.g. Claude 3.7 Sonnet)

OpenAI

Models: https://platform.openai.com/docs/models (e.g. GPT4.1)

Google

Models: https://ai.google/get-started/our-models/ (e.g. Gemini 2.5 Pro)

ElevenLabs

Prompting guide - Learn how to engineer lifelike, engaging Conversational AI voice agents

Others

Prompt Engineering Guide

DevConnect Series talk - Cloudflare Developer Platform Overview

lambros@lambrospetrou.com (Lambros Petrou) — Fri, 03 Jan 2025 00:00:00 GMT

On 2025-Jan-03, I delivered a talk doing an overview of the Cloudflare Developer Platform in my home-city, Nicosia, Cyprus 🇨🇾.

Thanks to Parsectix, specifically Pavlos, for organising the event (LinkedIn Event), and to all the folks that joined.

It was awesome to see some familiar faces in the audience.😅

The talk covers an overview of the Cloudflare Developer platform, core services, and how to gradually integrate it with your existing infrastructure as you see fit.

📽️ Download presentation slides

Some folks asked about Cloudflare’s free tier, and how cheap the plans are compared to other platforms, raising concerns around rug-pulling in the future (not uncommon with other platforms).

Cloudflare already blogged about why and how we manage to offer the free tier, and how zero egress fee is possible, at the post Reaffirming our commitment to free.

I sent this already to some, and putting it here for future readers of this page with similar concerns.

How to do encryption and envelope encryption with KMS in Go

lambros@lambrospetrou.com (Lambros Petrou) — Sun, 15 Dec 2024 00:00:00 GMT

Table of contents

Symmetric encryption
- Symmetric encryption in Go
- Data encryption keys
Envelope encryption
- Envelope encryption in Go
- Envelope encryption with KMS
Key rotation
Multi-level envelope encryption
Conclusion

In this article, I will go through the basics of symmetric data encryption, then move to envelope encryption using our own key encryption key or using a Key Management Service (KMS) from a cloud provider, and how that can be generalised to multiple levels of envelope encryption.

I will provide examples in Go code taken out from my own applications to make things concrete and understandable.

Symmetric encryption

There are millions of resources about encryption, but in this article I will focus on symmetric encryption of arbitrary data.

The goal of encryption is to convert some piece of data A (namely plaintext) into some different piece of data B (namely ciphertext) using a secret data encryption key DEK. The data B will seem like random garbage to anyone looking at it, and only the ones in possesion of key DEK can reverse the conversion back to A.

Therefore, we use encryption to protect sensitive data and stop worrying if someone inadvertedly gets access to it.

The conversion from A to B is called encryption, and the reverse is called decryption.

Symmetric encryption with a data encryption key.

For example, if we encrypt the plaintext Hello, World! with a specific algorithm and secret key, we will end up with the following ciphertext:

2YjP8xC8owzTkLrvEdHjUY2q6QWicr6n1Te0sAso5oR7KaCufSiebQadhQ82js01wRd135Q

The above seemingly random text can be stored anywhere without worrying about anyone ever figuring out that the actual plaintext data is Hello, World!.

Encryption can be used everywhere and for everything.

I encrypt files containing my passwords and sensitive information before storing them on Google Drive.
NASA uses it to protect files transferred inside their infrastructure.
Every time you visit a website over HTTPS you use encryption for communication between your device and the server handling those requests (see the awesome How HTTPS Works tutorial).
Most cloud providers encrypt customer data in transit and at rest with our own or their keys.
You can encrypt your laptop disk drives (see Windows Bitlocker) so that your data is secure by someone taking off your disk drive and reading it on a separate machine.
more, more, more…

My favourite online resources about security and encryption are the following:

Cryptographic Right Answers by Latacora
Cryptographic Storage Cheat Sheet by OWASP
Reference documentation by Key Management Services I use (like AWS KMS).

As of today, the recommended algorithms to use for symmetric encryption is 256-bit Advanced Encryption Standard (AES) in Galois Counter Mode (GCM) or XSalsa20+Poly1305 with a 256-bit data encryption key.

Symmetric encryption in Go

Let’s get into specifics on how to safely implement symmetric encryption in our applications.

I will be using Go code in this article without any third-party libraries, since the Go standard library has everything we need in the crypto package (and its sub-packages).

The function GenerateKeyBytes below generates 256-bit (32-bytes) data encryption keys that will be used throughout the whole article.

import (
    "crypto/rand"

    "golang.org/x/crypto/chacha20poly1305"
)

const KeySize = chacha20poly1305.KeySize

// GenerateKey returns a 32-byte key as per the chacha20poly1305 algorithm.
func GenerateKeyBytes() []byte {
    b := make([]byte, KeySize)
    _, err := rand.Read(b)
    if err != nil {
        panic("unexpected failure: could not generate random data")
    }
    return b
}

The following functions implement symmetric encrypting/decryption.

func EncryptBytes(key []byte, plainData []byte) ([]byte, error) {
    // We use this algorithm based on the recommendation of https://www.latacora.com/blog/2018/04/03/cryptographic-right-answers/#encrypting-data
    // Alternative could be AES-256 GCM: https://pkg.go.dev/crypto/cipher#NewGCM

    aead, err := chacha20poly1305.NewX(key)
    if err != nil {
        return nil, fmt.Errorf("failed to create encryption AEAD: %w", err)
    }

    // Select a random nonce, and leave capacity for the ciphertext.
    nonce := make([]byte, aead.NonceSize(), aead.NonceSize()+len(plainData)+aead.Overhead())
    if _, err := rand.Read(nonce); err != nil {
        return nil, fmt.Errorf("failed to encrypt: %w", err)
    }
    // Encrypt the message and append the ciphertext to the nonce.
    encryptedMsg := aead.Seal(nonce, nonce, plainData, nil)

    return encryptedMsg, nil
}

func DecryptBytes(key []byte, encryptedData []byte) ([]byte, error) {
    aead, err := chacha20poly1305.NewX(key)
    if err != nil {
        return nil, fmt.Errorf("failed to create encryption AEAD: %w", err)
    }
    if len(encryptedData) < aead.NonceSize() {
        return nil, fmt.Errorf("ciphertext is too short: %d < %d", len(encryptedData), aead.NonceSize())
    }

    // Split nonce and ciphertext.
    nonce, ciphertext := encryptedData[:aead.NonceSize()], encryptedData[aead.NonceSize():]

    // Decrypt the message and check it wasn't tampered with.
    plainData, err := aead.Open(nil, nonce, ciphertext, nil)
    if err != nil {
        return nil, fmt.Errorf("failed to decrypt: %w", err)
    }

    return plainData, nil
}

And to test the above you can use the following test:

func TestEncryptDecrypt(t *testing.T) {
    key := []byte("supersecretkey32byteslong1234567")
    plaintext := "Hello, World!"

    ciphertext, err := encryption.Encrypt(key, plaintext)
    if err != nil {
        t.Fatalf("Encrypt failed: %v", err)
    }
    if ciphertext == plaintext {
        t.Errorf("Encrypt: plaintext same as cipherText, got %s, want %s", ciphertext, plaintext)
    }

    decryptedPlaintext, err := encryption.Decrypt(key, encResult.CipherText)
    if err != nil {
        t.Fatalf("Decrypt failed: %v", err)
    }
    if decryptedPlaintext != plaintext {
        t.Errorf("Decrypt: plaintext mismatch, got %s, want %s", decryptedPlaintext, plainText)
    }
}

Data encryption keys

The data encryption key (DEK) is maybe the most crucial component in encryption (apart from the encryption algorithm itself). Anyone with the DEK can decrypt all data encrypted with that key, therefore securely storing DEKs is top priority.

You should always separate your plaintext data encryption keys from encrypted data.🔐

On one side of the security spectrum, there are applications with a single encryption key used to encrypt everything, and pass this key into the application with approaches like environment variables, something like Hashicorp Vault, or storing the key(s) in Amazon S3-compatible stores and restricting access with IAM permissions.

This is easy to manage, since you only worry about securing a single key, but it’s extremely dangerous in case you lose it or if someone gets access to it when they shouldn’t.

On the other end of the spectrum, you generate a different data encryption key for each piece of data your application wants to encrypt.

This seems much safer, since in the worst case that someone gets ahold of a key they can only decrypt a single piece of data, but management of all these keys sounds like nightmare.

Enter envelope encryption.

Envelope encryption

In the previous section, we saw that a good approach to reduce risk and blast radius in case someone gets access to a data encryption key (DEK) is to have many of them, one per piece of data. Securely storing and managing all these DEKs is a concern though.

Envelope encryption introduces another kind of key, the key encryption key (KEK). The key encryption key (KEK) is used to encrypt the data encryption keys (DEKs). Once a DEK is encrypted, it can be safely stored together with the encrypted data.

Therefore, we are back at having to manage a single key encryption key while each piece of data is encrypted with its own data encryption key.

To envelope encrypt some data:

Generate a data encryption key (DEK) using our GenerateKeyBytes() function from above.
Encrypt our plaintext data with the DEK from step 1, using the EncryptBytes() function.
Encrypt the DEK itself using a secret key encryption key (KEK), using the EncryptBytes() function.
Store the data ciphertext from step 2 and the DEK ciphertext from step 3 into our database/datastore.

To envelope decrypt some data:

Decrypt the data encryption key (DEK) ciphertext first using the key encryption key (KEK), using the DecryptBytes() function from above.
Decrypt the data ciphertext using the decrypted DEK from step 1, using the DecryptBytes() function.

If you have been following along, we solved the problem of managing many data encryption keys (DEKs), but we still need to securely manage the key encryption key.

I have seen many teams storing their key encryption keys in whatever “secret store” their cloud providers provide, since every single platform has a way to store secrets.

This is usually fine, assuming that your provider is properly implementing their “secrets store” with an actual Key Management Service (KMS) under the hood where their employees (or any intruder into their systems) cannot access the actual secret values.

Leaks of “secrets” from different providers is not uncommon though.

We can do better by using a Key Management Service (KMS) to securely store our key encryption keys (KEKs) and handle the encryption/decryption of our data encryption keys (DEKs). That way, the KEKs are never being transmitted in plain text, reducing the risk of them being leaked.

Enter Key Management Services, usually referred to as KMS services.

Envelope encryption with KMS

Envelope encryption allows us to use different data encryption keys (DEKs) for our data, and encrypt those DEKs with one or a few key encryption keys (KEKs).

The best and recommended way to securely store and manage those key encryption keys (KEKs) is with a dedicated key management service (KMS).

These KMS services encrypt and decrypt data encryption keys without ever exposing the actual key encryption keys in plaintext. This significantly reduces the risk of leaking our KEKs, since our application doesn’t even have access to them.

We offload trust to the KMS service provider.

Having said that, there is still risk. But, we offload that risk to the cloud provider of our choice. We trust that they implemented these KMS services correctly, and that if any leak or intrusion happens into their own systems our key encryption keys are still secure.

There are a lot of security measures taken by KMS service providers to ensure the key encryption keys are securely protected, and it usually involves keeping the plaintext version of our keys inside Hardware Security Modules (HSM) volatile memory just for the few milliseconds needed for the operation. See Data protection in AWS Key Management Service for more details.

Most reputable cloud providers offer KMS services, for example AWS KMS and Google Cloud Key Management. Many companies also use Hashicorp Vault that allows integration with various KMS services and provides secrets access across the entire infrastrure.

The main API offerred by KMS services that we care about is:

EncryptKey
DecryptKey
GenerateDataKeyPair

Let’s examine how we can use a KMS service for envelope encryption.

Symmetric envelope encryption with KMS service.

To envelope encrypt some data:

Generate a data encryption key (DEK) using our GenerateKeyBytes() function from above.
Encrypt our plaintext data with the DEK from step 1, using the EncryptBytes() function.
Encrypt the DEK itself passing the DEK to the KMS EncryptKey API.
Store the data ciphertext from step 2 and the DEK ciphertext from step 3 into our database/datastore.

Symmetric envelope decryption with KMS service.

To envelope decrypt some data:

Decrypt the data encryption key (DEK) ciphertext first using the KMS DecryptKey API.
Decrypt the data ciphertext using the decrypted DEK from step 1 and the DecryptBytes() function from above.
Return the data plaintext from step 2.

The GenerateDataKeyPair API (see docs) can replace our own GenerateKeyBytes() function from above. Calling this API will return a data encryption key in both plaintext and ciphertext. We would use the plaintext version to encrypt our data, and store the ciphertext alongside the encrypted data so that we can pass it later to the KMS DecryptKey API when we will want to decrypt the data. The plaintext should be discarded immediately after encrypting the data.

Envelope encryption in Go

The following code is what I use to abstract away any KMS service to allow me to mock them out in local tests, or use any kind of KMS service regardless of the provider.

type KeyEncryptionWrapper interface {
    EncryptKey(ctx context.Context, keyPlain []byte) ([]byte, error)
    DecryptKey(ctx context.Context, keyCipher []byte) ([]byte, error)
}

type Enveloped struct {
    kms KeyEncryptionWrapper
}

func NewEnveloped(kms KeyEncryptionWrapper) *Enveloped {
    return &Enveloped{
        kms: kms,
    }
}

Now let’s see the encrypt/decrypt methods of the Enveloped abstraction.

// Encrypt encrypts the given data and returns Base64 formatted cipher data blob
// that needs to be passed to the `Decrypt()` function for decrypting.
func (e *Enveloped) Encrypt(ctx context.Context, data []byte) (string, error) {
    var err error
    dekKey := GenerateKeyBytes()
    dekCipher, err := e.kms.EncryptKey(ctx, dekKey)
    if err != nil {
        return "", fmt.Errorf("could not encrypt data key: %w", err)
    }

    cipherData, err := EncryptBytes(dekKey, data)
    if err != nil {
        return "", fmt.Errorf("could not encrypt data: %w", err)
    }

    return fmt.Sprintf(
        "%s.%s",
        ToBase64(dekCipher),
        ToBase64(cipherData),
    ), nil
}

// Decrypt extracts the necessary key information from the cipherDataBlob, decrypts the data
// and returns the decrypted plain bytes.
func (e *Enveloped) Decrypt(ctx context.Context, cipherDataBlob string) ([]byte, error) {
    var err error
    partsDot := strings.SplitN(cipherDataBlob, ".", 3)
    if len(partsDot) != 2 {
        return nil, fmt.Errorf("invalid cipher data blob")
    }
    dekCipher, dataCipher := fromBase64(partsDot[0]), fromBase64(partsDot[1])
    if dekCipher == nil || dataCipher == nil {
        return nil, fmt.Errorf("invalid cipher data blob")
    }

    dekKey, err := e.kms.DecryptKey(ctx, dekCipher)
    if err != nil {
        return nil, fmt.Errorf("could not decrypt data encryption key: %w", err)
    }

    plainData, err := DecryptBytes(
        dekKey,
        dataCipher,
    )
    if err != nil {
        return nil, fmt.Errorf("could not decrypt data: %w", err)
    }

    return plainData, nil
}

Note that I use a Base64 encoded string as the encryption result to make it more readable and universally compatible with any storage product, and it also makes the decrypting a bit simpler. That’s an optional step, and alternatively I could just return two byte slices directly, one for the DEK ciphertext and one for the data ciphertext.

Key rotation

We haven’t discussed anything about key rotation so far, but it’s a key component in keeping your data safe.

Key rotation is the re-encryption of data with a different data encryption key (DEK).

Retrieve the data encryption key (DEK) for some data.
Decrypt the data ciphertext using the DEK from step 1.
Generate a new data encryption key (DEK-2).
Encrypt the data using DEK-2.
Store data ciphertext from step 4.

Key rotation is important in order to reduce the risk of a key being leaked or being discovered impacting much of your encrypted data.

Even if an attacker gets ahold of a data encryption key (DEK) plaintext, they will only be able to decrypt data that used that specific DEK. Therefore, rotating the encryption keys makes this attack impossible.

Even though key rotation is nice, it becomes very hard and gets very costly if we had to re-encrypt the entirety of our dataset. Imagine Google wanting to re-encrypt the Google Drive files across all their customers, or Amazon S3, or Cloudflare R2. You get the idea. It’s not trivial or practically feasible to re-encrypt petabytes of data every day.

Envelope encryption helps here again.👌

Since we have separate DEKs for each piece of data, we can do re-encryption only for the DEK itself which is stored alongside the encrypted data. The encryption keys are a few bytes (32-bytes in our code above), therefore much more practical to re-encrypt across the board.

KMS services offer key rotation APIs to simplify the key rotation process, and you can even have multiple key encryption keys eligible for decrypting a data encryption key so that you can carry out key rotation over several days or weeks.

Multi-level envelope encryption

The vast majority of users should use envelope encryption with a KMS service and everything will work out fine.

In some cases, when you need to manage a lot of key encryption keys, you might want to apply envelope encryption multiple times.

For example, AWS uses multiple levels of hierarchical envelope encryption. They generate an account level key encryption key used to encrypt key encryption keys for each service in that account (e.g. S3). Then, that service key encryption key is used to encrypt data encryption keys to encrypt data stored by the service.

As another example, in Skybear.NET I generate one account-level key encryption key (AKEK) that is encrypted/decrypted by AWS KMS, and then that AKEK is used to encrypt/decrypt the individual data encryption keys for each piece of data encrypted for that account.

The main reason to do this approach is to reduce the amount of KMS API calls, either due to cost reasons or due to KMS API rate limits.

Keeping the intermediate key encryption keys in memory for a few seconds or minutes could be very beneficial in some cases. But, unless you really need this optimization, stay with the simple KMS-based envelope encryption and avoid complexity.

Conclusion

If you have read till here, thank you!🙏🏼

To summarize:

Encrypt your data, encrypt your users’ data, encrypt everything.
Follow security guidelines, use secure algorithms and correct implementations of those algorithms. Use Go more😉
Use KMS services to offload encryption key management.

How to detect website text content changes with Skybear.NET

lambros@lambrospetrou.com (Lambros Petrou) — Sun, 08 Dec 2024 00:00:00 GMT

Table of contents

Real world scenario to detect PagerDuty vCard updates
Script to detect vCard updates
Scheduled runs

Hurl is a CLI tool that makes testing and automating HTTP APIs easy and enjoyable. For the past year, I have been building a managed platform to run Hurl scripts for you (Skybear.NET), automatically scaling the underlying infrastructure and providing useful execution reports.

The following article is a copy of the Skybear.NET corresponding How-to Guide, posting it here for my records (canonical URL properly used😉), and its content is as of 2024-Dec-08.

In this article I will use Skybear.NET to continuously check a website’s text content and notify me when a specific text changes.

Real world scenario to detect PagerDuty vCard updates

As part of being oncall at work, we use PagerDuty for alerting the oncall engineers when a team gets paged.

However, the PagerDuty app has several issues depending on the device you use and the OS version regarding things like Do not Disturb mode, leading to missed page calls. What’s the point of being oncall and not getting alerted😅

One easy (and maybe dumb) solution I have been doing for a few years to always guarantee that the phone calls by PagerDuty always “make a sound” is to import the PagerDuty vCard directly into my contacts. Therefore, even if I don’t have the PagerDuty app installed, as long as I allowlist the PagerDuty contact entry to always alert regardless of silent mode, I will always get alerted.

The website PagerDuty vCard Updates has a section listing the latest version of the PagerDuty vCard Update as a date, e.g. 2024-11-13.

In this guide we will periodically fetch the above website, detect changes in the specific version date of the latest vCard update, and notify us in case it changes so that we can download the new vCard.

If you want to play with the final script, run it with the Open Editor. No signup required, and you can play with it for FREE.

Note that as of the time of writing this guide, the latest vCard update date is 2024-11-13.

Script to detect vCard updates

Since we will be doing HTML inspection we will be using the Hurl’s XPATH assertion capabilities. A nice cheatsheet for XPath can be found at https://devhints.io/xpath.

Let’s take a look at the HTML section we care about on the PagerDuty website:



    
    
        
        Latest vCard Update
    
    


    2024-11-13

As you see from the HTML snippet above, we will need to find the

element that has a child element with the ID latest-vcard-update (the first

child above). Once we have the

element, we will find the immediate sibling

Let’s breakdown our XPath query:

Get the
element that has a child with the expected ID:
```
//h3[.//*[@id='latest-vcard-update']]
```
Get the first

The full XPath selector we will use is:

normalize-space(string(//h3[.//*[@id='latest-vcard-update']]/following-sibling::ul[1]))

For comparison, the corresponding JavaScript query selector would be:

document.querySelector("h3:has(#latest-vcard-update) + ul").textContent.trim();

We have done the hard part now🎉 Let’s write our Hurl script to periodically fetch the website, extract the vCard update date, and compare the vCard update date against the last date we have downloaded the vCard.

// detect-pagerduty-vcard-changes.hurl
# PagerDuty vCard updates detection
GET https://support.pagerduty.com/main/docs/notification-phone-numbers#pagerduty-vcard
HTTP 200
[Asserts]
xpath "normalize-space(string(//h3[.//*[@id='latest-vcard-update']]/following-sibling::ul[1]))" == "2024-11-13"

Run this script with the Open Editor (no signup required)

As of the time of writing this guide, the latest vCard update date is 2024-11-13.

The moment that PagerDuty will update their vCard, the assertion above will fail and if you have configured email notifications Skybear.NET will notify you immediately.

Below you can see an example of how the assertion failure would look like if our assertion was expecting 2024-11-10:

error: Assert failure
  --> ./s_nsFlFDlJkX54hqRSFFhGkf7-5srrVV5lz1Pq.hurl:5:0
   |
   | GET https://support.pagerduty.com/main/docs/notification-phone-numbers#pagerduty-vcard
   | ...
 5 | xpath "normalize-space(string(//h3[.//*[@id='latest-vcard-update']]/following-sibling::ul[1]))" == "2024-11-10"
   |   actual:   string <2024-11-13>
   |   expected: string <2024-11-10>
   |

Scheduled runs

Now that we have a script to monitor content changes, we can create a scheduled cron trigger to make sure it runs continuously every day and sends us an email when the content changes.

After you create the Skybear.NET script with the appropriate content, navigate to its Settings tab, and configure a Scheduled Cron trigger with the cron expression 0 1 * * * so that it runs every day at 01:00, forever.

You can configure the trigger to notify you by email when the content changes are detected.

Testing the Cloudflare D1 REST API with Hurl

lambros@lambrospetrou.com (Lambros Petrou) — Sun, 24 Nov 2024 00:00:00 GMT

Table of contents

User journey
Hurl source
Create a new database
Ensure freshness
Database details
Database queries
Conclusion

As I wrote in a previous post, I love using Hurl to test HTTP JSON APIs. Hurl is a command-line interface tool (CLI) that makes testing and automating HTTP APIs easy and enjoyable.

In this article, I will showcase some Hurl features with a concrete example. We will test part of the Cloudflare D1 REST API.

You can find the whole Hurl file mentioned below in this Gist: https://gist.github.com/lambrospetrou/4e07bf79abea9fd82b52d1a6f985405c

User journey

Cloudflare D1 is Cloudflare’s take on a serverless SQL database. It’s built ontop of SQLite and specifically the SQLite in Durable Objects product. Read the blog post Zero-latency SQLite storage in every Durable Object for more details.

D1 is meant to be used from within a Worker using the D1 Worker Binding API for best performance. It provides however a REST API that allows scripts and automation to interact with a database instance from anywhere.

Note: The D1 REST API is not optimized for performance, since all requests go through a central location, but can be useful for adhoc queries.

The user journey we will test is the following:

Create a D1 database named skybear-test-001.
- This can fail if a database already exists with that name.
List the databases of the account and extract the database UUID for the one named skybear-test-001.
- We use a list operation to figure out the database ID since the previous step can fail, hence we might not have the ID of the same-named existing database.
If step 1 failed, it means the database existed already, so delete the database with the ID from step 2.
If step 1 failed, we need to create a new database instance.
Get the database details and assert its details.
Submit an SQL query with one CREATE and one SELECT statements and assert the results.
Submit the same queries as step 6, but requesting “raw” results instead of the default and assert the results.
Cleanup the database.

Steps 1-4 could be simplified into a single step just creating the database, but I decided to do the above extra steps in the sake of showcasing some Hurl features and making our test suite more robust.

Hurl source

The Cloudflare D1 REST API requires an account API token, and the accountId (find your account ID) to be provided in all requests.

We will use Hurl variables to extract both of these values and simplify our scripts to not contain hardcoded secrets. The variables are CLOUDFLARE_TOKEN and CLOUDFLARE_ACC_ID.

Create a new database

POST https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
{
  "name": "skybear-test-001",
  "primary_location_hint": "weur"
}
HTTP *
[Captures]
db_created: jsonpath "$.success"

A simple POST request, already showcasing some Hurl niceties.

We can specify request headers (e.g. Authorization) right below the HTTP method and URL of the request.

The multiline inlined JSON between lines 3-6 will automatically set the Content-Type: application/json header to our request.

The HTTP * line indicates that we expect an HTTP response code of anything, hence the *. The reason we don’t assert the exact code is that as we explained, this create request can fail if a database already exists with the same name.

The last line is a captured variable, and is what we will use in steps 3-4. We create a new variable named db_created that will have the value of the success property of the JSON response.

A value true for db_created denotes that there wasn’t any existing database with the same name and all went well, whereas a value of false denotes that the creation failed, and we need to delete the existing database first.

Hurl supports JSONPath for easy parsing of JSON API responses.

Ensure freshness

As mentioned above, we want our main test requests to query a newly created database, so the steps described in this section delete any existing database with the same name and re-create it.

We want to find the ID of the existing database (step 2 of user journey). The following Hurl source does that by listing all databases and extracting the ID of the one named skybear-test-001.

GET https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
HTTP 200
[Captures]
db_id: jsonpath "$.result[?(@.name == 'skybear-test-001')].uuid" nth 0

Line 3 asserts that we received an HTTP response code 200 indicating success, and line 5 creates a new variable db_id that has the UUID value of the existing database.

This JSONPath documentation is nice for understanding the syntax, but it basically filters the result array for items that have name == 'skybear-test-001' and then returns the uuid field for each item selected.

The final nth 0 ensures that our variable will have a single string value and not an array value, since we will reuse it in subsequent requests.

The following snippet issues a DELETE request to the appropriate URL using the db_id variable (step 3).

DELETE https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database/{{ db_id }}
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
[Options]
skip: {{ db_created }}
HTTP 200

The interesting bit here is the skip: {{ db_created}} request option.

If the skip option value is true the request will not be sent. This ensures we only delete the database if step 1 above failed.

POST https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
[Options]
skip: {{ db_created }}
{
  "name": "skybear-test-001",
  "primary_location_hint": "weur"
}
HTTP 200
[Captures]
db_id: jsonpath "$.result.uuid"

Similarly to the deletion, in this step we create a new database (step 4), only if step 1 failed, and then we assign the newly created database ID to the same variable db_id.

We use the skip: {{ db_created}} request option again to only do this if necessary.

Database details

At this point we have a new database ready to accept our queries.

GET https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database/{{ db_id }}
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
HTTP 200
[Asserts]
jsonpath "$.success" == true
jsonpath "$.result.uuid" == {{ db_id }}
jsonpath "$.result.name" == "skybear-test-001"
jsonpath "$.result.running_in_region" == "WEUR"

We do a straightforward GET request here to assert the basic details of the database.

Using Hurl’s powerful assertions we ensure it has the name and ID we expect, and is placed in the location we provided during creation (step 1 or 4).

Database queries

We now want to execute SQL queries on our database.

The /query endpoint (see docs) accepts an sql string that can contain multiple SQLite statements and responds with an array of results, one for each statement.

In the example below, we issue two statements, one to create a new table, and one to select a few rows from the SQLite built-in table sqlite_master.

POST https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database/{{ db_id }}/query
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
Content-Type: application/json
{"sql": "CREATE TABLE IF NOT EXISTS marvel (name TEXT, power INTEGER); SELECT name, type FROM sqlite_master ORDER BY name ASC;"}
HTTP 200

[Asserts]
jsonpath "$.success" == true
jsonpath "$.result[0].success" == true
jsonpath "$.result[0].results" count == 0
jsonpath "$.result[1].success" == true
jsonpath "$.result[1].results[0].name" == "_cf_KV"
jsonpath "$.result[1].results[0].type" == "table"
jsonpath "$.result[1].results[1].name" == "marvel"
jsonpath "$.result[1].results[1].type" == "table"

# SQL Duration in the Durable Object should be FAST! (less than 2ms)
jsonpath "$.result[0].meta.duration" < 2.0
jsonpath "$.result[1].meta.duration" < 2.0

There shouldn’t be anything new in the above snippet apart from the fact that we use more Hurl filters in our assertions. For example, jsonpath "$.result[0].results" count == 0 asserting that our first result (the INSERT statement) has zero rows returned.

The last two lines assert that the SQLite statement execution duration was less than 2 milliseconds. Yes, SQLite in Durable Objects is fast🚀

The following is an excerpt of the actual response:

{
    "result": [
        {
            "results": [],
            "success": true,
            "meta": {
                "duration": 0.2732 //...
            }
        },
        {
            "results": [
                {
                    "name": "_cf_KV",
                    "type": "table"
                },
                {
                    "name": "marvel",
                    "type": "table"
                }
            ],
            "success": true,
            "meta": {
                "duration": 0.2147 //...
            }
        }
    ],
    "success": true
}

Finally, our last test will be against the /raw endpoint that is identical to the /query above but instead of returning an array of objects as results, it returns the raw rows and separately the column names (see docs).

POST https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database/{{ db_id }}/raw
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
Content-Type: application/json
{"sql": "CREATE TABLE IF NOT EXISTS marvel (name TEXT, power INTEGER); SELECT name, type FROM sqlite_master ORDER BY name ASC;"}
HTTP 200

[Asserts]
jsonpath "$.success" == true
jsonpath "$.result[0].success" == true
jsonpath "$.result[0].results.columns" count == 0
jsonpath "$.result[0].results.rows" count == 0

jsonpath "$.result[1].success" == true
jsonpath "$.result[1].results.columns[0]" == "name"
jsonpath "$.result[1].results.columns[1]" == "type"
jsonpath "$.result[1].results.rows[0][0]" == "_cf_KV"
jsonpath "$.result[1].results.rows[0][1]" == "table"
jsonpath "$.result[1].results.rows[1][0]" == "marvel"
jsonpath "$.result[1].results.rows[1][1]" == "table"

# SQL Duration in the Durable Object should be FAST! (less than 2ms)
jsonpath "$.result[0].meta.duration" < 2.0
jsonpath "$.result[1].meta.duration" < 2.0

Almost identical, with the assertions of the rows and columns being the difference.

At the very end we should cleanup by deleting the database:

DELETE https://api.cloudflare.com/client/v4/accounts/{{ CLOUDFLARE_ACC_ID }}/d1/database/{{ db_id }}
Authorization: Bearer {{ CLOUDFLARE_TOKEN }}
HTTP 200

OK, that’s it.

With a few lines we have tested and asserted almost all of the D1 REST API.

Skybear.NET

Finally, a piece of self-promotion😅

If you use Hurl for HTTP API testing, and have scripts you wish you could run on a schedule or on-demand as part of your CI pipeline I am building Skybear.NET doing exactly that.

The platform provides you comprehensive reports for every single script run execution that you can view at any time. The full HTTP response headers and bodies are automatically persisted for you, for every execution, which makes investigations and troubleshooting of your APIs easy and simple.

The full script we examined so far is running as-is on the Skybear.NET platform as we speak, configured to run every few minutes. Your scripts can execute on the platform without any changes from how you run them locally.

Try Skybear.NET and send me your feature requests.

Conclusion

If you are doing anything with HTTP-based APIs and websites, do yourself a favour and integrate Hurl into your daily workflow.

Hurl is amazing, and comes with a CLI that runs all your test files in parallel by default, generates detailed JSON and HTML reports, and it’s overall a great way to ensure correctness of your APIs.

Control and data plane architectural pattern for Durable Objects - Cloudflare Reference Architecture Diagram

lambros@lambrospetrou.com (Lambros Petrou) — Wed, 20 Nov 2024 00:00:00 GMT

This is a reference architecture diagram entry I originally published on the Cloudflare Documentation website.

This article is a copy of the original doc above for my records 😅 Don’t worry, all credits are given to the documentation page above with a corresponding canonical URL.

Introduction

Durable Objects are built on-top of Cloudflare Workers spanning several locations across our global infrastructure network. Each Durable Object instance has its own durable storage persisted across requests, in-memory state, single-threaded execution, and can be placed in a specific region.

A single Durable Object instance has certain performance and storage capabilities. Therefore, to scale an application without being restricted by the limits of a single instance we need to shard our application data as much as possible, and take advantage of the Cloudflare infrastructure by spreading our Durable Object instances across the world, moving both the data and compute as close to the users as possible.

This document describes a useful architectural pattern to separate the control plane from the data plane of your application to achieve great performance and reliability without compromising on functionality.

The control plane provides the administrative APIs used to manage resource metadata. For example, a user creating and deleting a wiki, or listing all wikis of a user.
The data plane provides the primary function of the application and handles the operations on the resources data directly. For example, fetching and updating the content of a wiki, or updating the content of a collaborative document. Data planes are intentionally less complicated and usually handle a much larger volume of requests.
The management plane is an optional component of a system providing a higher level of interaction than the control plane to simplify configuration and operations. In this document, we will not focus on this as the same principles apply as to the control plane.

Control and data plane separation pattern

In this pattern, our application consists of at least one Durable Object instance per resource type handling all its control plane operations, and as many Durable Object instances as we need for the data plane operations, one for each resource instance created in the application.

You can scale to millions of Durable Object instances, one for each of your resources.

The main advantage of this architectural pattern is that our data plane operations, usually with larger volume of requests than control plane operations, are handled directly by the Durable Object instances holding the resource data without going through the control plane Durable Object instance. Therefore, the application’s performance and availability is not limited by a single Durable Object instance, but is shared across thousands or millions of Durable Objects.

Consider an example for a generic resource type XYZ, where XYZ could in-practice be a wiki, a collaborative document, a database for each user, or any other resource type in your application.

A user in London (LHR) initiates a resource XYZ creation request. The request is routed to the nearest Cloudflare datacenter and received by the Workers fleet which serves the application API.
The Worker code will route the request to the appropriate control plane Durable Object instance managing the resources of type XYZ. We will use the idFromName approach to reference the Durable Object instance by name (control-plane-xyz). This allows immediate access to the control plane Durable Object instances without needing to maintain a mapping.
- The location of the control plane Durable Object will be close to the first request accessing it, or to the explicit region we provide using Location Hints.
The control plane Durable Object instance (control-plane-xyz) receives the request, and immediately creates another Durable Object instance (data-plane-xyz-03) near the user request’s location (using Location Hints) so that the actual Durable Object instance holding the resource’s content is near the user that created it.
- We call a custom init(...) function on the created Durable Object instance (data-plane-xyz-03) passing any required metadata info that will be needed to start handling user requests. The Durable Object instance stores this information in its local storage and performs any necessary initialisation. This step can be skipped if each subsequent request to the created resource contains all the information needed to handle the request. For example, if the request URL contains all the information as path and query parameters.
- We use the idFromName approach to reference the Durable Object (data-plane-xyz-03) which allows the use of name-based resource identifiers.
- Alternatively, we can use the newUniqueId approach to reference the Durable Object which will give us a random resource identifier to use instead of a name-based one. This random identifier will need to be communicated back to the user so that they provide it in their subsequent requests when accessing the resource.
The control plane Durable Object instance (control-plane-xyz) stores the generated identifier (data-plane-xyz-03) to its local storage, in order to be able to list/delete all created resources, and then returns it to the Worker.
The user receives a successful response for the creation of the resource and the corresponding identifier, and (optionally) gets redirected to the resource itself.
The user sends a write request to the API for the resource identifier returned in the previous step, in order to update the content of the resource.
The Worker code uses the resource identifier provided to directly reference the data plane Durable Object instance for that resource (data-plane-xyz-03). The Durable Object instance will handle the request appropriately by writing the content to its local durable persistent storage and return a response accordingly.
Another user from Portland (PDX) is sending a read request to a previously created resource (data-plane-xyz-01).
The Worker code directly references the Durable Object instance holding the data for the given resource identifier (data-plane-xyz-01), and the Durable Object instance will return its content by reading its local storage.

As long as the application data model allows sharding at the resource level, you can scale out as much as you want, while taking advantage of data locality near the user that accesses that resource.

The same pattern can be applied as many times as necessary to achieve the performance required.

For example, depending on our load, we could further shard our control plane Durable Object into several Durable Objects. Instead of having a single Durable Object instance for all resources of type XYZ, we could have one for each region. The name-based approach to reference a Durable Object instance simplifies targeting the appropriate instance accordingly.

In conclusion, as long as you find a way to shard your application’s data model in fine-grained resources that are self-contained, you are able to dedicate at least one Durable Object instance to each resource and scale out.

Durable Objects Namespace documentation
Durable Objects: Easy, Fast, Correct — Choose three
Zero-latency SQLite storage in every Durable Object
Data, Control, Management: Three Planes, Different Altitudes
Examples of this architectural pattern in real-world applications:
- Durable Objects aren’t just durable, they’re fast: a 10x speedup for Cloudflare Queues
- Building a global TiddlyWiki hosting platform with Cloudflare Durable Objects and Workers — Tiddlyflare

Love letter to Hurl

lambros@lambrospetrou.com (Lambros Petrou) — Sun, 03 Nov 2024 00:00:00 GMT

Table of contents

Hurl
Love letter
Conclusion

Hurl is a command-line interface tool (CLI) that makes testing and automating HTTP APIs easy and enjoyable.

Yeap, “testing”, “easy”, and “enjoyable”, in the same sentence. Not common.👌

I realized that I have been posting a lot about the managed platform I am building to run Hurl scripts for you (Skybear.NET), without writing about the underlying star of the show.

This article elaborates on why I use and love Hurl for most of my end-to-end testing and HTTP automation needs. From testing Go server APIs, to JavaScript APIs on Cloudflare Workers, to simply chaining a few API calls together to integrate different services into a pipeline.

Hurl

Hurl is a command line tool that runs HTTP requests defined in a simple plain text format.

It can chain requests, capture values and evaluate queries on headers and body response. Hurl is very versatile: it can be used for both fetching data and testing HTTP sessions.

Hurl makes it easy to work with HTML content, REST / SOAP / GraphQL APIs, or any other XML / JSON based APIs.

— By hurl.dev

The hurl.dev website is really, really good. Detailed documentation with plenty examples and snippets of how to do anything you want with Hurl (see Samples docs).

The above quote is how they introduce Hurl. Notice that they don’t focus on “just testing”.

Hurl’s power is that it’s an HTTP request automation tool. Testing is just one use-case of it.

The fact that you can chain requests, extract data from responses and pass them down to subsequent requests, and target any API content type makes it an amazing Swiss-army knife tool for anything that speaks HTTP.

It’s curl wrapped in a nice package! ❤️

Love letter

I have been writing code for more than 15 years at this point. Professionally (getting paid for it) for more than a decade.

I tried a lot of testing frameworks and tools, some with success, some with frustration, and some that were rejected on first sight. There are hundreds if not thousands of testing tools, from language-specific frameworks, to generic tools, to in-house hand-written tests with code.

I first found out about Hurl 2 years ago, when it was still at version 4.0.0 (see all releases).

Hurl was a love at first sight kind of thing.😍 Not only that, but each of the releases since, make it even more awesome.

GET https://www.skybear.net/_live-demo/secure.json
Authentication: Bearer sample-token-123
HTTP 200
{"ok":true}

Look at the above snippet. Zero unnecessary overhead or boilerplate.

I would bet that you already understand what it does just by looking at it, but let’s go through it.

Line 1 sends a GET request. Line 2 configures the Authentication header of the request.

Line 3 asserts that we get back a 200 response status code. Line 4 asserts the full response body content.

There is nothing to remove to make it simpler. That’s freakin awesome!

There is a more flexible and more powerful [Asserts] block with special helpers to do assertions. For example, the above snippet is equivalent to the following:

GET https://www.skybear.net/_live-demo/secure.json
Authentication: Bearer sample-token-123
HTTP 200

[Asserts]
body == "{\"ok\":true}"
# or
jsonpath "$.ok" == true

Are you not interested in assertions, and just want to fire off some requests? Have at it:

GET https://www.skybear.net/_live-demo/secure.json
Authentication: Bearer sample-token-123

Do you want to repeat a request 5 times with 100ms space between them?

GET https://www.skybear.net/_live-demo/get.json
[Options]
repeat: 5
delay: 100ms

Run the above examples for free with the Skybear.NET Open Editor.

Hurl has a plethora of assertions you can do (see Asserting Response docs).

It also has cool variable capturing from responses (see Capturing Response docs) which is useful for extracting data from a response body or header and reusing it later either in assertions or subsequent requests. This is very useful for example when you are creating a new resource by issuing a POST request and extracting the created resource’s ID from the response in order to query other endpoints for that identifier next.

Specifying the body for the request is as simple as the URL and headers too. You can send JSON, XML, GraphQL, just a multiline string, multipart form-data, and everything else you need (see Request docs).

Another feature I love for easy testing is that you can use injected variables in your scripts (see Injecting Variables docs).

Imagine for example that your API has a staging deployment at staging.example.com and the production deployment at www.example.com. You can adapt your actual Hurl file to use https://{{subdomain}}.example.com and inject the subdomain variable when you run the CLI.

Variables can be injected as CLI arguments (--variable subdomain=staging), as environment variables (HURL_subdomain=staging hurl ), or put all of them in a staging.env file and pass the file as command line argument (--variables-file staging.env).

Personally, I usually start off with the CLI arguments, and then once the scripts are fully fleshed out move them to .env files.

Hurl also comes with other goodies.🥳

Running all the *.hurl files in a directory tree in parallel makes up for a super fast testing experience, limited only by the target API’s latency! It’s written in Rust after all, and taking full advantage of it.

My favourite goody feature is the --report-json option (introduced in 5.0.0 - see announcement) which creates a detailed JSON report of every request made by Hurl, while also dumping every single response body received and referencing it in the report. This is a great way to debug my APIs actually end-to-end, with full introspection to headers, duration timings, and full response bodies!🚀

I only covered the tip of the iceberg. It’s so good!

Conclusion

If you are doing anything with HTTP-based APIs and websites, do yourself a favour and integrate Hurl into your daily workflow.

Make your HTTP tasks easier, simpler, and more enjoyable.

I use Hurl in my personal projects for quick and easy end-to-end tests without wasting time with unit tests and code refactoring during feature development (see Tiddlyflare example). I use it at work for testing and different automation scripts or experimentation I do with different internal and external APIs. I use it for my automation needs integrating a bunch of APIs together into a pipeline.

Finally, a piece of self-promotion 😅, if you use Hurl, and have scripts you wish you could run on a schedule or on-demand as part of your CI pipeline, with comprehensive reports you can view at any time including storing the full responses, have a look into Skybear.NET, and let me know what you like and don’t like.

Deploy your applications on a server with zero downtime

lambros@lambrospetrou.com (Lambros Petrou) — Mon, 28 Oct 2024 00:00:00 GMT

Table of contents

Context
Directory structure
Systemd
Caddy
Deploy script
Conclusion

In this guide I will explain the deployment scripts I have been using for several years now for my own applications deployed on actual servers, either VPS on Hetzner and Linode or cloud instances like AWS EC2.

All the scripts below are also available at https://gist.github.com/lambrospetrou/aaaa13344f0026d810700f1bd2601cfd.

Context

Let’s clarify what we want to achieve.

We want to deploy on servers. Not serverless platforms, not managed container platforms. We have our application artifact, maybe a Go/Rust binary, maybe a zip/tar/jar file, a Python/Node script, and want to run it on the server.

We want zero downtime. This means we can deploy the application without failing in-progress requests or incoming requests while the application is being deployed.

This is important since applications I deploy on my servers are usually SQLite-based and therefore all requests are routed onto that single server.

Lastly, we want simple deploy scripts that can run from a CI like Github Actions or AWS CodeBuild and from laptops/machines locally without changes needed.

ℹ️ Note 1: Even though this guide is not for Docker containers, the Caddy section and the zero downtime section actually do apply to Docker containerized applications too.

ℹ️ Note 2: I love the approach explained in this article, because it doesn’t depend on language-specific libraries to achieve zero downtime, as long as your application supports graceful shutdowns (more in Zero downtime deployments section).

Directory structure

I like structure in my filesystem. Inspired from past teams I worked in, articles I read, and from my own needs, this is what I now use across my servers for my applications.

The directory /opt/apps_workspace/ is the “application root directory”.

Within the application root directory I have a versions/ subdirectory that contains all versions of my application deployed (I maintain the N latest versions only, more on that later).

Depending on the application, the contents of the versions/ directory could just be a list of binaries, or if my application needs multiple files it will be a list of subdirectories, one per version.

Next to the versions/ directory, there is the current/ directory that contains anything that is only relevant for the running application, e.g. .env.local, SQLite database files, and anything else not tied to a specific version but needed at runtime.

There is a symlink current/ pointing to the corresponding binary file (or whatever artifact) under the versions/ subdirectory.

Live example from my Skybear.NET staging server:

$ ll /opt/apps_workspace/monosource-server/**
/opt/apps_workspace/monosource-server/current:
total 12K
drwxr-xr-x 3 appuser appadmins 4.0K Jul 25 23:03 appdata
-rw-r--r-- 1 lambros lambros   1.9K Sep 12 07:51 .env.local
lrwxrwxrwx 1 root    root       224 Sep 12 07:51 monosource-server -> /opt/apps_workspace/monosource-server/versions/monosource-server-20240912_075143-commit_unknown-f6145af482657e71162e0b105b2429fa754a0a5e11cb8d4ebf1ae6220c832a859e8d6d3f9c79fae35fd8ddb929fdb5300e34587bf2035908ef8be17638584fd2

/opt/apps_workspace/monosource-server/versions:
total 690M
-rwxr-xr-x 1 lambros lambros 69M Sep  9 21:57 monosource-server-20240909_215719-commit_unknown-1c0f46816dac2d4c9989b6ae5dbeacadfc95d26fb1d1b3c3476972b73e366a1f70f7857218b1fe8d6cd47a2ca17516193f9311f463e84cfa089e3aa992c3ff0e
-rwxr-xr-x 1 lambros lambros 69M Sep 10 07:23 monosource-server-20240910_072327-commit_unknown-dfe8db0911da484fc9f68c783765752d10b6aa923166829ddc96dc2da89a740134bca2eb30baca2c90b40a11c4ccfc8a1edcd75ca794c19abe27f18640824c42
-rwxr-xr-x 1 lambros lambros 69M Sep 10 07:40 monosource-server-20240910_074005-commit_unknown-3f4165508c7a54c27d43c20cb1417ac276f3e3c882b99bf4ac6f5a24a89283dc0af6185680802516cfb7f00767d68f441be46ff56297ba2296888432eb653fda
-rwxr-xr-x 1 lambros lambros 69M Sep 10 07:49 monosource-server-20240910_074934-commit_unknown-8b987287bfff734ea67d0877d4fe3233bc7f7f02d160437d5e97fc710603c167dafcaf656161f073b6f0cd502031566d9ccf80817a3493ee50f55b6e6617f515
-rwxr-xr-x 1 lambros lambros 69M Sep 10 07:49 monosource-server-20240910_074949-commit_unknown-8b987287bfff734ea67d0877d4fe3233bc7f7f02d160437d5e97fc710603c167dafcaf656161f073b6f0cd502031566d9ccf80817a3493ee50f55b6e6617f515
-rwxr-xr-x 1 lambros lambros 69M Sep 11 00:31 monosource-server-20240911_003106-commit_unknown-4aea354fc94a9137fbffa788892ecfe5dfb54ce2fdbd89b7ca4db15bf2a644aafacb1bfa24f9f858fdcb9108ebede6cbf5b8580afb827985fdc4bb75bae9f4b4
-rwxr-xr-x 1 lambros lambros 70M Sep 11 21:27 monosource-server-20240911_212745-commit_unknown-df30c74a6392ec0d404d77bb0f11512d457d77eba62d26316e1587aee7b1410eed23852cb4b03983f9cc19f6a5f9db8eb464f41dd5d5e5449c3a9913c78c56eb
-rwxr-xr-x 1 lambros lambros 70M Sep 11 22:26 monosource-server-20240911_222551-commit_unknown-496ff2e89f5b13190d2965a7861abdc6924adf3c45e54a5a367a27118703bcff033b402d9e758e3cbe68833e11be60acc7e3efe1fe8e9c27597d020c5b90e52b
-rwxr-xr-x 1 lambros lambros 70M Sep 11 22:38 monosource-server-20240911_223751-commit_unknown-ebd934d5e765b0651a8a045e47d4017365129abded1b02ef8fa8370b909db79fb28e9c2a7591b0adf3eaad6a6cd23f4ba1ea688eccde577ee8375afa6cdef034
-rwxr-xr-x 1 lambros lambros 70M Sep 12 07:51 monosource-server-20240912_075143-commit_unknown-f6145af482657e71162e0b105b2429fa754a0a5e11cb8d4ebf1ae6220c832a859e8d6d3f9c79fae35fd8ddb929fdb5300e34587bf2035908ef8be17638584fd2

Systemd

Systemd (https://systemd.io) is the defacto tool to manage processes, e.g. restarting them when they crash, startup on boot, and a LOT more.

We are going to use systemd to automatically start our applications on server reboots and on application crashes. We also use its Journal feature, essentially a component for capturing logs from the application’s process standard IO (stdout, stderr).

Each application I deploy has the following service file /etc/systemd/system/.service:

[Unit]
Description=Monosource Server
After=network.target

[Service]
ExecStart=/opt/apps_workspace/monosource-server/current/monosource-server
User=appuser
Group=appadmins
WorkingDirectory=/opt/apps_workspace/monosource-server/current/
Restart=always
RestartSec=5
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Notice that the main executable to run is our symlink in the current/ subdirectory as explained in the previous section.

Also, the app is running under the user appuser and the group appadmins, just to have some isolation from my own system user.

We specify that we want to always restart the application, and that we want both the standard output and standard error to be logged into systemd’s Journal.

To tail the application logs:

sudo journalctl -u monosource-server -f

We are going to see in the Deploy script section when this file is created, updated, and how we notify systemd to pickup changes.

Caddy

Caddy server (https://caddyserver.com) is an amazing proxy server written in Go.

Very good performance, with a user-friendly configuration syntax. We are going to use its reverse_proxy feature (see docs) for proxying all the applications on the server, and its load balancing feature to achieve zero downtime (see docs).

We are going to use the Caddyfile syntax to configure Caddy.

The main configuration file is in /etc/caddy/Caddyfile, and this is its content:

import sites-enabled/*

Yes, a single line importing all other Caddyfiles from the directory /etc/caddy/sites-enabled/.

Each application will have its own /etc/caddy/sites-enabled/-Caddyfile configuration.

This is the configuration I have for Skybear.NET which is an application serving multiple domains:

:80 {
    reverse_proxy http://127.0.0.1:8080 {
        header_up X-Real-IP {remote}

        # This gives 10s of buffering to allow zero downtime restarts of the service.
        lb_try_duration 10s
    }
    request_body {
        max_size 1M
    }
}

The above config specifies that Caddy server will listen on port :80 and act as a reverse proxy forwarding all requests to a process running locally and listening on http://127.0.0.1:8080 for HTTP requests.

I use Cloudflare as a proxy in front of all my servers, hence I don’t need to serve HTTPS from my origin servers (therefore no need to listen on port :443). If you want to go straight to your servers without any proxy CDN, Caddy supports HTTPS out of the box (see docs), so you just need to add some extra configuration for the domain(s) to certify.

In the forwarded request the headers X-Real-IP will be set appropriately, and the maximum request body we accept is 1MB otherwise the request will be rejected. Read through https://caddyserver.com/docs/caddyfile/directives/reverse_proxy#defaults to see if you really need these headers depending on your CDN used (if any at all).

We are going to see in the Deploy script section when this file is created, updated, and how we notify Caddy to pickup changes.

Let’s now explore the lb_try_duration 10s configuration.

Zero downtime deployments

ℹ️ Assumption: Your application supports graceful shutdowns. In order for zero downtime to work properly your application has to support graceful shutdown. That means you do not abruptly kill your process during restarts, but instead process in-flight requests without accepting new ones, and then exiting the process. How you do this depends on the language and framework you use. For example here is how it’s done in Go servers: https://pkg.go.dev/net/http#example-Server.Shutdown

Without the line lb_try_duration 10s, everything would still work correctly.

The behavior would be that all requests are received by Caddy, forwarded to the local server, and the server writes its response back.

During deployments, our application will restart, hence any request being forwarded to http://127.0.0.1:8080 will fail since no server is listening on that port until the application process starts up again.

My applications are usually written in Go, so downtime is only 1-2 second(s) max. Imagine though that you use something slower (e.g. Python, Ruby) or you have to do a slow initialization in your application. That would lead to downtime of your service, which is not good.

That’s where the magic of lb_try_duration 10s comes into play.

With that line we instruct Caddy to keep retrying to reach the “backend” (our application) up to 10 seconds before failing the request.

This is awesome, since it allows our application to restart, do its initialization, and then go online to start serving the “pending” requests.

One line. Really nice 👌

Deploy script

Now that we explored all individual components of our application, let’s see the connecting tissue. The deployment script.

The following deploy script will do the following in order:

Generate the application version name based on commit (if it exists) and current time (see example of this in the Systemd section).
Copy all the files the application needs into a temporary directory.
- This step ensures that all the files are copied on the target server before trying to restart any component to avoid any partial deployment.
We send a big shell command over SSH (or Tailscale SSH) that will do the following:
1. Move all the application files from the temporary directory into the /opt/apps_workspace//versions/ directory accordingly.
2. Move the Systemd configuration from the temporary directory to /etc/systemd/system/.service, and trigger its daemon to reload its configuration.
3. Move the Caddy configuration from the temporary directory to /etc/caddy/sites-enabled/-Caddyfile, and trigger the Caddy daemon to reload its configuration.
4. Update the symlink /opt/apps_workspace//current/ to point to the new version.
5. Restart the application using systemctl restart .
  - This is the only part that causes downtime, but is mitigated by using Caddy’s load balancing feature to buffer requests.
Delete older versions to retain only the latest 10 on the server, just in case I need to rollback to a previous version.
- Rollback is done manually now by recreating the current version symlink to point to a previous version and restarting the application.

Deploy script:

#!/usr/bin/env bash

set -e

if [ "$#" -ne 1 ]; then
    echo "usage: $0 user@server-address"
    exit 1
fi

SERVER_SSH=$1
SERVER_PATH=/opt/apps_workspace/monosource-server
BINARY_NAME="monosource-server"
SERVER_RESTART_COMMAND="systemctl restart $BINARY_NAME"
SYSTEMD_FILE="monosource-server.service"
SYSTEMD_DAEMONRELOAD_COMMAND="systemctl daemon-reload"

# https://caddyserver.com/docs/running#unit-files
CADDY_RESTART_COMMAND="systemctl reload caddy"
CADDYFILE="monosource-server-Caddyfile"

# Assume the script will be run inside the `src/` directory.
OUTFILE="./build/$BINARY_NAME"
ENVFILENAME=".env.local"
ENVFILE="./build/$ENVFILENAME"
# COMMIT_HASH=$(git rev-parse HEAD)
COMMIT_HASH="commit_unknown"
BUILD_TIMESTAMP=$(TZ=UTC date -u +"%Y%m%d_%H%M%S")
FILE_HASH=$(b2sum $OUTFILE | cut -f1 -d' ')
REMOTE_FILENAME="$BINARY_NAME-$BUILD_TIMESTAMP-$COMMIT_HASH-$FILE_HASH"

echo "Deploying: $REMOTE_FILENAME"

# Copy necessary files from current version.
scp "$OUTFILE" "$SERVER_SSH:/tmp/$REMOTE_FILENAME"
scp "$ENVFILE" "$SERVER_SSH:/tmp/$REMOTE_FILENAME-$ENVFILENAME"
scp "_tools/files/etc/caddy/sites-enabled/$CADDYFILE" "$SERVER_SSH:/tmp/$REMOTE_FILENAME-$CADDYFILE"
scp "_tools/files/etc/systemd/system/$SYSTEMD_FILE" "$SERVER_SSH:/tmp/$REMOTE_FILENAME-$SYSTEMD_FILE"

# Put the latest files in the right directories and restart everything without downtime.
ssh -q -Tt $SERVER_SSH <


Conclusion
I love serverless platforms (you should use Cloudflare Workers and Durable Objects more), but I also love writing small servers in Go and use SQLite 😅
This article describes the deployment script I use to achieve zero downtime deployments across actual servers, VPS, or cloud instances like AWS EC2.
Feel free to copy and modify them to your will, or reach out with questions and/or ways to improve them.



Building a global TiddlyWiki hosting platform with Cloudflare Durable Objects and Workers — Tiddlyflare
lambros@lambrospetrou.com (Lambros Petrou) — Tue, 22 Oct 2024 00:00:00 GMT
Table of contents

Context
TiddlyWiki
Durable Objects


Requirements
High-level architecture
CreateWiki data flow
GetWiki data flow
Location hints
OK. So what.
Mindset shift
Conclusion

Tiddlyflare is an open source TiddlyWiki hosting platform built with Cloudflare’s SQLite Durable Objects and Workers.
It supports multiple users, each with their own collection of TiddlyWikis, and each user’s data is automatically placed in a Cloudflare region close to them.
Each TiddlyWiki hosted by Tiddlyflare can be up to 1GB (limits will be raised to 10GB soon).
This article goes into the architecture behind Tiddlyflare, and showcases the power and flexibility you get from Durable Objects (DO), and the Workers platform overall, without you really doing much more work.
You can find the actual source code for Tiddlyflare implementing everything described in this article at https://github.com/lambrospetrou/tiddlyflare. It works.
Context
Let’s introduce some useful background context.
TiddlyWiki

TiddlyWiki is a unique non-linear notebook for capturing, organising and sharing complex information
— tiddlywiki.com

TiddlyWiki is an amazing open-source tool created more than a decade ago by Jeremy Ruston, and as the quote above mentions can be used for note-taking, information and knowledge organization, and much more.
For the purposes of this article, you only need to know that by default each wiki stores all of its data in a single HTML file. 😅
When you load that HTML file into a browser you get a UI, and the data is embedded into the HTML file itself.
There are a million ways to persist your changes (see Saving wiki section for supported plugins), but in this article we will focus on the PutSaver saving mode.
Automatically saving changes with PutSaver
We will build a platform to host TiddlyWikis. While editing our wikis we want the changes to be automatically saved remotely on the platform.
Conveniently, TiddlyWiki has a very basic but super flexible API interface named PutSaver that we will implement (see PutSaver code).
PutSaver is very simple.
Once the TiddlyWiki HTML file is loaded in a browser, it sends an OPTIONS request to the current URL location, and based on the response it decides if PutSaver is supported.
If the response contains a header named dav with any value, and a success response status code (200 >= status < 300), then we are good to go.
From that point on, any changes you make to your wiki are automatically propagated with a PUT request to the current URL location and the request body is the whole HTML file.
Not only the changes, but the whole file!
We don’t care if this is efficient or not, that’s how PutSaver works, and that’s what we will use.
Durable Objects
I already wrote an article introducing Durable Objects a few weeks ago, so keeping this short.

Durable Objects (DO) are built ontop of Cloudflare Workers (edge compute). Each DO instance has its own durable storage persisted across requests, in-memory state, executes in single-threaded fashion, and you decide its location. You can create millions of them!

I said it before, and saying it again. Durable Objects is the most underrated compute and storage product by Cloudflare.
It’s so different than other platforms that it’s not easy to realize its power initially.
Once you get it though, it’s a game changer! 🤯
Requirements
Let’s get started with the requirements of our platform.

We want to support multiple users, each with their own collection of TiddlyWikis, and the usual CRUD functionality (create/read/update/delete).
Each user’s data should be isolated from each other, i.e. user A only has access to their own TiddlyWikis.
Each TiddlyWiki created on the platform gets its own URL, and visiting that URL responds with the latest version of the wiki HTML file.
The PutSaver saving mechanism should be supported for all wikis by default.
We want all user data to be close to the user’s location. A user in Portland has their data near Portland, and a user in London has their data close to London.
Support scale (intentionally leaving this open-ended to your imagination limits).

Requirement 5 is where things get hairy, really fast, with traditional hosting/cloud/infrastructure platforms.
High-level architecture
There are many approaches to implement this and satisfy the requirements. Some are complex, others are complicated, and others are just a pain in the ass.
The following design exploits and showcases the goodies of Durable Objects, satisfying all the requirements, without the code getting complex or our mental reasoning having trouble.

  
  Tiddlyflare high-level architecture diagram.


Let’s break down the above diagram.

User 1
First user is in Portland, US.


User 2
Second user is in London, UK.


Traffic to Cloudflare network (A)
The traffic from all users is routed to the nearest Cloudflare datacenter using Anycast.


Workers fleet (B)
The Workers are stateless, and each user request goes to any available machine inside a datacenter close to the user request’s location. Not necessarily the same machine each time.


Durable Object with ID TENANT_1 (C)
A Durable Object (DO) instance is created the first time a user attempts to create a wiki, placed in a location near the user’s request, under the Durable Object Namespace TENANT.
TENANT_1 is the specific DO instance created for User 1, and is placed inside a datacenter near the Portland region, close to User 1 location.
The tenant Durable Objects hold data about the user itself, and a metadata record for each wiki created for that user.
The tenant Durable Object does NOT manage any of the wiki data (i.e. the wiki HTML).


Durable Objects with IDs WIKI_1 and WIKI_2 (D)
A Durable Object (DO) instance is created when a wiki is created, placed in a location near the TENANT_1 DO location, under the namespace WIKI.
WIKI_1 and WIKI_2 are the specific DO instances created for User 1’s wikis.
We are going to understand later why the TENANT_1 location is used here instead of the User 1’s location.


Durable Object with ID TENANT_2 (E)
TENANT_2 is the specific DO instance created for User 2 information, and is placed inside a datacenter near the London region, close to User 2 location.


Durable Object with ID WIKI_3 (F)
WIKI_3 is the DO instance created for User 2’s wiki, near the location of TENANT_2 Durable Object instance.


Cloudflare network (N)
Each Worker and Durable Object instance can communicate with other instances across Cloudflare’s network without going to the public internet (most of the time).
This can be used to efficiently communicate between DO instances, or other Cloudflare services.



CreateWiki data flow
Let’s now explore a concrete example of how data flows through the above diagram when User 1 creates their first wiki.

User 1 opens Tiddlyflare (e.g. on tiddly.lambros.dev), logs in, and clicks the button to create a TiddlyWiki.
The POST request triggered flows to the nearest Workers fleet within a datacenter in Portland.
The Worker code attempts to create a Durable Object Stub for User 1’s tenant ID.
Since this is the first time we attempt that, there is no Durable Object instance with that ID, therefore the platform will create one in the closest datacenter with Durable Object support, often in the same region.
The Worker code doesn’t need to check anything to see if a TENANT DO already exists and worry about all that. When you create a reference to the DO you want to use, if it doesn’t already exist it gets created, and then you just get routed to it.
Worker code to get access a DO instance:const doStub = env.TENANT.idFromName(tenantId).get();


The above line will return a Durable Object Stub which allows us to call methods on the DO instance directly.
This stub can reference a DO instance in the same datacenter, on the same machine, or in the other side of the world.


Now that the TENANT_1 DO stub is created, we call doStub.createWiki(...) to create the first TiddlyWiki.
The Durable Object TENANT_1 now receives the request, initiates its local SQLite storage with the appropriate SQL tables for the user information (remember; this is the first time the user did anything), and subsequently attempts to create the Durable Object that will manage the wiki’s data.
Similar to step 3, we now attempt to get a stub on the wiki DO WIKI_1.
Generate a random ID for the wiki (i.e. WIKI_1): const doId = env.WIKI.newUniqueId();
Get the DO stub: const wikiStub = await env.WIKI.get(doId);
Create the wiki: await wikiStub.create(tenantId, wikiId, name, wikiType);
The WIKI_1 DO instance is created near the TENANT_1 location because that’s the origin of the request to the WIKI_1 DO instance.


The WIKI_1 DO instance initializes its own local SQLite database with the right tables to store wiki data, and stores the default TiddlyWiki HTML file.
The TENANT_1 DO instance receives the successful response from WIKI_1, writes in its own SQLite database that WIKI_1 was created successfully, and returns the information about the newly created wiki back to User 1.
The URL of a wiki includes the WIKI_1 ID, therefore just having the URL is enough to be able to reference the WIKI_1 DO instance without having to access TENANT_1 at all.
Alternatively, if you want to expose “names” instead of IDs through URLs, you can use the idFromName(name) function to create your DO ID (see docs).



Are you starting to see the magic? 👁️
GetWiki data flow
We have a wiki created now, so let’s see the much simpler wiki data read flow.

User 1 opens the wiki URL returned by the creation flow.
The GET request flows to the nearest Workers fleet within a datacenter in Portland.
This time the Workers code attempts to create a Durable Object Stub straight to WIKI_1, and bypasses the TENANT_1 DO, since the URL encodes the DO instance ID.
Worker code to the wiki DO instance:const doId = extractWikiId(requestUrl)
const wikiStub = env.WIKI.idFromString(doId).get();


As before, WIKI_1 is probably in the same region as TENANT_1 or even same datacenter.


The worker now calls the DO stub to return the wiki content.
Code: return wikiStub.getFileSrc(wikiId);


The WIKI_1 DO instance will wake up, if not already running, read from its local SQLite database the contents of the wiki HTML file, and stream it back to the Worker.
The Worker simply forwards the stream of the DO response back to User 1, without any intermediate buffering for no added overhead.


In summary, all GET requests for a wiki flow through the nearest worker to the user location, they then get routed to the corresponding WIKI Durable Object instance holding that wiki’s data (could be on the other side of the world), and the content is streamed back to the user.

That’s it.
User 2 accessing User 1 wiki
Can you guess what the read flow looks like for User 2 trying to read WIKI_1 from User 1? For simplicity let’s assume all wikis are publicly accessible to anyone with the URL at hand (in reality we have actual auth).
Go ahead and guess which step of the previous section would be different.
OK.
The differences are steps 2 and 3.
The request from User 2 will go to the Workers fleet nearby User 2’s location, somewhere in London.
The WIKI_1 Durable Object instance doesn’t move after creation, therefore in step 3 the Worker will create a WIKI_1 DO stub that will reach out to the Portland region to access the WIKI_1 DO instance, and then continue with the wiki reading as usual.
Location hints
There was a subtle difference between step 3 and step 6 in the creation flow (User 1).
In step 3, the TENANT_1 Durable Object (DO) instance is created in the nearest datacenter to the user location. Whereas, in step 6, the WIKI_1 Durable Object instance is created in the nearest datacenter to the TENANT_1 DO instance’s location.
In general, the location considered when creating a Durable Object instance is the location of the running code that attempts to create the Durable Object Stub.
In step 3, the stub creator is the Worker code running closest to the user, whereas in step 6 it’s the code running inside the TENANT_1 DO instance.
In this specific example, both Durable Object instances are going to end up in the same region since they are all close by, maybe even the same datacenter/machine, but in other scenarios this might be different. See the example where User 2 attempts to access WIKI_1.
In cases where you want to influence the location of a Durable Object instance, ignoring the location of the stub creator, you can use Location Hints and provide explicitly the region you want to place the Durable Object instance.
let durableObjectStub = OBJECT_NAMESPACE.get(id, { locationHint: "eeur" });

// Supported locations as of 2024-Oct-22
wnam    Western North America
enam    Eastern North America
sam     South America
weur    Western Europe
eeur    Eastern Europe
apac    Asia-Pacific
oc      Oceania
afr     Africa
me      Middle East

👉🏼 Tip: https://where.durableobjects.live is an amazing little website showing all the Cloudflare locations with Durable Object support.
OK. So what.
You might be wondering, apart from the fact that we used Cloudflare proprietary technology, why this, and not use a VPS, or AWS Lambda, or anything else really.
This is why I love Durable Objects and what made the Workers platform really click for me.

You are not tied down to a single location. It’s TRIVIAL to put data in any of the Cloudflare regions supporting Durable Objects. Create a Stub to the Durable Object you want, in the location you want, and get to work.

Imagine having to write a Cloudformation stack to deploy across 10+ regions, and communicate within those regions from your application. All the configuration needed, managing all those regional endpoints. My god! 🤬
For anyone saying skill issue right now, I worked at AWS. I have been using AWS for almost a decade. I have implemented CI/CD pipelines to deploy across all its regions (with AWS SAM, AWS CDK, Terraform).
The difference is night and day. It doesn’t even come close!


Scale to hundreds, thousands, hundreds of millions of Durable Object instances.

A Durable Object is a mini server with local, durable, actual disk storage (10GB).
With just a name or ID, you summon a tiny server instantly at the location you want.
No other platform allows you to horizontally scale so trivially. Not at this scale.
Fly.io is in my opinion the only other platform that has a nice UX doing similar kind of horizontal scaling with their Machines API offering. However, even in that case, managing the exact location of each instance, creating and tearing them down instantly, referencing them by a constant name/ID throughout the whole lifetime of your application (e.g. across blue/green deployments), and doing all of that transparently within my code without a lot of boilerplate, is not as seamless as just creating a Durable Object Stub.


The Cloudflare Developer Platform.

I only used Workers and Durable Objects for the whole Tiddlyflare product.
There is so much you can do within the platform, and everything integrates with Workers through Bindings so nicely (and will get even more seamless).
Workers KV, R2, Workers AI, Cache API, Rate Limiters, upcoming Containers, and so much more.
Example: Adding a global cache to Tiddlyflare is simply env.KV.put(wikiId, src). This will now allow fast reads from any Worker processing requests for that wikiId without having to even reach the DO instance. 1 line!



env.WIKI.get(extractDOID(wikiUrl)).deleteWiki(wikiId);

That’s all it takes. 🚀
Mindset shift
Having said all that, I want to point out an adoption issue with a platform like Workers and Durable Objects.
We are transparent grown up adults, after all.
Durable Objects have been public for several years now, but most developers have no clue what they are, and what they can do.
That’s because it’s a radically different combination of compute and storage infrastructure product compared to existing platforms.
I mentioned before that someone familiar with any Actor programming model (Elrang, Elixir, Akka) will feel right at home with Durable Objects.

It’s the Actor model with infinite scale built natively into the global Cloudflare network itself.

In the end, this boils down to thinking about the main entities of your application as separate “things” that communicate with each other.
Each entity has their own memory, their own local disk storage, and their own lifecycle. In our example above, we have two main entities, the tenant (user information), and the wikis (versions of HTML content).
At the application level we have 1 TENANT instance per user, and unlimited amount of WIKI instances per user.
You could model all the wikis to be managed by a single Durable Object instance, or further merging the two and only having one Durable Object instance for each user with all their information including wiki content. 👎🏼
The problem with that design is that all requests and all operations for a user, including all the operations for all of their wikis, would be handled by a single Durable Object instance. What if a single wiki gets DDoSed and then blocks all others? What if the machine hosting that single Durable Object instance goes down? What if…?
Durable Objects are a very, very powerful programming model. However, a single Durable Object instance is a tiny server of 128MB memory and 10GB disk space. There is only so much it can do for you and requests it can handle (up to 1 thousand per second).
I previously described a bunch of other example cases and how Durable Objects can influence their architecture.
In summary, the main hurdle of adopting the Workers platform is that you need to start designing your applications with this isolation in mind.

What if every core entity in your application had its own mini server?

That’s Durable Objects.
Conclusion
You made it. Awesome. Thank you. 🙏🏻
I hope you now know what Durable Objects are, and understand why they are powerful.
If you want to use them in your applications, and have more questions, please, please reach out.
If you have feedback to improve the platform, please reach out.



Skybear.NET Scripts secret variables, HTTP triggers, and replacing AWS Lambda with Fly.io - Changelog 2024-10-15
lambros@lambrospetrou.com (Lambros Petrou) — Tue, 15 Oct 2024 00:00:00 GMT
Table of contents

HTTP Hook Trigger
Secret Hurl variables
Out of AWS Lambda into Fly.io
Response bodies automatically persisted
Docs
Conclusion and feedback


Skybear.NET is a managed platform automating your HTTP API testing using Hurl.dev plain text scripts.
You can run your scripts on-demand through the API, or periodically based on a cron expression.
It’s been a few months since the last changelog article (see last update in July), but it doesn’t mean there wasn’t any work done.
On the contrary, the features shipped since then are quite the bangers!
Let’s dive into them. 👇🏼
HTTP Hook Trigger
In a past update (see article) I introduced the script triggers, specifically the scheduled cron trigger.
Scheduled triggers allow you to configure a script to run periodically every few minutes based on a cron expression.
However, in order to properly integrate with Continuous Integration (CI) systems running scripts periodically doesn’t work.
We want to be able to run scripts on-demand, wait for the results, and then accordingly progress or abort the deployment.
Back in August (see changelog) I shipped a key feature, the HTTP hook trigger.
In addition to the Scheduled Cron trigger, you can now configure your scripts to be invokable through an HTTP POST request.
The request is blocking, and does not return until your script execution is complete.
This allows you to run your scripts in your CI, and make sure any code changes you shipped are actually correct.

  
  Skybear.NET script HTTP hook trigger settings.


You can see above the new script setting to enable your script’s HTTP hook trigger.
Please note that this URL should be treated as a secret, since anyone with that URL can send an HTTP POST request to it and trigger a run of your script.
Script runs triggered by the new HTTP hook trigger have a new indicator:

  
  Skybear.NET script HTTP hook trigger run result.


Secret Hurl variables
Hurl scripts are very flexible, and they support variables in order to allow dynamic scripting.
For example, the following script defines the variable BASE_HOST that needs to be provided at execution time.
GET https://{{ BASE_HOST }}/some-api
HTTP 200

With the release of the HTTP hook triggers, it was only natural that folks wanted a way to provide Hurl variables to their scripts since now they could run either through the UI, through their CI, or just manually through the HTTP trigger.
Skybear.NET shipped support for Hurl variables as part of the request to the HTTP hook trigger URL (see changelog), enabling fully dynamic scripts.
An example POST request to the HTTP hook trigger of a script:
curl --request POST \
  --header "Content-Type: application/json" \
  --data '{"hurlVariables":{"BASE_HOST": "skybear.net"}}' \
  https://api.skybear.net/v1/integrations/triggers/http/s_nkjfLc27FDb1dRz7rZ95zcc/strig_http_l9qRWlr16M3jm1LnbTzM7XtSNcGKShGtq:sync

A few days later, I fully integrated Hurl variables into the Skybear.NET platform and built a UI to manage these variables through the UI (see changelog).
You can define them in your Skybear.NET account secrets page, and use them across all your scripts.

  
  Skybear.NET account secrets and variables.


All Hurl variables are being treated as sensitive secrets. Never logged or stored in plaintext. 🔐 Script run reports are also encrypted.
I do envelope encryption. Each variable is encrypted with its own Data encryption key (DEK), that itself is encrypted with a different Key Encryption Key (KEK) for each account, and that KEK finally {en,de}crypted by AWS KMS!
I will write up a more detailed article on how envelope encryption is implemented.
With Hurl variables integrated natively, you can literally take your local scripts, put them in Skybear.NET, run them in the UI, configure them to run automatically every few minutes, or even get an HTTP URL that you can invoke in your CI and have them running in the Skybear platform remotely.
Full script compatibility with local Hurl CLI. 👌
Out of AWS Lambda into Fly.io
Back in September, a user reported that their script sending requests to an IPv6 origin was failing.
At the time, the script execution was done inside AWS Lambda.
After a bit of research and testing, it turns out that AWS Lambda does not natively support outgoing IPv6 connections, therefore all scripts attempting to do that were failing.
I immediately created a “known issue” in our documentation (see issue entry) for this so that I can share it with other customers in case they encounter this.
I went through a few solutions, and decided that it was not worth the effort to fiddle with AWS networking shenanigans, having to move my AWS Lambda functions inside a VPC, creating and proxying all traffic through NAT gateways, etc.
After 2 days, all of the script executions were moved out of AWS Lambda into Fly.io (see changelog).
The transition was trivial, uneventful, and has been running smooth since then! 🥳
I have used Fly.io in the past, so I was familiar with it, and since the execution server was already an HTTP API written in Go it was trivial to migrate it from AWS Lambda into Fly.io with a tiny Docker container.
As a fun side-effect, I now use 3 regions on Fly.io instead of just one I previously had with AWS Lambda, therefore there is more resiliency in case any of their datacenters goes down.
And soon, I will enable users to select the location(s) where their scripts will be running. 🥳
Response bodies automatically persisted
Another huge upgrade in September was the upgrade of Hurl to version 5.0.1 (see changelog) that added the --report-json argument.
Before that upgrade, you had to use the output:  option in order to persist responses in your script run reports.
With the new Hurl option --report-json, the platform now automatically persists every* single response body received while executing your script, uploads it to durable object storage, and makes it available to you for download later (see changelog).
The output option continues to work, but you shouldn’t need to use it.
Debugging your scripts and APIs was never easier 🙃
* There is a limit on how much storage each script run can use at the moment, but I plan on extending this based on the active billing plan.
Docs
Finally, last month I released the initial documentation website for Skybear.NET.
Visit https://www.skybear.net/docs/ and let me know what you think.
It’s quite barebones at the moment, but I will be adding a lot of content over the next few weeks.
Tutorials, existing feature descriptions, and plenty of examples for Hurl scripts.
Keep an eye on the Changelog page listing major and highlighted releases.
Conclusion and feedback
Check out Skybear.NET and let me know if you have specific feature requests.
If you have any questions, email me, or reach out at @lambrospetrou. 🙏🏼



Building a feature flags service with Cloudflare Workers and Durable Objects
lambros@lambrospetrou.com (Lambros Petrou) — Fri, 23 Aug 2024 00:00:00 GMT
Table of contents

Problem definition
Conclusion

Feature flags are one of my favourite practices in software engineering.
I wrote a detailed article about their use-cases and how I consider them a superpower!
In this article, we are going to build a feature flag service on-top of the Cloudflare Developer Platform, specifically Cloudflare Workers and Durable Objects.
I recently wrote about Durable Objects, in my opinion the most underrated compute and storage offering by Cloudflare.
If you don’t know what Durable Objects are, go read that article first, otherwise you can simply imagine them as “Same as Workers; with durable storage that is persisted across requests”.
Problem definition
Before we dive into the implementation specifics, let’s define first what we want to build.
A feature flag service usually boils down to a function isEnabled(featureName: string, identifiers: Map): boolean.
The featureName denotes the feature flag we want to check, and identifiers denote the specific resource(s) we want the feature flag for.
The return value is simply a boolean true or false, where true indicates that the resource(s) specified by the collection of identifiers has the given feature flag enabled.
Examples of identifiers most often used in feature flag systems: account ID, user ID, request country/location, server host name, server region/availability-zone/datacenter, and more.
Some feature flag services support more complex use-cases like A/B Testing, but to keep the scope of the article small, we are only going to focus on the above use-case of enabling and disabling features.
Technical requirements
I will just define some requirements for the service, just to have something concrete to model our system around, but these can be as you like in your own systems.

Feature names are strings up to 1,000 alphanumeric characters.
Each identifier is a string up to 100 alphanumeric characters.
The service should be accessible from anywhere in the world over an API.
Latency of the service should be low.
Our Javascript UI code running in users’ browsers anywhere in the world is using this service.
Our server-side code running in our datacenters/cloud is using this service.


We have users spread around the world, US, Europe, Australia, etc.
We read feature flags 100x more often than writing or updating them
We support at least 10,000 requests per second, and a peak of 500 requests per second per feature flag.
We want at most 15 seconds of latency for updates to propagate everywhere. Useful for immediate disabling of features (i.e. kill-switches).

Examples of feature flags
TODO
Traditional system design
Option 1 - Object Storage
In my feature flags article I describe one of the most common, and to be fair, simplest implementations of a feature flag service.
I love this solution because it’s extremely simple, but battle-tested and robust. We used this exact approach for several services in AWS.
You configure all your feature flags in a source code repository, and every time you push a new change the whole repository is stored on a cloud object storage system like Amazon S3 or Cloudflare R2.
Then, all the clients periodically fetch those files locally and update their internal states.
DIAGRAM TODO
Drawbacks

Clients need to download and parse all the feature flags set, or at least everything that could be queried by the local applications on the same machine.
Works only for server-side applications, or other long-running services.
Updates are periodic usually every 1-5 minutes to reduce costs.

Option 2 - Centralized database application
Another very common approach is to have a traditional server application in one location that holds all the feature flags, and exposes an API any client can use.
DIAGRAM TODO
Drawbacks

Latency accessing the service from around the world will be slow, 100ms+ depending on the region of the users.
Everyone is connecting to the same application, so availability and performance of the service will hurt as you scale up usage.

Option 3 - CDN caching
Same as Option 2, but add a global CDN in front of it with 15 seconds of caching.
All the identifiers have to be part of the cache key to avoid incorrect results.
DIAGRAM TODO
Drawbacks

Depending on the identifiers we use caching can be useless for a long tail of feature flags. For example, if most of the identifier combinations are not popular enough to have cache HITs, then all of them will have the same drawbacks as in Option 2.

Option 4 - Application replicas
The next logical evolution of the service is to replicate the feature flag database, or even the whole appication, in more than one locations.
Imagine you have the same application running in US, in Europe, and in Australia.
These applications can use database replication (or something like Amazon DynamoDB Global Tables), or deploy multiple independent instances of the application.
All traffic will be routed to the application closest to it.
We will need a new routing layer at the edge (CDN) so that all requests go to the closest application servers.
This extra layer can be Cloudflare Workers, that runs on every single Cloudflare location at the edge, with zero coldstarts, and can do the routing of the requests based on user’s location.
DIAGRAM TODO
Drawbacks

Consistency across these systems will be complex, or feature flags will need to be sharded in a way that every single query will not need to access another instance’s data.
Better latencies than previous options, but still can be slow for many locations. Otherwise, the cost will be high by deploying many of these server applications.
Management and deployment of all the instances of the application becomes heavier and more complex.
Each indepedent instance of the application needs to scale to handle all the requests of its “region”.

Option 5 - Database replicas
Similar to Option 4, but instead of having multiple application instances, we have a central application and multiple database replicas spread around the world.
There is only one application handling all writes, therefore consistency is trivial, and it replicates its updates across multiple other locations.
To satisfy our requirements, the replication delay cannot be more than 15 seconds.
Assuming that we have a smart edge layer, like Cloudflare Workers, we can directly access the closest database replica.
Therefore, all queries will be able to query the whole dataset.
DIAGRAM TODO
Drawbacks

Better latencies than previous options, but still can be slow for many locations depending on number of replica locations.
Management and deployment of all the datastore replicas is complex and costly.
Each replica datastore needs to scale to handle all the requests of its “region”.

Option 6 - Workers KV
If you search for feature flag implementations on-top of Cloudflare’s Developer Platform, most of the solutions are using Workers KV as their database.
Workers KV is Cloudflare’s low latency key-value datastore with global caching, and completely serverless.
There is nothing to manage, and you just write data into your database.
The internal architecture of Workers KV is very similar (at high level) to Option 5 above (read more in “How KV works”).
There are a few central datastores across the world replicating data between them, and the Workers at the edge access the closest one to the user’s requests.
Workers KV by default will cache the most read items in order to achieve very low latencies.
Drawbacks

Propagation of updates/writes across all locations can take up to 60 seconds.
Caching HIT only works for very popular items, therefore many queries will have to go to one of the central datastores.

Option N
There are infinite number of iterations to do to improve the system’s architecture, but we already covered most of the approaches implemented in practice.
The rest of the article will showcase how you can use the Cloudflare Developer Platform to implement an even better system, with much less overhead and complexity!
What if
Option 5 above is really good, almost great, but its drawbacks are important:

Better latencies than previous options, but still can be slow for many locations depending on number of replica locations.
Management and deployment of all the datastore replicas is complex and costly.

What if we could have as many copies of our datastore replicas without most of the complexity, hence removing the second drawback above.
Now that we removed the complexity of maintaining the datastore replicas, could we add more?
Instead of having 2-3 replicas, what if we had 10?
By having more replicas, we can get the data closer to more users requesting them, hence reducing the latencies and removing the first drawback.
We need to keep in mind though that the more replicas we have, the more updates we need to replicate across all of them, and the more complex it is to maintain them.
This is the main reason most common implementations of Options 4 and 5 above only maintain a small amount of replicas or independent instances of the whole application.
Durable Objects

Durable Objects (DO) are built ontop of Cloudflare Workers. Each DO instance has its own durable (persisted across requests) key-value storage, its own in-memory state, executes in single-threaded fashion, and you can programmatically create unlimited of them, each in its own location.
From my article describing Durable Objects.

Durable Objects make the “what ifs” reality.
You can have unlimited number of Durable Object instances without any maintenance overhead or significantly extra costs (see docs)!
You can place Durable Object instances in specific locations, among all of the supported regions (see docs)!
Each Durable Object supports the same code as any Worker, with the additional superpower that it has locally persistent durable storage.
This means that you can keep state across all requests handled by that object instance, and you can even use in-memory state to significantly speed up certain operations that can be cached in memory.
Finally, combining Workers that run at the edge (every Cloudflare location) with Durable Objects running close to the Workers, you have an amazing platform to spread your data and load across the world.
Unlimited sharding
The fact that you can control the location of a Durable Object instance, and that you can create as many as you want, with exactly zero extra effort or cost, enables some new system design approaches.
In traditional sharding approaches, you are limited in how many shards you create due to the high cost of each additional shard.
Durable Objects allow you to shard infinitely with tiny shards of a few bytes, a few hundred shards of bigger capacity, and everything in-between.
For example, in our feature flag service, we could have each Durable Object (DO) own any of the following aspects of our data.

Each DO instance owns a specific feature flag name (all its identifiers).
Each DO instance owns a specific feature flag name and a subset of the identifiers.
Each DO instance handles all feature flag requests originated from a geographical area (e.g. city level).

All of the above approaches have their own tradeoffs, how you replicate data among DO instances if necessary, how you coordinate reads and writes, how many of them do you need, etc.
The criteria we use to decide the sharding dimensions are:

Geographical proximity of the DO instance handling a request
Throughput we need to support per DO instance (limit of 1000 RPS per DO instance)

Option - DO 1
We have one Durable Object instance for each single feature flag and all of its identifiers.
This is similar to Option 1 above, albeit easier to maintain.
Option - DO 2
We have one primary Durable Object instance for each single feature flag and all of its identifiers that processes all writes and updates of that feature flag.
We also have one replica Durable Object instance placed in every single of the other locations Durable Objects are available.
Write flow

Edge worker receives the write request.
Write request is sent to the nearest replica DO instance.
Replica DO instance sends the write request to the primary DO instance.
Primary DO instance applies the write to its local durable storage, caches in-memory, and returns its latest value.
Replica DO instance applies the latest value locally.
Primary DO instance schedules an alarm (see docs) a few seconds after the write to broadcast the latest value to all replicas.
Primary DO instance wakes from the alarm and sends in parallel a request to all replicas. The replication across all replicas can also be done using a Gossip protocol (link to MARTIN KLEPMANNS youtube talk with this).

The gossip protocol or broadcast is not across thousands of nodes, it’s across a few tens, so it should be relatively fast and efficient.
Also, since we sharded our data a lot, the expectation (or assumption really) is that only a small minority of the features flags will have often updates that would need to run this replication flow.
Read flow

Edge worker receives read request.
Read request is sent to the nearest replica DO instance.
Replica DO instance returns its latest local value.
Edge Worker uses the CDN Cache to cache the value for a few seconds.

Drawbacks
If a region is partitioned from the rest of Cloudflare network, the replication protocol might not reach the replica DO instances in that region, hence it will be returning stale values.
Since all the reads will go to the same DO instance it’s always going to be returning the latest value it knows, and it will also respect read after writes for writes originating from that region.
Option - DO 3
We have one primary Durable Object instance for each single feature flag and the main identifier combination that processes all writes and updates of that combination.
For example, we have one DO instance for each feature flag + account ID combination.
We also have one replica Durable Object instance placed in every single of the other locations Durable Objects are available.
The flows are similar to Option - DO 2 but with extra sharding based on the account ID.
This allows us to scale out to much higher throughput, especially when we have accounts doing much more writes compared to other accounts, which is pretty much the case for every single SaaS out there.
Hot partitions
It’s not uncommon for a whale customer to saturate a system’s throughput limits and causing issues.
With the design of sharding at the feature flag + account ID dimension, we have a limit of 1000 requests per second since a single DO instance will own each combination.
If we have an account that wants to be making more than 1k/s writes, which is extremely weird if you ask me, then we would need to further shard that DO instance.
A common approach used in databases that we can borrow here is to “split” the combination into N instances, and we pick one of them to handle each write (usually using some hashing technique or randomly if possible), but on every read we have to query all N instances to get the full value.
Or we split by another identifier to further scale out, but this won’t always be possible.
Going deep into more complex scaling approaches will be addressed in a future article.
Durable Objects system design
DIAGRAM TODO
Worker
-> Edge Cache
-> DO replica
|
|
-> DO primary




Durable Objects (DO) — Unlimited single-threaded servers spread across the world
lambros@lambrospetrou.com (Lambros Petrou) — Wed, 14 Aug 2024 00:00:00 GMT
Table of contents

Not just for real-time collaboration
Workers intro
Durable Objects intro
Durable Objects and the Actor model
Durable Objects use-cases
Limitations
Pricing Durable Objects vs Workers
Conclusion

In this article I will showcase Durable Objects (DO), probably the most underrated compute and storage offering by Cloudflare.
I am not going to focus on how you use Durable Objects in code at all, since the goal is to explain why Durable Objects should be used more, but you can find code examples in the developer docs.

TL;DR
Durable Objects (DO) are built ontop of Cloudflare Workers (edge compute). Each DO instance has its own durable storage persisted across requests, in-memory state, executes in single-threaded fashion, and you decide its location. You can create millions of them!

If the above sentence is not clear, keep reading. You won’t regret it.
Not just for real-time collaboration
As of the time of writing, the Cloudflare developer documentation (see here) and the Durable Objects landing page (see here) describe DOs as follows:

Real-time, low-latency API coordination and consistent storage
Durable Objects provides a powerful API for coordinating multiple clients and users — helping you build collaborative applications while maintaining strong consistency of state. […]
Durable Objects provide a powerful API for coordinating multiple clients or users, each with private, transactional and strongly consistent storage attached.

In my opinion, even though the above is accurate, it focuses too much on the real-time collaboration use-case and throws off customers that are not actually working on real-time collaboration software.
That is a lot of people, including myself when I initially read about Durable Objects (DO).
I will showcase why Durable Objects are a great fit for a lot more use-cases than real-time collaboration.
Workers intro
Durable Objects are super-powered Cloudflare Workers (see docs [1] [2] [3]).
Workers are Cloudflare’s serverless compute platform that:

Doesn’t have cold starts.
Runs in more than 300 locations across the world.
Supports Javascript and any WASM-compiled language.
Has direct access to all of Cloudflare’s infrastructure (see all Runtime APIs).

You can read more on how Workers are implemented on-top of V8 isolates in the excellent “How Workers works” page.
The things we care about for this article is that Workers:

Are limited at 128MB of memory, therefore hard to do in-memory caching for anything significant.
Concurrently handle many requests on a single Worker instance (V8 isolate).
Are stateless across requests, and each request can potentially be routed to a different machine in Cloudflare’s network.
They can be destroyed and recreated whenever, therefore user code should have no assumptions about affinity (i.e. where it runs) or longevity (i.e. how long each worker instance runs).

Durable Objects intro
Now that we know what Workers are, we can better understand Durable Objects.
In my mind, Durable Objects (DO) are Workers with durable storage, but there are some important differences from Workers.
Firstly, let’s go through some key properties of DOs:

Built on-top of Workers, so they support exactly the same code (Javascript+WASM), with the same memory limits.
Each Durable Object instance stays alive for several seconds before hibernating, hence allowing in-memory caching to boost performance (see docs).
Each Durable Object instance has its own local durable storage that only that specific instance can access (read/write) (see docs).
Each Durable Object instance has an identifier, either randomly-generated or user-generated, allowing you to “select” which Durable Object instance should handle a specific request or action by providing this identifier (see docs).
They are not available in every location like Workers, but are still spread around the world (see the where.durableobjects.live project for live locations), and most importantly you can influence where each instance should be located (see Location hints docs).
They provide an Alarms API that allows you to schedule an execution of your Durable Object instance any time in the future with millisecond-granularity (see docs).
They are effectively single-threaded; when a request execution causes any side-effects (e.g. durable storage reads/writes) other requests to that specific DO instance are blocked until the request completes. Read more details on how this is implemented in the blog post “Durable Objects: Easy, Fast, Correct — Choose three”.
Each Durable Object “type” (or “binding” in Cloudflare terms) maps 1-to-1 with a Javascript class implementing the business logic. You can create unlimited instances of each Durable Object type.

There are more things to know, but I listed the key features I care about.
You probably already spotted some important differences between Workers and Durable Objects:

Fully stateless (Worker) vs In-memory state (DO)
No instance affinity (Worker) vs Affinity by an identifier (DO)
Always near the request (Worker) vs Ability to influence location (DO)
No storage (Worker) vs Durable storage (DO)
Concurrent execution (Worker) vs Single-threaded execution (DO)

Durable Objects and the Actor model
Another way of describing and thinking about Durable Objects is through the lens of the Actor programming model (🧍🏼‍♂️↔🧍🏻).
There are several popular examples of the Actor model supported at the programming language level through runtimes or library frameworks:

Erlang OTP used by Erlang and Elixir, probably among the oldest and most feature-rich Actor runtimes.
Akka for Java, Scala, and later .NET.
Microsoft Orleans for .NET.
more…

Each Durable Object instance can be seen as an Actor instance, receiving messages (incoming HTTP/RPC requests), executing some logic in its own single-threaded context using local durable storage or in-memory state, and finally sending messages to the outside world (outgoing HTTP/RPC requests or responses, even to another Durable Object instance).
The Actor model simplifies a lot of problems in distributed systems because it abstracts away the communication between actors using RPC calls that could be implemented on-top of any transport protocol, and it avoids all of the concurrency pitfalls you get when doing concurrency through shared memory (e.g. race conditions when multiple processes/threads access the same data in-memory).
The most astonishing feature of Durable Objects, and the super differentiator from the above Actor frameworks, is that with Durable Objects you get all of the above, built-in, while having a huge distributed network to spread your actors across. 🤯
When I was at WhatsApp a few years ago, I was really amazed by how the Erlang runtime simplified and solved many distributed systems problems for us. Each chat (e.g. between 2 people) was handled by a single Erlang process (a single actor), and all messages of that chat were routed to that specific process using the Erlang runtime primitives for routing. This meant that all the logic for each chat was running in a single-threaded process with normal sequential code. Easy to reason about. We also had tens of thousands of servers across the US and Europe connected in a mesh so that the Erlang runtime can route messages across them.
With Durable Objects you get all of that 💪🏼 But with orders of magnitude more power and much less effort than if you had to build all this infrastructure yourself!
Your system now comprises of Cloudflare’s whole network. Even though Durable Objects are not in every location of the CDN like Workers, they are in more than 25 cities across the world as of today (see where.durableobjects.live) and they will keep expanding over time.
As mentioned above, you can create unlimited instances of each Durable Object type (or binding), therefore if you design your Actor model well, you can scale across the whole of Cloudflare’s edge network.
Durable Objects use-cases
Now that we know all about Durable Objects, let’s see how they can be used to simplify your architecture.
As a guideline, if you have any use-case where there is a clear boundary between some or all of your resources, you can use Durable Objects to simplify your logic when processing and storing each of those resources.
It can be per user, per chat channel, per factory warehouse, per object storage bucket, per workflow, per tenant in a multi-tenant SaaS, etc.
If all your database queries have a WHERE resourceID = 'abc' in order to restrict your queries to the correct subset of your data with that resource identifier, then Durable Objects could potentially simplify your life. It always depends on the concrete use-case obviously, but this is the starting point to see if they can benefit you.
Real-time collaboration
If it’s not yet clear, the reason DOs are awesome for real-time collaboration, is that you need “total order of events” such that all the “actors” collaborating end up in the same final state.
Chat systems
Each chat channel (e.g. Lambros talking with Ben) is one DO instance, therefore all messages go through the same instance and processed sequentially.
The volume for a single channel is relatively very low, but the volume of all messages across channels is usually humongous.
This is perfect for Durable Objects, since you can have unlimited DO instances spread across the world (closer to each chat channel’s members).
Imagine that the Durable object instance responsible for the chat between Lambros and Ben is in London/UK, but the instance for the chat between Paul and Jasmine is in Portland/US.
Online document live-editing
Each document is one DO instance, therefore all edits of the document are processed sequentially by the same DO instance and stored in the instance’s durable storage.
With unlimited Durable Object instances, you essentially support unlimited number of documents.
Multi-tenant SaaS
Imagine you are building a restaurant booking system, or any other booking system.
You need to process bookings in-order to avoid double-booking tables, and usually you do this with database transactions.
If your volume is low, then a single-server database is fine, and you should go with that. But what if you want to grow and expand your service to thousands of restaurants, or even across the world and you want to be local to each of those regions (latency, jurisdiction regulations, etc).
Durable Objects can help you. Each Durable Object instance can handle all bookings for a single restaurant, and you can have unlimited number of DO instances hence support for unlimited number of restaurants.
CI/CD pipelines
Imagine you are building a Continuous Integration (CI) and Continuous Deployment (CD) service.
You need total ordering not only across pipeline executions, but also durable state persistence for each individual execution in case they can be paused and resumed later or just to keep track of progress and reference artifacts.
If you want to have parallel independent executions of a pipeline, each pipeline execution can be represented by a Durable Object instance that keeps track of each execution, and one instance that just keeps track of all the execution IDs and maybe coordinate them (e.g. restrict how many parallel executions you can have).
Alternatively, if you want to restrict only one execution at a time for each pipeline stage, but want to allow multiple executions to exist in different stages of the pipeline (similar to Amazon’s Pipelines) you can model your pipeline as a Durable Object instance that keeps track of all the pipeline executions of the pipeline.
The actual modelling depends on your requirements, but you can see where this is going.
So many more
I cannot enumerate every single use-case out there…
The gist is that if you can think of a nice boundary across your resources, there is a high chance for Durable Objects to be very useful to you.
In Cloudflare we use Durable Objects for a lot of the internal and external products and more are adopting them every day.
Limitations
You can find all the limits in the Durable Objects documentation, but the main things to keep in mind are:

Throughput of each Durable Object instance (see docs).
Total storage per account (can be raised by contacting support).
Size limits for the storage’s keys and values.

Most Worker limits apply too (see docs).
Pricing Durable Objects vs Workers
Just pointing out that pricing for Durable Objects is different than Workers.
Workers are charged purely on CPU time (see docs), whereas Durable Objects are charged on number of requests and duration (see docs).
As mentioned previously, Durable Objects can stay alive and have in-memory state, hence why the duration dimension is taken into account for pricing.
Almost every serverless compute provider is also charging by duration so it shouldn’t be surprising, but something to keep in mind when comparing with Workers, famous for CPU-only pricing.
Conclusion
Durable Objects (DO) are a super powerful tool.
When they fit a use-case, they can simplify so many things while keeping the developer experience great and operational overhead low.
If you can think of a nice boundary across your resources or users, there is a high chance that Durable Objects will be very useful to you.
Each Durable Object instance is identified by a user-provided key, has its own throughput limits, its own durable storage, its own in-memory state, executes in single-threaded fashion, and you can influence its location if necessary.
Oh, and BTW, there is nothing blocking you from using Durable Objects for the resources that fit that model, and another storage product (e.g. D1 SQL database) for your highly relational resources.
You can find a few select Durable Object examples on the developer docs as well.
If you have feedback about the article, or Durable Objects, feel free to reach out 😉



Brag document and folder — feels good only
lambros@lambrospetrou.com (Lambros Petrou) — Sat, 27 Jul 2024 00:00:00 GMT
Table of contents

Many definitions
Is this… the Worklog?
Brag document and folder
Conclusion

In this article I explain what a “Brag document” is, how it relates to my own “Worklog document”, and how I use them both.
Many definitions
A Brag document has many names, depending on who you ask, and it can also contain a variety of content, again depending who you ask.
Julia Evans was among the first that popularized the term “Brag document”. Her post is really good, go read it too (after this one though 😅).
Others also wrote their own versions, mostly similar to Julia’s post.

“Keeping Track of Your Accomplishments with a Brag Document” by Jeff Humble
“An Essential Tool for Capturing Your Career Accomplishments” by Jessica Ivins
“Why you Should Keep a Brag Document” by Andy Budd

Finally, several folks often discuss about this type of documents on social media, e.g. this Reddit post titled “Keep a Brag Document”.
A Brag document is many things, but the common definition is:

Listing what you do over time at work, tasks, problems solved, etc.
Making it easier to find your achievements for promotion docs, interview dicussions, etc.
And can use it as a source of happiness and feeling of achievement.

Is this… the Worklog?
Several years ago, I wrote an article describing the “Worklog document”, my own kind of document I was keeping throughout my career.
If you read my article, and compare it with the articles referenced above, you will see a huge overlap.
I didn’t know the term “Brag document” back then, but they look awfully similar.
This is why I use the term “Brag document” a bit differently than the above.
Brag document and folder
I personally use a Brag document just for the good stuff, the achievements, and the praises.
It’s not a list of things I do over time, it’s not a list of projects I delivered, and it’s not the number of oncall incidents I managed to resolve.
It is complementary to my Worklog.
It has screenshots of chat messages from colleagues thanking me for something.
It has quotes from others praising my work, in chat, in announcements, in wikis, or given as part of performance review feedback.
It has awards I have won, or significant milestones in my career.
It is a document, and literally a folder of screenshots.
You can see some of these screenshots in my 1:1 Tech Interviews page, with colleagues praising my article about software engineering interviews and asking if they could use it themselves.
A quote from my last performance review feedback can be found also in my article “Ownership - High agency - Manager of One” (go read it and find out which one 😎).
Conclusion

This Brag folder and document is literally the place where you can go and just feel nothing but good about yourself, your work, and your achievements.

Everyone should keep a Brag document, regardless of level or industry, from Juniors to Distinguished Engineers, to doctors, to teachers.
All people forget. Colleagues and managers, and even you, will forget what you did in the past.
This is your way of persisting the great things you do forever.
Trust me, your future self will thank your current self for doing this.
Do good work, and keep updating your Brag document and folder. 🥳



Skybear.NET Scripts landing page and Business plan - Changelog 2024-07-21
lambros@lambrospetrou.com (Lambros Petrou) — Sun, 21 Jul 2024 00:00:00 GMT
Table of contents

Run types
Landing page
Business plan
Continuous Integration use-case
Conclusion and feedback


Skybear.NET is a managed platform to automate your HTTP API synthetics testing using Hurl.dev plain text scripts.
Use it for testing your HTTP APIs on-demand or periodically, or use it as a complex orchestrator for a sequence of HTTP requests that need to be executed in order using data from previous ones at specific times of the day.
The past two months was a lot of work, but also a few weeks of holidays to recharge.
Let’s dive into the changes that happened.
Run types
In the last update (see post) I introduced the script triggers, specifically the scheduled cron trigger.
This type of trigger allows you to configure a script to run periodically every few minutes.
I also introduced the historical script runs listing page where you can see all the invocations of your script, either manually triggerred through the UI, or due to the scheduled cron trigger.
For this changelog, I did a small user-experience improvement to denote the type of each run.

  
  Script run invocation type in list runs page.


A small indicator now exists in the scripts list page that is awesome in quickly checking out which of your scripts are scheduled to run periodically, and how often.

  
  Script cron trigger configuration in scripts list page.


Landing page
I finally launched the product’s landing page. 🥳 Check it out at https://www.skybear.net/.
I spent some time working on the landing page’s copy, trying to emphasize what Skybear.NET is about, what you can use it for, and why it’s not simply yet another uptime checker.
This is only the beginning though, and I have a lot of changes coming up soon that will enrich the landing page even more.
Take a look, and let me know if I can improve something!

  
  Skybear.NET product landing page above-the-fold.


While working on the landing page, I used several resources as guidance, especially a few products I love using or following (e.g. Tailscale, Tailwind CSS, Postmark, and anything from Basecamp).
You can spot some of their influence in my landing pages 😅
The most helpful resource though, which I recommend to anyone working on landing pages to read, is this guide by Julian Shapiro: Resource:
Landing Pages
Go ahead and read it, now! Thank me later 👌
Business plan
After a few weeks of holidays, and several days of work, I released the Business pricing plan for Skybear.NET.
I spent more time than I wanted reading about pricing, checking out how competitors price their plans, how multiple other relevant and irrelevant products do their pricing, and went through multiple versions for my own.
A few months ago, I actually wrote my thoughts on SaaS pricing, so I had that as my guarding rails as well.
I decided for now to go with a usage-based pricing using the total number of step requests across all your scripts, combined with a volume-based bulk discount approach.
Instead of going fully usage-based where you pay only what you use, I decided to use 7 price points that I consider sufficient for companies and teams at different growth levels, and apply them a discount as you go higher.
Right now the prices available start from 15 USD/month and go up to 1100 USD/month, with an annual plan offering a whole of 2 months for free.
There is also a completely FREE tier, albeit much restricted, that someone can use to try out the platform before switching to the recurring monthly or annual plans.

  
  Skybear.NET product pricing plans as of 2024-07-21.


Are these the final and only plans forever? … Probably not…
I would lie to myself, and you, if I said pricing won’t change (higher or lower) once I start getting customer feedback, or experimenting a bit more with it, adding more features, etc.
Please try out the platform, and let me know if you have suggestions on pricing.
Continuous Integration use-case
The core platform features, including the landing page, pricing integration and respecting the plan limits, are now implemented.
In the next couple of months I am going to focus on expanding the platform’s feature set around Continuous Integration (CI).
Many teams and companies use Synthetic API Tests to verify their code changes before shipping them in production.
There is a big userbase using Hurl.dev scripts or similar tools like Bruno, and many of these teams run their tests in CI jobs in addition to their local tests during development.
I would like to dive into this market base and provide features that would entice these teams to use their existing scripts and test files with the Skybear.NET platform in order to get more visibility into their tests, and other value-adding features.
There is already a big list of ideas I am brewing and will start working on over the next couple of months, but if you are using any of these tools and have specific needs for what you need out of a platform to run your tests, please reach out.
Conclusion and feedback
Skybear.NET can already be used for real-world use-cases. 🚀
I use Skybear.NET to test Skybear.NET!
Check it out, try to run your own scripts on-demand or periodically, and let me know if you have specific feature requests.
If you have any questions, email me, or reach out at @lambrospetrou. 🙏🏼



Skybear.NET Scripts response bodies and cron triggers - Changelog 2024-05-28
lambros@lambrospetrou.com (Lambros Petrou) — Tue, 28 May 2024 00:00:00 GMT
Table of contents

Full CRUD
Hurl 4.3.0
Response outputs
Historical script runs
Script triggers - Scheduled Cron
Conclusion and feedback


Skybear.net Scripts is a managed platform to automate your HTTP website and API tests using Hurl.dev scripts. I like to call them HTTP workflows.
Use it for testing your HTTP APIs periodically, use it as a website uptime checker, or use it as a complex orchestrator for a sequence of HTTP requests that need to be executed in order using data from previous ones at specific times of the day.
Let’s dive into the changes of the past few months.
Full CRUD
In the last update (see post) I introduced the management of Skybear.net scripts: creating, updating, and listing.
A few days later, the delete functionality was completed and rolled out as well (see tweet).

  
  Delete script action.


Hurl 4.3.0
I updated the Hurl version used for running the scripts from 4.1.0 to 4.2.0 (see tweet - see changelog), and a few weeks later again to 4.3.0 (see tweet - see changelog).
Staying up-to-date with hurl means more features and bug fixes to all Skybear.net users.
Response outputs
Starting with Hurl 4.3.0, there is a new output:  option for each “entry” in the Hurl script, denoting the filename into which to save the full response body of the request.
You can now have full end-to-end introspection of the request and its response, including status code, headers, and body.
As an example, the following snippet will query the httpbin.org/headers endpoint, and save its response into the headers.json file.
GET https://httpbin.org/headers
[Options]
output:headers.json

The screenshot below shows a more complicated workflow involving 4 HTTP requests, each with its own resource output file.
Clicking any of the four filenames (see Output resources section) opens the corresponding file as-returned in the original response.

  
  Multiple requests saving their responses using the `output` option.


Historical script runs
Skybear.net scripts are now powerful enough to cover complex scenarios and use-cases.
Many times, it’s useful to access past runs of a script and examine its results, either for comparing with a more recent run, or just as a reference during incident investigations.
You are now able to access the full script run reports from the past 30 days.
This retention period will be configurable with upcoming paid plans (get in touch if you have special needs).

  
  View past script run results actions.


Script triggers - Scheduled Cron
All the above improvements led to the new script triggers feature.
Triggers are numerous mechanisms leading to an execution (aka “run”) of your scripts.
Last week I released the first among many, the scheduled cron trigger.
You can now specify how often to run your scripts, with 1-minute granularity.
For now, you cannot configure a script to run more often than every 10-minutes, but upcoming paid plans will allow you to run scripts as often as every few seconds! 🤯
This feature was built ontop of Cloudflare Durable Objects allowing for fine-grained second level granularity which provides plenty of flexibility for future extensions!

  
  Cloudflare Durable Object logs for Skybear.net scripts cron triggers.


Soon, there will be a built-in notification mechanism as well to notify you when the script fails to successfully complete execution.
Check out the video below for a showcase of configuring a scheduled cron trigger.

  
  Video showcasing the Skybear.net Scheduled Cron Triggers feature.


Conclusion and feedback
Skybear.net Scripts can already be used for real-world use-cases. 🚀
I use Skybear.net to test Skybear.net!
If you have any questions, email me, or reach out at @lambrospetrou. 🙏🏼



Ownership - High agency - Manager of One
lambros@lambrospetrou.com (Lambros Petrou) — Sun, 07 Apr 2024 00:00:00 GMT
Table of contents

Definitions
Applicability and importance
Conclusion

Definitions
What is holistic ownership of a project?

Holistic ownership. Person ABC, in the projects we have worked together, has excelled at holistic owning the problem (beyond specific projects or tasks). This is particular important when dealing with large scope problems where stepping back and looking at the overall picture to propose next steps is very important.

How important is ownership in leadership?

On any team, in any organization, all responsibility for success and failure rests with the leader. The leader must own everything in his or her world. There is no one else to blame. The leader must acknowledge mistakes and admit failures, take ownership of them, and develop a plan to win.


Leaders are owners. They think long term and don’t sacrifice long-term value for short-term results. They act on behalf of the entire company, beyond just their own team. They never say “that’s not my job”.

The 37signals folks advocated for hiring managers of one back in 2008.

A manager of one is someone who comes up with their own goals and executes them. They don’t need heavy direction. They don’t need daily check-ins. They do what a manager would do — set the tone, assign items, determine what needs to get done, etc. — but they do it by themselves and for themselves.

Having ownership and being a manager of one is very closely related to “high agency”.

High Agency is a sense that the story given to you by other people about what you can/cannot do is just that - a story. And that you have control over the story. High Agency person looks to bend reality to their will. They either find a way, or they make a way. Low agency person acepts the story that is given to them.


When you’re told that something is impossible, is that the end of the conversation, or does that start a second dialogue in your mind, how to get around whoever it is that’s just told you that you can’t do something?


High Agency is about finding a way to get what you want, without waiting for conditions to be perfect or otherwise blaming the circumstances.

All the above, except one (😉), are definitions given by people for ownership, high agency, and being a manager of one.
There are many other terms referring to the same thing.
Some companies like Amazon put this in their leadership principles and company values.
Others, put this as an explicit dimension in their performance evaluation criteria (most of the tech companies at least).
This particular skill, owning some problem end-to-end, is a key ingredient for success in business, family, and personal development alike.
Applicability and importance
Throughout my career, I have been putting a lot of effort in developing my ownership skills.
I always try to understand a problem holistically, think about problems and edge cases, design a solution, and ultimately make sure the solution rolls out successfully.
Does this apply to tech jobs though, or only software projects? Absolutely not!
Actually, one of the previous quotes is from Jocko Willink’s book “Extreme Ownership: How U.S. Navy Seals Lead and Win”.
He explains how being a leader with ownership within the US Navy SEAL forces is crucial. And he was literally in “life or death” situations.
Check his talk below for a short glimpse of his perspective on extreme ownership.




Being a person with the quality of owning whatever you do, having high agency to being able to navigate ambiguities and the unknowns, is so crucial that everyone can benefit from.
Are you working in tech? Are you a surgeon? Are you an investor? Are you a trader? Are you just a parent struggling with the kids?
You need to be able to identity problems and understand them end-to-end.
You need to be able to find solutions, and ensure they are implemented.
You need to be able to acknowledge mistakes and take responsibility in order to improve things going forward.
Everyone has colleagues that are just waiting to be told what to do all the time, without taking any initiative or having any bias for action.
Many middle managers usually hide within the bureacracy of big organizations, acting as dumb passthough proxies from reports to superiors and vice-versa.
Most times, even doing a horrible job at that, when they filter out the wrong things.
Never owning any of the problems hurting the team and never taking actions to improve things.
There are also folks that can never acknowledge they are wrong.
Everyone makes mistakes. That’s how we learn as a species. 
Having ownership, means you own the mistakes too.
It doesn’t even have to be that you individually made the mistake, but that someone on your team did.
If you are leading a project or a team, and turns out that better guidance or better communication could avoid the mistake, then you share the mistake.
A leader should take responsibility, and find ways to improve a bad situation going forward.
Conclusion
Fun fact: As I hinted earlier, all the quotes at the top are definitions I found from online sources, except one.
One of the quotes is verbatim feedback I copied from one of my performance evaluation reviews. Can you figure out which one? 😉 Shamelessly self-bragging.
Be an owner. Have high agency. Be a Manager of One.

References

High Agency: what is it, why it is important, and how to cultivate it
High Agency twitter thread by George Mack
High Agency mental model
Hire managers of one




SaaS Pricing - What I want vs What I offer
lambros@lambrospetrou.com (Lambros Petrou) — Sun, 18 Feb 2024 00:00:00 GMT
Table of contents

Context
What I want vs What I offer
Usage-based pricing
Per-seat - Recurring pricing
Conclusion
References


Pricing for a SaaS, or online digital products in general, is a hot topic for debate. All the time.
In this article I explain the pricing model I like as a customer of a SaaS product, and then compare that to the model I would (will) offer in my own SaaS product.
Are there any differences? Why?
Context
Last year I spent several months contemplating starting a PaaS business, a managed infrastructure platform. 
I decided not to in the end, but I decided to write about several things I noticed in an article.
One of the aspects I researched was the pricing models different PaaS businesses used (see Pricing section).

Pricing is a huge thing for any company. Pricing defines, and filters, the customers a company will attract. Do you want the Enterprises, charge thousands. Do you want the indie developers, give stuff for free. The pricing model is what will make a company profitable, or kill it.

The above is very true, and drives the pricing model of a company. 👆🏻
I then elaborated on what I prefer as a user too. 👇🏻

[…] for me, as a customer, I like it more when pricing starts from zero ($0) when I don’t consume any resources, and increases as I use the platform more.

Who would have guessed… Someone that wants to pay as little as possible.
Not really though! I want to pay as close to nothing as possible, when my use of a product is zero.
When I use a product, I want to pay proportionally to that usage.
And then wrote the following two-fold model as my ideal pricing model:


I thought about pricing a lot. The model I would choose if I ever built it would be two-fold:
 

Provide usage-based pricing that starts from $0, but with a higher price-per-unit.
Provide per-seat or volume-discounted pricing for bigger companies.

With the above two-fold model, anyone can start using the platform to make sure it works for them, pay for their usage along the way, and once they settle to use it they could switch to the second pricing plan. Some prefer predictability, some prefer usage-based, so why not both.



I read a lot of pricing material over the past few years (see #References section below for some of my favourites).
There are people arguing in favor of anything and everything.
What I want vs What I offer
The above two-fold approach is my personal favourite model, as a consumer, and (I think) as a business.
I have seen folks that complain about fixed high prices for a product, or talk negatively about too many recurring subscriptions, for them as an individual consumer.
But then, their own business only offers those too.
If your own business is not offering to others what you like to get, why do you expect other businesses to do so.
I am just starting a paid product business (see Skybear.net Scripts), and its pricing will be two-fold as above.
I am putting my skin in the game, or if you prefer, I will be walking my talking.
I want to provide a usage-based/credit-based pricing plan for those that don’t want recurring monthly charges.
I also want to offer bundled/tiered plans for those that prefer predictable per-month/per-year charges.
Doing this without confusing users and leading them to comparison math is the interesting bit. To be explored.
Hopefully this article will not age badly 😁
Usage-based pricing
There are several pricing models that I put into this category, including (and not only):

prepaid credits (prepay for X credits, and spend them whenever)
usage metered / pay as you go (pay at the end of the month for how much you used)
percentage commission (pay a % of some revenue to the service provider)
more…

Why do I like usage-based pricing as a customer?

Pay as little as possible when I don’t use the product.
If my use is spiky, I don’t want to pay the peak all the time, or getting throttled when the spike happens because I am paying the lower plan.
I can use the product in the same way, for a toy project, as well as a huge for-profit project.

Drawbacks of usage-based pricing?

Harder to meter and track, for the provider. Need to find the right dimensions to bill for, without pushing customers to weird usage of the product when cost-cutting.
Harder to predict, for the customers.

Examples of usage-based pricing models:

Amazon Web Services (AWS)
Although AWS takes it to the extreme often, and makes it impossible to actually calculate your bill. They don’t have to. They choose to.
AWS CodePipeline charges per active pipeline (v1), or per execution minute(v2).
AWS Lambda charges per execution GB-second.
Amazon S3 charges per GB-stored, and per request.


Upstash
Entirely usage-based on a dimension that makes sense per product (e.g. number of commands for Upstash Redis)


Stripe
Commission on every $ they process for you.



Per-seat - Recurring pricing
Pricing models that I put into this category, including (and not only):

Per-seat price per month (usually $$ per employee/user)
Tiered plans with fixed charges per month
Bundled volume-based plans with fixed charges per month
Recurring subscription-based with fixed charge per month
more…

The common aspect in these models is that you choose among a few predefined plans, and then you pay that much per month or per year regardless of how much you used the product.
These plans often have an additional dimension for overages in order to cover unexpected usage that exceeds the allotment of the chosen plan.
Why do companies like to pay or offer predictable/fixed pricing?

Easier to reason and plan for, considering some providers (e.g. AWS) make it very complex to understand your bills.
Per-seat plans are often more manageable to manage in terms of licensing, control it over time depending on active employees, etc.
MRR is trendy.

Drawbacks of recurring pricing?

See benefits of usage-based pricing above.

Examples of recurring pricing:

Tailscale
Per-seat charge per month according to plan.


Google Workspace
Per-seat charge per month according to plan.


Amazon Prime
Per year fixed fee.



Conclusion
Both pricing approaches have pros and cons.
I like the flexibility of usage-based pricing, and at the same time I understand that the predictability and easier reasoning of fixed recurring pricing is preferred by larger businesses.
The fixed bundled per-month plans with overages is a hybrid actually, but usually its high floor pricing doesn’t bring the benefits of usage-based pricing when your usage is lower than those.
In my opinion, we should do both.
I like for example how Upstash does it (as of now at least), with usage-based pricing starting at $0, and then becoming more cost-effective to switch to bundled/fixed pricing per month once your usage exceeds a certain threshold.
References
This section lists some of my favourite material for product pricing.

SaaS pricing models 101: your options and how to pick the right one by Stripe
SaaS pricing: models, strategies, and examples by Paddle
The SaaS business model by Stripe, Patrick McKenzie (patio11)
Pricing low-touch SaaS by Stripe, Patrick McKenzie (patio11)
What I Would Do If I Ran Tarsnap by Patrick McKenzie (patio11)
Pricing determines your business model by Jason Cohen




Skybear.NET Scripts private user accounts - Changelog 2024-02-18
lambros@lambrospetrou.com (Lambros Petrou) — Sun, 18 Feb 2024 00:00:00 GMT
Table of contents

Rebranding to skybear.net
User accounts
Script management
Coming very soon
Conclusion and feedback


Skybear.net Scripts is a managed platform to execute Hurl.dev scripts.
I like to describe Hurl scripts as simple workflows that orchestrate a sequence of HTTP requests, while doing response assertions and transformations.
I blogged about the silent first release of an earlier version of the tool two months ago (see article), coined as Hurl Webscripts.
Rebranding to skybear.net
First update since that article, is that now the platform is hosted under the Skybear.net brand.
I own several top-level skybear domains, and this is perfect use-case for them 😅
The previously released https://webscripts.lambrospetrou.com now redirects straight to the Skybear.net Open Editor, so no existing shared links broke during this migration.

  
  Skybear.net Scripts


User accounts
This was a beefy release and took several weeks to rollout publicly.
It was in production behind feature flags with me the sole user though for a while now.
The main reason is that it introduces user accounts, and everything from now on revolves around these accounts in terms of ownership, storage, and access control.
Previously you could only work with the Open Editor and sharing your scripts with others putting the whole script source code into the URL as params.
This was fine as an initial release, but in the end it’s not easy to iterate on a script, make improvements, and save those changes if you need to always get a new URL to store in your own records.
Now, with private scripts in your account, you can create as many scripts as you want, edit them and save them as many times as necessary to get them right, and they are persisted on the platform so you don’t have to worry about keeping track of your scripts.
It took a bit of time to fully roll this out cause I wanted to take into account future features too and implement it properly. Few things coming over the next couple of months include SSO authentication, teams and organizations, inviting users to read or author your scripts with you, and lots more.
It’s finally out though! 🎉 New features will now ship in a more regular cadence.

  
  Skybear.net Scripts sign in page.


Script management
The script management story is very simplistic at the moment, but complete, in the spirit of SLC (Simple, Lovable, Complete).
You can create scripts, update their source code, execute them, see the execution results, and finally list all your scripts.
Deleting scripts is the obvious miss in that list, but don’t worry. It’s almost done, just needing a few final touches and it’s out next week.
I have many other plans for script management, but they will be added incrementally. I don’t want to release features that make the product feel incomplete when directly related features are missing, so I am bundling them in batches.
As the SLC article above explains, it’s better to develop a bicycle in v1, even if your v5 will be a car.
You can already get value out of Skybear.net scripts without feeling out-of-place.

  
  Skybear.net Scripts listing page.


Coming very soon
User accounts and script management, were the two main features prepping the ground for everything else to follow.
I am so glad this release is finally out, and I can now gradually add features, much faster!
Some of the short-term features that are either already in-progress or will be soon:

Upgrade the Hurl execution engine to v4.2.0 from v4.1.0.
Support for names and descriptions per script for easier identification, in addition to the existing generated IDs.
Include all the HTTP responses of the script execution in the results, giving you complete visibility.
Add many more examples to make it easy to author Skybear.net scripts. (There are some fancy features I am brewing in this topic long-term…🤐)

Too many nice things coming up! ✨
Conclusion and feedback
Skybear.net Scripts is now ready for more complex Hurl workflows.
I already experimented with some workflows doing more than 300 HTTP requests (😮), asserting the responses, and making sure everything works correctly.
I want to see your own scripts, your own workflows, and your own use-cases.
I am eager to hear your feedback and feature requests.
If you have any questions, email me, or reach out at @lambrospetrou.


  
  Skybear.net Scripts release LinkedIn post on 2024-02-18 (see post).




Hurl Webscripts v0.0.1
lambros@lambrospetrou.com (Lambros Petrou) — Thu, 28 Dec 2023 00:00:00 GMT
Hurl Webscripts is a managed platform to execute Hurl.dev scripts. Hurl scripts do a lot, but to keep it short, they send HTTP/cURL requests, do assertions on the responses, and pipeline consecutive requests using previous responses.

Test it at https://webscripts.lambrospetrou.com

I don’t want this to be a generic compute platform (e.g. AWS Lambda). I do want it though to grow into more use-cases, and if all goes well and according to plans, there are lots of exciting directions.
See a small showcase of the tool as of today. This is mostly for my historical records 😅




v0.0.1
I just released the very first version of the platform a few days ago (see tweet).

  
  Tweet of the initial version v0.0.1 release.


It only allowed you to type a Hurl script, run it, and see the generated reports as-is from the Hurl CLI. Albeit with some modifications and cleanup to make them to render well on my website.
A few improvements quickly followed up.
Shareable links (see tweet) allowing you to share the authored script with anyone so that you can share your masterpieces with more folks, on social media, or in your team’s Slack channels.



A new network requests timeline visual (see tweet) explaining the time each network operation took for every HTTP request of the script. You can clearly see how much time was spent in DNS Lookup, TCP Handshake, SSL handshake, waiting for server response, and transferring data.

  
  Network requests timeline visualization.


All code editors, at some point introduce some kind of split screen mode with multiple panes to allow for more things to be shown at once.
I implemented such side-by-side mode (see tweet), with the Hurl script code being on the left side, and the output on the right side. Size of each pane is configurable, and my underlying implementation supports arbitrary number of panes on each axis, so I will be doing a lot more enhancements in the layout department over time.

  
  Side-by-side mode in mobile landscape.


Going forward
Now that the first initial version is up and running, I am able to ship smaller incremental improvements continuously.
I have lots of ideas about features and different major directions this platform could go.
I will try a few things and see what will stick with the market.
There are some features I will be implementing either way:

Support for more Hurl options (see docs).
AI assistant and plenty of examples to help you write Hurl scripts.
Scheduled execution of scripts (e.g. run this every Tuesday at 09:00 and notify “this” email).
Unique invoking endpoint that will execute the script by just calling that endpoint.
Managing a collection of scripts and running them all together.
Source control versioned scripts (e.g. Github integration).
and many more…

I am not sure yet which major direction I will pick, but some areas I want to explore include:

Webhook integrations
Website and API scraping
API end-to-end testing
Adhoc HTTP testing
more to discover…

I am having lots of fun with this platform, and the more I build, the more fun it gets.
Each feature I ship enables more use-cases, and makes the user-experience better, and this gives me joy.
Looking forward to productizing this soon! 💪😉



How to find listening and open ports in Linux
lambros@lambrospetrou.com (Lambros Petrou) — Tue, 05 Dec 2023 00:00:00 GMT
In many cases I want to check information about the network ports on my system, either to see what is using a specific port, or to see which ports are listening for requests.
Option 1 - lsof
lsof -nP +c 15 | grep LISTEN

The command lsof stands for “list open files”. It provides information about files and processes that are currently open on the system.
The following options to lsof make the command faster by avoiding DNS and service lookups. 

-n prevents the conversion of network numbers to hostnames.
-P prevents the conversion of port numbers to service names.

The option +c 15 specifies the number of seconds to wait before giving up on network-related operations.
The | grep LISTEN pipe command filters the output of lsof for lines containing the word LISTEN, which indicates processes that are listening for incoming network connections.
Option 2 - netstat
netstat -tuln

The command netstat stands for “network statistics”. It provides information about network connections, routing tables, interface statistics, masquerade connections, and multicast memberships.
The command netstat -tuln is used to display information about active network connections and listening sockets.
The options -tuln to netstat mean:

-t specifies that only TCP connections should be displayed. It filters the output to show only information related to TCP protocols.
-u specifies that only UDP connections should be displayed. Similar to the -t option, it filters the output to show only information related to UDP protocols.
-l stands for “listening” and instructs netstat to display only listening sockets, which are endpoints for incoming connections.
-n prevents the conversion of numeric addresses to symbolic hostnames. It speeds up the command execution by avoiding DNS lookups.

Appending the | grep http pipe command filters the output of netstat for lines containing the word http, which indicates processes that are listening for incoming HTTP connections.
References

https://www.redhat.com/sysadmin/netstat




How to find your public IP
lambros@lambrospetrou.com (Lambros Petrou) — Mon, 04 Dec 2023 00:00:00 GMT
Open https://checkip.amazonaws.com/.
Use the following to get it programmatically:
curl https://checkip.amazonaws.com




Template for writing technical RFC docs
lambros@lambrospetrou.com (Lambros Petrou) — Sat, 18 Nov 2023 00:00:00 GMT
Table of contents

A. Get the template
B. Sections overview
B. Template preview

A. Get the template
Download or copy the full template at this Google Document. Use it as you wish 😉
Once you open the template:

To use it with Google Docs: File (top left) > Make a copy
To use it with Microsoft Word: File (top left) > Download > Microsoft Word (.docx)

B. Sections overview
1. Title and reviewers
At the top of the template, you should have a clear title of the proposal.
I usually prefix it with RFC - </code> to make it easier to search later in Google Drive.</p>
<p>After the title there is information about the state of the RFC, followed by required and optional reviewers status.</p>
<p>The state of the RFC can be one of: <code>Draft</code>, <code>Under-review</code>, <code>Approved</code>, <code>In-progress</code>, <code>Completed</code></p>
<p>The first three states are used during the review of the RFC, and once approved we proceed to the next two states which refer to its implementation and rollout.</p>
<p>The list of reviewers should include:</p>
<ul>
<li>Name or email of reviewer</li>
<li>Team name of reviewer</li>
<li>Review status: <code>not-reviewed</code>, <code>in-progress</code>, <code>approved</code>, <code>declined</code></li>
<li>Review date (date when the final decision was taken)</li>
</ul>
<figure>
  
  <figcaption>RFC document title and reviewers section.</figcaption>
</figure>

<h3 id="2-table-of-contents"><a href="#2-table-of-contents">2. Table of Contents</a></h3><p>I always include a table of contents to make it easy to jump to sections, and give an overview of what’s in the document.</p>
<p><strong>Note:</strong> In Google Docs, make sure you “refresh” the table of contents widget after adding/removing sections in the document because it doesn’t auto-update. You can refresh it by clicking anywhere inside the table of contents, and clicking the “refresh” icon at its left side that will appear.</p>
<h3 id="3-overview-and-context"><a href="#3-overview-and-context">3. Overview and context</a></h3><p>In this section you should describe the problem the RFC is addressing, and provide necessary context so that people unfamiliar with it can understand the benefits of implementing the RFC.
There can be sub-sections to give more detailed information.</p>
<h4 id="glossary-and-terms-optional"><a href="#glossary-and-terms-optional">Glossary and terms (optional)</a></h4><p>Provide some explanation of terms that will be used later in the doc.</p>
<h4 id="customer-business-impact-optional"><a href="#customer-business-impact-optional">Customer/Business impact (optional)</a></h4><p>Provide evidence of customer impact or business impact that justify working on this RFC.</p>
<h3 id="4-goals-and-requirements"><a href="#4-goals-and-requirements">4. Goals and Requirements</a></h3><p>Explicitly mention what is in-scope and out-of-scope for this RFC.</p>
<h3 id="5-timeline-and-milestones"><a href="#5-timeline-and-milestones">5. Timeline and Milestones</a></h3><p>Provide rough estimates and track key milestones for the RFC and its implementation, including its rollout. This section is to be updated over time as the RFC progresses from review, to approval, to implementation, and to its rollout.</p>
<h3 id="6-proposal-solution"><a href="#6-proposal-solution">6. Proposal solution</a></h3><h4 id="high-level-overview"><a href="#high-level-overview">High-level overview</a></h4><p>This section describes the proposed solution. It should be enough for everyone to understand what the RFC is proposing, and how it’s solving the problem while satisfying the goals and the requirements mentioned above.</p>
<p>This section should not go into all the technical details (see relevant section below), since some stakeholders (e.g. product managers, directors, VPs) might not be technical.
It should however provide enough details to be complete on its own, so it’s not just a vague description of a solution.</p>
<p>Put 1-2 high-level diagrams explaining the solution, but not too many, and describe the key components and the core process/flow of the solution.</p>
<p>Most of the non-technical folks can stop reading here. Maybe they can also read the questions section at the end, but they shouldn’t need to read the technical details section below.</p>
<h4 id="technical-details"><a href="#technical-details">Technical details</a></h4><p>This is the meat of the RFC for the technical folks. There should be sub-sections for all key aspects of the proposed solution.</p>
<p>Each sub-section should cover one key component of the solution and give details to fully understand how it will be implemented. Include diagrams, code snippets, schema definitions, and decisions taken with their reasoning tradeoffs.</p>
<h4 id="open-questions-optional"><a href="#open-questions-optional">Open Questions (optional)</a></h4><p>This section contains open questions about the technical implementation that are yet to be investigated or decided, but won’t materially change the proposed solution itself, so it doesn’t block its review and approval.</p>
<h3 id="7-alternative-options-optional"><a href="#7-alternative-options-optional">7. Alternative options (optional)</a></h3><p>This section should give some details about alternative options that were researched and rejected.
You might not need this section, if the proposed solution is straightforward or it doesn’t really have many different alternatives worth mentioning.</p>
<p>There can always be discussion and changes in the technical implementation in the previous section, but this section is for significantly different approaches to solving the original problem.</p>
<h3 id="8-frequently-asked-questions"><a href="#8-frequently-asked-questions">8. Frequently Asked Questions</a></h3><p>This section should contain questions that you expect folks to ask, or are being asked a few times after publishing the RFC so that you don’t repeat yourself in comments, and can just point folks to this section.</p>
<p>I usually prewrite some questions I expect that colleagues will have, and populate it with more as people review the document.</p>
<h3 id="9-appendix-optional"><a href="#9-appendix-optional">9. Appendix (optional)</a></h3><p>I rarely put an appendix into my RFCs because usually I see other folks abusing it by putting too many unnecessary details.</p>
<p>The appendix can be used for more detailed diagrams, screenshots, tables of data, and in general secondary information that supports the rest of the RFC.</p>
<p>Depending on how many implementation details you need in the RFC you usually don’t need to even have an appendix, since it leads to annoyingly long documents.</p>
<p>When you need it though, include it at the end of the document and keep it tidy with sub-sections.</p>
<h2 id="c-template-preview"><a href="#c-template-preview">C. Template preview</a></h2><p>Download or copy the full template at <a href="https://docs.google.com/document/d/1W5VkHlFxqwZ0b80IDayO1D73ECAxGM-DzCwvoZNygzk/">this Google Document</a>. Use it as you wish 😉</p>
<br/>


</article>
<article>
<h1>Amazon Profiler and AWS CodeGuru Profiler — CI/CD</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 22 Oct 2023 00:00:00 GMT</p>
<p><strong>Table of contents</strong></p>
<ul>
<li><a href="#amazon-profiler">Amazon Profiler</a><ul>
<li><a href="#backend">Backend</a></li>
<li><a href="#website">Website</a></li>
</ul>
</li>
<li><a href="#aws---amazon-codeguru-profiler">AWS - Amazon CodeGuru Profiler</a><ul>
<li><a href="#waves">Waves</a></li>
<li><a href="#environments">Environments</a></li>
<li><a href="#pipeline-promotions">Pipeline promotions</a></li>
<li><a href="#automatic-rollbacks">Automatic rollbacks</a></li>
<li><a href="#multiple-in-flight-versions">Multiple in-flight versions</a></li>
</ul>
</li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
<p>In this article I will go into details of the CI/CD pipelines my team implemented while at Amazon.
Read my previous article for <a href="/articles/cicd-flywheel/">“The CI/CD Flywheel”</a> to understand how each of the pipeline segments we will explore maps to the software release lifecycle.</p>
<hr/>
<p><strong>Important notice</strong></p>
<p>Amazon is a huge company, with hundreds of engineering teams.</p>
<p>The examples showcased below are based on my own experience.
However, you should keep in mind that each team, organisation, department, could follow different practices, sometimes in significant ways.</p>
<p>Anything described in this article should not be extrapolated to be how the whole company worked.
As you will see, even within my teams, we had different processes for several things.</p>
<p>You can find a more generic description of Amazon/AWS deployment pipelines in these amazing articles in the Amazon Builders’ Library, both written by <a href="https://twitter.com/clare_liguori">Clare Liguori</a>:</p>
<ul>
<li><a href="https://aws.amazon.com/builders-library/cicd-pipeline/?ref=lambrospetrou_com">My CI/CD pipeline is my release captain</a></li>
<li><a href="https://aws.amazon.com/builders-library/automating-safe-hands-off-deployments/?ref=lambrospetrou_com">Automating safe, hands-off deployments</a></li>
</ul>
<hr/>
<h2 id="amazon-profiler"><a href="#amazon-profiler">Amazon Profiler</a></h2><p>I was in the Amazon Profiler team for about three years before leaving Amazon in 2020.
Our product was continuous CPU profiling <a href="https://www.cncf.io/blog/2022/05/31/what-is-continuous-profiling/">[1]</a> <a href="https://granulate.io/blog/introduction-to-continuous-profiling/">[2]</a> for the internal backend services, mostly targeting JVM services.</p>
<p>In this team we were doing <a href="https://trunkbaseddevelopment.com/">trunk-based development</a> for all our services, and all Pull Requests were merged into a single <code>mainline</code> branch.</p>
<p>Let’s explore the CI/CD setup for one of our backend services, and our website.</p>
<h3 id="backend"><a href="#backend">Backend</a></h3><p>One of our backend services was the <strong>Aggregator</strong>, processing all ingested profiling data submitted by all the services using our product, and generating aggregated data that we would then display on the website.</p>
<figure>
  
  <figcaption>CI/CD pipeline for Aggregator service.</figcaption>
</figure>

<p>We had one pipeline with 4 stages, including 3 deployment environments (<code>beta</code>, <code>gamma</code>, <code>production</code>). All the environments were in the US.</p>
<p>The <code>source packages</code> column is a collection of Git repositories that will trigger a pipeline execution on every push on the specified branches.
Amazon doesn’t use monorepos, therefore we could have multiple repositories (we call them packages) per service.</p>
<p>The first stage, <code>build</code>, compiles the source code, runs a few static code analysis jobs, and builds the release artefacts.
In our case, the artefacts were the Java <code>.jar</code> files, along with some configuration files.</p>
<p>Proceeding to the <code>beta</code> stage, these artefacts are deployed to the <code>beta</code> environment.
This is a development environment with limited resources (single host), meant to be used only by our team. 
The goal of this environment is to guarantee functional correctness of our product, and of our automation.
Stress testing, and high availability is not a goal for <code>beta</code> environments.</p>
<p>Once the deployment completes, we run approval workflows (e.g. integration tests) against the deployed environment, targeting its API directly.</p>
<p>To promote the release artefacts from the <code>beta</code> stage to <code>gamma</code>, we had a <strong>manual approval</strong> gate.
The manual promotion is not always the default.
We alternated between manual/automatic promotion depending on if we wanted to pause changes from being deployed.
This was useful when we did manual testing in <code>gamma</code> for specific features or to troubleshoot issues.</p>
<p>The <code>gamma</code> stage is usually considered the last pre-production stage at Amazon.
This environment is closer to <code>production</code>, configuration-wise, region-wise, etc.
In some aspects though, <code>gamma</code> and <code>production</code> are not identical. Especially size-wise.
It would be inefficient for all our <code>gamma</code> environments to have the same size as our <code>production</code> environments.</p>
<p>The idea is that <code>gamma</code> is where you do all the tests that need a production-like environment, without affecting actual customers.
A common use-case for <code>gamma</code> stages was to run our load-testing tool, <a href="https://medium.com/@carloarg02/how-i-scaled-amazons-load-generator-to-run-on-1000s-of-machines-4ca8f53812cf">TPSGenerator</a>.</p>
<p>In the case of our Aggregator service, we would mirror the traffic from production in this environment so that we have similar traffic patterns, and similar ingestion data, in order to run experiments on production-identical data and catch issues before impacting our customers.</p>
<p>The approval workflows in <code>gamma</code> are more rigorous than <code>beta</code>, and they included (not exhaustive list):</p>
<ul>
<li>load testing</li>
<li>integration tests</li>
<li>long-running aggregation tests (e.g. multi hour aggregations)</li>
<li>security oriented tests</li>
</ul>
<p>Once all the approval workflows passed, we were manually promoting the release artefacts to the <code>production</code> stage.
A final round of approval workflows were running against the <code>production</code> environment, and if any workflow failed, an automatic rollback would trigger the deployment of the previous artefacts.</p>
<p>One common approval workflow we had across our services was the <strong>monitor bake time</strong>. For a specified amount of time, the pipeline would continuously evaluate a collection of PMET monitors (internal <a href="https://aws.amazon.com/cloudwatch/">CloudWatch</a> predecessor) and if any of them transitioned into <strong>Alert</strong> state within the specified period, the approval workflow would fail, automatically triggering a rollback to the previous release artefacts.</p>
<blockquote>
<p><strong>Note:</strong> We didn’t do rollbacks in <code>beta</code>, since it didn’t matter if it was broken for some time.
It would only affect our team, and it would give us time to investigate and fix the offending commits.
But, we did care for <code>gamma</code> and <code>production</code>.</p>
</blockquote>
<h3 id="website"><a href="#website">Website</a></h3><p>Our website was a Single-Page-Application using <a href="https://reactjs.org/">React</a>. We had a simpler setup than the backend, using AWS services: <a href="https://aws.amazon.com/cloudfront/">CloudFront</a>, <a href="https://aws.amazon.com/s3/">S3</a>, and <a href="https://aws.amazon.com/lambda/edge/">Lambda@Edge</a>.</p>
<figure>
  
  <figcaption>CI/CD pipeline for Amazon Profiler website.</figcaption>
</figure>

<p>We had one pipeline with 3 stages, including 2 deployment environments (<code>beta</code> and <code>production</code>).</p>
<p>This is as simple as it can get, but incredibly sufficient for many teams.</p>
<blockquote>
<p>This is what I call <strong>The Core pipeline</strong>, and everyone should try to have this as starting point.</p>
</blockquote>
<p>The <code>source packages</code> column is a collection of Git repositories that will trigger a pipeline execution on every push on the specified branches.</p>
<p>The first stage compiles the source code, runs static code analysis jobs, and builds the release artefacts.
For the website, the release artefacts were the JavaScript/HTML/CSS files we had to deploy in Amazon S3, along with some <a href="https://aws.amazon.com/cloudformation/">AWS CloudFormation</a> templates for the deployment on AWS.</p>
<p>Proceeding to the <code>beta</code> stage, the artefacts are deployed to the <code>beta</code> environment, in AWS region <code>us-east-1</code>.
The approval workflows run once the CloudFormation deployment is done, and verify that the website loads and functions properly using <a href="https://www.cypress.io/">Cypress</a> UI end-to-end tests.</p>
<p>We once again have a <strong>manual approval</strong> gate before promoting the artefacts to the <code>production</code> stage, same region.</p>
<h4 id="manual-testing-against-backend-api"><a href="#manual-testing-against-backend-api">Manual testing against backend API</a></h4><p>The <code>beta</code> environment is identical to the <code>production</code> one in this case, since we were using AWS in <strong>exactly</strong> the same way. The only difference was the domain name.</p>
<p>We could point the website to use any of the backend environments with an in-app toggle.
Within the <code>beta</code> website you could switch between the backend APIs of <code>beta</code>, <code>gamma</code>, <code>production</code>.
We were using the <code>beta</code> environment of the website during development until we felt good to release in production.</p>
<p>Even though we had the manual promotion, we were releasing regularly to <code>production</code>, several times per week to avoid big-bang releases.</p>
<h4 id="preview-environments"><a href="#preview-environments">Preview environments</a></h4><p>Oh the hype… All the modern Platform-as-a-Service products (Vercel, Netlify, Qovery, etc) brag about their automatic Preview Environments per PR.</p>
<p>Even though Amazon tooling didn’t provide this out of the box, it was easy to do it ourselves.</p>
<ol>
<li>We made sure that even when running locally on our laptops, we could target any of our backend APIs, in addition to the mocked responses locally.</li>
<li>We setup one of our development boxes (remote <a href="https://aws.amazon.com/ec2/">EC2</a> machine we sometimes used for development), and before publishing a Pull-Request a convenient command would ship the artefacts onto that host and give it a unique URL that we put in the PR description.</li>
</ol>
<p>We could have used separate AWS accounts/regions and deploy the full website per PR, but at the time we found it easier to just use a host for all the PRs.</p>
<h2 id="aws-amazon-codeguru-profiler"><a href="#aws-amazon-codeguru-profiler">AWS - Amazon CodeGuru Profiler</a></h2><p><a href="https://docs.aws.amazon.com/codeguru/latest/profiler-ug/what-is-codeguru-profiler.html">Amazon CodeGuru Profiler</a> is the AWS service my team <a href="https://aws.amazon.com/about-aws/whats-new/2019/12/aws-announces-amazon-codeguru-for-automated-code-reviews-and-application-performance-recommendations/">launched at re:Invent 2019</a>.</p>
<p>At the time, the public product was a small subset of what the internal Amazon Profiler service was capable, and the underlying infrastructure was different.
As explained above, <a href="#amazon-profiler">Amazon Profiler</a> was using our internal deployment platform (Apollo), and was deployed in only one US region.</p>
<p>CodeGuru Profiler was built from the ground-up on-top of AWS services, using <a href="https://docs.aws.amazon.com/whitepapers/latest/introduction-devops-aws/infrastructure-as-code.html">Infrastructure as Code</a> to automate as much as possible, and was deployed across many AWS regions.</p>
<p>There are a lot to cover about launching an AWS service, but in this section we focus only the CI/CD pipelines.</p>
<p>As I mentioned above, we were doing <a href="https://trunkbaseddevelopment.com/">trunk-based development</a> for all our services, and all Pull Requests were merged into a single <code>mainline</code> branch.</p>
<p>All our backend services, including the website, followed the same CI/CD architecture, therefore I will use a generic service in the example below for simplicity.</p>
<figure>
  
  <figcaption>CI/CD pipeline for Amazon CodeGuru Profiler services.<br/>(<em>right-click and open in new tab for full preview</em>)</figcaption>
</figure>

<p>Don’t get frighten by the big diagram 😅 We will go through it piece by piece.</p>
<p>The first half part of the pipeline, up to <code>beta</code> stage, is pretty identical to the showcase of <a href="#amazon-profiler">Amazon Profiler</a>, so I will just skip it, and focus on the new parts.</p>
<h3 id="waves"><a href="#waves">Waves</a></h3><p>Every AWS service has to deploy in <a href="https://aws.amazon.com/about-aws/global-infrastructure/regions_az/">tens of regions across the world</a>. A region is a physical location of several data centers.</p>
<p>There is a big variance in customer adoption among the regions, which is the dimension of “region size” I will use below.</p>
<p>This is exacerbated by the fact that some services are only available in a specific region.
One example is <a href="https://aws.amazon.com/lambda/edge/">AWS Lambda@Edge</a> functions.
Even though it deploys your code across many regions to run close to your customers, <a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/edge-functions-restrictions.html#lambda-at-edge-restrictions-region">their own control plane (create and manage your functions) is exclusively in <code>us-east-1</code></a>. Unfortunately, this means that if you want to use such services you need to deploy part of your infrastructure in those regions.</p>
<p>It is also important to note that deploying one region after the other would take too long for a change to reach all regions.</p>
<p>AWS teams use a wave-based deployment approach where you group multiple regions and deploy to them in parallel, whilst the waves are deployed sequentially.
This significantly reduces the blast radius of faulty releases, from automation and configuration issues, to buggy features.</p>
<p>Remember, practicing CI/CD for releasing your software is a way to make sure you ship fast, but also reliably, without breaking many customers.</p>
<p>There are a few different approaches in how you should group your regions, and in what order, but the usual wave-based setup is:</p>
<ol>
<li>A small region</li>
<li>A medium region</li>
<li>3+ regions</li>
<li>5+ regions</li>
<li>N+ regions</li>
</ol>
<p>The exact size of each wave is up to each team, but the idea is that you start slow, and then with each wave you go faster, with more and bigger regions.</p>
<p>The initial slow part gives confidence that your changes are safe, there are no obvious configuration issues, and the business metrics are not dropping alarmingly.
Then, you accelerate the deployment in order to rollout the changes to every customer.</p>
<h3 id="environments"><a href="#environments">Environments</a></h3><p>It is common practice in AWS services to have the following deployment environments per region:</p>
<ul>
<li><code>gamma</code>: The last pre-production environment for automated, and manual, testing before shipping changes to customers.</li>
<li><code>onebox</code>: Often a single host, but could be a small percentage of the overall production cluster of servers, or even a small percentage of AWS Lambda traffic.</li>
<li><code>production</code>: The full production environment.</li>
</ul>
<p>To fully benefit from the <code>onebox</code> deployment environment, we need to have proper segmentation of key metrics that we use for alarms and monitors between <code>onebox</code> and rest of production.
Otherwise, it would be impossible to monitor the impact of a change on the <code>onebox</code> environment if the metrics are covering the whole production environment.</p>
<p>Onebox environments are optional, depending on the underlying infrastructure used.
For example, when using <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> in combination with <a href="https://aws.amazon.com/codedeploy/">AWS CodeDeploy</a>, you can achieve the same result without having a dedicated environment, since CodeDeploy allows you to do rolling deployments by percentages while monitoring alarms at the same time.</p>
<p>This is what we did in our API services, we used AWS Lambda, and we didn’t have <code>onebox</code> environments.</p>
<h3 id="pipeline-promotions"><a href="#pipeline-promotions">Pipeline promotions</a></h3><p>With the above context in mind, let’s dive into the pipeline itself.</p>
<p>Once the release artefacts are verified in the approval workflows for the <code>beta</code> stage, they are then automatically promoted to <code>gamma</code>.</p>
<p>In <code>gamma</code> we do <strong>parallel deployment</strong> across all regions.
These environments are not exposed in customers, and the reason we deploy to all regions is to make sure that our configuration, especially infrastructure automation, does not have hardcoded values, or making assumptions about region-specific properties.</p>
<p>After a deployment for a region completes, the corresponding approval workflows start running, even if other regions are still being deployed.
Hopefully, all the approval workflows will pass, and the release artefacts will be ready for promotion in the first production wave, <code>wave-1</code>.</p>
<p>In all the promotions targeting a production environment, there are two usual conditions:</p>
<ul>
<li>All approval workflows pass.</li>
<li>The time window blocker is disabled.</li>
</ul>
<p>The <strong>time window blocker</strong> is a feature where we block automatic promotion of changes to the next stage during a specific time window.
We use this for important holidays like Christmas and Black Friday, or for restricting deployments only during specific times to avoid breaking production outside the business hours of the on-call, e.g. deploy only during EU/London office hours.</p>
<p>Proceeding to the production wave deployments, we first deploy to the <code>onebox</code> environment, if there is one.
The approval workflow for onebox is often lightweight, mostly monitoring specific metrics, and making sure our alarm monitors do not trigger for a specific amount of time.</p>
<p>Then, the changes are automatically promoted to the full production environment of each region of the wave.</p>
<p>Reminder, that the deployments within a wave are done in parallel.</p>
<p>The approval workflows for each region of the wave start as soon as the corresponding deployment completes.
Approvals here include business metrics monitoring, UI end-to-end tests (for certain APIs, and the website), and making sure our alarm monitors do not trigger for a specific amount of time.</p>
<p>In some cases, there is a also a <strong>fixed time bake period</strong>, where the approval workflow for the wave is essentially paused before proceeding with promoting the release artefacts to the next wave.
This is useful when we want to make sure enough time has passed, and enough customers were exposed to the changes released, which could surface issues not covered by the automated or manual tests, and no metric regressed right after the deployment.
For example, some issues like memory leaks could become prevalent only after several hours.</p>
<p>The fixed time bake period is not a rule, and I have seen many teams not having them at all, or having them and skipping them unless there is a specific reason.</p>
<p>Once again, when the approval workflow passes and there is no time window blocker enabled, the release artefacts are promoted to the next wave, until all the waves are deployed.</p>
<h3 id="automatic-rollbacks"><a href="#automatic-rollbacks">Automatic rollbacks</a></h3><p>In case any approval workflow fails in <code>wave N</code>, then the promotion to <code>wave N+1</code> is blocked, and the offending regional environment rolls back to the previous good version.</p>
<p>This means that some regions within the same wave, could be running different versions of your software than others.</p>
<p>The next version that will flow through the pipeline will at some point arrive at <code>wave N</code>, redeploy to the included regions, and the approval workflows will rerun.
If the approvals pass, then the new version will be promoted as usual to <code>wave N+1</code>. Therefore, all the regions within the wave will now run the same version.</p>
<p>Read more about automatic rollbacks in the article <a href="https://aws.amazon.com/builders-library/ensuring-rollback-safety-during-deployments/">“Ensuring rollback safety during deployments”</a>.</p>
<h4 id="multiple-in-flight-versions"><a href="#multiple-in-flight-versions">Multiple in-flight versions</a></h4><p>My absolute favourite feature of the internal Amazon Pipelines service (also available in <a href="https://aws.amazon.com/codepipeline/">AWS CodePipeline</a>) is that each stage of the pipeline runs independently from others. ❤️</p>
<p>This seems subtle initially, but in practice this allows parallel environment deployments of different versions!</p>
<p>For example, you could have <strong>v10</strong> being currently deployed in the <code>production</code> environment, <strong>v11</strong> in the <code>gamma</code> environment, and <strong>v15</strong> in the <code>beta</code> environment, <strong>all at the same time in parallel</strong>.</p>
<blockquote>
<p>This allows for amazing productivity, and velocity, shipping features often and reliably.
You don’t worry about coordinating releases, or babysitting a single version throughout the whole pipeline before deploying the next one.</p>
</blockquote>
<p>You merge your change, and then it will automatically (with the optional manual promotions occasionally) travel through the pipeline stages as fast as possible, depending on the approval workflows.</p>
<p>If your automated tests are sufficiently good, you can merge multiple times a day, and the pipeline will continuously ship those changes across all your environments automatically.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>This was quite a long article, and there are still many things we didn’t cover around CI/CD techniques in the real-world.</p>
<p>I hope this gave a glipmse into how some of the principles and theory of Continuous Delivery (CI) and Continuous Deployment (CD) are applied in practice, and hopefully it gives you confidence to implement some of these techniques in your own team’s release process.</p>

</article>
<article>
<h1>The CI/CD Flywheel</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 15 Oct 2023 00:00:00 GMT</p>
<p><strong>Table of contents</strong></p>
<ul>
<li><a href="#introduction-to-the-cicd-flywheel">Introduction to the CI/CD flywheel</a></li>
<li><a href="#the-complete-cicd-flywheel">The complete CI/CD flywheel</a></li>
<li><a href="#why-do-we-need-cicd">Why do we need CI/CD</a><ul>
<li><a href="#business-benefits">Business benefits</a></li>
<li><a href="#technical-benefits">Technical benefits</a></li>
</ul>
</li>
<li><a href="#real-world-cicd">Real world CI/CD</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
<p>In this article you will learn what CI/CD is, why you should use it, understand why it can be a competitive advantage for any team, and in follow-up articles I am going to describe how CI/CD is implemented in real-world teams I have been working with.</p>
<h2 id="introduction-to-the-ci-cd-flywheel"><a href="#introduction-to-the-ci-cd-flywheel">Introduction to the CI/CD flywheel</a></h2><p>Let’s step through the full development and release lifecycle of releasing a traditional web application. We start by writing code, testing it, building it, shipping it to production, and making sure it works as we expect.</p>
<p>The full flow is what I call <strong>The CI/CD flywheel</strong>, and as we will see at the end, all the steps of the flywheel are what a CI/CD pipeline implements.</p>
<h3 id="product"><a href="#product">Product</a></h3><p>It all starts by our customers, or potential customers, needing a feature.</p>
<p>This process involves discussing and coming up with a well-defined feature that we want to implement that will solve the customers’ issue.</p>
<p>Different teams follow very different methodologies for product decision making, so I am not going to get into that aspect here, but at the end of this step we end up with a bunch of engineering tasks that we want to implement.</p>
<p></p>
<h3 id="local-write-the-code"><a href="#local-write-the-code">Local - Write the code</a></h3><p>Once the feature requirements are defined, we write the code. </p>
<p>We do either locally on our laptop or PC, or in a remote development environment.
We use an IDE like Visual Studio Code, or a terminal based editor like <code>vim</code>.</p>
<p>At the end of the day we end up with some source code supposedly implementing the agreed-upon feature in our product.</p>
<p></p>
<h3 id="local-build-and-test"><a href="#local-build-and-test">Local - Build and Test</a></h3><p>Once we have some code written, we want to test it to make sure it does what we expect.</p>
<p>Usually, we write some unit tests to test at the function or class level, maybe some integration tests to test a bunch of things together, and if we are feeling adventurous we add some end-to-end (E2E) tests.
Most probably, if possible, you run the application locally for manual testing to make sure it does what you expect.</p>
<p>We repeat the process of writing and testing the code until the feature is ready.</p>
<p></p>
<h3 id="remote-review-and-test"><a href="#remote-review-and-test">Remote - Review and Test</a></h3><p>Once we are confident and satisfied with our code, we submit it for review by opening a <a href="https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/about-pull-request-reviews">Pull Request (PR)</a>.</p>
<p>Don’t worry, I know some of you just push directly without peer-reviews, but we are not all adrenaline-junkies…</p>
<p>Most (good) code review systems (e.g. Github) allow you to run several jobs on each Pull-Request (PR) to build and test the changes.
In these jobs we run any kind of tests we have, and even do more elaborate tests that take more time to complete, and therefore we don’t want to be running locally every time we change something in our code that would hinder our productivity.</p>
<p>Ideally, these jobs run in isolated environments that do not interfere with any production system, and can provide reproducibility in every run.</p>
<p>In some other code reviews systems, we are not able to do much other than building the code, and running the basic tests (unit, integration), so we just do that. Either way, we want to take advantage of this functionality to do testing of our code changes before merging them to the main branches used by everyone else in the team/company.</p>
<p></p>
<h3 id="remote-merged-integrated-built"><a href="#remote-merged-integrated-built">Remote - Merged, Integrated, Built</a></h3><p>After the code review is done, we merge our changes into the main branch(es) we use for production releases.</p>
<p>Once again, we want to run tests to make sure the code is correct and properly “integrated”.
Keep in mind that at this point potentially several other engineers have also merged their changes into the same branch.</p>
<p>The term “integrated” here means that we bring changes from multiple engineers and integrate them all together into one source, that needs to be built, tested, and then released as one.
Even if there are no code-related conflicts, there could be logic/business oriented conflicts and issues because of this merging.</p>
<p>At this step, we run extra analysis jobs like static code analysis, or security vulnerability checks, or end-to-end tests that could take several minutes or hours to complete, and more advanced tests.</p>
<p>This is often the last step we will run most of our suite of automated tests without direct customer or manual interaction.</p>
<p></p>
<h3 id="remote-prepare-the-release-artefacts"><a href="#remote-prepare-the-release-artefacts">Remote - Prepare the release artefacts</a></h3><p>Once everything passes, we need to build the release artefacts that will get deployed into our actual running systems.</p>
<p>Depending on the application this could be platform-specific artefacts (e.g. <code>.apk</code>) that we will upload to the mobile app stores (Apple Store, Google Play Store).
It could be Docker containers that we publish to one of the container registries (e.g. Dockerhub, AWS ECR), or the binaries for our CLIs, or the static files for our website, or any other artefact that our build process generates.</p>
<p>Regardless of the type of our product, this is the step where we generate the artefacts to be delivered to the production systems.</p>
<p></p>
<h3 id="deploy-staging-preproduction"><a href="#deploy-staging-preproduction">Deploy - staging / preproduction</a></h3><p>We wrote our code, we built it, we tested it, and we generated the release artefacts. It’s time now to deploy it.</p>
<p>At this step we take the release artefacts from the previous step and ship them to wherever our product is released.</p>
<p>If it’s the app stores, we will follow their corresponding process.
If it’s a website or a backend server we will deploy the artefacts to the servers.
If it’s a firmware update we will deploy it to some device lab with actual devices and automatically install it on them.</p>
<p>Finally, there are some rare cases where the deployment will happen in a manual way, by an actual person, to a medium that we cannot reach with automation and requires physical presence.</p>
<p>A key aspect for this step is that we should do the deployment on an environment that is not the one used by our customers, or at least not all of them at once by using private beta programs with few testers only (e.g. in the case of a mobile app).
This is what we call staging or pre-production environment(s).</p>
<p>This is not always possible, and some teams prefer to just ship to production (if you are one of those - <a href="/articles/cicd-feature-flags/">at least use feature flags!</a>).
Personally, I always prefer having at least one staging environment.</p>
<p></p>
<h3 id="deploy-staging-preproduction-verification"><a href="#deploy-staging-preproduction-verification">Deploy - staging / preproduction verification</a></h3><p>We deployed our changes to our staging/preproduction environment, and it’s time to verify that no alerts or alarms are going off, and that all the features we implemented work as expected, old and new.</p>
<p>This verification could be done in various ways, automated, manual, in a few seconds, or across several hours. It all depends on the actual product.</p>
<p></p>
<h3 id="deploy-production"><a href="#deploy-production">Deploy - production</a></h3><p>Finally, we deploy our artefacts to the production systems used by our actual customers.</p>
<p>This process should be similar to deploying to our staging environment(s), albeit in a different scale and ideally in a gradual way across all of your infrastructure.</p>
<p></p>
<h3 id="deploy-production-verification"><a href="#deploy-production-verification">Deploy - production verification</a></h3><p>For one last time, we need to verify that everything works fine, no customer is negatively impacted by the newly shipped changes, and that we didn’t break anything that was previously working.</p>
<p>To succeed in this step we need to have proper monitoring and observability of our applications.
There is a whole bunch of things around observability but at the end of the day, you should have alerts that will notify you when something is negatively impacting your users.</p>
<p></p>
<h3 id="customer-feedback-new-feature-requests"><a href="#customer-feedback-new-feature-requests">Customer feedback / New feature requests</a></h3><p>Our changes are in production, used and loved by our customers, so now we repeat the whole process again. Customers are asking for more features, or they complain about something and we need to fix.</p>
<p>We go back to our editor, write some more code, and go through the whole cycle again.</p>
<h2 id="the-complete-ci-cd-flywheel"><a href="#the-complete-ci-cd-flywheel">The complete CI/CD flywheel</a></h2><p></p>
<p>The whole process we went through is what I call <strong>The CI/CD flywheel</strong>.</p>
<p>In traditional publications, it is also called Software Development Life Cycle (SDLC), but I like flywheel better because it emphasizes the <strong>looping</strong> aspect.</p>
<p>We can see that the whole development cycle, the idea/customer request, writing code, delivering an artefact to customers, getting feedback, and going back to ideas, is a simple loop.
<strong>The faster you transition from one step to the next, the faster you deliver value to customers.</strong></p>
<p>One way to cycle through the flywheel faster, is to remove certain steps, for example the verification steps, or even the whole preproduction environment.
Indeed, this will shorten the circle.
However, there is a hidden trap in this thinking.</p>
<p>Removing validation steps will cause you to actually go through extra iterations.
Inevitably, broken things will be shipped to customers, and we will need to repeat the process more times just to fix bugs that could be prevented in the first place.</p>
<p>Catching issues later in the process is much more expensive than catching them in an earlier step.
The latter steps in the flywheel (deploying, verifying and getting user feedback) are often more time-consuming than executing the earlier steps (writing and testing code), therefore doing the early steps more times is more beneficial than doing the latter steps more times.</p>
<p>In the diagram below we can see how the steps of the flywheel map to the CI/CD phases: Continuous Integration (CI), Continuous Delivery (CD), and Continuous Deployment (CD). </p>
<p></p>
<p>Most of the flywheel belongs into at least one phase of a CI/CD pipeline.</p>
<p>Therefore, having great CI/CD systems is crucial in order to allow us to move faster, and delivering value to our customers safely, reliably, and continuously.</p>
<!-- 
## The CI/CD flywheel as a CI/CD pipeline

Describe in the flywheel how each step falls into the CI/CD elements!

- From ChatGPT

```
A good CI/CD (Continuous Integration/Continuous Deployment) pipeline typically consists of several important elements, including:

Automated testing: A CI/CD pipeline should include automated tests that run on every code commit to ensure that the code is functional and meets the necessary quality standards.

Continuous integration: This involves regularly integrating new code changes into the main codebase to avoid merge conflicts and ensure that the codebase remains stable.

Continuous delivery: The pipeline should be designed to automatically build and package the application for deployment to a staging environment, where it can be tested and verified by stakeholders.

Continuous deployment: Once the application has been tested and verified in the staging environment, the pipeline should be able to automatically deploy the application to production, without manual intervention.

Version control: Version control is critical for managing code changes, and a good CI/CD pipeline should integrate with a version control system such as Git to track changes to the codebase.

Infrastructure as code: The pipeline should use infrastructure as code (IaC) to define the application's infrastructure and configuration, making it easier to manage and reproduce the infrastructure in different environments.

Monitoring and feedback: The pipeline should provide feedback on the application's performance and metrics, allowing developers to identify and address any issues that arise in production.
``` -->

<h2 id="why-do-we-need-ci-cd"><a href="#why-do-we-need-ci-cd">Why do we need CI/CD</a></h2><p>We have explored the CI/CD Flywheel that gives an overview of the CI/CD process. 
Let’s explore some of the concrete benefits we can get from implementing CI/CD in practice.</p>
<h3 id="business-benefits"><a href="#business-benefits">Business benefits</a></h3><h4 id="ship-value-to-customers-more-often"><a href="#ship-value-to-customers-more-often">Ship value to customers more often</a></h4><p>Implementing new features, fixing bugs and issues detected in early stages of the CI/CD flywheel, and automated delivery of the release artefacts to the different environments, all ultimately contribute to <strong>shipping business value to customers faster and more often</strong>.</p>
<h4 id="earn-and-retain-customer-trust"><a href="#earn-and-retain-customer-trust">Earn and retain customer trust</a></h4><p>With a variety of automated tests, early detection and fixing of broken features, gradual rollout of new features, and finally, automatic detection of faulty releases, we significantly reduce the risk of breaking features that customers depend on. Or, when we break features, we can quickly release their fixes.</p>
<p>In this way, we <strong>earn the trust from customers</strong>, and retain it forever reducing unsatisfied customer churn.</p>
<h4 id="fast-experimentation-feedback-loop"><a href="#fast-experimentation-feedback-loop">Fast experimentation feedback loop</a></h4><p>Practicing CI/CD efficiently means that <strong>we can experiment and test new innovative features faster and more often</strong>.</p>
<p>Engineers are not being slowed down in cumbersome manual release procedures, and most importantly they are not risking in breaking real customers with unpolished or unfinished features unexpectedly.</p>
<p>CI/CD enables the development of features that would otherwise take a lot of back-and-forth in discussions among the different stakeholders until its decided that it’s worth doing.</p>
<p>Within a few days or even hours, a feature can be tested in a subset of users, or even in a staging environment without impacting customers, and take a more informative decision based on real data.</p>
<h4 id="attract-the-right-talent"><a href="#attract-the-right-talent">Attract the right talent</a></h4><p>Anyone in a creative role, including software engineers, designers, product owners, and others, <strong>likes to see their ideas and creation used by customers as soon as possible</strong>.</p>
<p>The rapid development feedback loop that CI/CD provides is a great selling point to attract people into working for a company.</p>
<p>Who wouldn’t want to work on a team that allows them to work and ship new ideas every day, straight to customers, in minutes or hours, instead of months!</p>
<h4 id="cost-reduction"><a href="#cost-reduction">Cost reduction</a></h4><p>The earlier an issue is detected, and fixed, the less costly it is to the business.
Let’s examine some scenarios.</p>
<p><strong>Scenario A - reckless</strong></p>
<ol>
<li>We implement a new feature, and ship it directly to production so customers start using it.</li>
<li>Customers start complaining in social media, and file support tickets, because something broke.</li>
<li>The support team and on-call engineers are scrambling to see what’s broken, implement a fix, and then ship it straight to production again.</li>
</ol>
<p><strong>Scenario B - half-assed</strong></p>
<ol>
<li>We implement a new feature, and ship it to an environment where a team of Quality Acceptance engineers (QAs) will need to do manual testing.</li>
<li>After a few hours, or days, the QA team managed to test our changes but detected something is broken.</li>
<li>They send it back to the engineering team, they implement a fix, and they send it back to QA again.</li>
<li>After a few hours, or days, the QAs approve, and we now ship the change to production.</li>
</ol>
<p><strong>Scenario C - better</strong></p>
<ol>
<li>We implement a new feature, and merge it in the main codebase.</li>
<li>The change goes into the CI/CD pipeline where the first step is to run some automated tests.</li>
<li>One of the automated tests fails and stops the pipeline.</li>
<li>We see that the automated test suite failed, check which test failed, implement a fix, and merge again.</li>
<li>The tests pass, and the change is delivered to our staging environments for further validation and then goes to production.</li>
</ol>
<p>There are thousands of other scenarios that could happen in practice, but I chose these because they showcase well the difference in mentality, and more importantly the cost of fixing an issue.</p>
<blockquote>
<p>In scenario A, the issue went straight into production, and customers noticed immediately.</p>
</blockquote>
<p>This not only hurts the business reputation, but it also costs a lot more in person-hours.</p>
<p>We will spend a lot of time debugging the customer complaints until we figure out the root cause, then implementing the fix, and shipping to production.
If we are lucky, the change we did for the fix won’t break something else…</p>
<blockquote>
<p>In scenario B, we are not as reckless as in scenario A, and we have a team of dedicated QAs that will do manual validation of changes.</p>
</blockquote>
<p>This sounds very inefficient (and it is), but it is actually used by thousands of companies, even today.
The QAs will often catch issues before reaching production.</p>
<p>However, the issue is the overhead of communication between the engineers and the QAs, in some cases with multiple back-and-forth iterations.
In addition, a code change just sits in a queue of changes that the QAs gradually go through and verify, which also introduces extra delays.</p>
<p>Finally, imagine all this happening twice. This is a lot of man-hours just spent on verifying that a change works as expected.</p>
<blockquote>
<p>In scenario C, we spent a few minutes and automated (most of) the tests that the QA team would do manually in scenario B, and added them in the automated test suite that runs in our CI/CD pipeline.</p>
</blockquote>
<p>The tests ran and failed just a few minutes after merging our change, and within a few minutes we implement a fix and merge again.</p>
<p>So, let’s recap what each scenario cost the business:</p>
<ul>
<li>Scenario A: X hours of debugging and going through customer complaints & business reputation risk.</li>
<li>Scenario B: Y hours/days of developers and QA communication, but less risk of the issue reaching customers.</li>
<li>Scenario C: Z minutes of automated run on a remote machine, and less risk of the issue reaching customers.</li>
</ul>
<p>In majority of cases, it holds that <code>X ~= Y >> Z</code>, which means that even though initially it seems that having a full CI/CD pipeline is more work and more steps, in practice it will save us a lot of time.</p>
<p>I hope it’s clear how big of a cost reduction it is to detect issues early on and to automate their verification.</p>
<p>Fun fact 👉🏻 CI/CD is similar to <a href="https://mag.toyota.co.uk/toyota-manufacturing-25-objects-andon-cord/">Anton Cord, introduced by Toyota</a>. The farthest the issue was detected in the pipeline the costlier it was to fix and it would cause more defects. The same approach was <a href="https://thinkinsights.net/strategy/andon-cord/">adopted by tech companies like Amazon and Netflix</a>, and then more.</p>
<h3 id="technical-benefits"><a href="#technical-benefits">Technical benefits</a></h3><h4 id="code-quality"><a href="#code-quality">Code quality</a></h4><p>Due to integrating changes to the main source code repository multiple times per day, we guarantee that the changes always build, and pass the relevant tests.</p>
<p>Gone are the days that engineers spent weeks implementing features in isolation, and then spending days trying to merge their changes together, solving major conflicts, or even rewriting big chunks of code to make it work.</p>
<h4 id="comprehensive-tests"><a href="#comprehensive-tests">Comprehensive tests</a></h4><p>Because our CI/CD pipeline will run automated tests for each code change in remote environments, we are encouraged to add more tests that will cover more scenarios we want to check.</p>
<p>We can have tests that need elaborate setups, and that would be very time consuming to setup manually, like privacy and security checks, or performance regression detection, or testing our change on multiple versions of a browser, or multiple devices, etc.</p>
<p>Overall, due to having automated test suites running on isolated environments we can do things we cannot afford to manually do locally on every code change.</p>
<h4 id="automate-repetitive-tasks-for-reproducibility"><a href="#automate-repetitive-tasks-for-reproducibility">Automate repetitive tasks for reproducibility</a></h4><p>Doing manual steps over and over again inevitably leads to mistakes.</p>
<p>QA testing sessions usually include manual actions.
Publishing release artefacts by manually running commands locally on your laptop is problematic.</p>
<p>I faced such issues early on in my career.
For the release of our mobile app we were using our own laptops to build the corresponding artefacts for each app store (iOS, Android, Amazon Fire).
Due to differences in our workspaces, once or twice, we shipped faulty releases.
However, when a different engineer was trying to reproduce the issue on their laptop, it wasn’t failing, due to local workspace differences.</p>
<p>This was a nightmare to debug.</p>
<p>A good CI/CD process will always guarantee that everything runs from a clean state, so that every time you execute the same steps, you get the same result.</p>
<p>Automating as much as possible is a gift that will never stop giving! 🎉</p>
<h4 id="build-once-use-everywhere"><a href="#build-once-use-everywhere">Build once, use everywhere</a></h4><p>When using CI/CD to build and deploy our software, we get the nice benefit that the same artefacts we build in the first steps of the pipeline are the ones being propagated to all the environments thereafter.</p>
<p>This means that we can attribute anything that happens in the pipeline, like test failures, or deployment issues, or even issues caught in production, to a single artefact, which ultimately will lead us to the offending code changes easier and faster.</p>
<p>Another benefit from releasing the same artefact everywhere, is that we can reason more easily about what’s running in each environment/stage without worrying about inconsistencies among the environments.</p>
<h4 id="controlled-rollouts-and-faster-mean-time-to-resolution-mttr"><a href="#controlled-rollouts-and-faster-mean-time-to-resolution-mttr">Controlled rollouts and faster Mean Time To Resolution (MTTR)</a></h4><p>Automatic deployments across environments enables the use of controlled, gradual, rollouts.</p>
<p>We could be doing gradual rollouts on a small percentage of customers each time, or even rolling out changes continuously but hiding the feature behind <a href="/articles/cicd-feature-flags/">feature flags</a> and only enabling it for certain customers.</p>
<p>This allows monitoring to detect failures impacting customers, and rolling back the newly released changes automatically.
This is easy and efficient to do in a CI/CD managed environment since each change that gets released is usually small and its rollback won’t cause havoc to customers.</p>
<p>Doing this manually, even if doable in some cases, would be very error-prone with elaborate manual steps.</p>
<h2 id="real-world-ci-cd"><a href="#real-world-ci-cd">Real world CI/CD</a></h2><p>In follow-up articles I will provide a bird’s eye view of some real-world CI/CD systems and procedures that I personally experienced throughout my career.</p>
<p>I want to showcase some examples of how companies use CI/CD in different ways, and at the same time how they follow the same principles that make up the CI/CD flywheel!</p>
<p>In all these real-world system the fundamentals are the same. The goals are the same. The benefits are similar. The implementation is different.</p>
<p>CI/CD systems we will explore:</p>
<ul>
<li>Amazon Retail / LOVEFiLM <ul>
<li>Gitflow development (2 branches), 2-3 stages (beta, gamma, prod) and 2 regions</li>
</ul>
</li>
<li>Amazon Profiler <ul>
<li>Trunk based development, 2 stages (beta, prod) and 1 region</li>
</ul>
</li>
<li>AWS/Amazon CodeGuru Profiler<ul>
<li>Trunk based development, multi stage (beta, gamma, prod), multi region</li>
</ul>
</li>
<li>Meta WWW<ul>
<li>Trunk based development, tiered production deployment (C1, C2, C3, … C9)</li>
</ul>
</li>
<li>Others…</li>
</ul>
<p>If you want to see something specific please reach out <a href="https://twitter.com/LambrosPetrou">@lambrospetrou</a> 😅</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>This article introduces the CI/CD flywheel, and maps its steps to a CI/CD pipeline for building, testing, deploying, and verifying our software products.</p>
<p>I truly believe that embracing and implementing fast automated CI/CD pipelines is truly a competitive advantage for any team, and any company, of any size! 💪🏼 🚀</p>

</article>
<article>
<h1>Writing docs</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 24 Sep 2023 00:00:00 GMT</p>
<p><strong>Table of contents</strong></p>
<ul>
<li><a href="#context">Context</a></li>
<li><a href="#future-me-will-thank-me">Future me will thank me</a></li>
<li><a href="#onboarding-new-joiners">Onboarding new joiners</a></li>
<li><a href="#asked-twice---asked-often">Asked twice - asked often</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
<h2 id="context"><a href="#context">Context</a></h2><blockquote>
<p>what’s your motivation to create such good docs?</p>
</blockquote>
<p>A couple of days ago, a colleague asked me the above question.
That was the 4th person commenting positively within the same week about that specific wiki I wrote, so I decided to write down an article about my thinking.</p>
<p>A bit of background, to set the stage right.</p>
<p>I personally <strong>love writing detailed documentation</strong> about the projects I do.</p>
<p>By “writing docs” I refer to:</p>
<ul>
<li>Commenting code with explanation and reasoning for things that are not obvious.</li>
<li>Putting plenty of information in issue tracking tools (e.g. Jira, Github Issues), and pull-request descriptions, such that there is clear trail of what I researched, what I did (even if it didn’t work out), and what’s remaining.</li>
<li>Writing comprehensive wikis and runbooks for the bigger type of projects I work on.</li>
</ul>
<p>This article is not an exchaustive list of benefits for having good documentation.
Plenty of longer posts do that, and thousands of debates took place regarding source code comments, in favor and against.</p>
<p>The rest of the article enumerates the key reasons why I like writing docs, and hopefully they will convince you to do so as well.</p>
<h2 id="future-me-will-thank-me"><a href="#future-me-will-thank-me">Future me will thank me</a></h2><p>Reason #1 is myself.
How selfish am I, right?</p>
<p>I first and foremost, write these docs for myself.
After a few weeks of not working on a project, I always forget things. </p>
<p>Such things include:</p>
<ul>
<li>System architecture of the service, which services it interacts with upstream/downstream, and what are the key properties of the service.</li>
<li>Key project/service references to documents, issues/tasks, dashboards, oncall handles.</li>
<li>How do certain intricacies of the system work? Wiki sub-pages for important topics that I will almost definitely forget in 6 months, and I will need to remember when debugging a service at 03:00 in the morning after getting paged.</li>
<li>Common operations like building and deploying the service artifacts. I know we all wish that all services had a single command, and that command was the same across all codebases. Unfortunately in the real world things are hairy. Jumping from project to project, in different languages, and using different tooling makes it harder to remember everything. Give me copy-pasteable commands any day of the week!</li>
</ul>
<p>There are many more things that I document, but these should already make it clear that there are plenty of things you need as an engineer that you will inevitably forget after a few months.</p>
<p>Having detailed and clear step-by-step docs is the easiest way for me to ramp up again on the project’s main aspects when coming back to it after a while.</p>
<h2 id="onboarding-new-joiners"><a href="#onboarding-new-joiners">Onboarding new joiners</a></h2><p>No matter how junior or senior someone is, when you join a new team or an existing long-running project you have to spend a lot of time building up context.</p>
<p>There is a lot of tribal knowledge that someone working on a project builds up, and someone new joining the project doesn’t have.
It’s awesome when you have nice colleagues that can answer all your questions promptly, but we are not always so lucky.</p>
<p>Many projects are single-person projects, so even though a team has several folks, only that person knows most of the project’s specifics. If that person is unavailable, good luck to you 😅</p>
<p>I witnessed a project where the main engineer was on vacation for a few weeks, and nobody else knew much about the project, so the project was put on-hold until they returned. To me, this is a huge no-no, and a red-flag!</p>
<p>This is common, and works fine in scrappy small startups where each engineer is essentially a department on their own, but as you grow, and as more folks join a team, it’s important to increase the <a href="https://en.wikipedia.org/wiki/Bus_factor">bus factor of the team</a>.</p>
<p>I personally treat documentation as my always-available assistant to explain things and provide handy tips.
Thus, I always put effort in keeping docs updated.</p>
<p>Even basic operations like compiling code, running tests, and which dashboards to explore, are a great initial boost.</p>
<p>Having nice runbooks makes the onboarding process smoother, and you can focus more on the core parts of your onboarding, eg. understanding the business aspect of the product.</p>
<h2 id="asked-twice-asked-often"><a href="#asked-twice-asked-often">Asked twice - asked often</a></h2><p>When I work on a project where more than 2-3 people are involved on a daily basis, it’s almost certain that every few days someone will ask the same question as someone else did before.</p>
<p>Some will even ask the same thing multiple times spread out across days.
People forget easily.</p>
<p>My strategy is that if someone asks me the same thing twice, or I get the same question from two different people, I quickly write it down in our wiki so that the next time someone asks, I can just deeplink them to the wiki entry.</p>
<p>This not only wastes less time in the future by not typing the answer over and over again, but it also pushes more people into the wiki.</p>
<p>My grand-plan is that gradually folks will put the wiki into their own daily workflow as well.
They will use it first to search for what they want, and ideally they will also start contributing to it their own tribal knowledge.</p>
<p>Even if I turn 1 person per month to improve our wiki, I consider it a win.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>The main drawback of writing docs is that someone spends time doing it.
But, in my experience, once you make wiki-writing just another item in your daily workflow, it becomes invisible.
It’s not a chore that you do separately, you do it continuously in small amounts.</p>
<p><strong>Taking notes and writing detailed docs is my superpower.</strong></p>
<p>I use my personal notes tens of times per day, and I use internal wikis as my entrypoints to most things.
You should start too.</p>
<p>If you want to read more about my personal notes, and the format I use, checkout the following articles I wrote in the past:</p>
<ul>
<li><a href="https://www.lambrospetrou.com/articles/best-tip-the-worklog/">Best tip I received — The worklog</a></li>
<li><a href="https://www.lambrospetrou.com/articles/the-worklog-format-1/">The worklog format 1.0</a></li>
</ul>

</article>
<article>
<h1>Feature Flags — CI/CD</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 17 Sep 2023 00:00:00 GMT</p>
<p>I originally sent part of this article to the mailing list of my <a href="https://www.elementsofcicd.com?ref=lambrospetrou.com">Elements of CI/CD</a> course, and now sharing it with the rest of you with improvements from the received feedback.</p>
<hr/>
<p><strong>Table of contents</strong></p>
<ul>
<li><a href="#overview">Overview</a></li>
<li><a href="#dynamic-configuration">Dynamic configuration</a></li>
<li><a href="#not-just-for-the-server">Not just for the server</a></li>
<li><a href="#beta-testing-and-allowlisting">Beta testing and allowlisting</a></li>
<li><a href="#rule-conditions">Rule conditions</a></li>
<li><a href="#feature-rollout">Feature rollout</a></li>
<li><a href="#ab-testing-and-experiments">AB testing and experiments</a></li>
<li><a href="#kill-switch">Kill-switch</a></li>
<li><a href="#testing-feature-flags">Testing feature flags</a></li>
<li><a href="#managing-feature-flags">Managing feature flags</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
<h2 id="overview"><a href="#overview">Overview</a></h2><blockquote>
<p><strong>You should adopt and use feature flags as much as possible!</strong></p>
</blockquote>
<p>Feature flags allow you to deploy changes to your software rapidly, and continuously, as many times a day as you want, without sacrificing the reliability of your product.</p>
<p>When you use feature flags, deploying changes is only half the story, since no customer will be exposed to those changes by default.
The second half of the story, is that you selectively enable features to a subset of your customers.
This sounds simple, but it really is a superpower! 💪🏼</p>
<p>Imagine that your product is an Android mobile application where users can search for whiskies, and ultimately buy bottles directly from the website.
You are developing two new features that each needs a few weeks of development and iteration.</p>
<ul>
<li>Feature A is getting whisky recommendations based on whiskies you mark as liked.</li>
<li>Feature B is the addition of new payment providers that would allow customers to buy with less fees.</li>
</ul>
<p>Two different features, developed in parallel by different (or same) engineers.</p>
<p>In some teams, you would work on these features on separate git branches, and only merge on the production branch that gets deployed when the feature is ready to go live.
This means that you will have two separate branches diverging more and more over time, and you also don’t know if something else breaks until you merge back to the production branch.</p>
<p>Too late.</p>
<p>With feature flags, you can continuously merge your code changes to the production branch, and more importantly deploy them to production, while at the same time ensuring that nobody will get exposed to them before they are ready.</p>
<p>We do this by wrapping these in-progress/unfinished features with special conditions that check if the feature is enabled, and if yes, for which users, and only proceed if the current user is allowlisted for the feature.</p>
<p>In our whisky application above, let’s assume that feature A will not be enabled at all, and feature B will only be enabled for customers in England since the payment providers we implemented so far and are ready to be tested are for English customers.</p>
<p>Your codebase would call the following Java snippet when displaying the checkout screen:</p>
<pre><code class="language-java">record User(String id, String country) {};

List<PaymentProvider> getPaymentProviders(
  User user, 
  FeatureFlags ff) { 
  List<PaymentProvider> providers = new ArrayList();
  // ... some code that adds payment providers already supported

  if (ff.isEnabled(FeatureName.PayProviderXYZEnabled, user)) {
    providers.add(new PaymentProviderXYZ());
  }

  return providers;
}
</code></pre>
<p>The above snippet ensures that the <code>PayProviderXYZForEngland</code> is included in the checkout only when the feature flag <code>FeatureName.PayProviderXYZEnabled</code> is enabled for the <code>user</code> being handled.</p>
<p>A partial implementation of the <code>FeatureFlags</code> class for our whisky application can be the snippet below:</p>
<pre><code class="language-java">record FeatureFlag(
  Set<String> countries, 
  Set<String> userIds, 
  boolean enabledForAll, 
  boolean disabledForAll) {};

class FeatureFlags {
  enum FeatureName {
    PayProviderXYZEnabled,
    WhiskyRecommendationsEnabled
  }

  Map<FeatureName, FeatureFlag> rules = ImmutableMap.of(
    FeatureName.PayProviderXYZEnabled, new FeatureFlag(
      ImmutableSet.of("England"), 
      Collections.emptySet(), 
      false, 
      false
    ),
    FeatureName.WhiskyRecommendationsEnabled, new FeatureFlag(
      Collections.emptySet(), 
      Collections.emptySet(), 
      false, 
      false
    )
  );

  boolean isEnabled(FeatureName featureName, User user) { 
    FeatureFlag ff = this.rules.get(featureName);
    if (ff.disabledForAll()) {
      return false;
    }
    return ff.enabledForAll() || 
           ff.countries().contains(user.country()) || 
           ff.userIds().contains(user.id());
  }
}
</code></pre>
<p>In the above implementation of <code>FeatureFlags</code> our features can be selectively enabled for users based on their country, and their ID.</p>
<p>I have seen teams and companies implementing feature flags in vastly different ways.
There are implementations as simple as the above, and there are implementations that have complex rules as we will explore in following sections.</p>
<p>You can see that in a few lines we have a functioning feature flags system that allows you to selectively execute code based on the user, or any other condition you need.
This means that you can safely merge your code changes even if the features you work on are unfinished, or even incorrect, as long as you wrap their entry point execution call with a feature flag condition check.</p>
<h3 id="feature-flags-are-a-superpower"><a href="#feature-flags-are-a-superpower">Feature flags are a superpower</a></h3><p>Feature flags are a superpower, and by using them you:</p>
<ul>
<li>Can continuously merge and deploy changes, thus increasing your velocity.</li>
<li>Can make sure that all the in-progress changes are integrated into the production branch and do not cause other features to break.</li>
<li>Avoid maintaining separate diverging git feature branches that go out of date and lead to annoying time-wasting merge conflicts.</li>
<li>Can test features and get early feedback with as few, and as many, customers as you want.</li>
<li>Can quickly enable, and more importantly, disable, a feature by just changing a boolean value. Quickly rolling back broken features is now trivial!</li>
</ul>
<p>The rest of the article will focus on popular use-cases for using feature flags.
Going through these use-cases will hopefully make the benefits and power of feature flags clear.</p>
<h3 id="also-known-as"><a href="#also-known-as">Also known as</a></h3><p>Feature flags also go by the following terms (non-exhaustive):</p>
<ul>
<li>Feature switches</li>
<li>Feature toggles</li>
<li>Experiments</li>
<li>A/B testing (feature flags can be used for A/B testing as well)</li>
</ul>
<h2 id="dynamic-configuration"><a href="#dynamic-configuration">Dynamic configuration</a></h2><p>Feature flags are one use-case of dynamic configuration in our software applications.</p>
<p>In the code snippets above implementing the <code>FeatureFlags</code> class for our imaginary Android application, we used hardcoded rules for the features we wanted to conditionally enable.</p>
<p>This means that in order to update these rules, e.g. adding new user IDs to existing flags or adding new flags, we would need to go through the application CI/CD pipeline, and deploy the application itself to use the updated rules.
Depending on the nature of our product this might not allow us to iterate on the feature flags themselves easily, and we wouldn’t be able to rollout many changes per minute/hour.</p>
<p>To solve this issue, we need to move the definition of rules outside our main applications, into their own artefacts, with their own CI/CD pipeline that can be executed independently of the application’s pipelines.</p>
<p>A common approach I have seen in practice is to put the feature rules in text files (e.g. JSON, YAML, TOML) and have a CD/CD pipeline deploy them in Amazon S3-compatible buckets in different regions.</p>
<p>Our applications will have to be adapted to periodically (every 1 minute) fetch these configuration files, and recreate the <code>rules</code> inside the <code>FeatureFlags</code> class based on the latest configuration files.
This exact flow was how one of our feature flag systems worked a few years ago, while I was working on the AWS Console for <a href="https://docs.aws.amazon.com/codeguru/latest/profiler-ug/what-is-codeguru-profiler.html">Amazon CodeGuru Profiler</a>.</p>
<p>An alternative would be to use SaaS services like <a href="https://launchdarkly.com/">LaunchDarkly</a> or <a href="https://posthog.com/feature-flags">PostHog feature flags</a> to configure your feature rules, and then in your application code you would call their APIs to get a decision.</p>
<p>There are myriad ways you can make your feature flags dynamically configured, but in all cases you want to have:</p>
<ul>
<li>Separate deployment pipeline for the feature rules, outside the application’s deployment process.</li>
<li>Quick rollout of feature rules updates (forward updates, and rollbacks).</li>
<li>Periodic update of the active feature rules inside the applications.</li>
<li>Safe rollout of feature rules (if latest version is invalid, use the previous one to avoid breaking the applications).</li>
</ul>
<figure>
  
  <figcaption>Example of data flow for updating, storing, and fetching feature flag rules from Amazon S3.</figcaption>
</figure>

<h2 id="not-just-for-the-server"><a href="#not-just-for-the-server">Not just for the server</a></h2><p>Feature flag systems are not only for servers and backends.</p>
<p>Even though the actual feature rules will indeed have to be served by some server API, at some point, they can be used for multiple application types.</p>
<p><strong>You can use them on websites.</strong></p>
<p>When the static assets are served, you can inject the feature rules inside the HTML document.
Or, you can provide a dedicated API that the website will call once loaded to fetch all enabled features for the user session, and update periodically.
Or, you can retrieve the feature rules every time the user logins or refreshes their session token.</p>
<p><strong>You can use them on mobile applications.</strong></p>
<p>Hardcoded feature rules in the application itself can be used, but will only be able to be updated from inside the application itself (e.g. user enabling an experimental feature), or when the application itself is upgraded.
Usually, you provide a dedicated API that the application will call once loaded to fetch all enabled features for the user, and repeat periodically.</p>
<p><strong>Offline applications.</strong></p>
<p>In cases where there is no network connectivity or calling remote APIs is not possible, we can still use feature flags in the hardcoded fashion we explored previously.</p>
<p>The user of the application will need to do specific actions to enable or disable the features.
For example, in several CLI applications you need to provide specific commands to enable experimental features (e.g. <a href="https://nodesource.com/blog/experimental-features-in-node.js/">Node.js</a>).
In Android systems, you can <a href="https://developer.android.com/studio/debug/dev-options#enable">enable advanced Developer Mode</a> by clicking a specific menu item X number of times.</p>
<p>In general, with a bit of imagination you can use feature flags in any kind of software application we build. And you should.</p>
<h2 id="beta-testing-and-allowlisting"><a href="#beta-testing-and-allowlisting">Beta testing and allowlisting</a></h2><p>One important aspect of agile product development is to get feedback from customers as early as possible, and iterate over the application by fixing issues and implementing new features.</p>
<p>Feature flags allow us to expose incomplete and in-progress features to a subset of our customers.</p>
<p>This makes it easy to iterate on our applications without worrying that we are going to negatively impact the rest of our customers.
Not only that, but we can do it straight in our production environments, using real data, real dependencies, and real customers.
The feedback and confidence we can get by using production directly is great.</p>
<p>Feature flags are a great way to implement early access programs for your products, and even paid tester programs where you allow organisations and individuals to test your product, and give you feedback, before releasing it to the public.</p>
<h2 id="rule-conditions"><a href="#rule-conditions">Rule conditions</a></h2><p>In the example implementation for <code>FeatureFlags</code> we used the user’s ID and originating country as the rule conditions.
There are hundreds of different conditions that we can use. </p>
<p>Simple user conditions include:</p>
<ul>
<li>ID</li>
<li>originating country</li>
<li>pricing plan</li>
<li>device type</li>
<li>company/organization of the user (in B2B cases)</li>
</ul>
<p>Apart from user specific conditions, we can have global conditions:</p>
<ul>
<li>deployment environment region (e.g. only in AWS <code>us-east-2</code>)</li>
<li>percentage of our hosts (e.g. only 10% of our hosts should use a feature)</li>
<li>percentage of users (e.g. only 10% of the users should use a feature)</li>
</ul>
<p>These are just few conditions I have used in the different teams I worked with, and there are way more.
You can even combine multiple conditions together to get fine-grained control of your features.</p>
<h2 id="feature-rollout"><a href="#feature-rollout">Feature rollout</a></h2><p>Probably the most popular use-case of feature flags is to gradually, and safely, rollout a new feature.</p>
<ul>
<li>While the feature is implemented the feature is enabled only for certain internal users (or nobody), and only in our staging environments.</li>
<li>Once it’s ready for beta testing we enable it for a few customers, and for specific production environments.</li>
<li>Once it’s ready for full release, we enable it for each of our production environments, ideally not all-at-once to avoid breaking all customers in case a bug slipped through.</li>
<li>If the rollout completed successfully, we remove the code that does the condition check and always use the newly launched feature.</li>
<li>If the rollout of the feature caused some regression, we can update the rules to disable the feature, and deploy that in order to quickly disable the feature and revert to the working version of the application.</li>
</ul>
<p>This is the most basic, most common, and arguably the most important use of feature flags.</p>
<p><strong>Deliver value to your customers, safely, reliably, continuously!</strong></p>
<h2 id="ab-testing-and-experiments"><a href="#ab-testing-and-experiments">AB testing and experiments</a></h2><p>A more complex use-case for feature flags is A/B testing (experiments).</p>
<p>A/B testing is when we want to experiment with different variations of the same feature, for example choosing the color of the checkout button between yellow, green, blue.
A/B testing systems usually provide extra features on-top of what we already explored so far, but the underlying technology is often the same.</p>
<p>For example, at Amazon, for the retail website we had our own internal service for feature flags and experiments called <strong>Weblab</strong> <a href="https://medium.com/fact-of-the-day-1/experimentation-at-amazon-51b35490d805">[1]</a> <a href="https://www.awa-digital.com/blog/truth-about-amazon-booking-experimentation-culture/#:~:text=much%20copied%20feature.-,Amazon%20created%20its%20own%20experimentation%20platform,-It%20wasn%E2%80%99t%20until">[2]</a> <a href="https://www.smartinsights.com/digital-marketing-strategy/online-business-revenue-models/amazon-case-study/#:~:text=Amazon%20marketing%20strategy%20experiments">[3]</a>.</p>
<p>In Weblab you could create a new feature, or experiment, where you didn’t just specify the rules that enabled a feature.
You could specify multiple <strong>treatments</strong> of the feature, and the rules per treatment.</p>
<p>For example, for the checkout button color example, you would have the <code>Control (C)</code> treatment, which is the default/existing case when the feature is disabled.
Treatment <code>T1</code> was the first option of the feature, e.g. yellow button.
Treatment <code>T2</code> was the second option of the feature, e.g. green button.
Treatment <code>T3</code> was the third option of the feature, e.g. blue button.</p>
<p>If you just wanted the feature flag functionality you were done, and in the code you would have something like below (not real Amazon-code, just as example):</p>
<pre><code class="language-java">String treatment = flags.getTreatment(FeatureName.ButtonColor, user);
if (treatment == "C") { 
  this.buttonColor = this.colorDefault; 
} else if (treatment == "T1") {
  this.buttonColor = this.colorYellow;
} else if (treatment == "T2") {
  this.buttonColor = this.colorGreen;
} else if (treatment == "T3") { 
  this.buttonColor = this.colorBlue;
}
</code></pre>
<p>If you wanted the A/B testing (experimentation) functionality, Weblab would also track key metrics that you specified for each of the treatments.</p>
<p>In the example above, we could track the number of button clicks for each treatment, and therefore we would be able to get concrete data on which button color performed better.</p>
<p>The Weblab implementation showcases that when there is a robust feature flag implemented, there is a lot of interesting functionality that is now made possible.</p>
<h2 id="kill-switch"><a href="#kill-switch">Kill-switch</a></h2><p>Another common use of feature flags is the kill-switch.</p>
<p>In many important feature rollouts, and big events (e.g. Black Friday, Superbowl, Christmas), you might want to have an easy way to enable/disable specific functionality in your application quickly.
In the kill-switch use-case, we want to disable a feature, or immediately revert to a different implementation of a feature.</p>
<p>Having a kill-switch is very similar to the rollout of a feature that went wrong and we disable it until fixed.
But, instead of the feature flag being temporary, it’s permanent.</p>
<p>One real-world example of a kill-switch feature flag I encountered was in Amazon Video.
When we released super popular shows (e.g. The Grand Tour) the demand was very high.
We had several kill-switches all over the codebase in order to quickly disable certain functionality that would allow the services to scale better if we had unexpectedly high traffic.</p>
<p>For example, one of those kill-switches would switch from using server-rendering for the TV Show details page on the Amazon Video website, to using a static file from our CDN that would only show static information and allow you to stream the show.
Even though some features wouldn’t be provided (e.g. customer reviews), this emergency measure would allow customers to watch the show, which was the main goal.</p>
<p>In general, most feature flags are meant to be temporary to control the rollout of new features.
There are certain cases where we want the ability to dynamically change our application’s behavior, and that’s where permanent feature flags come in place, the kill-switches. </p>
<p>The deactivation/killing of certain features to allow others to perform better, falls into the <strong>graceful degradation technique</strong> (<a href="https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/rel_mitigate_interaction_failure_graceful_degradation.html">see AWS reliability pillar</a>, <a href="https://www.usenix.org/conference/osdi23/presentation/meza">see META’s Defcon</a>) we can employ in our complex systems in order to cope with scale.</p>
<h2 id="testing-feature-flags"><a href="#testing-feature-flags">Testing feature flags</a></h2><p>A common question I get from folks when discussing about feature flags is how to approach testing.</p>
<p>For every treatment of a feature flag, you essentially create another branch of logic that needs to be tested.
Do your unit/integrations tests need to to cover all possible values of the feature flag, or not?</p>
<p>Personally, I do the following thinking process:</p>
<ul>
<li>If some code is irrelevant to the feature flag, and is not affected by what the feature flag value will be, then I don’t amend the tests related to that code.</li>
<li>If some code is directly by the value of the feature flag, then I will try to cover it with tests.<ol>
<li>If the tests can become parametric, with the feature flag value as input parameter, the best would be to run all those tests for all the possible feature flag values.</li>
<li>If the tests cannot become parametric, then I usually test the happy path for all the feature flag values, and then some edge cases for the final feature flag value.</li>
</ol>
<ul>
<li>For example, if we introduce a new feature flag for some new functionality, all the existing tests should still pass!</li>
<li>Once I confirm that they pass, I mock the feature flag value for those tests to be what the final value will be after the full rollout. This will make sure that existing functionality will work fine once the feature flag is fully rolled out.</li>
<li>Finally, I will duplicate some important tests (or somehow make them parametric) with the feature flag off to make sure that some new logic will not inadvertedly depend on the feature flag, and in case we roll back its release we end up with issues.</li>
</ul>
</li>
</ul>
<p>Testing on its own is a hot-topic.
Some engineers are neutral about it, others like it, others hate it, and you have the extreme camps on either side.</p>
<p>I personally like to have some tests.
I definitely don’t push for arbitrary 90%+ coverage, but I also don’t like having zero tests for core business logic because it makes development much much slower.</p>
<p>If you like tests, then make them cover the feature flags too.
If you don’t like tests, then you shouldn’t even be reading this section.</p>
<h2 id="managing-feature-flags"><a href="#managing-feature-flags">Managing feature flags</a></h2><p>Managing feature flags includes all the actions and procedures you need to have in-place in order to boost your productivity while doing safe rollouts of new features.</p>
<ol>
<li>Have one defined process everyone will follow when creating, updating, and deleting feature flags. This process can be automated with simple CLIs, any low-code workflow tools that can interact with source control, or using any of the SaaS services dedicated to feature flags.</li>
<li>A feature flag should almost always have the following lifecycle:<ol>
<li>Creation and code changes introducing branching behavior.</li>
<li>Ship code changes with feature flag OFF in staging/production.</li>
<li>Gradually enable the feature flag ON.</li>
<li>Once the feature flag is fully enabled, monitor key metrics for N days/weeks.</li>
<li>Once the verification is done, and the feature flag has been fully enabled for sufficient time, you should now delete the code changes using the feature flag, and make the enabled code branch the new default!</li>
<li>Delete the feature flag.</li>
</ol>
</li>
<li>Do code reviews for any feature flag change. Enabling or disabling a feature flag, changes the running code, so treat it as any other code change.</li>
<li>Implement some kind of monitoring to notify you when a feature flag is fully enabled for N weeks, and not yet cleaned up. Or even worse, when a feature flag is partially rolled out for N weeks without new changes.</li>
</ol>
<p>As I said above, a feature flag introduces branching in the code behavior, so you should try to clean them up as soon as possible.</p>
<p>Leftover feature flags, especially partially rolled out ones, are a nightmare to maintain.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p><strong>Use feature flags everywhere!</strong> Backend servers, frontend websites, mobile apps.</p>
<p>Iterating on new features development without the fear of breaking production and impacting customers is a <strong>productivity booster.</strong></p>
<p>You can start by a simple mechanism of hardcoded feature rules in a file, periodically fetched by your servers, and expand into more complex solutions later.</p>
<p><strong>Deliver value to your customers, safely, reliably, continuously!</strong> 🚀</p>
<h3 id="changelog"><a href="#changelog">Changelog</a></h3><ul>
<li>2023-09-24<ul>
<li>Added sections “Testing feature flags” and “Managing feature flags”.</li>
<li>Added table of contents.</li>
</ul>
</li>
</ul>

</article>
<article>
<h1>How to pass the interview for software engineering roles in Big Tech</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 10 Sep 2023 00:00:00 GMT</p>
<p>Over the past decade, I have gone through tens of interviews myself as an interviewee at small companies, startups, and Big Tech companies, from junior to Principal level roles. I also completed more than a hundred interviews as an interviewer during my time at Amazon/AWS, Meta, and Datadog.</p>
<p>Recently, I had discussions with several friends and colleagues about interviewing for software engineering roles, from entry level to senior levels. Some of them were going through interviews right now, and others asking about how to get better in interviewing in general.</p>
<p>So, I decided that instead of copy pasting resources every time, and having the same discussions repeatedly, I should write an article putting down all the generic advice I would give them. This way, our 1:1 discussions could focus on the specifics of their role, company, and skill-set instead.</p>
<p>This is NOT an exchaustive reference of interview preparation material.
This is the advice I give to friends and colleagues, and how I personally prepare for interviews.</p>
<p>It works for me, it worked for some of my friends, so it might work for you too.</p>
<p><strong>Table of contents</strong></p>
<ul>
<li><a href="#proper-preparation-is-worth-it">Proper preparation is worth it</a></li>
<li><a href="#interview-process">Interview process</a></li>
<li><a href="#coding">Coding</a></li>
<li><a href="#system-design">System Design</a></li>
<li><a href="#behavorial">Behavorial</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
<div class="upsell-section">
    <p>Prefer a personalized 1:1 session for tips, or a mock interview?</p>
    <div class="consulting-cta-container">
        <!-- <a class="cta-interview" href="https://cal.com/lambrospetrou/interview-preparation-1h" target="_blank" rel="noopener noreferrer"> -->
        <a class="cta-interview" href="https://go.lambros.dev/book-interview-prep" target="_blank" rel="noopener">
        Book interview preparation session
        <small>1:1 mock interview (coding or system design)</small>
        </a>
    </div>
</div>

<h2 id="proper-preparation-is-worth-it"><a href="#proper-preparation-is-worth-it">Proper preparation is worth it</a></h2><p>There are myriad of angry engineers online crying out loud that the current state of software engineering interviews is a mess. Some excuses they usually use against the coding interviews (which I do not agree with - I am pro-interviews!):</p>
<ul>
<li>Why would you ever put a tenured engineer, writing code for years, sometimes decades, through a 40-minute process writing code in a collaboration doc or a whiteboard…</li>
<li>Why would you ask interviewees to explain and describe systems that they would probably never built in their day to day job…</li>
<li>Why would you not just believe that if someone says they are ninja coders, they can actually write code…</li>
<li>Why would someone even spend 1-2 months preparing for a stupid interview just to get into a big tech company, if they never use any of that knowledge day to day…</li>
</ul>
<p>I am <strong>not</strong> going to debate the pros and cons of the interview process in this article, although I should write one in the future.</p>
<p>🙏🏼 For now, get over it! Big Tech companies, and even smaller ones, follow a pretty much standardised hiring process.</p>
<p><strong>And that’s a good thing! As long as you can devote some time to prepare.</strong></p>
<p>Preparing for the interviews, and actually <a href="https://blog.pragmaticengineer.com/software-engineering-salaries-in-the-netherlands-and-europe/">getting a job in a big tech company often means 3-10x times higher total compensation versus a local small company</a> 💰💶 </p>
<p>For me, and I am sure most people out there, this is effing important.</p>
<p>Do you not care for money, or getting a job in Big Tech, fine, stop reading now, close the tab, and/or <a href="https://twitter.com/LambrosPetrou">send me an angry tweet how much you disagree</a>.</p>
<p>Note: Big Tech means <a href="https://www.investopedia.com/terms/f/faang-stocks.asp">FAANG</a> in some circles, but in this article I refer to most multi-billion-dollar-revenue tech companies (e.g. Datadog, Cloudflare, Meta, Google, Stripe, Shopify).</p>
<h2 id="interview-process"><a href="#interview-process">Interview process</a></h2><p>The interview process in these companies is (mostly) the same.
There are small (or big) differences at the behavorial aspect of the interviews, and the difficulty of questions asked. Other than that, preparing for the top-tier companies will make your interviews with smaller companies a walk in the park. </p>
<p>Usually the whole interview pipeline consists of:</p>
<ol>
<li>Coding (2-3x)</li>
<li>System Design - Architecture (1-2x) — for Senior+ levels</li>
<li>Behavorial (1-2x)</li>
</ol>
<p>I am going to focus on the above parts of the interview process.
Each interview usually lasts 45 minutes, with some companies extending it to an hour.</p>
<p>Some companies might have interviews that are more conversational, or oriented around specific technologies (e.g. Java and JVM internals, machine learning algorithms and frameworks).
Those need dedicated preparation and it really depends on the company you are interviewing with, so I am not going to get into those.</p>
<h2 id="coding"><a href="#coding">Coding</a></h2><p>Coding interviews are the ones getting the most backlash by interviewees. Unless you are working in some deeply technical core Computer Science role, you are probably not exercising your coding problem solving skills at the level required for these interviews.</p>
<p>Therefore, you need to dedicate time preparing for this interview.</p>
<p>How much, depends solely on your skill, experience, and how often you practice.</p>
<p>As an example, after 5 years of working at Amazon/AWS, even though I worked at a technical product (lots of tree traversals, data transformations), I had to spend about 1 month practicing problem solving (almost) daily.
This is because for 5 years, I had <strong>zero practice</strong> in coding interviews.</p>
<p>Next time I switched jobs was 2.5 years later, and that time I only needed about 2 weeks of preparation, mostly reading past material to refresh my memory.
That’s because I had been doing 1-2 interviews every year just for practice, and was solving a few problems every couple months.</p>
<h3 id="coding-during-the-interview"><a href="#coding-during-the-interview">Coding - during the interview</a></h3><p>During this interview, you will usually use a coding collaboration tool like <a href="https://coderpad.io/">Coderpad</a> or <a href="https://www.hackerrank.com/">HackerRank</a> where both you, and your interviewer can collaborate on the code.</p>
<p>Initially, the interviewer will paste (or just verbally explain) the problem statement.</p>
<p>Once they give you the problem, this is where the “dance” starts, and it’s your turn to shine 😉</p>
<h4 id="1-ask-clarification-questions"><a href="#1-ask-clarification-questions">1. Ask clarification questions</a></h4><p>As straightforward as it sounds, I interviewed many candidates that completely skipped this step and jumped into the code.
Needless to say that most of those candidates failed spectacularly.</p>
<p><strong>You have to ask clarification questions.</strong>
It’s literally an evaluation checkbox in the interviewer’s handbook for your feedback.</p>
<p>Ask questions.</p>
<ul>
<li>edge cases</li>
<li>size of the data</li>
<li>invalid inputs</li>
<li>input and output formats expected</li>
</ul>
<p>Keep asking questions until the problem is 100% clear in your mind!</p>
<h4 id="2-explain-your-solution-with-examples-before-code"><a href="#2-explain-your-solution-with-examples-before-code">2. Explain your solution with examples before code</a></h4><p>Another mistake candidates often do is not explaining their full solution before coding it.</p>
<p>There are several benefits in explaining your solution verbally, and actually testing it with at least one example input.</p>
<ul>
<li>It’s important that you show how you think. How you approach a problem and walk through different solutions until you find the correct one.</li>
<li>If you missed something in your solution, explaining it might reveal it.</li>
<li>If you run out of time coding the solution, your interviewer will still know that you actually did think of a correct solution.</li>
<li>If you run out of time and cannot test your code at the end, you still need to show your interviewer that you can test an input with your solution and prove that it works. Do it early to get it out of the way.</li>
</ul>
<p>This shouldn’t be a 10-minute discussion.
It has to be a very short illustration, max 2-3 minutes, of how the solution works with an example input.</p>
<p>If you cannot find a solution within the first 2-3 minutes, it’s crucial that you keep talking.</p>
<p>Keep talking your thoughts, what do you have trouble with, which part of the solution works, and at which part are you stuck.
You need to keep the interviewer engaged and in-sync with your thinking.</p>
<p>This not only gives you points in working-together and collaboration, but more importantly makes it trivial for the interviewer to steer you in the right direction in case your partial solution is completely the wrong one, or to give you the hint missing to fully solve the problem.</p>
<p>Don’t take the above point to the extreme either.
If you need 1-2 minutes to think for yourself, that’s totally fine.
Tell the interviewer that you want 2 minutes to think of it, or write some notes down.
You just need to <strong>keep the interviewer in the loop</strong>.</p>
<p>Before moving on, I also ask the candidates to tell me the complexity of their approach in terms of time and space.
After writing the code, I ask them again to reason about its complexity and compare it with the original answer.</p>
<h4 id="3-write-the-code"><a href="#3-write-the-code">3. Write the code</a></h4><p>At this point you should have a clear idea of the solution you want to code.
You should never start coding before you have a crystal clear understanding of the problem, and its potential solution.</p>
<p>Throughout the session, remember that you need to keep the interviewer in-sync with what you are thinking and doing.</p>
<p>While you write the code, say the highlights of what you are doing.
For example, if you are going to iterate a list and do some transformation, say that and then write the code for it.</p>
<p>Many candidates have trouble knowing what to say and when while coding.
One easy way for me to think about this, is that if I would write a comment in the code normally, that’s what I say out loud.</p>
<p>Random tips:</p>
<ul>
<li>In these 10-15 minutes you usually just write 10-30 lines of code. You don’t need fancy classes, and hierarchies of inheritance. However, you do need to write simple and clean code that is easy to understand.</li>
<li>Write small functions to abstract away complexity from the function that solves the main problem. For example, if you need to iterate a string and parse something as part of a bigger solution, move the iteration and parsing in its own function to keep the main business logic simple.</li>
<li>If you forget about a specific language method name, or what the arguments are, don’t panic. Tell the interviewer something like <em>“I know that there exists a method YYY, but cannot remember its definition. Can I write XXX to represent that method and come back to it later?”</em>. Almost always the answer will be yes. If the tool supports running and evaluating the code, then this problem goes away since you can try and find the right method. Also, some interviewers will allow you to even google the method name (make sure to ask if you can do that though!).</li>
<li>If you are able to run the code written, then put <code>print/console.log</code> statements in your code and run it judiciously to make sure it works.</li>
<li>Don’t spend more than 2-3 minutes at the same line of code, typing and deleting without progress. Stop and think if needed. Ask the interviewer a question. Try to get out of your blackout.</li>
</ul>
<p>Typing the code of the solution is usually the easiest part.
You understand the problem.
You thought of the solution.
You just translate words into code at this point.</p>
<p>Remember to be friendly, engaging, and talk to the interviewer.
If you do this, they will consiously, and unconsiously, help you.
Either giving you a hint when they see you stuck, or just being there acting as your <a href="https://en.wikipedia.org/wiki/Rubber_duck_debugging">rubber duck</a>.</p>
<h4 id="4-walk-through-the-code-and-test-it"><a href="#4-walk-through-the-code-and-test-it">4. Walk through the code and test it</a></h4><p>Hopefully, you still have time after writing the code.</p>
<p>It’s important that you now do a quick walk through the code written, verbally explaining what each step does, and making sure it implements the previously discussed solution.</p>
<p>If you want to get all your points, you should also do a run-through with an example input.
Show how the input will be processed at each line, similar to how a debugger in an IDE would work in a step-by-step execution.</p>
<p>Before moving on to the next problem, assuming the tool supports it, make sure to run your code and confirm that it does the right thing.</p>
<p>This can be done trivially by just calling your function with sample inputs and printing out the returned values.
No need for fancy test frameworks to remember. For example, in Javascript this is how I do it:</p>
<pre><code class="language-javascript">function solve_problem_xxx(input) {
    // ...
}

console.log(solve_problem_xxx(/* input 1 */))
console.log(solve_problem_xxx(/* input 2 */))
console.log(solve_problem_xxx(/* input 3 */))
</code></pre>
<h4 id="5-the-interviewer-interrupts-you-to-move-on"><a href="#5-the-interviewer-interrupts-you-to-move-on">5. The interviewer interrupts you to move on</a></h4><p>There are cases where the interviewer will interrupt you after 15-20 minutes to move on to a different problem.</p>
<p>This usually happens when they have more questions to ask you, and in the interest of time they want to move on to cover more topics.
I usually tell the candidates how many questions we will do right at the beginning, so if your interviewer doesn’t say anything, it might be good to ask them yourself so that you can plan your time.</p>
<p>If you did all the above steps, you might not lose any points even if the code is not fully finished.
You clarified the problem, explained the solution, reasoned about its complexity, showcased that you can do a dryrun, and you proved that you can write simple and clean code.</p>
<p>Having said that, if you only wrote very few lines of code, you will lose points.</p>
<p>This is a coding and problem solving skills interview, so you need to prove that you can solve problems, and write the code for their solution.</p>
<p>So, move with urgency, be fast, and be methodical.
Follow the above steps, and practice.</p>
<h3 id="coding-preparation-before-the-interview"><a href="#coding-preparation-before-the-interview">Coding - preparation before the interview</a></h3><p>There are myriads of coding interview books, like <a href="https://www.amazon.co.uk/Elements-Programming-Interviews-Python-Insiders/dp/1537713949/">Elements of Programming Interviews in Python</a> and <a href="https://www.amazon.co.uk/Cracking-Coding-Interview-6th-Programming/dp/0984782850">Cracking the coding interview</a>. There are also hundreds of online coding platforms dedicated to coding interviews like <a href="https://www.hackerrank.com/">HackerRank</a>, <a href="https://leetcode.com/">Leetcode</a> (probably most famous). </p>
<p>I never liked Leetcode, and only use HackerRank from time to time to get some practice using an online evaluation tool instead of coding locally on my laptop.</p>
<p>After reading tons of books and trying these platforms, I personally wholeheartedly recommend to everyone that asks me how to prepare, to buy the <a href="https://www.amazon.co.uk/Elements-Programming-Interviews-Python-Insiders/dp/1537713949/">Elements of Programming Interviews in Python</a> book.</p>
<p>It has versions in other languages (e.g. <a href="https://www.amazon.co.uk/Elements-Programming-Interviews-Insiders-Guide/dp/1479274836">C++</a>, <a href="https://www.amazon.co.uk/Elements-Programming-Interviews-Java-Insiders/dp/1517671272">Java</a>), but I always recommend the Python version since the answers are very simple Python code that most programmers should understand and be able to translate in the language they use.</p>
<p>This book contains a lot of problems for every category of questions used in the interviews.
The problem difficutly ranges from easy to super hard.</p>
<p>There is even a sample guide at the front pages that suggests problems to solve from each category depending on how much time you can spend on preparation.
This is super useful to give you an idea of which problems are really core, and which ones can be left for later.</p>
<p>What I like the most about this book, is that there is an explanation of the solution, and the solutions are often small and clean. 
This is in stark contrast to the “Cracking the coding inteview” book that used to have tens of lines of Java classes that just take away focus from the actual problem being solved.</p>
<p>Disclosure: I am not affiliated with the authors of this book, nor do I get any commission promoting it. I honestly just love it, and use it as my sole preparation material for coding interviews for the past 4 years, with success.</p>
<h4 id="need-to-practice"><a href="#need-to-practice">Need to practice</a></h4><p>No matter the book, or the platform, or any other resource you use preparing for the coding interview, the only constant is that <strong>you need to practice</strong>.</p>
<p>Unfortunately, these problems are not something you do on a daily basis (in most roles), and therefore you need to spend time practicing in order to get good at them.</p>
<p>You need to solve a few problems of each category, so that later you can pattern match any given problem to something you previously did.
Even if you don’t get identical questions in the interview, you will most likely use techniques that you came across while preparing, which makes a huge difference.</p>
<p><strong>Put the time. Practice daily, or multiple times per week, for 2-5 weeks depending on your skills.</strong></p>
<h2 id="system-design"><a href="#system-design">System Design</a></h2><p>System design interviews are usually done only for the Senior level and beyond.</p>
<p>These are discussion oriented interviews where you are given a vague problem statement, and you need to come up with a design of a system architecture that solves that problem, and go deep into the technical details of the system.</p>
<p>For example, a common question is asking you to design <a href="https://www.youtube.com/">YouTube</a>.</p>
<p>Of course, you cannot just implement YouTube or come up with the absolute best architecture for it in 40-50 minutes.
However, you need to show your skills in thinking about systems, limitations, constraints, and making reasonable assumptions.</p>
<p>During this interview you will use an online whiteboarding tool like <a href="https://excalidraw.com/">Excalidraw</a>, or if the interview is onsite a real-life whiteboard ✍🏼</p>
<p>There are different variations of this interview.
There is the traditional backend/distributed systems interview, the mobile app design interview, the machine learning system design interview, and others.</p>
<p>Even though I am focusing mostly on the backend system design interview below, everything applies to the rest as well.
The steps, the area of focus, and the tips apply to all the variations.</p>
<h3 id="system-design-during-the-interview"><a href="#system-design-during-the-interview">System Design - during the interview</a></h3><p>Similarly to the coding interview, the system design interview can be tackled systematically, making it easier for you to prepare and handle the interview without depending too much on the interviewer steering of the conversation.</p>
<h4 id="1-explore-the-problem-and-ask-questions"><a href="#1-explore-the-problem-and-ask-questions">1. Explore the problem and ask questions</a></h4><p>Once you get the problem statement, you have to spend the next 5-10 minutes asking questions.</p>
<p>You need to explore the problem as deep and as broad as possible. There are two kinds of questions you should do:</p>
<ol>
<li>Business requirements (also known as functional requirements)</li>
<li>Technical requirements (also known as non-functional requirements)</li>
</ol>
<p><strong>Business requirements</strong></p>
<p>This is where you will define the exact problem you will solve, what use-cases to support, what functionality to provide.
Example questions (assuming the YouTube scenario):</p>
<ul>
<li>Who is the user of the product?<ul>
<li>viewers, video editors, ad publishers, …</li>
</ul>
</li>
<li>How often and when do they use it?<ul>
<li>24/7 vs business hours, timezone based, global, …</li>
</ul>
</li>
<li>What can they do with it?<ul>
<li>upload/view/edit video, comments, download video, like/dislike, playlists, …</li>
</ul>
</li>
</ul>
<p><strong>Technical requirements</strong></p>
<p>This is where you will understand the scale and constraints of the system you should design.
Example questions (assuming the YouTube scenario):</p>
<ul>
<li>How many users per second?</li>
<li>How many videos “actioned” by each user?</li>
<li>Size limit per video?</li>
<li>Acceptable latency per operation?</li>
<li>Eventual consistency vs synchronous actions</li>
</ul>
<p>Overall, after this series of questions you should know exactly what the product should do, and the constraints.</p>
<p>The interviewer might tell you to make assumptions instead of answering with a concrete value in some of your questions. 
In that case, try to give a reasonable guess based on products you know in real life (e.g. Facebook users around 2 billion).</p>
<p>The functional requirements are usually a much more limited set of the products you know and use, you would never propose a design for the whole of YouTube.
But you can design a system for uploading, and viewing videos.</p>
<p>The non-functional requirements almost always revolve around the following dimensions:</p>
<ul>
<li>Data size (ingestion, storage, processing)</li>
<li>Throughput (requests per second, number of users, read vs write ratio)</li>
<li>Latency and consistency (eventual vs synchronous consistency, asynchronous vs synchronous operations)</li>
<li>Cost (efficiency of the design)</li>
</ul>
<p>It might be necessary, and almost always suggested, to do some back of the napkin math to estimate number of requests per second, storage needed, and other numbers throughout the session.</p>
<p>For latency-related estimations, use the handy comparison table in “<a href="https://gist.github.com/jboner/2841832">Latency Numbers Every Programmer Should Know</a>“, and during your calculations use rounded numbers to simplify.</p>
<h4 id="2-provide-a-high-level-end-to-end-design"><a href="#2-provide-a-high-level-end-to-end-design">2. Provide a high-level end-to-end design</a></h4><p>Once you know what the system should do, for the following 5-10 minutes, the goal is to put some high-level design down.</p>
<p>This is important, and I have seen many candidates skip this step, and failing the interview in the end because they ran out of time without having an end-to-end system in-place due to spending too much time in a few components.</p>
<p>Here, you start talking about the main parts of the system.
You are not going into technical details now.
You describe the inputs of the system, main components of the system, and then the output of the system.</p>
<p>For example, for the YouTube scenario, assuming we only need an upload video and a view video page, this could be an initial high-level diagram.</p>
<p></p>
<p>You should not spend more than 10 minutes in this step.</p>
<p>The goal is not to cover every nitty gritty detail of the system, but to show that you understood the problem, you know the main components of the system, and the flow of data from input to output is clear.</p>
<h4 id="3-flesh-out-details-for-each-component"><a href="#3-flesh-out-details-for-each-component">3. Flesh out details for each component</a></h4><p>This step should take roughly 1/3-1/2 of the interview duration, 20-25 minutes.</p>
<p>You now have to take each component of the high-level design and go one step deeper, fleshing out enough technical details, such that if someone took your diagram and notes after this step they would have a good idea how to start implementing your system.</p>
<p>I recommend you start from the input of the system, and walking through to the outputs, so that you stay focused.
Do not jump from one component to another without being methodical, otherwise you will get confused and leave important things out.</p>
<p>For example, start by introducing load balancers in front of users.</p>
<p>Discuss the routing technique you use if there is any specific requirement.
Do you need any stateful load balancing, e.g. sticky sessions, or is it purely stateless?
Talk about these things as you draw.</p>
<p>Then, you move to the next component in our diagram above, the upload service.</p>
<p>How does the video uploading work?
Probably you need be able to handle GBs of data being uploaded.
Do you have a way to do it in parallel by splitting the video client-side, or is it all uploaded at once, or is it a multi-part upload?</p>
<p>Then, how does the video move to the transcoding service.
Does the upload service store it temporarily somewhere else like <a href="https://aws.amazon.com/s3/">Amazon S3</a>, and only pass the object key to the transcording service?
What’s the output of the transcoding service?</p>
<p>I hope it’s clear that in this section you go much deeper.
You discuss many technical details as you progress through the system.</p>
<p>General tips</p>
<ul>
<li>You should drive the interview. Don’t stop talking unless the interviewer interrupts you or asks you something. Show that you can control the interview and know how to describe a system. Do not just wait for them to ask questions and steer you towards a specific path.</li>
<li>Mention anything that comes to mind, but only draw and focus on main technical details and say that you will revisit the extra specific details in a follow-up round of deep-dives.<ul>
<li>For example, you shouldn’t spend 10 minutes discussing load balancing algorithms if that’s not the main problem being solved.</li>
</ul>
</li>
<li>You again need to cover the system end-to-end. The steps after this one will give you extra points, but this step is the meat of the interview.</li>
<li>Remember to justify your decisions as you go.<ul>
<li>For example, if you say that you use Amazon S3 for storing the video, explain the properties it provides and why it suits your needs.</li>
</ul>
</li>
<li>There are things that you probably won’t have expertise. You should still cover what you know, and explicitly mention what you don’t know.<ul>
<li>For example, one time I told the interviewer I didn’t know about exact streaming algorithms for videos, but I know they exist, so I would use one of those. I explained that I knew that videos are delivered in chunks, and that there are some manifest lists for the chunks of a video and the player requests the right chunks, etc.</li>
</ul>
</li>
</ul>
<p>Important aspects to flesh out:</p>
<ul>
<li>Load balancers and stateful/stateless scaling.</li>
<li>Databases used and why, e.g. NoSQL vs SQL RDBMS, data schemas.</li>
<li>Caching and database sharding.</li>
<li>Data flow from one component to the next one.</li>
<li>Point out explicitly if there are message queues (e.g. <a href="https://aws.amazon.com/sqs/">Amazon SQS</a>), event streaming like <a href="https://kafka.apache.org/">Kafka</a>, or synchronous gRPC calls.</li>
</ul>
<h4 id="4-discuss-about-constraints-limitations-special-considerations"><a href="#4-discuss-about-constraints-limitations-special-considerations">4. Discuss about constraints - limitations - special considerations</a></h4><p>At this point, the system should be well-defined with enough detail, and all the components are fleshed out.</p>
<p>For the next 5-10 minutes, you should start discussing about constraints of the system, its limitations, and single points of failure.</p>
<p>Examples:</p>
<ul>
<li>If you use a cache what happens if it crashes?</li>
<li>Can the database selected handle the expected load?</li>
<li>How is the system impacted if each component fails, i.e. which parts are single points of failure?</li>
<li>What does recovery look like when a server crashes during video transcoding?</li>
<li>Discuss optimizations needed at scale, e.g. using a Content Delivery Network (CDN) to offload the delivery of the video parts from your servers.</li>
<li>What if we want to support 2x the load, or 10x the customers?</li>
</ul>
<h4 id="5-deep-dive-into-specific-components"><a href="#5-deep-dive-into-specific-components">5. Deep dive into specific components</a></h4><p>At this point, I usually have about 5 minutes remaining, and I ask the interviewer if they need me to go deeper in a specific component or if they have specific questions.</p>
<p>If they say no, don’t finish it here.
Show your expertise in building systems, and pick a component and go deeper.</p>
<p>Focus on some of the aspects you brought up in the previous section and explain how you would tackle them.</p>
<p>If they say yes, then focus on that component and open up the discussion to them at this point, making it a dialogue.</p>
<h4 id="guidelines"><a href="#guidelines">Guidelines</a></h4><p>The following table provides a summary of how you should approach the system design interview, based on the previous sections.</p>
<p>You can of course deviate depending on the company-specific details, but you should still apply the same structured thinking, methodical end-to-end designing, and deep dives into key components to showcase your technical depth.</p>
<br/>

<table>
<thead>
<tr>
<th><span style="white-space:nowrap;">Time spent</span></th>
<th>Notes</th>
</tr>
</thead>
<tbody><tr>
<td><span style="white-space:nowrap;">5-10 minutes</span></td>
<td>Explore the problem and ask questions. Focus on business requirements (functional) and technical requirements (non-functional).</td>
</tr>
<tr>
<td><span style="white-space:nowrap;">5 minutes</span></td>
<td>Provide a high-level end-to-end design. The flow of data and actions should be clear end-to-end without too many technical details.</td>
</tr>
<tr>
<td><span style="white-space:nowrap;">25 minutes</span></td>
<td>Flesh out all technical details for all the components. Start from the input of the system, all the way to the outputs. If someone took your diagram and notes after this step they would have a good idea how to start implementing your system.</td>
</tr>
<tr>
<td><span style="white-space:nowrap;">5-10 minutes</span></td>
<td>Discuss about constraints, limitations, crash recovery and fault-tolerance, special considerations depending on the problem.</td>
</tr>
<tr>
<td><span style="white-space:nowrap;">5-10 minutes</span></td>
<td>Deep dive into specific components if there is enough time. This section can be skipped if the previous one justifies taking more time.</td>
</tr>
</tbody></table>
<p>Overall, keep in mind you will be evaluated for the following criteria:</p>
<ul>
<li>Problem exploration and navigating ambiguity.</li>
<li>Understanding requirements and providing a high-level solution.</li>
<li>Showcase technical depth and broad knowledge. How well do you know the technologies you choose and how well do you justify using them.</li>
<li>Ability to communicate clearly when describing technical solutions.</li>
</ul>
<h3 id="system-design-preparation-before-the-interview"><a href="#system-design-preparation-before-the-interview">System Design - preparation before the interview</a></h3><p>Preparing for the system design interview doesn’t have a single approach.
It’s not as simple as the coding interview, which boils down to practicing more questions.</p>
<p>Practicing system design questions helps, but if you don’t know much, then you won’t know what you don’t know 🤯</p>
<p>What I recommend, and what I personally do is the following:</p>
<ul>
<li>Look into how your existing company implements a lot of their complex systems. You probably have access to all the internal implementation details, even the people working on them, so you can ask questions. Looking at real systems and how they are implemented is tremendously useful. I personally learnt a lot by researching how several AWS systems are implemented.</li>
<li>Read engineering blogs from well-known tech companies. This is very vague, I know, and I also have trouble following blogs outside a couple. The following is what I study religiously and then search around in more companies depending on topic.<ul>
<li><a href="https://aws.amazon.com/builders-library/">The Amazon Builders’ Library</a>: This is one of my favourite resources for learning about distributed systems. These are technical articles taken out from actual Amazon/AWS systems. I actually saw many of the techniques described in these articles in real-life during my time at AWS, and that’s why I love them. It’s not just marketing bullshit.</li>
<li><a href="https://engineering.fb.com/">Meta engineering blog</a>: Meta’s engineering blog is among my favourites. It spans things from AI, to developer tooling, to core infrastructure platforms, to web-scale metric systems. Some of them are high-level and not very technical, but some are super nice.</li>
<li><a href="https://blog.cloudflare.com/">Cloudflare engineering blog</a>: Cloudflare writes amazing technical blog posts about lots of their infrastructure and products. They range from super deeply technical network solutions to high-level architecture designs.</li>
<li>Other company blogs: <a href="https://www.datadoghq.com/blog/engineering/">Datadog</a>, <a href="https://stripe.com/blog/engineering">Stripe</a>, pick your favourite tech company.</li>
</ul>
</li>
<li>Watch the <a href="https://www.youtube.com/playlist?list=PLeNDQKdre0oEzLXh8Ksl2Ocoeltx0gD8-">Systems Architecture Interview</a> videos by Jackson Gabbard.</li>
<li><a href="https://www.amazon.co.uk/gp/product/1838430210">Understanding Distributed Systems, Second Edition: What every developer should know about large distributed applications</a><ul>
<li>I love this book. It does NOT go deep into the topics discussed, but it gives you a very broad coverage of many aspects around distributed systems.</li>
<li>This can be a great starting book that will expose you to the many topics you should be aware when designing systems, and then get other resources to go deeper in the topics you feel you have a gap.</li>
</ul>
</li>
<li>Watch the <a href="https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_HdUFe97RItdiB">Distributed Systems lecture series</a> by Martin Kleppmann.<ul>
<li>Amazing playlist by the author of “Designing Data-Intensive Applications” (DDIA).</li>
<li>This series covers core distributed systems concepts (e.g. logical clocks, consensus, replication, quorums) with crystal clear explanations.</li>
</ul>
</li>
<li><a href="https://www.amazon.co.uk/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/B08VKMNDBN/">Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems</a><ul>
<li>At this point, this is the bible of distributed systems.</li>
<li>This book is not for beginners, or for spending a few days to quickly go over topics. This is an in-depth technical book, focused on the data aspects of applications and database concepts in general.</li>
<li>I recommend you leave this last, unless it’s really matching the role you are interviewing for. But, you should definitely read it if you have the time.</li>
</ul>
</li>
<li><a href="https://github.com/donnemartin/system-design-primer">The System Design Primer</a><ul>
<li>Has a lot of information, examples of questions, and links to lots of other content to help with the system design interview.</li>
</ul>
</li>
<li>After you have read the above (or even during), it’s time to practice more. Just pick any product you use on a daily basis, pick a specific subset of its functionality, and start brainstorming how you would design it.<ul>
<li>Doing this a few times will help you develop intuition in common solutions and techniques as most system designs have similar components.</li>
<li>Try to first think a solution on your own before googling to find out information about the actual implementation.</li>
</ul>
</li>
</ul>
<h4 id="other-resources"><a href="#other-resources">Other resources</a></h4><p>The above material should be more than enough to prepare for the System Design interview, but if you want more material, the following are some resources I used a bit (definitely not exchaustively).</p>
<ul>
<li>Articles related to distributed systems by Murat Demirbas: <a href="http://muratbuffalo.blogspot.com/">http://muratbuffalo.blogspot.com/</a></li>
<li>Articles by Marc Brooker: <a href="https://brooker.co.za/blog/">https://brooker.co.za/blog/</a></li>
<li>The morning paper: <a href="https://blog.acolyer.org/">https://blog.acolyer.org/</a></li>
<li>The <a href="https://blog.pragmaticengineer.com/preparing-for-the-systems-design-and-coding-interviews/">Preparing for the Systems Design and Coding Interview</a> article by Gergely Orosz has a lot of references on books, courses, and material to study.</li>
<li>Free online course: <a href="https://www.hiredintech.com/system-design/">https://www.hiredintech.com/system-design/</a></li>
<li>Preparation links and resources for system design questions: <a href="https://github.com/shashank88/system_design">https://github.com/shashank88/system_design</a></li>
<li>System design interview for IT companies: <a href="https://github.com/checkcheckzz/system-design-interview">https://github.com/checkcheckzz/system-design-interview</a></li>
</ul>
<h2 id="behavorial"><a href="#behavorial">Behavorial</a></h2><p>This interview varies a lot company to company.</p>
<p>Some companies focus on project work, digging into specific work you did, extracting information about your contribution, project complexity, etc.</p>
<p>Other companies focus on the people aspect of things, extracting information about your decision making, conflict resolution, and collaboration skills.</p>
<p>Most companies focus on both 😅</p>
<h3 id="behavorial-during-the-interview"><a href="#behavorial-during-the-interview">Behavorial - during the interview</a></h3><p>The behavorial interviews are not as straight-cut as the technical interviews explored above.
Therefore, I will just focus on things that apply in general, and things you should be prepared for anyway.</p>
<ul>
<li>Be honest. Do not lie that you did things you didn’t do. If they actually use your provided references, they might reveal your lies. In most cases though, if the interviewer is experienced, they will pick up your lying and will ask follow-up questions that either you will have to lie even more, or you won’t know the answer to. In either case, you lose.</li>
<li>Be clear and use simple language. Big companies have employees from countries all around the world, and even though English is probably the language used daily, most people have accents. Speak in simple terms so that you are always understood, and keep an ear out if the interviewer asks you the same thing multiple times throughout the interview, since it might be an indication that communication is not clear.</li>
<li>Don’t be an arrogant jerk. Some folks think they are gods of engineering. Even if you are, don’t put it in your interviewer’s face and show-off. Showcase your skills with concrete data, examples, deep technical explanations, without insulting your interviewer.</li>
<li>Be specific in your contributions. There are very few products or projects that are delivered end-to-end by a single engineer. When describing some work you did, make sure to explain the overall situation with the product/project, the team, but emphasize on what you did as well. As an interviewer I have to know what your contribution is to the project. I don’t care about the project you worked on per se, but about the work you did, or you didn’t.</li>
<li>Use the <a href="https://capd.mit.edu/resources/the-star-method-for-behavioral-interviews/">STAR method</a> when talking about past projects. Explain the <strong>Situation</strong> of the project, the <strong>Task</strong> to complete, the <strong>Actions</strong> you took, and finally the <strong>Result</strong>.</li>
<li>If you get a question that you have no idea what to answer, say it. Don’t stay there hanging or saying something completely irrelevant. Help yourself by letting the interviewer know. If they insist, then try to ask questions to get more specific, until it’s something you can answer.</li>
</ul>
<p>Overall, these interviews have the following goals:</p>
<ul>
<li>Did the candidate contribute in a project in a significant way, and knows how to quantify that, and describe their work to someone?</li>
<li>Is the candidate someone that can collaborate in a team, and be a good colleague to the rest of the company?</li>
</ul>
<p><strong>Be honest, be clear and specific, and showcase your skills without arrogance.</strong></p>
<h3 id="behavorial-preparation-before-the-interview"><a href="#behavorial-preparation-before-the-interview">Behavorial - preparation before the interview</a></h3><ul>
<li>Go down memory lane and find at least two projects you are proud of. Be able to answer any question around them. You should be able to use the STAR method to describe the project, what problem it solved and what was your contribution. You should be able to answer technical questions about the project too, so spend some time reminding yourself about specifics of the project.</li>
<li>Prepare answers for the most common behavorial interview questions. Some of these questions are horrible, I hate them myself too, but many companies ask them. So, prepare before hand. These are the known “Tell me about a time” questions (<a href="https://www.themuse.com/advice/behavioral-interview-questions-answers-examples">1</a>, <a href="https://hbr.org/2023/01/how-to-answer-tell-me-about-a-time-you-failed-in-a-job-interview">2</a>). Some examples:<ul>
<li>Tell me about a time you had a conflict with a colleague and how you resolved it.</li>
<li>Tell me about a time you had to solve a complex problem.</li>
<li>Tell me about a time your actions led to a negative outcome, and how did you recover.</li>
<li>Tell me what your colleagues would say as your best quality.</li>
<li>Which soft or hard skills would you like to improve on.</li>
</ul>
</li>
<li>Study the company you are interviewing with, and try to sell yourself in a way that makes sense for them. For example, if the company is an analytics company, try to talk about a project or something you did and how it impacted the analytics of your product, or talk about work you did to improve the analytics you gathered to improve decision making.</li>
<li>Watch this <a href="https://www.youtube.com/watch?v=PJKYqLP6MRE">Intro to Behavioural Interviews</a> video by Jackson Gabbard.</li>
</ul>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>I hope the above information helps even a tiny bit your preparation for the interviews.</p>
<p>This is my approach, and how I prepare for interviews.
It has worked well for me so far, so I am confident you can get something valuable out of it.</p>
<p>Good luck 💪🏼 and <a href="https://twitter.com/LambrosPetrou">let me know</a> if you found this useful or if it helped you get that job!</p>
<div class="upsell-section">
    <p>Prefer a personalized 1:1 session for tips, or a mock interview?</p>
    <div class="consulting-cta-container">
        <!-- <a class="cta-interview" href="https://cal.com/lambrospetrou/interview-preparation-1h" target="_blank" rel="noopener noreferrer"> -->
        <a class="cta-interview" href="https://go.lambros.dev/book-interview-prep" target="_blank" rel="noopener">
        Book interview preparation session
        <small>1:1 session — coding or system design interviews.</small>
        </a>
    </div>
</div>

</article>
<article>
<h1>Merge two directories recursively</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 13 Aug 2023 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>I want to copy all the contents of a directory (or folder) named <code>source-folder</code> into another directory named <code>destination-folder</code>.</p>
<p>Let’s walk-through an example below.</p>
<p>Assume <code>source-folder</code> contains:</p>
<pre><code>file1.txt
file2.txt
directory1/file3.txt
directory2/file4.txt
</code></pre>
<p>And <code>destination-folder</code> contains:</p>
<pre><code>file5.txt
directory1/filexyz.txt
directory2/file4.txt
</code></pre>
<p>The final structure I want in <code>destination-folder</code> should be:</p>
<pre><code>file1.txt
file2.txt
file5.txt
directory1/file3.txt
directory1/filexyz.txt
directory2/file4.txt
</code></pre>
<ul>
<li>The <code>directory2/file4.txt</code> will be the original that was already in <code>destination-folder</code>, hence ignoring the one from the <code>source-folder</code>.</li>
</ul>
<h2 id="solution"><a href="#solution">Solution</a></h2><p>An easy solution is to use the <a href="https://linux.die.net/man/1/rsync"><code>rsync</code></a> command line tool, available in most Linux/Unix systems.</p>
<pre><code class="language-sh">rsync -av --ignore-existing source-folder/* destination-folder/
</code></pre>
<p>Without the <code>/*</code> it just copies the source folder itself into the destination!
So, you would end up with <code>source-folder</code> under the <code>destination-folder/</code>.</p>
<p><strong>Note:</strong> According to <a href="https://unix.stackexchange.com/a/149986">https://unix.stackexchange.com/a/149986</a> the wildcard is not needed, but the trailing slash is.</p>
<h2 id="detailed-explanation"><a href="#detailed-explanation">Detailed explanation</a></h2><ul>
<li><code>rsync</code>: This is the command-line utility used for synchronizing files and directories between different locations, either on the same system or between different systems.</li>
<li><code>-av</code>: Combines the following two options:<ul>
<li><code>-a</code> (or <code>--archive</code>): This option is used to achieve archive mode, which is essentially a combination of several other options like <code>-r</code> (recursive), <code>-l</code> (copy symlinks as symlinks), <code>-p</code> (preserve permissions), <code>-t</code> (preserve modification times), and more. It’s a convenient way to ensure that files are copied with their metadata and properties preserved.</li>
<li><code>-v</code> (or <code>--verbose</code>): This option enables verbose mode, meaning that rsync will display detailed information about the files being transferred and the progress of the synchronization.</li>
</ul>
</li>
<li><code>--ignore-existing</code>: This option tells rsync to skip copying files that already exist in the destination folder. If a file with the same name exists in the destination, it won’t be overwritten or updated.</li>
</ul>

</article>
<article>
<h1>Fast Feedback Loop and Delayed Gratification</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 29 May 2023 00:00:00 GMT</p>
<p>For the past few months I have been working on some side-projects that took way longer than I expected, or initially planned.</p>
<p>This was partly due to my mistake of keep expanding their scope, but also because I underestimated the time investment needed.
Shocking! Planning estimates gone wrong…</p>
<p>These delays sometimes demotivate me, or even trigger second-thoughts if I should even be doing these projects in the first place. Should I spend my time somewhere else instead?</p>
<p>My projects being delayed is not the purpose of this article though.</p>
<h2 id="delayed-gratification"><a href="#delayed-gratification">Delayed Gratification</a></h2><p>I was always the kind of person that delays instant short-term gratification for a better outcome in the long-term.</p>
<p>In my younger days, I was studying hard to learn things well, get good grades to get into a good university years after, and building the skills to get a nice job.</p>
<p>In sports (volleyball), I trained for weeks, months, or years, in order to feel the excitement, satisfaction, and that amazing dopamine rush when we won matches or tournaments.</p>
<p>In other words, I am used to working hard now, for a better outcome later.</p>
<p>This is what <strong>delayed gratification</strong> means, and is explained nicely in James’ Clear article <a href="https://jamesclear.com/delayed-gratification">“40 Years of Stanford Research Found That People With This One Quality Are More Likely to Succeed”</a>.</p>
<h2 id="fast-feedback-loop"><a href="#fast-feedback-loop">Fast Feedback Loop</a></h2><p>A <strong>fast feedback loop</strong> is most often mentioned as part of Agile methodologies for project management/execution, modern developer tooling and platforms that promise faster iterations, and more.
In general, it’s a popular pitch line for “something” increasing your productivity and output.</p>
<p>To me, fast feedback loop means that whenever I do something, anything, I am able to observe some kind of response back soon thereafter.
Some feedback that indicates if what I did was good, bad, or if it had any effect at all.</p>
<p>When I am writing software code, I want to have a very fast feedback loop between the time I write the actual code to the moment when I know it’s functionally correct.
Having good tooling that allows me to run unit tests as fast as possible (or a live-updating UI) the moment I save my code changes is absolutely crucial to my daily workflow.</p>
<p>I also want a tight feedback loop from when I merge my changes to the time I get feedback from customers actually using them.
I want to have a continuous, fast, reliable way of shipping changes to production, and then also a way to get feedback on how it’s doing.
That comes through metrics we define, complaints reported from customers, or even social media.</p>
<h2 id="the-realization-of-the-conflict"><a href="#the-realization-of-the-conflict">The Realization of the conflict</a></h2><p>Let’s get back to the realization I had. 😅</p>
<p>Having a <strong>fast feedback loop</strong> seems to be in conflict with <strong>delayed gratification</strong>.</p>
<ul>
<li>How can you build something that requires months or years, when you need a little amount feedback, a little gratification, all the time?</li>
<li>What can keep you going through hard times, when there is nothing to show for it yet?</li>
<li>How can you justify spending so much time on something, without getting something in return?</li>
</ul>
<p>Even though I have embraced delayed gratification for most of my life, I now seem to <em>need</em> a fast feedback loop for most things I do.
I can confidently argue that this is a side-effect of my day job as a software engineer, and it’s now influencing my attitude towards life too.</p>
<p>I realized though, that these two powerful behaviors are actually not in conflict with each other.</p>
<p>I have always been getting fast feedback, even when working towards very long-term big goals.</p>
<p>With midterm and final exams at school, you get feedback every few weeks.
At sports, I play local tournaments every few weeks/months, and my teams always played in league competitions spanning months with matches at least once a month.
In my day job, I get feedback during local development with good tooling. I get customer feedback through fast iteration on features and observability of the system behavior through metrics every time we release something. </p>
<p>The rookie mistake I did with my side-projects is that I had set these humongous goals at the beginning, without any small iterative milestones that would be completed every few weeks. </p>
<p>I was missing that dopamine boost.</p>
<p>I call this a “rookie” mistake because I already knew about this principle. Splitting big chunks of work into smaller pieces is essential for my day job.
I read about agile and lean development from myriad books. And I still did the mistake…</p>
<p>One of my main projects is to create an advanced online course for Continuous Integration and Continuous Deployment (CI/CD) (<a href="https://www.elementsofcicd.com/">elementsofcicd.com</a>). Building the content for several months now, felt like writing a book, where you cannot get customer feedback until the whole book is finished, printed, then sold, and then getting your gratification.</p>
<p>I made a mistake. I should have taken small sections of the course and publishing them as articles.
Maybe making a few smaller courses out of the sections that stand on their own, and releasing them even if the main course is not yet ready.</p>
<p>This would give me feedback, satisfaction, and the gratification that would keep me going to build the bigger, more complete, course I initially planned.</p>
<h2 id="life-rule"><a href="#life-rule">Life rule</a></h2><p>This is now a rule for me, at work, and at life.</p>
<p>If I want to get things done, I need to adjust my environment, and my tasks, such that I maintain a fast feedback loop whilst at the same time delaying my gratification for the ultimate goal.</p>
<p>Finding a way to get <strong>continuous little-gratifications</strong>, in order to be able to get that <strong>delayed huge-gratification</strong>, is crucial.</p>
<blockquote>
<p>“Most people optimize for the day ahead. A few people optimize for 1-2 years ahead. Almost nobody optimizes for 3-4 years ahead (or longer).</p>
<p>The person who is willing to delay gratification longer than most reduces competition and gains a decisive advantage.</p>
<p>Patience is power.”</p>
<p>— By James Clear in <a href="https://jamesclear.com/3-2-1/november-4-2021">“3-2-1: The value of nature, controlling your attention, and designing your environment”</a></p>
</blockquote>

</article>
<article>
<h1>The perfect PaaS: Does it exist? Or impossible to build?</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Thu, 02 Feb 2023 00:00:00 GMT</p>
<p><em>Warning: This article is a combination of deep multi-month research, hands-on experience over almost a decade, and ideas, all combined into a nice elaborated ranting. Proceed at your own peril!</em></p>
<p><strong>Table of contents</strong></p>
<ul>
<li><a href="#intro">Intro</a></li>
<li><a href="#this-question-haunts-me">This question haunts me</a></li>
<li><a href="#what-do-i-want-from-a-platform">What do I want from a platform?</a></li>
<li><a href="#pricing">Pricing</a></li>
<li><a href="#compute">Compute</a></li>
<li><a href="#continuous-integration--continuous-deployment">Continuous Integration & Continuous Deployment (CICD)</a></li>
<li><a href="#flexibility-and-extensibility">Flexibility and extensibility</a></li>
<li><a href="#good-citizen">Good citizen</a></li>
<li><a href="#developer-experience-dx-or-devx">Developer experience (DX or DevX)</a></li>
<li><a href="#my-complaint-to-aws">My complaint to AWS</a></li>
<li><a href="#conclusion">Conclusion</a></li>
<li><a href="#the-question-remains">The question remains</a></li>
</ul>
<h2 id="intro"><a href="#intro">Intro</a></h2><p>Over the past 5 months, I have spent my entire free time outside working hours, researching, trying, and reading documentation about every cloud platform product I could get my hands on. From Platform-as-a-Service (PaaS), to self-hosted PaaS, to Backend-as-a-Service (BaaS), to traditional server management tools, to Infrastructure as Code (IaC) tools, to the newer and trendier Infrastructure from Code (IfC) tools, and everything in between.</p>
<p>I was still left disappointed with the existing offerings.</p>
<p>At some point, I even started being serious about building the platform I want myself.
And then, once my own needs are satisfied, making it a product for others to use as well. Because, what’s a better business than one that solves your own problem, right? 😁</p>
<p>In a nutshell, I was (and still am) looking for a platform to use in order to deploy my stupid no-revenue nobody-uses toy projects, but also my more serious need-it-to-always-run might-bring-me-money projects.</p>
<p>I am a software engineer by trade, and over the years I have used a variety of platforms.
During my time at Amazon, I started with our pre-AWS internal system (Apollo), but then (thank God) we migrated on AWS where I used and built several abstractions with raw <a href="https://aws.amazon.com/cloudformation/">AWS Cloudformation</a> (and sprinkles of <a href="https://jinja.palletsprojects.com/en/3.1.x/">Jinja</a>), then <a href="https://aws.amazon.com/serverless/sam/">AWS SAM</a>, and then <a href="https://aws.amazon.com/cdk/">AWS CDK</a>.
Similarly, at Meta we had our own internal deployment platform(s), and now at Datadog, we have yet another internal platform built on-top of Kubernetes (K8s).</p>
<p>Even within Amazon, every single team was building directly on-top of AWS, so you can imagine every team building their own mini platform and abstractions. Some were doing it really well, and some were doing it horribly bad… The funny part was that the final result of each team’s infrastructure and their CICD pipelines looked similar (at least the good ones).
It was then, during my time working with AWS, that I started imagining what the ideal platform would look like, having seen the things that worked well, and the mistakes that nobody should ever repeat.</p>
<p>After all this time, I still haven’t found the platform that gives me everything I want. There are a few that come really close, but then fail me in some way.</p>
<h2 id="this-question-haunts-me"><a href="#this-question-haunts-me">This question haunts me</a></h2><blockquote>
<p><strong>Does the perfect PaaS exist, or is it impossible to build?</strong></p>
</blockquote>
<p>There are tens of startups launched in this domain every year (<a href="https://twitter.com/amasad/status/1620679080382464000">joke tweet 1</a>, <a href="https://twitter.com/monkchips/status/1368924845740810249">joke tweet 2</a>), but they either shut down after a few months, or they pivot into something entirely different, or they are just not good enough.</p>
<p><em>Is the perfect platform just a mirage? Why doesn’t it exist? Or does it, and I just haven’t found it yet?</em></p>
<p>Almost everyone I know (including myself) uses a combination of at least 2-3 platforms, integrating them with Infrastructure-as-Code templates, and then managing and running those templates in yet another Continuous Integration (CI) platform. </p>
<p>This is OK if you work on one project, but I don’t want to do it for every single project I work on.</p>
<p><em>Am I wasting my time on something that cannot be built? Is every person, and every team, really all that different to justify each building their own platform?</em></p>
<p>The front-end folks managed to converge in beautiful platforms, with <a href="https://pages.cloudflare.com/">Cloudflare Pages</a>, <a href="https://vercel.com/">Vercel</a>, and <a href="https://netlify.com/">Netlify</a>, providing a very compelling package. I like them, and I use them. But sometimes they are too restricted and get super pricy when you want to lift those restrictions (<a href="https://www.lambrospetrou.com/articles/serverless-platforms-2022/">read my serverless platforms overview back in 2022</a>), which is when I fallback to a full-stack platform again.</p>
<p>I still believe that the platform I imagine can satisfy the needs of many people, and many teams.</p>
<h2 id="what-do-i-want-from-a-platform"><a href="#what-do-i-want-from-a-platform">What do I want from a platform?</a></h2><p>I want the <strong>Heroku developer experience (DX) on-top of AWS</strong>.
That’s it, that’s the buy pitch.</p>
<p>AWS is super powerful, reliable, with pay-as-you-go pricing. I love it. However, the developer experience using AWS directly is abysmal. I experienced AWS from the inside, and I know they won’t fix this. At least not any time soon.</p>
<p>Some folks joke that Heroku <strong>is</strong> the Heroku experience on-top of AWS. But I disagree. Heroku being built on-top of AWS is an implementation detail I don’t care about. I want the products AWS offers (or similar), the EC2s, the ECSs, the Lambdas, the DynamoDBs, and the S3s. But, I want them packaged in a Heroku experience.</p>
<h2 id="pricing"><a href="#pricing">Pricing</a></h2><p>Pricing is a huge thing for any company. Pricing defines, and filters, the customers a company will attract. Do you want the Enterprises, charge thousands. Do you want the indie developers, give stuff for free.
The pricing model is what will make a company profitable, or kill it.</p>
<p>Having said that, for me, as a customer, I like it more when pricing starts from zero (<code>$0</code>) when I don’t consume any resources, and increases as I use the platform more.</p>
<p>This pricing model has many names and variations: pay as you go, metered, usage-based, and nowadays a core component of any <a href="https://www.gomomento.com/blog/fighting-off-fake-serverless-bandits-with-the-true-definition-of-serverless#:~:text=Putting%20it%20all%20together%3A%20the%20Litmus%20Test%20for%20Serverless">serverless</a> product.</p>
<p>If you don’t use something you don’t pay, and the more you use something the more you pay.</p>
<p>I want to be able to host my toy projects that have literally 0 requests-per-second for cents, but I am happy to pay more for the projects I care about being available.</p>
<p>While looking through products, I tend to compare their compute pricing with AWS EC2, and anything that is within 2x of EC2 costs feels OK to me.</p>
<p>For comparison, Heroku costs 5-10x (or higher) more than AWS once we go past the entry-level dynos (what they call their instances): <code>Performance M</code> for <code>$250/dyno/month</code> vs <code>EC2 m6g.medium</code> for <code>$33.58/instance/month</code>.</p>
<p>Most new platforms also fail in the pricing dimension, and are way more expensive than what I would pay for my workload.
Some platforms take a cut of the AWS bill for the resources they manage, which to me is nuts. The value I get from their platform doesn’t depend on the instance types I use (ten instances of <code>t4g.nano</code> vs <code>c6g.2xlarge</code>) so why should my bill. Others charge per seat (<code>$99/seat</code> is getting quite popular) in addition to actual compute charges, which again seems off for me. </p>
<p>Pricing is tricky, and I totally understand that for a company that pays <code>$50K+</code> a year for an engineer, it doesn’t matter if the pay <code>$1000</code> more for the tools they use. They need to make a profit at the end of the day. Personally, I prefer usage-based pricing models, even if the unit-price is a bit higher in low volumes.</p>
<p>I thought about pricing a lot. The model I would choose if I ever built it would be two-fold:</p>
<ul>
<li>Provide usage-based pricing that starts from <code>$0</code>, but with a higher price-per-unit.</li>
<li>Provide per-seat or volume-discounted pricing for bigger companies.</li>
</ul>
<p>With the above two-fold model, anyone can start using the platform to make sure it works for them, pay for their usage along the way, and once they settle to use it they could switch to the second pricing plan. Some prefer predictability, some prefer usage-based, so why not both. </p>
<p><a href="https://basecamp.com/pricing">Basecamp</a> actually just updated their pricing to a two-fold model, just this month. As of this writing, they offer a per-user <code>$15/month</code> plan, but also a flat <code>$299/month</code> plan for unlimited use.
That’s what ideal pricing looks to me! In this case it’s not stricly usage-based, but their low per-seat price is pretty close to the above model.</p>
<p>I love how easy it is to start using and ramping up with managed pay-as-you-go products like AWS DynamoDB, S3, and even non-AWS products like the new <a href="https://www.gomomento.com/pricing">Momento Serverless Cache</a> that charges <code>$0.15/GB</code> and that’s it. One price to think; it can’t get simpler than that.</p>
<h2 id="compute"><a href="#compute">Compute</a></h2><p>This is the dimension where most of the platforms have the most restrictions, or limitations.</p>
<p>I am not going to focus on serverless functions in this article (although similar things apply), since a few things are more different when you go with a full-on serverless architecture (cold-starts, event-driven services, queues, frameworks, etc).</p>
<p>Coincidentally, exactly one year ago I wrote a <a href="https://www.lambrospetrou.com/articles/elastic-beanstalk-al2-go/">deep dive article for AWS Elastic Beanstalk</a>, explaining why I liked it, and elaborated on its feature set along with some of its nuances. That article was actually triggerred by this tweet I made voicing once again my frustration in finding my ideal platform (<a href="https://twitter.com/LambrosPetrou/status/1487493396566528007">@LambrosPetrou/status/1487493396566528007</a>).</p>
<p>There are three types of services that I want to run:</p>
<ol>
<li>The toy project that is fine to have a bit of cold-start, and doesn’t receive much traffic.</li>
<li>An application that needs to be available, to have low latency (<100ms), and potentially auto-scale (1-5 nodes).</li>
<li>An application (e.g. <a href="https://gitea.io/">Gitea</a>) that needs persistent disk access, hence a single-instance server, and which ideally shouldn’t have (long) downtime during deployments.</li>
</ol>
<p>For the first type, I always use serverless functions with AWS Lambda, or Cloudflare Workers.</p>
<p>The second and third types is where restrictions come in play.</p>
<h3 id="compute-requirements"><a href="#compute-requirements">Compute / Requirements</a></h3><ul>
<li>Managed platform: OS updates, host patching, and optionally language runtime updates</li>
<li>Single server instance should be possible (because not everything is web-scale)</li>
<li>Persistent disk volumes (i.e. Amazon EBS)</li>
<li>Deployments in-place on same instance with only 1-2s of downtime (maximum)<ul>
<li>Restriction due to the disk volume only being accessible from one host at a time</li>
</ul>
</li>
<li>Variety of instance types, from the low-cost <a href="https://aws.amazon.com/ec2/instance-types/t3/">T3</a> to the powerful ARM-based Graviton2 <a href="https://aws.amazon.com/ec2/instance-types/c6g/">C6g</a></li>
<li>Autoscaled cluster possible when needing more than one instance</li>
</ul>
<h3 id="compute-what-is-missing"><a href="#compute-what-is-missing">Compute / What is missing?</a></h3><p>My evaluation of a platform most often looks like this:</p>
<ol>
<li>I need persisted disk volumes (e.g. for SQLite), and many platforms automatically fail (e.g. Google Cloud Run).</li>
<li>I prefer a managed platform taking care of OS updates, patches, etc. In most cases, this automatically means using containers. I couldn’t find any platform outside AWS Elastic Beanstalk that does this without containers. I could also consider the various server management platforms (e.g. Linode) but even if they offer patches and updates, it’s not as convenient as a truly managed runtime, and they lack everything else too.</li>
<li>Most platforms that survive up to this step will fail the in-place deployments requirement. In-place deployments are a must when you have a persistent disk volume attached, otherwise you end up with long downtimes (stop using instance, detach volume, attach volume to new instance, start using new instance). AWS Elastic Beanstalk does this with only 1-2s downtime. Alternatively, I could use my own custom setup with AWS CodeDeploy & EC2.</li>
<li>We only reach this step when considering the autoscaled cluster application, and probably we are using containers since not many platforms support executables natively. This is OK.</li>
</ol>
<p>Summarising, when I have a Go application executable to deploy, my choices are:</p>
<ul>
<li>Single-instance: Use AWS Elastic Beanstalk, or do my own thing with AWS CodeDeploy & EC2</li>
<li>Autoscaled cluster: Use AWS Elastic Beanstalk, AWS ECS, Fly.io, or other container platform.</li>
</ul>
<h2 id="continuous-integration-amp-continuous-deployment"><a href="#continuous-integration-amp-continuous-deployment">Continuous Integration & Continuous Deployment</a></h2><p>This is a huge topic (<a href="https://www.elementsofcicd.com/">I am building a course about it</a>), and products vary from a very basic Continuous Integration (CI) feature, to complex Continuous Deployment (CD) pipelines.</p>
<h3 id="cicd-requirements"><a href="#cicd-requirements">CICD / Requirements</a></h3><ul>
<li>Support for multiple AWS accounts (if the platform deploys in my own accounts)</li>
<li>Support for multiple regions, and multiple stages (staging, production, waves)</li>
<li>Support for custom build steps, and potentially per region when necessary</li>
<li>Support for different deployment targets (see the <a href="#compute">#Compute section</a> above)</li>
<li>Support for automatic approval workflows (e.g. acceptance tests)</li>
<li>Support for manual approval between stages (because not everything can be automated)</li>
<li>Support for automatic rollback using some monitor/alert/metric/workflow</li>
<li>Track commits progressing across the pipeline</li>
<li>Logs and metrics per build/deployment/job</li>
</ul>
<h3 id="cicd-what-is-missing"><a href="#cicd-what-is-missing">CICD / What is missing?</a></h3><p>Everything? 😅 Jokes aside, I believe that working with the internal Amazon Pipelines tool for almost 5 years spoiled me.</p>
<p>There is not much public information on the tool, and no, <a href="https://aws.amazon.com/codepipeline/">AWS CodePipeline</a> is not similar, it doesn’t even come close. However, the following two articles (personal favourites) from the Amazon Builders’ Library give a very detailed explanation for the core aspects of how deployments happen inside Amazon/AWS.</p>
<ul>
<li><a href="https://aws.amazon.com/builders-library/automating-safe-hands-off-deployments/">Automating safe, hands-off deployments</a></li>
<li><a href="https://aws.amazon.com/builders-library/cicd-pipeline/">My CI/CD pipeline is my release captain</a></li>
</ul>
<p>At a first glance, the list of requirements above might seem a lot. I would argue that all the above should be part of every CICD pipeline, and in my experience, all the teams end up implementing something close to that on their own after a few months of working on a project. So, why not have it built-in to the platform. Almost all good pipelines at Amazon “looked the same”, albeit with their own steps, approval workflows, and monitors to track when to roll back.</p>
<p>Most cloud platforms do not support such complex pipelines at all. Usually, they only provide single stage build/deploy workflows off a branch, and they offload the CICD needs to external dedicated products.</p>
<p>The majority of projects I see default in using Github Actions. It’s indeed a really nice product, but it has many limitations, and requires you to buy into the more expensive plans to enable some core features. I believe that a major part of its success is mostly due to the network effects of being free and readily available for anyone on Github.</p>
<p><a href="https://docs.gitlab.com/ee/ci/">Gitlab CI</a> and <a href="https://buildkite.com">Buildkite</a> are among the best products I found, and I explored more than 20 of them. They are also the only ones supporting dynamically generated pipelines (<a href="https://docs.gitlab.com/ee/ci/pipelines/downstream_pipelines.html">in Gitlab</a>, <a href="https://buildkite.com/docs/pipelines/defining-steps#dynamic-pipelines">in Buildkite</a>). This is very useful when you have a fixed initial pipeline, but then some configuration in your code, or the output of a step, determines the next steps in the pipeline.
This is not a common feature, but it’s really useful for a project I am working on that automatically generates the infrastructure and deployment stages based on some configuration during the workflow runtime.</p>
<p>However, both have their own issues, high pricing for Gitlab, and managing the server-fleet for Buildkite. The pricing for CICD products also varies a lot, from per seat pricing, to usage-based pricing, from affordable, to crazy expensive.</p>
<h2 id="flexibility-and-extensibility"><a href="#flexibility-and-extensibility">Flexibility and extensibility</a></h2><p>This is very generic and I don’t have a list of requirements. However, it’s necessary for a platform to offer extensibility hooks to customise the feature set provided.</p>
<p>On the one hand, platforms like AWS Elastic Beanstalk <a href="https://www.lambrospetrou.com/articles/elastic-beanstalk-al2-go/#platform-hooks">provide very powerful hooks to override almost everything</a>, and on the other hand, platforms like <a href="https://cloud.google.com/appengine/docs/the-appengine-environments#compare_high-level_features">Google App Engine standard environments</a> are very restricted and opinionated for anything you do.</p>
<p>Some things I want to able to customize that not all platforms support:</p>
<ul>
<li>Compute resources (CPU, RAM)</li>
<li>Deployment strategy (in-place, green/blue, canary)</li>
<li>Autoscaling strategy (request-based, utilisation-based)</li>
<li>Filesystem and storage options (disks, nfs)</li>
</ul>
<p>In a future ideal world, we shouldn’t need to specify CPU/RAM requirements either. Just run the application, and as requests come in it will get the resources it needs, and the platform will be able to absorb and distribute the load beautifully.</p>
<h2 id="good-citizen"><a href="#good-citizen">Good citizen</a></h2><p>Another generic, and even more vague aspect I am looking for when evaluating a platform is if the platform is a good citizen within the ecosystem.</p>
<p>For example, if the platform comes with its own CDN, great. But can I proxy my deployed application with my own CDN if needed? Some Anycast platforms have issues in some setups.</p>
<p>Can I plug my own monitoring and observability product (e.g. Datadog) easily, or am I stuck with its own monitoring stack, if there is any at all.</p>
<p>Are there any hooks I can run generic commands/scripts for adhoc jobs I need during deployments, like provisioning external resources through Terraform? This would be a solved problem assuming that there is some kind of CICD workflow, but it’s something to watch for.</p>
<p>As I see it, the platform should provide good enough defaults to start with, whilst giving me hooks to customize my experience.</p>
<h2 id="developer-experience-dx-or-devx"><a href="#developer-experience-dx-or-devx">Developer experience (DX or DevX)</a></h2><p>Developer experience is essentially how a platform feels like when using it, after considering every aforementioned dimension.</p>
<p>Is the platform intrusive to my application code? For example, do I need to change my business logic to be able to run on that platform? Examples here include App Engine, and most serverless function providers in some degree.</p>
<p>Can I accomplish my day-to-day tasks without wasting time reading badly written documentation, deciphering cryptic error messages, or fighting with broken CLIs and websites? Is it easy to troubleshoot and investigate issues with the infrastructure if and when they happen?</p>
<p>What are the abstractions of the platform? Do I need to configure every single piece of infrastructure similar to using AWS directly (e.g. VPC, subnets, autoscaling groups, scaling policies, IAM roles, IAM policies), or does it provide useful high-level abstractions with sane defaults. How much effort do I need to put to get my code deployed?</p>
<p>Do I really care that the platform is built on-top of Kubernetes (K8s)? <a href="https://twitter.com/LambrosPetrou/status/1614458082482065408">No, I do not</a>! </p>
<p>As a positive example, <a href="https://fly.io/docs/reference/private-networking/#fly-internal-addresses">Fly.io with its built-in private networking</a> which acts as advanced service discovery is really great.</p>
<p>Preview environments per pull-request, or easy to create environments per engineer, are crucial to have for faster feedback loop during development. This trend was commodized by the frontend platforms in the past few years, so now more platforms are supporting them.</p>
<p>This is just the tip of the iceberg of the things that make up a nice developer experience.</p>
<p>I have faith that the newer serverless products, alongside the managed AI platforms, will drive us towards a better future for traditional backend stacks too.</p>
<p>One of the most promising platforms I found was <a href="http://modal.com/">Modal</a>, even though I usually dislike Infrastructure from Code tools. Modal combines amazing DX, great usage-based pricing, and they seemelessly disappear in your code. Unfortunately, they only support Python right now, but <a href="https://twitter.com/LambrosPetrou/status/1614228730154450944">I am hoping they expand to more languages soon</a>.</p>
<h2 id="my-complaint-to-aws"><a href="#my-complaint-to-aws">My complaint to AWS</a></h2><p>Amazon and AWS have many issues at the company level, I can talk for months about them, but somehow AWS offers some of the most reliable and highly-available infrastructure in the industry.</p>
<p>If we focus on core products like EC2, S3, Lambda, DynamoDB, Route53, SQS, and a few others, you can pretty much build anything for insane scale. If only they fixed the developer experience of the platform.</p>
<blockquote>
<p>They provide all the little pieces of the puzzle, without a preview of the final puzzle!</p>
</blockquote>
<p>AWS Elastic Beanstalk was my favourite deployment platform a few years ago.
Unfortunately, AWS abandoned it, and it shows. It was late to get Spot instances support, late to get Arm Graviton instances support, and there was hardly any major improvement of the product since its upgrade to the Amazon Linux 2 platform (back in 2020).</p>
<p>It combined everything, single instance, containers, managed autoscaled clusters, good default monitoring dashboard, simple CICD, and an amazingly extensible platform.</p>
<p>My guess is that the original internal team gradually moved on to other projects, or companies, and then the whole product entered maintenance mode. It’s not the first time this would have happened…</p>
<p>AWS wasted millions of dollars in numerous attempts to bring some of that Heroku experience in AWS, by building new services, instead of fixing, improving, and promoting Elastic Beanstalk.
Services like AWS CodeStar (2017), AWS CodeCatalyst (2022), and even the promising AWS App Runner (2021), are all attempts to provide a better PaaS natively by AWS.</p>
<p>They are all going to fail. It’s obvious, to me at least, that most attempts after Beanstalk are mostly CV and promotion oriented projects. Who wants to get credit for improving an existing service, when they can advertise on LinkedIn that they launched a whole new service, and parade with announcement blog posts. Right? A new director can lead the launch of a new service, and grow a team from 3-4 people to 30 just within a year. That’s how directors get promotions. Been there, saw that, left after that!</p>
<p>Sometimes, starting with a clean sheet is the right choice, but in the case of Elastic Beanstalk, I believe that pushing it to the side until it rots, was the wrong decision.</p>
<p>Oh well, what do I know, I am neither Jeff Bezos, nor Andy Jassy 🤐</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Okay, my ranting is almost over.</p>
<p>I probably looked into more than 50 products (<a href="https://gist.github.com/lambrospetrou/88ea9592e44ca6decb3f3fea04859eca">see list of products/tools/platforms</a>), open source self hosted platforms (e.g. <a href="https://dokku.com/">Dokku</a>, <a href="https://caprover.com/">CapRover</a>), platforms that deploy into my own AWS accounts (e.g. <a href="https://www.qovery.com/pricing">Qovery</a>, <a href="https://www.flightcontrol.dev/pricing">Flightcontrol</a>, <a href="https://withcoherence.com/pricing">WithCoherence</a>, <a href="https://stacktape.com/#:~:text=Open%20source-,Pricing,-For%20individuals">Stacktape</a>), managed platforms (e.g. <a href="https://fly.io/pricing">Fly.io</a>, <a href="https://render.com/pricing">Render</a>, <a href="https://railway.app/pricing">Railway</a>), and even the legendary <a href="https://www.heroku.com/dx">Heroku</a>.</p>
<p>All of them fail in some of the above core requirements. If it’s not pricing, it’s features, and if it’s not features, it’s developer experience.</p>
<p>I currently settled on the following mix:</p>
<ul>
<li>Cloudflare Pages for websites.</li>
<li>AWS Lambda for anything that can tolerate cold-starts and <code>>100ms</code> latencies.</li>
<li>Fly.io for my autoscaled applications that work without in-place deployments.</li>
<li>AWS Elastic Beanstalk or custom EC2 setup for my single-instance use-cases.</li>
<li>AWS SAM and Terraform for Infrastructure as Code.</li>
<li>Combination of AWS CodeBuild, Github Actions, and experimenting with Buildkite, for CICD.</li>
</ul>
<p>A simple decision tree for the platform I use depending on my needs:</p>
<p></p>
<p>I basically have to setup the same things for every project I am working on, no matter how small or big. Infrastructure as Code templates, a CICD pipeline in a separate tool (when I need more than one stage) that needs access to the repo and my cloud account credentials, moving shared stuff into separate packages or copy-pasting them every time across repositories, etc.</p>
<p>Why not move all this boilerplate into a platform. Is it only my boilerplate?</p>
<h3 id="the-question-remains"><a href="#the-question-remains">The question remains</a></h3><blockquote>
<p><strong><em>The Perfect PaaS: Does it exist? Can it be built? Or is it impossible?</em></strong></p>
</blockquote>
<p>The platform I envision. The platform I want.</p>

</article>
<article>
<h1>Hell Yeah or No - what's worth doing | Highlights</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 12 Nov 2022 00:00:00 GMT</p>
<p>This article contains my favourite highlights from the <a href="https://sive.rs/n"><strong>Hell Yeah or No</strong></a> book from Derek Sivers.</p>
<p>I was tweeting these highlights while reading the book, but I wanted to persist them on my blog as well. Check the original Twitter thread at <a href="https://twitter.com/LambrosPetrou/status/1585311232940445701">@lambrospetrou/status/1585311232940445701</a>.</p>
<p>I strongly recommend reading the whole book. It’s very short, with each chapter just being a couple of paragraphs, but immensely satisfying.</p>
<hr/>
<h2 id="4-actions-not-words-reveal-our-real-values"><a href="#4-actions-not-words-reveal-our-real-values">4. Actions, not words, reveal our real values</a></h2><blockquote>
<p>No matter what you tell the world or tell yourself, your actions reveal your real values. <strong>Your actions show you what you actually want.</strong></p>
</blockquote>
<ul>
<li><a href="https://sive.rs/arv">https://sive.rs/arv</a></li>
</ul>
<h2 id="6-why-are-you-doing"><a href="#6-why-are-you-doing">6. Why are you doing?</a></h2><blockquote>
<p>It’s crucial to know why you’re doing what you’re doing. Most people don’t know. They just go with the flow.</p>
<p><strong>Whatever you decide, you need to optimize for that goal, and be willing to let go of the others.</strong></p>
</blockquote>
<ul>
<li><a href="https://sive.rs/why">https://sive.rs/why</a></li>
</ul>
<h2 id="8-imitate-we-are-imperfect-mirrors"><a href="#8-imitate-we-are-imperfect-mirrors">8. Imitate. We are imperfect mirrors.</a></h2><blockquote>
<p><strong>Like a funhouse mirror that distorts what it reflects, your imitation will turn out much different from the original.</strong> Maybe even better.</p>
<p>So look around at those existing ideas in the world.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/mirror">https://sive.rs/mirror</a></li>
</ul>
<h2 id="11-character-predicts-your-future"><a href="#11-character-predicts-your-future">11. Character predicts your future</a></h2><blockquote>
<p><strong>How you do anything is how you do everything. It all matters.</strong></p>
<p>Your actions are completely under your control, and seem to be the best indicator of future success.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/character">https://sive.rs/character</a></li>
</ul>
<h2 id="14-small-actions-change-your-self-identity"><a href="#14-small-actions-change-your-self-identity">14. Small actions change your self-identity</a></h2><blockquote>
<p><strong>Your actions show the world who you are.</strong> You won’t act differently until you think of yourself differently. So start by taking one small action that will change your self-identity.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/actid">https://sive.rs/actid</a></li>
</ul>
<h2 id="15-if-you-re-not-feeling-hell-yeah-then-say-no"><a href="#15-if-you-re-not-feeling-hell-yeah-then-say-no">15. If you’re not feeling “hell yeah!” then say no</a></h2><blockquote>
<p><strong>Say no to almost everything.</strong> This starts to free your time and mind.</p>
<p>Then, when you find something you’re actually excited about, you’ll have the space in your life to give it your full attention. You’ll be able to take massive action, in a way that most people can’t, because you cleared away your clutter in advance. <strong>Saying no makes your yes more powerful.</strong></p>
<p>Refuse almost everything. Do almost nothing. But the things you do, do them all the way.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/hyn">https://sive.rs/hyn</a></li>
</ul>
<p><em>This is really one the top highlights!</em></p>
<h2 id="19-tilting-my-mirror-motivation-is-delicate"><a href="#19-tilting-my-mirror-motivation-is-delicate">19. Tilting my mirror (motivation is delicate)</a></h2><blockquote>
<p>When you notice your motivation fading, you have to seek out the subtle cause. <strong>A simple tweak can make all the difference between achieving something or not.</strong></p>
</blockquote>
<ul>
<li><a href="https://sive.rs/tilt">https://sive.rs/tilt</a></li>
</ul>
<h2 id="24-there-s-no-speed-limit"><a href="#24-there-s-no-speed-limit">24. There’s no speed limit</a></h2><blockquote>
<p>[…] <strong>“the standard pace is for chumps”</strong> — that the system is designed so anyone can keep up. If you’re more driven than most people, you can do way more than anyone expects.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/kimo">https://sive.rs/kimo</a></li>
</ul>
<h2 id="34-switch-strategies"><a href="#34-switch-strategies">34. Switch strategies</a></h2><blockquote>
<p><strong>Early in your career, the best strategy is to say yes to everything.</strong> The more things you try, and the more people you meet, the better. Each one might lead to your lucky break.</p>
<p>Then when something is extra-rewarding, it’s time to switch strategies. <strong>Focus all of your energy on this one thing.</strong> Don’t be leisurely. Strike while it’s hot. Be a freak. <strong>Give it everything you’ve got.</strong></p>
</blockquote>
<ul>
<li><a href="https://sive.rs/switch">https://sive.rs/switch</a></li>
</ul>
<h2 id="35-don-t-be-a-donkey"><a href="#35-don-t-be-a-donkey">35. Don’t be a donkey</a></h2><blockquote>
<p><strong>The solution is to think long term.</strong> Do just one thing for a few years, then another for a few years, then another.</p>
<p><strong>Most people overestimate what they can do in one year, and underestimate what they can do in ten years.</strong></p>
<p>Think long term. Use the future. Don’t be short sighted. Don’t be a donkey.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/donkey">https://sive.rs/donkey</a></li>
</ul>
<h3 id="50-don-t-start-a-business-until-people-are-asking-you-to"><a href="#50-don-t-start-a-business-until-people-are-asking-you-to">50. Don’t start a business until people are asking you to</a></h3><blockquote>
<p>Most have an idea but no customers. For them I always say, “Don’t start a business until people are asking you to.”</p>
<p>Don’t announce anything. Don’t choose a name. Don’t make a website or an app. Don’t build a system. <strong>You need to be free to completely change or ditch your idea.</strong></p>
<p>Then you get your first paying customer. Provide a one-on-one personal service. Then you get another paying customer. <strong>Prove a real demand.</strong></p>
<p>Then, as late as possible, you officially start your business.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/asking">https://sive.rs/asking</a></li>
</ul>
<h2 id="66-if-you-think-you-haven-t-found-your-passion"><a href="#66-if-you-think-you-haven-t-found-your-passion">66. If you think you haven’t found your passion…</a></h2><blockquote>
<p><strong>It’s dangerous to think in terms of “passion” and “purpose” because they sound like such huge overwhelming things.</strong></p>
<p>[…] <strong>just notice what excites you and what scares you on a small moment-to-moment level.</strong></p>
<p>If you keep thinking about doing something big, and you find that the idea both terrifies and intrigues you, it’s probably a worthy endeavor for you.</p>
<p>You grow by doing what excites you and what scares you.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/passion">https://sive.rs/passion</a></li>
</ul>
<h2 id="67-whatever-scares-you-go-do-it"><a href="#67-whatever-scares-you-go-do-it">67. Whatever scares you, go do it</a></h2><blockquote>
<p>Fear is just a form of excitement, and you know you should do what excites you.</p>
<p>Best of all, <strong>once you do something that scared you, you’re not scared of it anymore!</strong> As you go through life, doing everything that scares you, you fear less and less in the world.</p>
<p>Life is an ongoing process of choosing between safety (out of fear and need for defense) and risk (for the sake of progress and growth). Make the growth choice a dozen times a day.</p>
</blockquote>
<ul>
<li><a href="https://sive.rs/scares">https://sive.rs/scares</a></li>
</ul>

</article>
<article>
<h1>AWS Lambda + Upstash Redis + Go = 🚀❤️</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 16 Oct 2022 00:00:00 GMT</p>
<p>This article was originally posted as a <a href="https://upstash.com/blog/aws-lambda-go-redis">guest post on Upstash Redis blog</a>. I am cross-posting it here for my records.</p>
<hr/>
<h2 id="intro"><a href="#intro">Intro</a></h2><p>Serverless compute platforms are awesome, but without serverless databases they are too limited.</p>
<p>While I was building the platform for my upcoming course, <a href="https://www.elementsofcicd.com?utm_source=personal_blog">The Elements of CI/CD</a>, I wanted a serverless database since I decided to use <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> as my server for certain things. The requirements I had were:</p>
<ol>
<li><strong>Pay as you go pricing</strong>. I don’t want to pay per hour or per node, but by usage (requests, storage, etc). It should be very cheap to start using it and as usage ramps up the cost would grow proportionally.</li>
<li><strong>Low latency</strong>. Nobody likes slow responses, so querying the database should be fast from inside AWS regions (e.g. <code>eu-west-1</code>).</li>
<li><strong>Great developer experience (DevX)</strong>. Having a nice interface for the database without having to learn yet another niche DSL, or wasting hours fiddling with a website is preferred.</li>
</ol>
<p><a href="https://upstash.com/redis">Upstash Redis</a> satisfies all of the above requirements, and it does a great job at it.</p>
<ul>
<li>Pay as you go?<ul>
<li>✅ Really affordable to start, and also at scale!</li>
</ul>
</li>
<li>Low latency?<ul>
<li>✅ <1ms latency from querying from inside AWS Lambda!</li>
</ul>
</li>
<li>Great DevX?<ul>
<li>✅ It’s standard Redis. So, yeap.</li>
</ul>
</li>
</ul>
<p>In this article we are going to see how to use Upstash Redis from inside AWS Lambda, ensure it is fast enough for our needs, whilst at the same time keeping our code maintainable in order to be able to test locally or deploy to a different platform if needed.</p>
<h2 id="what-are-we-implementing"><a href="#what-are-we-implementing">What are we implementing?</a></h2><p>For simplicity, we are going to implement just 3 API endpoints:</p>
<ol>
<li>The <code>GET|POST /login</code> endpoint which accepts a <code>userId</code> as a query parameter in a <code>GET</code>, or inside a form value submitted with a <code>POST</code> request. This endpoint will generate a session ID, store it in Redis, and also set a cookie for subsequent visits. The <code>GET</code> just makes it easier to test.🙃</li>
<li>The <code>GET /lessons/completed</code> endpoint requires logged in users (i.e. having the cookie with the session ID) and returns a JSON response with all the lessons the user completed and when.</li>
<li>The <code>POST /lessons/{lessonSlug}/mark-complete</code> endpoint requires logged in users (i.e. having the cookie with the session ID) and marks the lesson denoted by <code>lessonSlug</code> as completed with the current time.</li>
</ol>
<p><em>Note: In the code below there are a few things missing, thus this is not production-ready copy-pasteable code. For example, we should check that the given <code>lessonSlug</code> exists before updating it. The login endpoint should also accept a password and do proper salted/hashed verification before creating session IDs, etc.</em></p>
<h2 id="1-setup"><a href="#1-setup">1. Setup</a></h2><ul>
<li>The complete code detailed below also exists in my <a href="https://github.com/lambrospetrou/aws-playground/tree/master/aws-lambda-upstash-redis-article"><code>aws-playground</code> repository</a> if you want to see how everything fits together.</li>
</ul>
<p>As you will see below, we are creating two entrypoints, i.e. two executable commands. One will be for a normal local server, and one will be for AWS Lambda. This way, we will be able to test our whole logic locally, and with standard unit/integration tests if we wanted.</p>
<p>The only differences between them is shown in sections 1.2 and 1.3 below.</p>
<h3 id="1-1-workspace"><a href="#1-1-workspace">1.1 Workspace</a></h3><p>Before we dive into the code, let’s setup our working directory for Go.</p>
<ol>
<li><a href="https://go.dev/doc/install">Download Go</a>.</li>
<li><a href="https://docs.upstash.com/redis#create-account">Create an Upstash Redis account and database</a> in a region of your choosing. Ideally it should be the same AWS region you will deploy your Lambda. I will be using <code>eu-west-1</code> (Europe, Ireland) in this article.</li>
</ol>
<p></p>
<p>After completing the above, we can now create our workspace. For the rest of the article, assume our code is under <code>~/dev/aws-lambda-upstash-redis</code>.</p>
<pre><code class="language-bash">mkdir -p ~/dev/aws-lambda-upstash-redis
cd ~/dev/aws-lambda-upstash-redis
</code></pre>
<p>Then, create a Go package:</p>
<pre><code class="language-bash">go mod init com.upstash/example/aws-lambda-upstash-redis
</code></pre>
<h3 id="1-2-local-server-entrypoint"><a href="#1-2-local-server-entrypoint">1.2 Local server entrypoint</a></h3><ul>
<li>Paste the following code in <code>~/dev/aws-lambda-upstash-redis/cmd/server/main.go</code>.</li>
</ul>
<pre><code class="language-go">package main

import (
    "log"
    "net/http"
    "os"

    "com.upstash/example/aws-lambda-upstash-redis/core"
)

func main() {
    mux := core.NewMux()
    port := os.Getenv("PORT")
    if len(port) == 0 {
        port = "5000"
    }
    if err := http.ListenAndServe(":"+port, mux); err != nil {
        log.Fatal(err)
    }
}
</code></pre>
<h3 id="1-3-aws-lambda-entrypoint"><a href="#1-3-aws-lambda-entrypoint">1.3 AWS Lambda entrypoint</a></h3><ul>
<li>Paste the following code in <code>~/dev/aws-lambda-upstash-redis/cmd/lambda/main.go</code>.</li>
</ul>
<pre><code class="language-go">package main

import (
    "com.upstash/example/aws-lambda-upstash-redis/core"
    "github.com/aws/aws-lambda-go/lambda"
    "github.com/awslabs/aws-lambda-go-api-proxy/httpadapter"
)

func main() {
    mux := core.NewMux()
    lambda.Start(httpadapter.NewV2(mux).ProxyWithContext)
}
</code></pre>
<h3 id="1-4-core-logic"><a href="#1-4-core-logic">1.4 Core logic</a></h3><p>Our main core logic will go into the <code>core</code> package to be shared by both entry points above.</p>
<ul>
<li>Paste the following code in <code>~/dev/aws-lambda-upstash-redis/core/lib.go</code>.</li>
</ul>
<pre><code class="language-go">package core

import (
    "github.com/go-chi/chi/v5"
)

func NewMux() *chi.Mux {
    r := chi.NewRouter()
    return r
}
</code></pre>
<h3 id="1-5-building-compiling"><a href="#1-5-building-compiling">1.5 Building / Compiling</a></h3><p>I usually write a small <code>makefile</code> to avoid typing long commands every time I want to compile so copy the following into <code>~/dev/aws-lambda-upstash-redis/makefile</code>:</p>
<pre><code class="language-makefile">default: build

clean:
    rm -rf build/

build: build-lambda build-server

build-lambda: clean
    GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o build/handler cmd/lambda/main.go
    cd build/ && zip handler.zip ./handler

build-server: clean
    CGO_ENABLED=0 go build -o build/server cmd/server/main.go
</code></pre>
<p>Don’t worry too much for now about the details but this allows us to do:</p>
<ul>
<li><code>make build-server</code>: Builds the binary for running the server locally (executable <code>./build/server</code>).</li>
<li><code>make build-lambda</code>: Builds the binary for running the server on AWS Lambda (executable <code>./build/handler</code> and <code>./build/handler.zip</code>).</li>
<li><code>make</code> or <code>make build</code> does both.</li>
</ul>
<p>The <code>CGO_ENABLED=0</code> option makes sure our executable binaries are self-contained (i.e. statically compiled). The <code>GOOS=linux GOARCH=amd64</code> options are needed to cross-compile and match the linux environment of AWS Lambda in case you are using a Mac or Windows system locally.</p>
<p>Next, run <code>go mod tidy</code> to fetch all the code dependencies. Remember to run this every time you add or remove Go dependencies.</p>
<p>Finally, run <code>make</code> once, in order to build everything and make sure your workspace is setup, before we go deeper into the code.</p>
<h2 id="2-api-implementation"><a href="#2-api-implementation">2. API implementation</a></h2><p><em>For this section we will always be working inside the <code>~/dev/aws-lambda-upstash-redis/core/lib.go</code> file.</em></p>
<p>The following few lines define the API endpoints we discussed earlier, using the amazing <a href="https://github.com/go-chi/chi/"><code>go-chi</code></a> library.</p>
<pre><code class="language-go">import (
    //...
    "github.com/go-chi/chi/v5"
    "github.com/go-chi/chi/v5/middleware"
)

func NewMux() *chi.Mux {
    r := chi.NewRouter()

    r.Use(middleware.RequestID)
    r.Use(middleware.Logger)
    r.Use(middleware.Recoverer)

    r.Get("/login", login)
    r.Post("/login", login)
    r.Group(func(r chi.Router) {
        r.Use(UsersWithSessionOnly)
        r.Get("/lessons/completed", listLessonsCompleted)
        r.Post("/lessons/{lessonSlug}/mark-complete", markLessonComplete)
    })

    return r
}
</code></pre>
<p>In the snippet above, <code>r.Group(...)</code> creates a shared layer where we can apply common middleware for any route defined inside of it. In this case, we add our own middleware <code>UsersWithSessionOnly</code>, which as we will see later guarantees that the request contains the cookie with an active session ID.</p>
<h3 id="2-1-userswithsessiononly-middleware"><a href="#2-1-userswithsessiononly-middleware">2.1 UsersWithSessionOnly middleware</a></h3><p>In this middleware we want to implement the following:</p>
<ol>
<li>Extract the cookie that contains the session ID, and fail otherwise.</li>
<li>Query Redis to fetch the user details based on the session ID, and fail if the session ID provided is not active.</li>
<li>Store the user ID in the request’s <code>context.Context</code> in order to make it available to downstream middleware or handlers.</li>
</ol>
<p>First, we need some boilerplate code for some imports and definitions that are used everywhere.</p>
<pre><code class="language-go">import (
    //...
    "log"
    "os"
    "strings"

    "github.com/go-redis/redis/v8"
)

type contextKey struct {
    name string
}
const (
    COOKIE_AUTH_NAME = "xxx_session_id"
)
var (
    CTX_USER_ID = &contextKey{"LoggedInUserId"}
    redisDb     = NewClient()
)

func NewClient() *redis.Client {
    redisUrl := strings.TrimSpace(os.Getenv("UPSTASH_REDIS_URL"))
    if redisUrl == "" {
        log.Fatalln("Required env UPSTASH_REDIS_URL not set!")
    }
    opt, _ := redis.ParseURL(redisUrl)
    redisDb := redis.NewClient(opt)

    return redisDb
}
</code></pre>
<p>And now the main logic for the authentication middleware.</p>
<pre><code class="language-go">// UsersWithSessionOnly middleware restricts access to just logged-in users.
// If validation passes, then the context will contain the user id (CTX_USER_ID).
func UsersWithSessionOnly(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        c, err := r.Cookie(COOKIE_AUTH_NAME)
        if err != nil {
            render.Status(r, http.StatusForbidden)
            render.JSON(w, r, struct{}{})
            return
        }

        ctx := r.Context()
        userId, err := redisDb.Get(ctx, "session:"+c.Value).Result()
        if err == redis.Nil {
            // If session is not found then user is forbidden from accessing the API!
            render.Status(r, http.StatusForbidden)
            render.JSON(w, r, struct{}{})
            return
        } else if err != nil {
            // Something went wrong querying Redis!
            render.Status(r, http.StatusInternalServerError)
            render.JSON(w, r, struct{ Message string }{Message: "We could not validate the provided session ID"})
            return
        }
        // Set it for downstream middleware and handlers.
        next.ServeHTTP(w, r.WithContext(context.WithValue(ctx, CTX_USER_ID, userId)))
    })
}
</code></pre>
<h3 id="2-2-marklessoncomplete"><a href="#2-2-marklessoncomplete">2.2 markLessonComplete(…)</a></h3><p>This is a straightforward operation where we want to store in Redis that the lesson denoted by the <code>lessonSlug</code> path parameter is completed at the current time of the request.</p>
<p>In Redis we want to keep a map for each user where each key-value pair in the map will be the lesson as key, and the completion date as value. Therefore, we use the <a href="https://redis.io/commands/hset/"><code>HSET</code> Redis command</a>. We could also store a separate key per lesson but this makes it easier to fetch all the lessons for a user at once later.</p>
<pre><code class="language-go">func markLessonComplete(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    lessonSlug := chi.URLParam(r, "lessonSlug")
    userId := r.Context().Value(CTX_USER_ID).(string)
    timeNow := time.Now().Format(time.RFC3339)

    err := redisDb.HSet(ctx, "lessons:"+userId, lessonSlug, timeNow).Err()
    if err != nil {
        render.Status(r, http.StatusInternalServerError)
        render.JSON(w, r, struct{ Message string }{Message: "We could not save your progression..."})
        return
    }

    render.JSON(w, r, struct {
        LessonSlug    string
        LastCompleted string
    }{
        lessonSlug,
        timeNow,
    })
}
</code></pre>
<h3 id="2-3-listlessonscompleted"><a href="#2-3-listlessonscompleted">2.3 listLessonsCompleted(…)</a></h3><p>In similar fashion as the previous section, here we just want to return the whole map of lessons completion and return it to the user in a JSON response. We use the <a href="https://redis.io/commands/hgetall/"><code>HGETALL</code> command</a> for this.</p>
<pre><code class="language-go">func listLessonsCompleted(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    userId := r.Context().Value(CTX_USER_ID).(string)

    lessons, err := redisDb.HGetAll(ctx, "lessons:"+userId).Result()
    if err == redis.Nil {
        lessons = map[string]string{}
    } else if err != nil {
        render.Status(r, http.StatusInternalServerError)
        render.JSON(w, r, struct{ Message string }{Message: "We could not load your lessons..."})
        return
    }

    render.JSON(w, r, struct {
        Lessons map[string]string
    }{
        lessons,
    })
}
</code></pre>
<h3 id="2-4-login"><a href="#2-4-login">2.4 login(…)</a></h3><p>Finally, the login endpoint. Once again, please do not copy the following code into production, since it’s not doing any kind of validation. For the purposes of this article we are only interested in how it queries Redis and how it sets the cookie for the session ID.</p>
<p>The session ID is generated by the <a href="https://github.com/segmentio/ksuid"><code>ksuid</code></a> library, which has a few advantages over normal UUIDs, and we only consider it active for 1 hour. We use the time-to-live functionality of the <a href="https://redis.io/commands/set/">Redis <code>SET</code> command</a> for automatic removal from the database after one hour.</p>
<pre><code class="language-go">import (
    // ...
  "github.com/segmentio/ksuid"
)

func login(w http.ResponseWriter, r *http.Request) {
    // Check credentials and update redis session and return Set-Cookie
    // WARNING: You should do an actual validation in production for credentials!
    // ...
    // For now we always assume correctness and automatically create a session token
    // by saving it to Redis, and also setting it as a cookie.
    userId := strings.TrimSpace(r.FormValue("userId"))
    if userId == "" {
        render.Status(r, http.StatusBadRequest)
        render.JSON(w, r, struct{ Message string }{Message: "Missing required userId"})
        return
    }

    sessionId := ksuid.New()
    redisDb.Set(r.Context(), "session:"+sessionId.String(), userId, time.Hour*1)
    http.SetCookie(w, &http.Cookie{
        Name: COOKIE_AUTH_NAME, Value: sessionId.String(),
        Path: "/", MaxAge: int((time.Hour * 1).Seconds()),
        // This should be true when deploying in production (https), but locally we need it false (http).
        Secure: false,
    })

    http.Redirect(w, r, "/lessons/completed", http.StatusTemporaryRedirect)
}
</code></pre>
<h2 id="3-demo-locally"><a href="#3-demo-locally">3. Demo - Locally</a></h2><p>Phew, that was a lot of code.😅</p>
<p>Let’s do a quick demo to make sure everything works as expected.</p>
<ul>
<li>First, set the <code>UPSTASH_REDIS_URL</code> environment variable to the URL of the database you created in section 1 above. You can find it in the <em>details</em> tab of your database’s page (see section 1.1 above).</li>
</ul>
<pre><code class="language-bash">export UPSTASH_REDIS_URL="<your-url-here>"
</code></pre>
<ul>
<li>Then, build and run the local server:</li>
</ul>
<pre><code class="language-bash">make build-server && ./build/server
</code></pre>
<h3 id="browser-testing"><a href="#browser-testing">Browser testing</a></h3><p>Now let’s do some testing in the browser by visiting <a href="http://localhost:5000/lessons/completed">http://localhost:5000/lessons/completed</a>.</p>
<p></p>
<p>We get <code>403 - Forbidden</code>, so let’s login, by visiting <a href="http://localhost:5000/login?userId=lambros">http://localhost:5000/login?userId=lambros</a>.</p>
<p></p>
<p>We are logged in now, and we automatically got redirected to <code>/lessons/completed</code>, but they are empty. So, let’s mark a lesson as completed. In the <code>console</code> tab inside the devtools of your browser run the following:</p>
<pre><code class="language-javascript">await (
  await fetch("http://localhost:5000/lessons/123/mark-complete", {
    method: "POST",
    credentials: "same-origin",
  })
).json();

// Should output something like:
// {LessonSlug: '123', LastCompleted: '2022-10-12T02:01:14+03:00'}
</code></pre>
<p>Visiting <a href="http://localhost:5000/lessons/completed">http://localhost:5000/lessons/completed</a> should show this lesson as marked now:</p>
<pre><code class="language-json">{ "Lessons": { "123": "2022-10-12T02:01:14+03:00" } }
</code></pre>
<p>Et voila. Everything works fine!</p>
<p>Looking into the Redis database itself using the recently launched online Data Browser also proves that the expected data is there.</p>
<p></p>
<h2 id="4-aws-lambda"><a href="#4-aws-lambda">4. AWS Lambda</a></h2><p>In order to test and deploy to AWS Lambda we are going to use the <code>sam</code> cli.</p>
<ul>
<li><p>First, setup the <a href="https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html">SAM cli</a> and make sure your user/role has the <a href="https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-permissions.html">right permissions</a>.</p>
</li>
<li><p>The <code>sam</code> cli needs a Cloudformation template to work, so copy the following into <code>aws-iac/sam-template.yml</code>:</p>
</li>
</ul>
<pre><code class="language-yaml">AWSTemplateFormatVersion: "2010-09-09"
Transform: AWS::Serverless-2016-10-31
Description: Defines all the AWS resources we need for our Upstash Redis API.

Resources:
  # https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-resource-function.html
  GoUpstashRedis:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: ../build/handler.zip
      Handler: handler
      Runtime: go1.x
      MemorySize: 512
      FunctionUrlConfig:
        AuthType: NONE
        Cors:
          AllowCredentials: false
          AllowMethods: ["*"]
          AllowOrigins: ["*"]

Outputs:
  GoUpstashRedisApi:
    Description: "Endpoint URL"
    Value: !GetAtt GoUpstashRedisUrl.FunctionUrl
  GoUpstashRedis:
    Description: "Lambda Function ARN"
    Value: !GetAtt GoUpstashRedis.Arn
  GoUpstashRedisIamRole:
    Description: "Implicit IAM Role created for GoUpstashRedis"
    Value: !GetAtt GoUpstashRedisRole.Arn
</code></pre>
<ul>
<li>Build the handler bundle for AWS Lambda:</li>
</ul>
<pre><code class="language-bash">make build-lambda
</code></pre>
<ul>
<li>Add the following to <code>makefile</code> to make it easy to deploy after we do code changes:</li>
</ul>
<pre><code>sam-deploy: build-lambda
    sam deploy -t aws-iac/sam-template.yml --stack-name "UpstashRedisGoArticleStackDemo" --region eu-west-1 --resolve-s3 --no-confirm-changeset --no-fail-on-empty-changeset --capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND
</code></pre>
<ul>
<li>Deploy to the specified region (see previous command).</li>
</ul>
<pre><code class="language-bash">make sam-deploy
</code></pre>
<ul>
<li>You should get some output like the below:</li>
</ul>
<pre><code class="language-bash">CloudFormation outputs from deployed stack
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Outputs
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Key                 GoUpstashRedis
Description         Lambda Function ARN
Value               arn:aws:lambda:eu-west-1:<redacted>:function:UpstashRedisGoArticleStackDem-GoUpstashRedis-baB8dQPkTfg0

Key                 GoUpstashRedisIamRole
Description         Implicit IAM Role created for GoUpstashRedis
Value               arn:aws:iam::<redacted>:role/UpstashRedisGoArticleStac-GoUpstashRedisRole-16UWC7HR6KII8

Key                 GoUpstashRedisApi
Description         Endpoint URL
Value               https://6pmmwqmg5vec3bcsldabckaf5i0nlgje.lambda-url.eu-west-1.on.aws/
-----------------------------------------------------------------------------------------------------------------------------------------------------------

Successfully created/updated stack - UpstashRedisGoArticleStackDemo in eu-west-1
</code></pre>
<ul>
<li>The URL of the deployed AWS Lambda is shown in the output printed, in this case <code>https://6pmmwqmg5vec3bcsldabckaf5i0nlgje.lambda-url.eu-west-1.on.aws/</code>. So, feel free to repeat the demo steps we did in our browser before using <code>localhost</code> with the actual domain now.<ul>
<li>Alternatively, you can also find the URL of the newly created function in the outputs of the CloudFormation stack <code>UpstashRedisGoArticleStackDemo</code> in the <a href="https://eu-west-1.console.aws.amazon.com/cloudformation/home?region=eu-west-1">Cloudformation console</a>.</li>
<li><strong>Note:</strong> Make sure to set the <code>UPSTASH_REDIS_URL</code> environment variable on your AWS Lambda configuration as well, otherwise it will just crash. Visit the <a href="https://eu-west-1.console.aws.amazon.com/lambda/home?region=eu-west-1">AWS Lambda console</a>, then click on your newly deployed Lambda, click on the <strong>Configuration</strong> tab, and then on the left side menu click <strong>Environment variables</strong>. Type <code>UPSTASH_REDIS_URL</code> as key, and your Upstash Redis URL as value. Click <strong>Save</strong>, and now your Lambda is ready.</li>
</ul>
</li>
</ul>
<h3 id="4-1-sam-local-test"><a href="#4-1-sam-local-test">4.1 SAM local test</a></h3><p>We can test our Lambda locally by providing a <code>sample-event.json</code> with the right path/cookie/query parameters/etc. Example of such JSON can be found in <a href="https://github.com/lambrospetrou/aws-playground/blob/master/aws-lambda-upstash-redis-article/sample-event.json"><code>aws-lambda-upstash-redis-article/sample-event.json</code></a>.</p>
<ul>
<li>Then, once you have a valid JSON event file, run the following to invoke the server logic as it would run on AWS Lambda:</li>
</ul>
<pre><code class="language-bash">sam local invoke -t aws-iac/sam-template.yml -e sample-event.json
</code></pre>
<h3 id="4-2-security-of-upstash-redis-url"><a href="#4-2-security-of-upstash-redis-url">4.2 Security of Upstash Redis URL</a></h3><p>In this article, for simplicity we provided the Upstash Redis URL, containing the password, through environment variables. We don’t want to hardcode this into the SAM Cloudformation template which is versioned along with our code, hence why we had to manually configure it through the AWS Lambda console.</p>
<p>There is a better way to do this automatically without modifying the Lambda configuration every time and to avoid having the Redis credentials/URL in plain sight for anyone with console access.</p>
<p>We can use <a href="https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html">AWS Systems Manager Parameter Store</a> and the corresponding Cloudformation resource <a href="https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ssm-parameter.html#aws-resource-ssm-parameter--examples"><code>AWS::SSM::Parameter</code></a> to hold the URL (we can set it once and retain during deployments), and change our Lambda code to fetch the parameter’s value at runtime. We could also automatically inject it as an env variable inside <code>sam-template.yml</code>, although this would still have it in plain text in the console.</p>
<p>Changing the code to fetch it from SSM Parameter Store is easy due to our separation of entrypoints, so we could fetch the parameter only when running inside AWS Lambda (<code>~/dev/aws-lambda-upstash-redis/cmd/lambda/main.go</code>) and passing it through to the <code>NewClient()</code> function that creates the Redis client.</p>
<h2 id="5-how-fast-is-it"><a href="#5-how-fast-is-it">5. How fast is it?</a></h2><p>Apart from the first <a href="https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/">cold start invocation</a> which takes roughly <code>100-120 ms</code>, every invocation thereafter is lightning fast and always under <code>4 ms</code>.</p>
<p>Below is an example of a hot invocation for the <code>/login?userId=lambros</code> endpoint as implemented above:</p>
<p></p>
<p>As you can see the total duration of our request was <code>2.06 ms</code>. Yes, that’s <strong>two milliseconds</strong>, to generate a session ID, write it to Upstash Redis remotely, and return the redirection response.</p>
<p>Looking more carefully in the <strong>Log output</strong> section, we can see that the request lasted <code>789.435μs</code>, at least from our code’s perspective. This means our logic completed well under <code>1 ms</code> (roughly <code>0.790 ms</code>). Mind-blowing considering we are using a remote database.🤯</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>I am really amazed by how well Upstash Redis performs, especially since it’s hard to find such performance for serverless databases suited for platforms like AWS Lambda.</p>
<p>The Redis API is really convenient, Upstash Redis has top-notch pricing model and awesome developer experience, and it’s fast. I love the combination!</p>
<p>AWS Lambda + Upstash Redis + Go = 🚀❤️</p>

</article>
<article>
<h1>Twitter 📝 Post-Commit Reviews by Cindy Sridharan</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 22 Aug 2022 00:00:00 GMT</p>
<p>Check the original Twitter thread at <a href="https://twitter.com/LambrosPetrou/status/1561752140909002757">@lambrospetrou/status/1561752140909002757</a>.</p>
<hr/>
<p><strong>Post-Commit Reviews</strong> - <a href="https://copyconstruct.medium.com/post-commit-reviews-b4cc2163ac7a">https://copyconstruct.medium.com/post-commit-reviews-b4cc2163ac7a</a></p>
<p>A great, and IMO controversial, article by <a href="https://twitter.com/copyconstruct">@copyconstruct</a> on why it’s better to code review after a change is merged into trunk. I have my opinions as well, so let’s dive in.</p>
<p>I spend a lot of time reviewing my team’s diffs, and I am a huge advocate for its benefits. So, this article caught me off-guard. After reading it, though, I see its perspective, but I still think that this can rarely work in practice.</p>
<p>I think the article should first start with the <em>challenges of post-commit reviews</em>. Right now, it paints a very nice picture at the beginning, but reading it to the end shows how hard it would be to actually implement.</p>
<h2 id="challenge-1-high-functioning-high-trust-environments"><a href="#challenge-1-high-functioning-high-trust-environments">👎 Challenge 1: High-Functioning, High-Trust Environments</a></h2><p>Not only do the team members need to fully trust each other in shipping unreviewed code, but they also need to accommodate for onboarding new hires to this way of working.</p>
<h2 id="challenge-2-investment-in-automation"><a href="#challenge-2-investment-in-automation">👎 Challenge 2: Investment in Automation</a></h2><p>Without great tooling and automation, post-commit reviews cannot work. It has to be trivial to revert commits, especially multiple ones. Doing this in monorepos is even more complicated.</p>
<blockquote>
<p>managing commits becomes a lot easier in non-monorepo environments, since the automation required to revert commits doesn’t require research paper levels of complexity.</p>
</blockquote>
<p>+1. This has been my experience as well.</p>
<h2 id="challenge-3-strong-cultural-scaffolding"><a href="#challenge-3-strong-cultural-scaffolding">👎 Challenge 3: Strong Cultural Scaffolding</a></h2><p>This challenge is covered in a pro-point. Quoting the author:</p>
<p>“Getting to a point where post-commit reviews are a reality requires a strong cultural scaffolding. At the very least, it requires:</p>
<ul>
<li>a culture of collaborating […] prior to code implementation via a design document.</li>
<li>consensus around aspects like style-guide, coding idioms, concurrency primitives etc.</li>
<li>investment in better automation practices and tooling.”</li>
</ul>
<h2 id="benefit-1-focus"><a href="#benefit-1-focus">👍 Benefit 1: Focus</a></h2><blockquote>
<p>Being able to merge pull requests in quick succession ensures that the developer can iterate faster on the feature they are developing</p>
</blockquote>
<p>Can’t argue with this.</p>
<blockquote>
<p>Another benefit […] is reviewer focus. […] being pinged every 10 or 15 minutes for a review can hinder the productivity</p>
</blockquote>
<p>This can be fixed. Open a PR, and continue working on the next PR. Or use a stacked-diffs tool like Phabricator.</p>
<h2 id="benefit-2-encourages-better-development-practices"><a href="#benefit-2-encourages-better-development-practices">👍 Benefit 2: Encourages Better Development Practices</a></h2><p>This is the other side of the same coin as Challenge 3 above. Having a super disciplined team is required for post-commit reviews, so you get that as a benefit too.</p>
<h2 id="benefit-3-detect-more-bugs-before-code-review"><a href="#benefit-3-detect-more-bugs-before-code-review">👍 Benefit 3: Detect More Bugs Before Code Review</a></h2><p>Undecided 🤔 I see how post-commit/pre-deploy testing can expose more issues, but I guesstimate them to be the minority. You should anyway test your code before sending it for review/merging.</p>
<p>A concern, fitting the trust point. With post-commit reviews, you need to trust your colleagues to actually go back and change big chunks of code if needed. I worked with a few people that would NEVER do that, and such cases are blockers.</p>
<hr/>
<p>In conclusion, I can see how post-commit reviews can speed up a team. But it requires <strong>all</strong> of:</p>
<ul>
<li>great tooling</li>
<li>trust within the team</li>
<li>pre-commit decision-making processes (e.g. RFCs, design docs)</li>
</ul>
<p>If you have these, go for it 🚀</p>

</article>
<article>
<h1>Twitter 📝 Observability for emerging infra: Charity Majors</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 14 Aug 2022 00:00:00 GMT</p>
<p>Check the original Twitter thread at <a href="https://twitter.com/LambrosPetrou/status/1558819676439945216">@lambrospetrou/status/1558819676439945216</a>.</p>
<hr/>


<hr/>
<p>Just watched the “Observability for emerging infra” talk by @mipsytipsy. Even though I am a big proponent for testing before production, I do agree with all the points raised!</p>
<p>My favourite points raised 👇</p>
<p>1/ Testing is not exclusive to either staging or production. You can, and should, test in both.</p>
<p>2/ Shipping to production is not a flip switch that goes from 0 to 100 once the code lands. There can be commit reverts, hotfixes/patches, multiple versions running, automatic rollbacks. Our job is not done just because the code is shipped.</p>
<p>3/ If you can solve your problem with a LAMP-stack (or equivalent), do yourself a favor and go with it, ignoring all the K8s etc complexity.</p>
<p>4/ The shift from monitoring to observability is the shift from “known unknowns” to “unknown unknowns”. Monitoring is like unit tests, we know what to monitor. Distributed systems have an infinite list of failures that make staging worthless.</p>
<p>5/ We should be spending more time and money in tools for looking into production systems. We cannot test everything in staging environments, so we should admit that we need testing in production.</p>
<p>6/ Why do people invest so much in staging testing tooling, when they cannot tell if the system is healthy in the first place, staging or production. Without observability, it’s just chaos.</p>
<p>7/ Engineers should build muscle memory. When you ship code, you should go and look at it in production. We need to watch our code run with:</p>
<ul>
<li>real data</li>
<li>real users</li>
<li>real traffic</li>
<li>real scale</li>
<li>real concurrency</li>
<li>real network</li>
</ul>
<p>8/ What to test before prod, the known unknowns:</p>
<ul>
<li>does it work</li>
<li>does my code run</li>
<li>does it fail in ways I can predict</li>
<li>does it fail in ways it has previously failed</li>
</ul>
<p>9/ What to test in prod, the unknown unknowns:</p>
<ul>
<li>complex behavorial tests</li>
<li>experiments (A/B testing)</li>
<li>load tests</li>
<li>edge cases/weird bugs</li>
<li>canary, canary, canary</li>
<li>rolling deploys</li>
<li>multi-region</li>
</ul>
<p>10/ Risks of testing in prod</p>
<ul>
<li>expose security vulnerabilities</li>
<li>data loss or contamination of hosts</li>
<li>sudden app crashes</li>
<li>resource saturation (due to load testing)</li>
<li>impossible rollback depending on error</li>
<li>bad experience to users</li>
</ul>
<p>11/ We need to start using:</p>
<ul>
<li>feature flags for continuous releases</li>
<li>high cardinality tooling (Honeycomb, FB’s Scuba)</li>
<li>canary</li>
<li>shadow systems</li>
<li>capture/replay for databases</li>
<li>Be less afraid, by using the right tooling</li>
</ul>
<p>12/ Every engineer should know:</p>
<ul>
<li>what normal production looks like</li>
<li>how to deploy and rollback to a known state</li>
<li>how to debug in production</li>
</ul>
<p>13/ SSH-ing into production is a sign something is wrong with instrumentation!</p>
<p>14/ What is observability?</p>
<p>Control theory: It’s how much you can understand about the state of the system by looking at its external outputs.</p>
<p>Us: How can we ask new questions, new queries, without having to ship new code each time.</p>
<p>15/ Events, not metrics? Log strings vs structured events?
We need high cardinality -> context, using structured data. Metrics and aggregations strip away the details.</p>
<p>16/ We need raw requests for new investigation queries, so use sampling instead. Aggregations are the devil, a once-way trip. Intelligent dynamic sampling can save space, but help us investigate issues.</p>
<p>17/ We need:</p>
<ul>
<li>observability driven development</li>
<li>comfortably looking at prod</li>
<li>observability oriented tooling</li>
</ul>
<p>Zero users care what the “system” health is, all users care about is their experience. So, watch it run in production.</p>
<p>That’s it! Great talk👌</p>

</article>
<article>
<h1>Fly.io cloud development environment with Visual Studio Code Remote-SSH</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 01 Aug 2022 00:00:00 GMT</p>
<p>For many years now I have been using an <a href="https://aws.amazon.com/ec2/">AWS EC2</a> <code>T3.nano</code> (<code>T2.nano</code> previously) instance as my personal VPS server. I mainly host a <a href="https://gitea.io/">Gitea</a> server, along with some other toy projects, and I sometimes use it as my remote development machine when for some reason I cannot use the local laptop I have at hand. It’s been quite reliable honestly, checking its current <code>uptime</code> it says it has been up for 850 days now without any issues. And the reason I restarted it back then was to (unsuccessfully) upgrade from Amazon Linux to Amazon Linux 2.</p>
<p>Fast-forward to present, I find myself playing a lot with <a href="https://fly.io/">Fly.io</a>, a new-ish cloud compute provider for server applications (e.g. <a href="https://fly.io/docs/getting-started/dockerfile/">Dockerfile</a>). I covered it in a <a href="https://www.lambrospetrou.com/articles/serverless-platforms-2022/">past article about serverless platforms in 2022</a>, but in summary I really love the developer experience and the simplicity it provides through its CLI.</p>
<p>In this article I will describe how I now use Fly.io as my development environment in the cloud instead of EC2, without having to remember to update the underlying OS or even worse upgrading to a new major version and having to do annoying file migrations.</p>
<h2 id="it-s-all-containers"><a href="#it-s-all-containers">It’s all containers</a></h2><p>Fly.io is built on top of <a href="https://fly.io/docs/reference/architecture/#microvms">Firecracker microVMs</a> and supports a <a href="https://fly.io/docs/reference/builders/">few types of builders</a> that ultimately assemble a container to deploy. In this article I am going to use the <a href="https://fly.io/docs/getting-started/dockerfile/">Dockerfile support</a> since that’s how I prefer to model my development environment.</p>
<p>The benefits of modelling my environment in a Dockerfile:</p>
<ul>
<li>Upgrading the operating system (OS), or even changing it, is a single line change and a redeploy (e.g. from <code>FROM ubuntu:18.04</code> to <code>FROM ubuntu:20.04</code>).</li>
<li>Installing or uninstalling software and packages from the OS is again trivial using the <a href="https://docs.docker.com/engine/reference/builder/#run"><code>RUN</code></a> command.</li>
<li>Define exactly what processes should my server run and never have to worry about manually (re)starting them.</li>
<li>Have all of the above versioned in Git, so doing any change to it is easy while keeping a historical record in case I need to check how something was setup in the past.</li>
<li>Have the whole environment recreated in seconds if something goes wrong (Fly.io provides <a href="https://fly.io/docs/reference/volumes/">persistent disk volumes</a>).</li>
</ul>
<h2 id="cloud-development-environment"><a href="#cloud-development-environment">Cloud Development Environment</a></h2><p>For a basic remote development environment I want to be able to do the following:</p>
<ul>
<li>SSH into it in case I need to test linux commands.</li>
<li>Checkout and work with Git repositories.</li>
<li>Install the several programming languages I work with.</li>
<li>Have a nice development experience using Visual Studio Code comparable to local development.</li>
</ul>
<p>The full source code referenced in the following sections is available in <a href="https://github.com/lambrospetrou/code-playground/tree/master/flyio-cloud-dev-env">this repository</a>, and you can <a href="https://downgit.github.io/#/home?url=https://github.com/lambrospetrou/code-playground/tree/master/flyio-cloud-dev-env">download it as .zip file here</a>.</p>
<h3 id="dockerfile"><a href="#dockerfile">Dockerfile</a></h3><p>This is a simplistic version of the <code>Dockerfile</code> I use:</p>
<pre><code class="language-dockerfile">FROM ubuntu:bionic

RUN apt-get update && apt-get install --no-install-recommends -y \
    ca-certificates curl sudo openssh-server bash git \
    iproute2 apt-transport-https gnupg-agent software-properties-common \
    # Install extra packages you need for your dev environment
    htop make vim && \
    apt autoremove -y

ARG USER="clouddevuser"
RUN test -n "$USER"
# Create the user
RUN adduser --disabled-password --gecos '' --home /data/home ${USER}
# passwordless sudo for your user's group
RUN echo "%${USER} ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

ENV USER_ARG=${USER}

# Setup your SSH server daemon, copy pre-generated keys
RUN rm -rf /etc/ssh/ssh_host_*_key*
COPY etc/ssh/sshd_config /etc/ssh/sshd_config

COPY ./entrypoint.sh ./entrypoint.sh
COPY ./docker-entrypoint.d/* ./docker-entrypoint.d/

ENTRYPOINT ["./entrypoint.sh"]
CMD ["/usr/sbin/sshd", "-D"]
</code></pre>
<p>Let’s do a simple breakdown of each Dockerfile instruction.</p>
<p>First, we define the Operating System (OS) to use, in this case <a href="https://hub.docker.com/_/ubuntu/">Ubuntu</a> version <code>bionic</code> (codenamed <code>18.04</code>).</p>
<pre><code class="language-dockerfile">FROM ubuntu:bionic
</code></pre>
<p>Then we encounter the <code>RUN</code> instruction which basically installs additional software to our OS. We can update this list (see relevant line comment) to include more packages as necessary.</p>
<p>The next 8 lines create a new user <code>clouddevuser</code> (<a href="https://docs.docker.com/engine/reference/builder/#arg">customisable argument using <code>ARG</code></a>), sets the necessary permissions, and creates the user’s home under <code>/data/home</code> (we can change this to <code>/home/clouddevuser</code> or anything else). The thing to remember from these lines is that we export an environment variable <code>USER_ARG</code> which holds the user name since it will be used by the <code>entrypoint.sh</code> script (detailed in the next section).</p>
<p>Then, we have 2 lines essentially making sure the our local SSH configuration is copied into the container and is the one used.</p>
<p>The last lines need some explanation.</p>
<pre><code class="language-dockerfile">COPY ./entrypoint.sh ./entrypoint.sh
COPY ./docker-entrypoint.d/* ./docker-entrypoint.d/

ENTRYPOINT ["./entrypoint.sh"]
CMD ["/usr/sbin/sshd", "-D"]
</code></pre>
<p>The two <code>COPY</code> instructions copy the local <code>entrypoint.sh</code> script and the local directory’s <code>./docker-entrypoint.d/</code> content to the container’s root directory.
The last pair of instructions specify that we want our container to execute <code>./entrypoint.sh /usr/sbin/sshd -D</code> using the <a href="https://docs.docker.com/engine/reference/builder/#entrypoint"><code>ENTRYPOINT</code></a> and <a href="https://docs.docker.com/engine/reference/builder/#cmd"><code>CMD</code></a> Dockerfile instructions.</p>
<p>In summary, this Dockerfile selects the OS we want, installs extra software packages, sets up our user, and then runs the <code>entrypoint.sh</code> script. Easy.</p>
<h3 id="entrypoint-sh"><a href="#entrypoint-sh">entrypoint.sh</a></h3><pre><code class="language-bash">#!/bin/bash

set -euxo pipefail

echo "Creating /run/sshd"
mkdir -p /run/sshd

HOME_DIR=/data/home
SSH_DIR=/data/etc/ssh

echo "Ensure home directory"
mkdir -p $HOME_DIR

echo "Ensure SSH host keys"
mkdir -p $SSH_DIR
ssh-keygen -A -f /data

echo "Setup SSH access for user $USER_ARG"
mkdir -p $HOME_DIR/.ssh
# Append the given keys to the authorised keys and only keep the uniques!
# The `# empty comment` is to avoid an empty file which causes grep to fail.
echo -e "# empty comment\n$HOME_SSH_AUTHORIZED_KEYS" >> $HOME_DIR/.ssh/authorized_keys
cat $HOME_DIR/.ssh/authorized_keys | sort | uniq | grep -v "^$" > /tmp/authorized_keys
mv /tmp/authorized_keys $HOME_DIR/.ssh/authorized_keys

if [ -f "$HOME_DIR/.ssh/id_ed25519" ]; then
    echo "$HOME_DIR/.ssh/id_ed25519 exists, skipping."
    echo ""
    echo "Make sure you add this public key to your Github / Gitlab / other vcs:"
else
    echo "$HOME_DIR/.ssh/id_ed25519 does not exist, generating."
    ssh-keygen -t ed25519 -f $HOME_DIR/.ssh/id_ed25519 -C "$USER_ARG@fly-vscode" -N ""
    echo ""
    echo "Add this public key to your Github / Gitlab / other vcs:"
fi
cat $HOME_DIR/.ssh/id_ed25519.pub

echo "chowning your home to you"
chown -R $USER_ARG:$USER_ARG $HOME_DIR

if [[ -d "docker-entrypoint.d" ]]
then
    echo "Running docker-entrypoint.d files"
    /bin/run-parts --verbose docker-entrypoint.d
fi

echo "Running $@"
exec "$@"
</code></pre>
<p>Briefly, the entrypoint script:</p>
<ol>
<li>creates the user’s home directory (needs to match the one we used in <code>Dockerfile</code>)</li>
<li>adds some SSH authorized keys to our configuration to allow remote SSH-ing</li>
<li>optionally generates an SSH key that we will use to authenticate this server with services like Github when pulling/pushing Git repositories</li>
<li>it runs all the scripts under the <code>/docker-entrypoint.d/</code> directory</li>
<li>and finally runs the command passed as argument to the script, which as we explained in the previous section it’s going to be <code>/usr/sbin/sshd -D</code></li>
</ol>
<p><strong>Notes</strong></p>
<ul>
<li>Anything that needs the user uses the environment variable <code>USER_ARG</code> that we defined in the <code>Dockerfile</code>.</li>
<li>The content of the environment variable <code>HOME_SSH_AUTHORIZED_KEYS</code> is appended to the <code>$HOME_DIR/.ssh/authorized_keys</code> file, and then we do some shell shenanigans to only allow unique lines inside that file (to avoid appending the same keys each time our container is started). The value of <code>HOME_SSH_AUTHORIZED_KEYS</code> is provided by the <code>fly.toml</code> file as you will see below.</li>
<li>The generated <code>$HOME_DIR/.ssh/id_ed25519.pub</code> file is the public key we need to upload to Github or any other service that needs to authenticate its user using SSH keys.</li>
<li>The <code>/bin/run-parts --verbose docker-entrypoint.d</code> conveniently runs any script we put into the <code>docker-entrypoint.d/</code> directory, which makes it easy to do any initialisation to the OS at runtime.</li>
</ul>
<p>After the above script runs, now the container is stuck in the <code>/usr/sbin/sshd -D</code> command. This is the SSH daemon that accepts connections from anyone attempting to SSH into our Fly.io server instance, our container. As long as the only thing we want to expose from the server is the SSH port, this is the only thing we need to run as the last command in our <code>entrypoint.sh</code>.</p>
<p>if we didn’t run this at the end, and the script exited, then the container would complete and exit as well, which would then cause our Fly.io server instance to be marked as failed/unhealthy and would be restarted, leading to a crash-loop (by default it attempts to deploy up to 3 times).</p>
<h3 id="fly-toml"><a href="#fly-toml">fly.toml</a></h3><pre><code class="language-toml">app = "<WILL_BE_REPLACE_WITH_GENERATED_NAME>"

[env]
  HOME_SSH_AUTHORIZED_KEYS = '''
'''

[[mounts]]
  # This is the persistent volume mount location, so if you change this
  # you need to also change the Dockerfile and entrypoint.sh wherever "/data" is used.
  destination = "/data"
  source = "clouddevdata"

[[services]]
  internal_port = 22
  protocol = "tcp"

  [[services.ports]]
    port = 10022
</code></pre>
<p>Fly.io uses the <a href="https://fly.io/docs/reference/configuration/"><code>fly.toml</code> configuration file</a> to configure your application when using the <a href="https://fly.io/docs/flyctl/"><code>flyctl</code> CLI</a>.</p>
<p>What happens here?</p>
<ul>
<li>The <code>app</code> key specifies the name of the Fly.io application after it’s created (see below section) and is used by the <code>flyctl</code> CLI when issuing commands.</li>
<li>The <code>HOME_SSH_AUTHORIZED_KEYS</code> key can be updated to contain the SSH keys to put into the authorised keys file for our container at runtime. I usually update this with a new SSH key, deploy the application (see section below) which will append it to the <code>authorized_keys</code> file, and then remove it from the <code>fly.toml</code> file to avoid versioning it in the Git repository.</li>
<li>The <code>clouddevdata</code> persistent volume is mounted on <code>/data</code> inside the container.</li>
<li>We expose port <code>10022</code> publicly and route that to port <code>22</code> in the container, which is where the SSH daemon we started in our entrypoint script listens.</li>
</ul>
<h2 id="enough-give-me-my-cloud-environment"><a href="#enough-give-me-my-cloud-environment">Enough! Give me my cloud environment.</a></h2><p>OK, after we explained the key parts of the setup, let’s see how trivial it is to run this with Fly.io.</p>
<h3 id="0-get-the-code"><a href="#0-get-the-code">0. Get the code</a></h3><p>The full source code is available in <a href="https://github.com/lambrospetrou/code-playground/tree/master/flyio-cloud-dev-env">this repository</a>, and you can <a href="https://downgit.github.io/#/home?url=https://github.com/lambrospetrou/code-playground/tree/master/flyio-cloud-dev-env">download it as .zip file here</a>.</p>
<p>All <code>flyctl</code> commands shown below need to run inside the source code directory.</p>
<h3 id="1-create-a-fly-io-application-once"><a href="#1-create-a-fly-io-application-once">1. Create a Fly.io application (once)</a></h3><p>This step only needs to be run once in order to create your Fly.io application.
After [installing the <code>flyctl</code> CLI](<a href="https://fly.io/docs/flyctl/installing/">https://fly.io/docs/flyctl/installing/</a>, run the following:</p>
<pre><code class="language-bash">flyctl launch --generate-name --no-deploy --copy-config
</code></pre>
<p>Running the above will give you a prompt to select the region you want to deploy your cloud environment. After selecting the region, the <code>fly.toml</code> will also be updated with the autogenerated application name (as specified by <code>--generate-name</code>).</p>
<p>If you want to use a specific application name (since it’s part of the DNS name you will need to use), then you can use the <code>--name</code> argument instead (e.g. <code>--name lambros-application-1</code>), but keep in mind that this should be unique across all Fly.io applications globally since it’s part of the DNS subdomain you get, e.g. <code>lambros-application-1.fly.dev</code>, so there is a high chance your wanted name is taken.</p>
<h3 id="2-create-the-persistent-volume-once"><a href="#2-create-the-persistent-volume-once">2. Create the persistent volume (once)</a></h3><p>The main enabler for the cloud development environment is that we can use <a href="https://fly.io/docs/reference/volumes/">persistent disk volumes</a> to hold our data files while keeping all the OS/packages controlled by the <code>Dockerfile</code>. So let’s create our volume:</p>
<pre><code class="language-bash">flyctl volumes create clouddevdata --region lhr --size 10
</code></pre>
<p>This will create a <code>10GB</code> volume in London (<code>lhr</code>). You have to create the volume in the same region you selected in the previous step for the application itself! You can also <a href="https://fly.io/docs/reference/regions/">check the available regions list</a> to find the right value.</p>
<h3 id="3-deploy"><a href="#3-deploy">3. Deploy</a></h3><p>This is the only step we need to do every time we change something in our application.</p>
<pre><code class="language-bash">flyctl deploy
</code></pre>
<p>This will pick up the <code>Dockerfile</code>, check if there are changes and build a new image if necessary, and then trigger a deployment.
Once the deployment is finished you can use the cloud development environment, i.e. SSH into it.</p>
<h3 id="4a-generate-ssh-keys"><a href="#4a-generate-ssh-keys">4a. Generate SSH keys</a></h3><p><strong><em>If you already have your SSH keys you can skip this section and go to <a href="#4b-ssh">Section 4b</a>.</em></strong></p>
<p>To generate your key run the following (replace the <code>KEY_FILENAME</code> and the email as necessary):</p>
<pre><code class="language-bash">ssh-keygen -t ed25519 -f ~/.ssh/<KEY_FILENAME> -C "your_email@example.com"
</code></pre>
<p>If your system does not support the Ed25519 algorithm, you can use RSA keys.</p>
<pre><code class="language-bash">ssh-keygen -t rsa -b 4096 -f ~/.ssh/<KEY_FILENAME> -C "your_email@example.com"
</code></pre>
<p>The above command will generate two files:</p>
<ol>
<li>The private key, at <code>~/.ssh/<KEY_FILENAME></code>, which should <strong>never be shared with anyone</strong>.</li>
<li>The public key, at <code>~/.ssh/<KEY_FILENAME>.pub</code>, which is the one to upload in Github or in our case paste in the <code>HOME_SSH_AUTHORIZED_KEYS</code> section as described above.</li>
</ol>
<p>Common key filenames are <code>id_<algorithm></code>, e.g. <code>id_rsa</code>, or <code>id_ed25519</code>. In some cases, I generate keys for different purposes so having specific names for the keys is very useful.</p>
<p>You should add the key to the <code>ssh-agent</code> for easier use, <a href="https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent#adding-your-ssh-key-to-the-ssh-agent">following the Github instructions</a>.</p>
<h3 id="4b-ssh"><a href="#4b-ssh">4b. SSH</a></h3><p>The default configuration only allows SSH-ing with authorized keys, and password-based authentication is disabled (see <code>etc/ssh/sshd_config</code> in the source code).
Therefore, you need to update the <code>HOME_SSH_AUTHORIZED_KEYS</code> value in <code>fly.toml</code> with your laptop’s SSH key (usually <code>~/.ssh/id_rsa.pub</code>), and then deploy once with <code>flyctl deploy</code>. Then, you can remove the SSH key from the <code>HOME_SSH_AUTHORIZED_KEYS</code> again to keep it safe.</p>
<p>Test that you can SSH into the cloud development environment (assuming application name <code>lp1111</code>, and user <code>clouddevuser</code>):</p>
<pre><code class="language-bash">ssh clouddevuser@lp1111.fly.dev -p 10022
</code></pre>
<p>Now you should be hopefully logged into the remote container, so go nuts, and explore what you can do, do some file changes, and confirm that your changes persist across deployments 🤩</p>
<h3 id="5-visual-studio-code-remote-ssh"><a href="#5-visual-studio-code-remote-ssh">5. Visual Studio Code - Remote SSH</a></h3><p>VSCode has amazing <a href="https://code.visualstudio.com/docs/remote/ssh">remote development capabilities</a> which enable full-use of the editor with any remote server accessible over SSH connection.</p>
<ol>
<li>Install the <a href="https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-ssh">Remote - SSH extension</a> from the marketplace.</li>
<li>The default configuration only allows SSH-ing with authorized keys, and password-based authentication is disabled (see <code>etc/ssh/sshd_config</code> in the source code).
Therefore, you need to update the <code>HOME_SSH_AUTHORIZED_KEYS</code> value in <code>fly.toml</code> with your laptop’s SSH key (usually <code>~/.ssh/id_rsa.pub</code>), and then deploy once with <code>flyctl deploy</code>. Then, you can remove the SSH key from the <code>HOME_SSH_AUTHORIZED_KEYS</code> again to keep it safe.</li>
<li>Open the command palette (<code>CMD+SHIFT+p</code> on Mac, <code>CTRL+SHIFT+p</code> on Windows).</li>
<li>Search for the <code>Remote-SSH: Connect to Host</code> command and select it.</li>
<li>Type your environment details, e.g. <code>clouddevuser@lp1111.fly.dev:10022</code>, where <code>clouddevuser</code> is the user used in the <code>Dockerfile</code>, <code>lp1111</code> is the Fly.io application name, and <code>10022</code> is the port exported by the <code>Dockerfile</code>.</li>
</ol>
<p>You should then be able to mount any directory on the remote server and use VSCode as if it’s working with the local filesystem 🥳</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Fly.io + VSCode Remote-SSH = ❤️ 🚀</p>
<h3 id="references"><a href="#references">References</a></h3><ul>
<li><a href="https://code.visualstudio.com/docs/remote/ssh">https://code.visualstudio.com/docs/remote/ssh</a></li>
<li><a href="https://fly.io/docs/app-guides/vscode-remote/">https://fly.io/docs/app-guides/vscode-remote/</a></li>
<li><a href="https://community.fly.io/t/a-vscode-example-for-fly/460">https://community.fly.io/t/a-vscode-example-for-fly/460</a></li>
</ul>

</article>
<article>
<h1>AWS Elastic Beanstalk Go platform (AL2) — Deep Dive</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 30 Jan 2022 00:00:00 GMT</p>
<p>Even though my default and preferred way of deploying applications is <a href="/articles/serverless-platforms-2022/">#serverless platforms</a>, sometimes I need an actual long-running host. For example, some of the applications I run (e.g. <a href="https://gitea.io/en-us/">Gitea</a>) are written in <a href="https://go.dev/">Go</a> and use <a href="https://www.sqlite.org/">SQLite</a> as their database so in that case serverless does not work.</p>
<p>In this article I will explore the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-environment.html">Elastic Beanstalk Go platform based on Amazon Linux 2 (AL2)</a> for deploying a simple application. As a comparison, a couple of years ago I <a href="/articles/multiple-services-elastic-beanstalk/">wrote about the older Go platform</a> based on Amazon Linux 1.</p>
<p>Even though I focus on the Go platform, <strong>almost everything applies to <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux.html#platforms-linux.list">all Amazon Linux 2 based platforms</a> (e.g. <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/java-se-platform.html">Java</a>)</strong>. You can also use the Go platform to deploy any application that is compiled to a binary that can run on Amazon Linux 2 (e.g. <a href="https://www.rust-lang.org/">Rust</a>).</p>
<p><em>Disclaimer: All information shown is accurate as of Jan 30/2022.</em></p>
<h2 id="why-use-aws-elastic-beanstalk"><a href="#why-use-aws-elastic-beanstalk">Why use AWS Elastic Beanstalk?</a></h2><blockquote>
<p>Easy to begin, Impossible to outgrow.</p>
</blockquote>
<p><a href="https://aws.amazon.com/elasticbeanstalk/">Elastic Beanstalk</a> is a Platform-as-a-Service (PaaS) service offering by AWS.
It takes care of managing the platform updates, patching the underlying hosts, and abstracting away the AWS resources used while at the same time providing the necessary hooks and capabilities to customise and extend them when needed.</p>
<p>AWS Elastic Beanstalk is similar in “principle” to <a href="https://azure.microsoft.com/en-gb/services/app-service/">Azure App Service</a>, or <a href="https://cloud.google.com/appengine">Google App Engine</a>.</p>
<p>I personally like Elastic Beanstalk and is my first choice of deploying to actual hosts/VM instances. I usually use the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-types.html#single-instance-environ">single instance</a> environment to get the benefits without the costs, but there is also the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-types.html#autoscale-environ">load-balanced environment</a> which allows you to scale as much as you want.</p>
<p>Features the Go platform provides which I cannot find in another platform as of time of writing (<a href="https://twitter.com/LambrosPetrou/status/1487493396566528007">see related tweet</a>):</p>
<ul>
<li>single server instance only (because not everything is web-scale)</li>
<li>persistent disk volumes (i.e. <a href="https://aws.amazon.com/ebs/">Amazon EBS</a>)</li>
<li>deployments in-place on same instance with only 1-2s of downtime<ul>
<li>restriction due to the persistent disk volume only being accessible from one host at a time</li>
</ul>
</li>
<li>managed platform: OS updates, host patching, language runtime updates</li>
<li>variety of instance types, from the low-cost <a href="https://aws.amazon.com/ec2/instance-types/t3/">T3</a>, to the powerful ARM-based Graviton2 <a href="https://aws.amazon.com/ec2/instance-types/c6g/">C6g</a></li>
<li><5$/month</li>
</ul>
<h2 id="overview"><a href="#overview">Overview</a></h2><p>The Amazon Linux 2 based platforms were added back in <a href="https://aws.amazon.com/blogs/compute/introducing-a-new-generation-of-aws-elastic-beanstalk-platforms/">2020</a>, and with that upgrade all the platforms now follow (almost) the same conventions and setup (<a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux.html#platforms-linux.versions">see supported Linux platforms</a>), which is a nice improvement over the previous platforms based on Amazon Linux 1 where each language had its own idiosyncracies.</p>
<p>The AWS documentation contains most of the things we need to know about the platforms but unfortunately as usual it’s like they intentionally try to make it hard on their users, and they spread the information across multiple pages. In this article I will try to cover the main features of the platform and link to the corresponding documentation pages to make it easier to find what we need. This is basically a reference for future me 😅</p>
<p>All AL2 platforms offer the following features (<a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html">see documentation</a>):</p>
<ul>
<li><strong>Buildfile</strong>: one-off commands to run from inside the uploaded bundle, e.g. building the source code to create the binary to run.</li>
<li><strong>Procfile</strong>: long-running monitored commands to run, e.g. the web server.</li>
<li><strong>Platform hooks</strong>: one-off commands to run during the deployment lifecycle hooks (e.g. predeploy, postdeploy). New to AL2 based platforms.</li>
<li><strong>Reverse proxy configuration</strong>: <code>nginx</code> is used as the reverse proxy for most platforms and there is a way to override its configuration. Some platforms also provide Apache HTTPD as proxy (e.g. Tomcat, Node.js, PHP, and Python).</li>
<li><strong>Configuration files (.ebextensions)</strong>: configuration to extend the Beanstalk environment. For example, add new CloudFormation resources (e.g. DynamoDB tables, EBS volumes), customise existing CloudFormation resources (e.g. the EC2 instance), override files on the filesystem, and many more.</li>
</ul>
<p>Let’s go into details for each one of these for the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-environment.html">Go platform</a> specifically.</p>
<h2 id="buildfile"><a href="#buildfile">Buildfile</a></h2><ul>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-buildfile.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-buildfile.html</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html</a></li>
</ul>
<p>The <code>Buildfile</code> file should be placed at the root of the uploaded bundle <code>.zip</code> file and contains one-off commands that we want to run as part of the <strong>build</strong> phase of the deployment. The working directory of each command is the root of the uploaded bundle. If your uploaded bundle contains your source code then this is where you would put your application compile command. If you have a <a href="https://aws.amazon.com/devops/continuous-integration/">Continuous Integration (CI)</a> system to build your source code before-hand then you don’t need a <code>Buildfile</code>.</p>
<p>For example, the following <code>Buildfile</code> will compile the Go application and create a binary file named <code>app</code> into the <code>bin/</code> directory:</p>
<pre><code>cmd1: go build -o bin/app
</code></pre>
<p>The <code>cmd1:</code> part is nothing but just a name and does not have any meaning other than differentiating the commands to run. The following is also valid:</p>
<pre><code>build: go build -o bin/app
check: ./validate-app-binary.sh
</code></pre>
<p><strong>Personal opinion:</strong> I never use the <code>Buildfile</code> to build my application and I always have a CI system to build the source code (or even build it locally). Even the documentation itself suggests to use the <code>predeploy</code> platform hook (more later) instead for the one-off commands we want to run, so this is not really needed. This means that my uploaded bundle <code>.zip</code> file only contains the compiled artifacts, e.g. the <code>bin/app</code> binary from the above example. This makes deployments a bit faster as well since the build phase will not happen during the Beanstalk lifecycle and I don’t worry about my build process being constrained by the machine running the actual application. For example, if you have a lot of dependencies or your tests need a lot of resources then doing it before triggering a Beanstalk deployment will allow you to use smaller lightweight instances for the actual application in production (e.g. <code>T3.nano</code>, <code>T4G.nano</code>).</p>
<h2 id="procfile"><a href="#procfile">Procfile</a></h2><ul>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-procfile.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-procfile.html</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html</a></li>
</ul>
<p>The Go platform also supports conventions to <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-environment.html">automatically build and run your application</a>, but I personally avoid them since they are less flexible than using <code>Procfile</code> and there is zero benefit in using them.</p>
<p>The <code>Procfile</code> file should be placed at the root of the uploaded bundle <code>.zip</code> file and contains the long-running commands that we want to run as part of the application. Each process started by these commands will be monitored by the environment and automatically restarted when they crash. This is the place where we specify the main command(s) to start the application.</p>
<p>For example, the following starts the <code>bin/app</code> and <code>bin/bgapp</code> apps with the right flags:</p>
<pre><code>web: ./bin/app -name webapp
bgapp: ./bin/bgapp -name bgapp -port 5001
</code></pre>
<p>By default, Elastic Beanstalk listens to requests from the internet on HTTP port <code>80</code> and forwards them to port <code>5000</code> inside the host (configurable if needed). Therefore, the application should listen on port <code>5000</code>. Elastic Beanstalk looks for the command named <code>web</code> from <code>Procfile</code>, and sets the <code>PORT=5000</code> environment variable when running it. The application code will need to read that env variable when setting up its HTTP listener. We could also ignore the <code>PORT</code> env variable and hardcode port <code>5000</code> but it makes it harder to start your application on different ports (e.g. locally), so do the nice thing and read the env variable on startup (<a href="#example-application">see example application code below</a>).</p>
<p><strong>NOTE1:</strong> The default port can be changed from <code>5000</code> to something else by using <code>.ebextensions</code> and setting the <code>aws:elasticbeanstalk:application:environment.PORT</code> property. Read more about this in the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html#platforms-linux-extend.proxy">reverse proxy configuration docs</a> (make sure to expand the “Reverse proxy configuration” section).</p>
<p><strong>NOTE2:</strong> In the previous version of the platform (AL1) all processes in the <code>Procfile</code> were getting the <code>PORT</code> set to a value starting from <code>5000</code> and then growing in increments of <code>+100</code>. This <strong>does not apply anymore in AL2 platforms</strong> and <strong>only</strong> the explicitly named <code>web</code> process has the <code>PORT</code> env set, which is why I explicitly set the port for <code>bgapp</code> in the above example.</p>
<p><strong>NOTE3:</strong> There are some nuances around logging and how the logs are streamed to CloudWatch Logs when <code>Procfile</code> contains more than the <code>web</code> process but we will <a href="#logging">explore those later</a>.</p>
<h2 id="platform-hooks"><a href="#platform-hooks">Platform hooks</a></h2><ul>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html</a></li>
</ul>
<p>Platform hooks are a new addition to the AL2 based platforms. The only way to extend and customise Elastic Beanstalk before was using the <code>.ebextensions</code> configuration files but that was/is complicated and very error-prone (lots of trial-and-error needed 😓). Platform hooks cover 90% of all use-cases (personal guesstimate), and they make our lives much easier.</p>
<p>The <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html#platforms-linux-extend.hooks">Platform hooks</a> (expand section) are basically script files placed inside the <code>.platform/hooks</code> directory, under one of the following sub-directories: <code>prebuild</code>, <code>predeploy</code>, <code>postdeploy</code>, depending when they need to run.</p>
<p>The scripts inside each of those directories run in order of their filename. For example, let’s examine the following directory structure:</p>
<pre><code class="language-bash">$ ls .platform/hooks/**
.platform/hooks/postdeploy:
01-validate-http.sh  02-update-route53.sh

.platform/hooks/predeploy:
01-setup-ebs.sh
</code></pre>
<p>With the above files the script <code>01-setup-ebs.sh</code> will be executed before the new version of the app is activated (<code>predeploy</code>), and the <code>01-validate-http.sh</code> and <code>02-update-route53.sh</code> scripts will run after the switch (<code>postdeploy</code>) in this specific order.</p>
<h3 id="config-hooks"><a href="#config-hooks">Config hooks</a></h3><p>Apart from the <code>.platform/hooks</code> directory, there is also a <code>.platform/confighooks</code> directory supported. At first glance, the difference between the two is confusing. Even though the available hooks and execution rules are the same, the type of changes in a deployment will decide which hooks to trigger.</p>
<blockquote>
<p>A configuration deployment occurs when you make configuration changes that only update environment instances without recreating them.</p>
</blockquote>
<p>According to the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html#:~:text=A%20configuration%20deployment%20occurs%20when%20you%20make%20configuration%20changes%20that%20only%20update%20environment%20instances%20without%20recreating%20them.%20The%20following%20option%20updates%20cause%20a%20configuration%20update.">docs</a> the following changes trigger <code>.platform/confighooks</code> only:</p>
<ul>
<li>Environment properties and platform-specific settings</li>
<li>Static files</li>
<li>AWS X-Ray daemon</li>
<li>Log storage and streaming</li>
<li>Application port</li>
</ul>
<p><strong>NOTE:</strong> In order to make sure a script runs regardless of the type of changes it has to be placed in both <code>.platform/confighooks</code> and <code>.platform/hooks</code>.</p>
<h3 id="reverse-proxy-configuration"><a href="#reverse-proxy-configuration">Reverse Proxy configuration</a></h3><p>All AL2 based platforms now are similar in the way their <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html#platforms-linux-extend.proxy">reverse proxy is configured</a>, using <code>nginx</code>.
In order to customise the reverse proxy with our own <code>nginx</code> configuration we can use the <code>.platform/nginx</code> directory.
Any <code>.conf</code> configuration file inside the directory <code>.platform/nginx/conf.d/</code> of the uploaded application source bundle will be included automatically by <code>nginx</code> during service startup.</p>
<p>I rarely have to modify the proxy configuration, and when I do it’s to adapt some of the default limits, e.g. concurrency connections, timeouts.
See <a href="https://www.nginx.com/resources/wiki/start/topics/examples/full/"><code>nginx</code> examples</a> for what configuration is available, and the <a href="#multiple-procfile-processes">Multiple Procfile processes</a> section below for how this can be used to proxy two servers running on the same host.</p>
<h2 id="advanced-configuration-files-ebextensions"><a href="#advanced-configuration-files-ebextensions">Advanced configuration files (.ebextensions)</a></h2><ul>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/ebextensions.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/ebextensions.html</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html</a></li>
</ul>
<p>I am not going to dive deep into this one because <strong>it supports a massive amount of configuration</strong>, and should be the last option only if nothing else works.
Configuration files are useful when all the above features we discussed (<code>Buildfile</code>, <code>Procfile</code>, <code>.platform/{confighooks/hooks}</code>) do not support what we want. In this case they can be handy since they support almost anything.</p>
<ul>
<li>A configuration file needs to be placed inside the <code>.ebextensions</code> directory at the root of the uploaded <code>.zip</code> file bundle, and should have the <code>.config</code> file extension.</li>
<li>The configuration file structure is <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html">documented here</a> and supports anything from installing <code>yum</code> packages, to putting files on the filesystem, to adding/overriding CloudFormation resources.</li>
<li>Read the above documentation pages in order to properly understand how <code>.ebextensions</code> work to avoid a lot of unnecessary head wall-hitting 🤕</li>
</ul>
<p>Example of a configuration file (<code>.ebextensions/awslogs.config</code>) that configures the CloudWatch Logs agent to stream the <code>/var/log/bgapp.stdout.log</code> log file to CloudWatch Logs: <a href="https://gist.github.com/lambrospetrou/758a312a1317532eb6bb0960985df83e">https://gist.github.com/lambrospetrou/758a312a1317532eb6bb0960985df83e</a></p>
<h2 id="instance-deployment-workflow"><a href="#instance-deployment-workflow">Instance deployment workflow</a></h2><p>The documentation section I visit most often is the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html#platforms-linux-extend.workflow">instance deployment workflow</a> diagram.</p>
<p>Unfortunately, it’s not 100% complete since it’s not showing in detail how shutdown and termination of the current app is happening, but it covers everything else.</p>
<p>Credits for the following image belong to AWS but I am putting it here for completeness as well:</p>
<p></p>
<h2 id="example-application"><a href="#example-application">Example application</a></h2><p>In this application we have a simple HTTP API written in Go. We don’t use any of the advanced configuration features of Elastic Beanstalk here just to show that things can be easy and simple.</p>
<ul>
<li>Full code available: <a href="https://github.com/lambrospetrou/aws-playground/tree/master/elastic-beanstalk-al2-go/single-process">https://github.com/lambrospetrou/aws-playground/tree/master/elastic-beanstalk-al2-go/single-process</a></li>
</ul>
<p>I didn’t put a <code>Buildfile</code> in this application, so when we want to deploy we need to run <code>make</code> either locally or on our CI system, and then upload the bundle file located in <code>build/bundle.zip</code>.</p>
<p>This is how the directory structure looks like after running <code>make</code>:</p>
<pre><code class="language-bash">➜  single-process git:(master) find . -type f
./app/go.mod
./app/main.go
./Makefile
./build/bin/app
./build/bundle.zip
./build/Procfile
./build-tools/Procfile
</code></pre>
<p><strong>Procfile</strong></p>
<pre><code>web: ./bin/app -name web
</code></pre>
<p><strong>Application code (main.go)</strong></p>
<pre><code class="language-go">package main

import (
    "flag"
    "fmt"
    "html"
    "log"
    "net/http"
    "os"
    "strings"
)

func main() {
    name := flag.String("name", "app", "The name of the service running, e.g. web2")
    port := flag.String("port", os.Getenv("PORT"), "The port for the server to listen.")
    flag.Parse()
    if strings.TrimSpace(*port) == "" {
        *port = "5000"
    }

    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        msg := fmt.Sprintf("Service %s Path, %q", *name, html.EscapeString(r.URL.Path))
        log.Println(msg)
        fmt.Fprintf(w, msg)
    })
    log.Printf("App %s starts listening at :%s\n", *name, *port)
    log.Fatal(http.ListenAndServe(":"+*port, nil))
}
</code></pre>
<p>Notice how we read the <code>PORT</code> environment variable but also support command line arguments and then start our HTTP listener on the given port.</p>
<p>After deploying the above on Elastic Beanstalk and enabling <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.cloudwatchlogs.html">CloudWatch Logs streaming</a> we can see that automatically we get the following Log groups created and populated:</p>
<p></p>
<p>The <code>/aws/elasticbeanstalk/Al2go-env-1/var/log/web.stdout.log</code> log group contains the log lines from our application. The <code>web</code> prefix of the file corresponds to the <code>web</code> process name inside our <code>Procfile</code>.</p>
<p>The <code>/aws/elasticbeanstalk/Al2go-env-1/var/log/eb-engine.log</code> log group contains all logs from the Beanstalk processes that run during a deployment, which is useful while troubleshooting failed deployments or trying to understand how deployments work.</p>
<p>Elastic Beanstalk provides an endpoint for our application which corresponds to the HTTP port <code>80</code> mentioned in <a href="#procfile">Procfile</a> section above, and looks like <code>http://al2go-env-1.eba-f2qnm2t6.eu-west-1.elasticbeanstalk.com/</code> where <code>al2go-env-1</code> is the name of my environment. Unfortunately this endpoint is not HTTPS…</p>
<h2 id="multiple-procfile-processes"><a href="#multiple-procfile-processes">Multiple Procfile processes</a></h2><p>In this section we have the same application as above, but this time I am starting it twice listening on different ports in order to show how we can override <code>nginx</code> to proxy two applications on the same host.</p>
<ul>
<li>Full code available: <a href="https://github.com/lambrospetrou/aws-playground/tree/master/elastic-beanstalk-al2-go/multi-process">https://github.com/lambrospetrou/aws-playground/tree/master/elastic-beanstalk-al2-go/multi-process</a></li>
</ul>
<p>This is how the directory structure looks like after running <code>make</code>:</p>
<pre><code class="language-bash">➜  multi-process git:(master) find . -type f
./app/go.mod
./app/main.go
./Makefile
./.ebextensions/awslogs.config
./build/bin/app
./build/bundle.zip
./build/.ebextensions/awslogs.config
./build/Procfile
./build/.platform/nginx/conf.d/01_proxy.conf
./build-tools/Procfile
./.platform/nginx/conf.d/01_proxy.conf
</code></pre>
<p><strong>Procfile</strong></p>
<pre><code>web: bin/app -name web
bgapp: bin/app -name bgapp -port 5001
</code></pre>
<p><strong>Application code (main.go)</strong></p>
<p>Exactly the same as before.</p>
<p><strong>What happens?</strong></p>
<p>We have a process named <code>web</code> which is the same as before starting the server at given <code>PORT</code> (<code>5000</code>), and also a second process named <code>bgapp</code> which starts the same server at port <code>5001</code> explicitly. As I mentioned in the <a href="#procfile">Procfile</a> section only the server at port <code>5000</code> will actually receive requests by default, so we need to override the <code>nginx</code> configuration to route traffic accordingly.</p>
<p>We do this by providing <code>.platform/nginx/conf.d/01_proxy.conf</code>:</p>
<pre><code>server {
    server_name .elasticbeanstalk.com;
    listen 80;

    location /web2 {
        proxy_pass http://127.0.0.1:5001;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location / {
        proxy_pass http://127.0.0.1:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
</code></pre>
<p>When Beanstalk starts the <code>nginx</code> service it automatically includes all <code>.conf</code> files in <code>.platform/nginx/conf.d/</code> and therefore our configuration above will be used. What it does is basically route any request under the <code>/web2</code> path to our server named <code>bgapp</code>, and anything else goes to the main server named <code>web</code> (remember our <code>Procfile</code>).</p>
<h3 id="logging"><a href="#logging">Logging</a></h3><p>Unfortunately Elastic Beanstalk has some rough edges, and logging is one of those. When we only have a single <code>web</code> process everything is fine and as we saw above all the logs we need are automatically streamed to CloudWatch Logs (if enabled). The issue is when we have more than one process, e.g. in our current scenario where we also have the <code>bgapp</code> process.</p>
<p>After reading the following resources I realised I need to provide a custom configuration file to install AWS CloudWatch Logs agent and set it up to stream the additional <code>bgapp.stdout.log</code> file to CloudWatch Logs.</p>
<ul>
<li><a href="https://aws.amazon.com/premiumsupport/knowledge-center/elastic-beanstalk-customized-log-files/">https://aws.amazon.com/premiumsupport/knowledge-center/elastic-beanstalk-customized-log-files/</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.cloudwatchlogs.html#AWSHowTo.cloudwatchlogs.streaming.custom">https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.cloudwatchlogs.html#AWSHowTo.cloudwatchlogs.streaming.custom</a></li>
</ul>
<p>You can find the <code>.ebextensions/awslogs.config</code> file I used at <a href="https://github.com/lambrospetrou/aws-playground/blob/master/elastic-beanstalk-al2-go/multi-process/.ebextensions/awslogs.config">https://github.com/lambrospetrou/aws-playground/blob/master/elastic-beanstalk-al2-go/multi-process/.ebextensions/awslogs.config</a>.</p>
<p><strong>NOTE1:</strong> The default IAM role created by Elastic Beanstalk for web services, <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/concepts-roles-instance.html"><code>AWSElasticBeanstalkWebTier</code></a>, does not include the <code>logs:CreateLogGroup</code> permission which is needed for the creation of the <code>/var/log/bgapp.stdout.log</code> log group so make sure to update the IAM role used by the instance to include that. The default log groups (e.g. for <code>/var/log/web.stdout.log</code>) are created by the Elastic Beanstalk service automatically hence why the role used by the EC2 instance does not (need to) have this permission. But since we are now adding custom logs to be streamed to CloudWatch Logs we need to attach explicit permissions to allow the creation of the log group as well.</p>
<p><strong>NOTE2:</strong> I <a href="https://github.com/aws/elastic-beanstalk-roadmap/issues/225">opened a roadmap feature request</a> to the Beanstalk team to make this automatically for all processes inside the <code>Procfile</code> since it doesn’t make any sense to me to have to do this on my own since it’s already done for the <code>web</code> process.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>AWS Elastic Beanstalk is simple, and flexible enough to do anything I want. When AWS Lambda cannot be used, this is my goto platform.</p>
<p>In this article I put everything I had in my notes from using Elastic Beanstalk in my own projects over the years.
There are some rough edges but the main issue I encounter is that the documentation is spread all over the place. Having this information structured makes it a bit easier to use and troubleshoot things.</p>
<p>In the worst case, even if nobody else needs this, future me will be very happy I took the time to write this down 😎</p>
<h3 id="references"><a href="#references">References</a></h3><ul>
<li>For all platforms based on Amazon Linux 2<ol>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/platforms-linux-extend.html">Extending Elastic Beanstalk Linux platforms</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers.html">Configuring Elastic Beanstalk environments</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/beanstalk-environment-configuration-advanced.html">Configuring Elastic Beanstalk environments (advanced) with .ebextensions</a></li>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.cloudwatchlogs.html">Using Elastic Beanstalk with Amazon CloudWatch Logs</a></li>
</ol>
</li>
<li>For Go platform<ol>
<li><a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-environment.html">Using the Elastic Beanstalk Go platform</a></li>
</ol>
</li>
</ul>

</article>
<article>
<h1>Serverless platforms into 2022 — AWS, Cloudflare Workers, Netlify, Vercel, Fly.io</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Thu, 20 Jan 2022 00:00:00 GMT</p>
<p>I love <strong>#serverless</strong> platforms 🚀 I have used most of the public offerings over the years, and I always keep an eye out for new services.
In this article I explain the different types of compute, edge vs regional, and make a comparison among the most popular providers. I know there are more providers out there offering serverless products but I included the ones I personally tried. Anything else can be mapped to one of these anyway for comparison.</p>
<p><em>Disclaimer: All data shown below is as of Jan 20, 2022.</em></p>
<h2 id="compute-edge-vs-regional"><a href="#compute-edge-vs-regional">Compute @ Edge vs Regional</a></h2><p>There are a few confusing terms floating around, not uncommon for our industry unfortunately, so let’s try to clarify things.
One of the most important differences among the different platforms is their answer to the question <strong>where does my code run?</strong></p>
<p>Check <a href="https://aws.amazon.com/cloudfront/features/#Global_Edge_Network">Amazon CloudFront’s presence map</a>:</p>
<p></p>
<p>Looking at the legend, and then the map, we can immediately see that the most important distinction between Edge locations and Regions is how many there are of each. We have 300+ <strong>Edge</strong> locations, or Points of Presence (PoP), and 13 <strong>Regional Caches</strong> (bigger orange circles). The Regional Caches are mostly backed by standard <a href="https://aws.amazon.com/about-aws/global-infrastructure/regions_az/">AWS regions</a>.</p>
<p>For comparison let’s look at <a href="https://www.cloudflare.com/en-gb/network/">Cloudflare’s global map</a> of more than 250 <strong>edge</strong> locations:</p>
<p></p>
<p>So, what’s the main difference between an edge location, and a regional location? Well, there are a lot more edge locations, and they are more spread around the world. This means that users are more likely to be closer to an edge location rather than a regional location.</p>
<ul>
<li><strong>Compute - Edge</strong>: When platforms say they offer edge compute, the reasonable assumption is that they run our code in all of these edge locations in their underlying CDN infrastructure.</li>
<li><strong>Compute - Regional</strong>: When we have the traditional regional compute it means that we decide which region will run our code, and then all users have to reach that single region.</li>
<li><strong>Compute - Multi-regional</strong>: We also have the middle ground where our code runs in the regional locations, but we don’t explicitly specify which one. This implies that the provider will run our code in the region that is the closest to the user making the request.</li>
</ul>
<p>The selling point of running on an edge compute platform is that our customers will get faster response times since our code runs closer to them. However, keep in mind that if the code needs access to a database or calls into external services, it means that those services are now a bottleneck and add to the overall latency. Everything has tradeoffs and the right choice depends on several factors.</p>
<p>Now that the terms are understood, let’s see how the different platforms compare.</p>
<h2 id="the-platforms"><a href="#the-platforms">The platforms</a></h2><p>In this section I will briefly enumerate all the platforms, and what they offer.
AWS has multiple serverless products, each with different features, so I will list them separately.</p>
<h3 id="aws-lambda-regional"><a href="#aws-lambda-regional">AWS Lambda - Regional</a></h3><blockquote>
<p>Run code without thinking about servers or clusters</p>
</blockquote>
<ul>
<li><a href="https://aws.amazon.com/lambda/">https://aws.amazon.com/lambda/</a></li>
</ul>
<p>The service that started it all in 2014! Easily the most popular serverless product, and with the most available integrations both inside the AWS ecosystem, but also with other SaaS products.
AWS Lambda over the years got many features, and at the moment it <a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html">supports many runtimes</a>.</p>
<ul>
<li><strong>Languages (natively)</strong>: Go, Node.js, Python, Ruby, Java, C#, PowerShell</li>
<li><strong>Languages (custom runtime)</strong>: Any language as long as you implement the <a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html">custom runtime API</a> (since 2018)</li>
<li><strong>Containers</strong>: Any <a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-images.html">Docker container</a> implementing the <a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-api.html">Runtime API</a>, <a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-extensions-api.html">Extensions API</a>, and <a href="https://docs.aws.amazon.com/lambda/latest/dg/runtimes-logs-api.html">Logs API</a> (since 2020)</li>
<li><strong>Memory</strong>: 128MB up to 10GB</li>
<li><strong>Runtime</strong>: 29s (HTTP integration with API Gateway), 15min (anything else)</li>
<li><strong>Uploaded Bundle size</strong>: 50MB (zipped), 250MB (unzipped), 10GB (container image)</li>
</ul>
<p>Looking at the above features, it’s quite clear that AWS Lambda can run pretty much anything now 🚀 But you should be aware of the <a href="https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/">cold starts…</a></p>
<h3 id="aws-lambda-edge-multi-regional"><a href="#aws-lambda-edge-multi-regional">AWS Lambda@Edge - Multi-regional</a></h3><blockquote>
<p>Run your code closer to your users</p>
</blockquote>
<ul>
<li><a href="https://aws.amazon.com/lambda/edge/">https://aws.amazon.com/lambda/edge/</a></li>
</ul>
<p>Lambda@Edge launched soon after Cloudflare Workers (see below) came out so many consider this a rushed response to Workers. Its goal is to run our code in the region closest to the user making the request. Therefore, despite its name containing the word <strong>edge</strong>, it’s not an edge compute product as per our above definition. Yes, this definitely has many people fooled.</p>
<p>Lambda@Edge is a product that can be used only with a CloudFront distribution, and you specify which events to handle while the request flows through the CloudFront cache systems, e.g. before checking the cache, or after the origin returned a response. <a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/lambda-cloudfront-trigger-events.html">Read the docs</a> for details about the different integration triggers.</p>
<ul>
<li><strong>Languages</strong>: Node.js, Python</li>
<li><strong>Memory</strong>: 128MB (Viewer triggers), 10GB (Origin triggers - same as AWS Lambda)</li>
<li><strong>Runtime</strong>: 5s (Viewer triggers), 30s (Origin triggers)</li>
<li><strong>Uploaded Bundle size</strong>: 1MB (Viewer triggers), 50MB (Origin triggers)</li>
</ul>
<p>I have used Lambda@Edge extensively while working at Amazon/AWS and it’s great. The <code>1MB</code> bundle size limitation was the only issue we bumped into until we minified our JavaScript.</p>
<h3 id="amazon-cloudfront-functions-edge"><a href="#amazon-cloudfront-functions-edge">Amazon CloudFront Functions - Edge</a></h3><blockquote>
<p>lightweight functions in JavaScript for high-scale, latency-sensitive CDN customizations</p>
</blockquote>
<ul>
<li><a href="https://aws.amazon.com/cloudfront/features/#Edge_Computing">https://aws.amazon.com/cloudfront/features/#Edge_Computing</a></li>
</ul>
<p>CloudFront Functions launched last year in 2021 as one yet another competitor to Cloudflare Workers, but yet again a limited one in my opinion.
CloudFront Functions is a proper edge compute product, since our code runs in all of CloudFront’s edge locations. I do use it in my own website (the one you are reading now) and it does the job, but its runtime restrictions limit the potential use-cases, especially since you are not allowed to make any external API calls or even access a filesystem.</p>
<ul>
<li><strong>Languages</strong>: Custom <a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/functions-javascript-runtime-features.html">JavaScript runtime</a></li>
<li><strong>Memory</strong>: 2MB</li>
<li><strong>Runtime</strong>: 5s (Viewer triggers)</li>
<li><strong>Uploaded Bundle size</strong>: 10KB</li>
</ul>
<p>It’s clear this is a pretty restricted environment, which to be fair is implicitly acknowledged by AWS judging by <a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/cloudfront-functions.html">the use-cases they list</a> as ideal for the product.</p>
<h3 id="cloudflare-workers-edge"><a href="#cloudflare-workers-edge">Cloudflare Workers - Edge</a></h3><blockquote>
<p>Deploy serverless code instantly across the globe to give it exceptional performance, reliability, and scale.</p>
</blockquote>
<ul>
<li><a href="https://workers.cloudflare.com/">https://workers.cloudflare.com/</a></li>
</ul>
<p>Workers was one of the first true edge compute offerings and pushed the whole industry forward. The main differentiator is that Workers run in <a href="https://developers.cloudflare.com/workers/learning/how-workers-works">V8 isolates</a> eliminating cold starts completely, and allowing near-instantaneous execution very close to users. Note that the V8 environment is not a full <a href="https://nodejs.org/en/">Node.js</a> runtime which means it’s not possible to run everything.</p>
<ul>
<li><strong>Languages</strong>: JavaScript/TypeScript, WebAssembly</li>
<li><strong>Memory</strong>: 128MB</li>
<li><strong>Runtime</strong>: 50ms (Bundled Plan), 30s (Unbundled Plan - HTTP), 15min (Unbundled Plan - Cron trigger)</li>
<li><strong>Uploaded Bundle size</strong>: 1MB</li>
</ul>
<p>One important feature of Cloudflare Workers is that the runtime is measured in <a href="https://developers.cloudflare.com/workers/platform/limits#cpu-runtime">CPU consumption</a> for the Bundled Usage plan, meaning that external API requests (<a href="https://developers.cloudflare.com/workers/runtime-apis/fetch"><code>fetch</code> requests</a>) do not count towards the limit. However, for the Unbundled Usage plan the runtime is <a href="https://developers.cloudflare.com/workers/platform/limits#duration">measured in wall-clock time</a>, so we are charged for the whole duration of the worker running, including external calls.</p>
<p>The biggest advantage of Workers over its competitors is that it’s extremely nicely integrated with the rest of Cloudflare. It natively supports <a href="https://developers.cloudflare.com/workers/runtime-apis/cache">accessing and updating the CDN cache</a>, it provides <a href="https://developers.cloudflare.com/workers/runtime-apis/durable-objects">Durable Objects</a> which are very powerful, and it has a <a href="https://developers.cloudflare.com/workers/runtime-apis/kv">Key-Value store</a> built-in. </p>
<p>Another recent addition is the native integration of <a href="https://developers.cloudflare.com/pages/platform/functions">Workers with Cloudflare Pages</a> which enable full-stack application development completely on Cloudflare. With <a href="https://blog.cloudflare.com/introducing-r2-object-storage/">Cloudflare R2 Storage</a> around the corner this is going to be a very big threat to AWS Lambda dominance.</p>
<h2 id="netlify-vercel-edge-regional"><a href="#netlify-vercel-edge-regional">Netlify / Vercel - Edge + Regional</a></h2><blockquote>
<p>Develop. Preview. Ship.</p>
</blockquote>
<ul>
<li><a href="https://netlify.com/">https://netlify.com/</a></li>
<li><a href="https://vercel.com/">https://vercel.com/</a></li>
</ul>
<p>Netlify and Vercel are the most popular serverless products among the frontend community. Their focus is entirely on improving the developer experience for frontend developers, and they are doing an incredibly amazing job.
Vercel standardised the phrase <strong>Develop. Preview. Ship.</strong> and Netlify straight up competes with them at that.</p>
<p>I have been using both of them on-and-off for several years and honestly feature-wise they are identical. From my perspective the differences between them are in their satellite (or secondary) features. Vercel focuses on extending their platform’s features (e.g. <a href="https://vercel.com/live">Next.js Live</a>), and adding native optimisations for <a href="https://nextjs.org/">Next.js</a> (their amazing React-based framework) e.g. <a href="https://vercel.com/docs/concepts/next.js/image-optimization">Image Optimization</a>. Netlify on the other hand provides several features that are complementary to a frontend application, like <a href="https://docs.netlify.com/visitor-access/identity/">Identity</a>, <a href="https://www.netlify.com/products/forms/">Forms</a>, <a href="https://docs.netlify.com/site-deploys/split-testing">Split Testing</a> and more.</p>
<p>The interesting fact about these platforms is that they are built on-top of the products I went through above.</p>
<p>Their <strong>serverless functions</strong> products (<a href="https://docs.netlify.com/functions/overview/">Netlify</a>, <a href="https://vercel.com/docs/concepts/functions/serverless-functions">Vercel</a>) are directly build on-top of AWS Lambda, and their <strong>edge functions</strong> are built on-top of Cloudflare Workers (<a href="https://docs.netlify.com/edge-handlers/overview/">Netlify</a>, <a href="https://vercel.com/docs/concepts/functions/edge-functions">Vercel</a>). Weirdly though, Netlify lists their edge handlers memory limit as 256MB, whereas Cloudflare Workers tops at 128MB, so I am not sure what’s going on there, if there is a special agreement between them, or if they actually run these on their own network!</p>
<p>So, are Netlify and Vercel the best products to use? Should everyone migrate their AWS (or other) serverless apps to them since they combine both? Well… Nope 😅</p>
<p>These products provide top-notch developer experience but it comes at a cost. Their serverless offerings are more limited than their native counterparts:</p>
<ul>
<li>Serverless functions:<ul>
<li>Netlify: only <code>us-east-1</code> region, 1GB of memory, 10s synchronous execution</li>
<li>Vercel: only <code>IAD1</code> region, 1GB of memory, 5s synchronous execution (15s on Pro, 30s on Enterprise)</li>
</ul>
</li>
<li>Edge functions:<ul>
<li>Netlify: 256MB of memory, 50ms runtime</li>
<li>Vercel: 1.5s runtime (after returning a response it can still run up to 30s)</li>
</ul>
</li>
</ul>
<p>Some of the above limits can be lifted for their paid plans, but mostly the Enterprise ones.</p>
<p><a href="https://pages.cloudflare.com/">Cloudflare Pages</a> gets an honorary mention in this section since it now competes in this space by offering a seamless integration with source code repositories, and serving <a href="https://jamstack.org/">Jamstack websites</a>, while at the same time having built-in <a href="https://developers.cloudflare.com/pages/platform/functions">integration with Cloudflare Workers</a>.</p>
<p>The above limitations are why I personally still use AWS Lambda directly when I want an API, and combine it with Cloudflare Pages for the frontend. However, if your application does not need more resources than what these platforms offer, which to be fair most websites don’t, then you are fine just using them and focusing on your application. This is exactly what I did when I built <a href="https://temp.minibri.com/">Minibri Temp</a> with Netlify.</p>
<h2 id="fly-io-multi-regional"><a href="#fly-io-multi-regional">Fly.io - Multi-regional</a></h2><blockquote>
<p>Fly keeps your performance up by sending users on the shortest, fastest path to where your application is running.</p>
</blockquote>
<ul>
<li><a href="https://fly.io/docs/getting-started/">https://fly.io/docs/getting-started/</a></li>
</ul>
<p>Fly.io is a relatively new player (launched in 2020) but it definitely attracted many people already. It currently has <a href="https://fly.io/docs/reference/regions/">20 regions</a> and the selling pitch is that you specify a Docker image for your application and depending on where the user is they will spin up an instance of it to the closest region and serve the request. Very similar to how Lambda@Edge works, but incredibly more flexible!</p>
<ul>
<li><strong>Containers</strong>: <a href="https://fly.io/docs/reference/builders/">Docker images</a>, <a href="https://fly.io/docs/reference/builders/#buildpacks">Cloud Buildpacks</a></li>
<li><strong>Memory</strong>: 256MB-2GB (shared-cpu), 2-64GB (dedicated-cpu)</li>
<li><strong>Runtime</strong>: No limit, pay per second</li>
<li><strong>Uploaded Bundle size</strong>: n/a</li>
</ul>
<p>This is more like a traditional container product, but I included it because they do offer the multi-regional flavor, and they even have <a href="https://fly.io/docs/getting-started/multi-region-databases/">multi-region Postgres databases</a> with a nice architecture. If their scaling is fast enough not to have long cold-starts then this should be a very nice middle-ground between fully edge or fully single-regional.</p>
<p>I haven’t used Fly.io a lot but I do like its approach and based on the feedback from others I definitely plan to give it a go with one of my side projects.</p>
<p>Fun fact: While writing this post, <a href="https://twitter.com/flydotio/status/1484278935726788608">Fly.io announced 3GB Postgres or persisted volumes</a> in the free tier 🥳</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p></p>
<p>The diagram above summarises the platforms we discussed. Each one has its benefits and drawbacks, but the choice we have nowadays is amazing! I am pretty sure things will continue to evolve, improve, and we will see a lot more innovation in this space.</p>
<p>I have my favourites, you have yours, so let’s just keep on building 😉</p>

</article>
<article>
<h1>Kotlin http4k (via GraalVM Native Image) and Golang</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 26 Sep 2021 00:00:00 GMT</p>
<p>First things first, this article is not a scientific performance comparison between <a href="https://kotlinlang.org/">Kotlin</a> and <a href="https://golang.org/">Go</a> (aka Golang). I want to show that someone can use lightweight frameworks with Kotlin to achieve small enough binaries that execute fast, and quite comparable to Go’s executables. One of my favourite deployment platforms is AWS Lambda, and it really makes a difference having small artifacts to upload, with very fast startup times, otherwise <a href="https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/">cold start invocations</a> will essentially push latencies to the roof.</p>
<p>As part of a new project, I am trying to decide what stack to use as the backend. Let’s see how Go and Kotlin compare based on my very subjective and opinionated experience.</p>
<p><strong>Go</strong> (favourite backend choice)</p>
<ul>
<li>PRO: Great developer experience (i.e. tooling)</li>
<li>PRO: Great standard library</li>
<li>PRO: Small binary executables</li>
<li>PRO: Amazing performance (especially the low memory usage!)</li>
<li>CON: Some things are not available in quality libraries, hence might need boilerplate, or complete development (e.g. Machine Learning)</li>
</ul>
<p><strong>Kotlin</strong> (favourite language)</p>
<ul>
<li>PRO: Amazingly expressive language</li>
<li>PRO: Great standard library & third-party libraries for anything</li>
<li>PRO: Good developer experience (at least when Gradle et al. work as expected)</li>
<li>PRO: Good overall performance</li>
<li>CON: Very high memory usage due to <a href="https://en.wikipedia.org/wiki/Java_virtual_machine">Java Virtual Machine (JVM)</a></li>
</ul>
<p>The rest of the article will be a walkthrough for implementing a small API in Go and Kotlin, with a comparison of their performance against 200 simultaneous connections of 10s continuous traffic.</p>
<h2 id="api"><a href="#api">API</a></h2><p>A very simple API with two endpoints, and <a href="https://www.sqlite.org">SQLite</a> as the database.</p>
<ul>
<li><code>/?name=<requested-name></code><ul>
<li>Checks the database to see if <code><requested-name></code> exists in the <code>users</code> table and responds accordingly.</li>
</ul>
</li>
<li><code>/add?name=<new-name></code><ul>
<li>Tries to add a new entry in the <code>users</code> table with <code>name=<new-name></code> (without checking if it exists) and responds accordingly.</li>
</ul>
</li>
</ul>
<p>The table schema:</p>
<pre><code class="language-sql">sqlite> .schema
CREATE TABLE users (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT,
    UNIQUE(name)
);
</code></pre>
<h2 id="go"><a href="#go">Go</a></h2><p>I will not paste here the full code for Go since it’s quite a few lines, and it’s not the main purpose of this article.</p>
<p>Check out the full source code at <a href="https://github.com/lambrospetrou/code-playground/blob/master/go-vs-kotlin-datastores/golang/main.go">https://github.com/lambrospetrou/code-playground/blob/master/go-vs-kotlin-datastores/golang/main.go</a>.</p>
<p>The only third-party dependency I use is <a href="https://pkg.go.dev/crawshaw.io/sqlite"><code>crawshaw.io/sqlite</code></a> library by David Crawshaw as the SQLite driver, who also <a href="https://crawshaw.io/blog/go-and-sqlite">wrote an amazing article</a> about it.</p>
<h3 id="stats"><a href="#stats">Stats</a></h3><p>Some interestings facts of the Go implementation.</p>
<ul>
<li>Binary size: <code>9.7MB</code></li>
</ul>
<pre><code>$ ls -hl v1
-rwxr-xr-x  1 lambros  lambros   9.7M 26 Sep 20:10 v1
</code></pre>
<ul>
<li>RSS memory at startup: <code>6.6MB</code></li>
</ul>
<pre><code>$ ps -eo pid,rss,%mem,%cpu,command | grep -e "./v1"
90176   6660  0.0   0.0 ./v1
</code></pre>
<ul>
<li>Replay <code>GET /add?name=i_am_ironman</code> for 10s with 2 threads of 100 connections each.<ul>
<li>Average requests per second: <code>15.11k</code> (total: <code>303834</code>)</li>
<li>RSS memory after replay: <code>17.9MB</code> (<strong>!</strong>)</li>
</ul>
</li>
</ul>
<pre><code>$ wrk -t2 -c100 -d10s --latency --timeout 1s http://localhost:8080/add\?name\=i_am_ironman
Running 10s test @ http://localhost:8080/add?name=i_am_ironman
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.89ms   15.52ms 675.26ms   97.49%
    Req/Sec    15.11k     2.10k   18.45k    58.91%
  Latency Distribution
     50%    3.00ms
     75%    3.62ms
     90%    4.51ms
     99%   51.22ms
  303834 requests in 10.10s, 52.45MB read
  Non-2xx or 3xx responses: 303833
Requests/sec:  30075.57
Transfer/sec:      5.19MB
</code></pre>
<ul>
<li>Replay <code>GET /?name=i_am_ironman</code> for 10s with 2 threads of 100 connections each.<ul>
<li>Average requests per second: <code>18.58k</code> (total: <code>373386</code>)</li>
<li>RSS memory after replay: <code>18.2MB</code> (<strong>!</strong>)</li>
</ul>
</li>
</ul>
<pre><code>$ wrk -t2 -c100 -d10s --latency --timeout 1s http://localhost:8080/\?name\=i_am_ironman
Running 10s test @ http://localhost:8080/?name=i_am_ironman
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.65ms  702.95us   9.48ms   73.62%
    Req/Sec    18.58k     2.45k   26.61k    74.26%
  Latency Distribution
     50%    2.60ms
     75%    2.99ms
     90%    3.50ms
     99%    4.75ms
  373386 requests in 10.10s, 53.06MB read
Requests/sec:  36960.38
Transfer/sec:      5.25MB
</code></pre>
<p>Small binary, fast startup, and extremely low memory usage 🚀</p>
<h2 id="kotlin"><a href="#kotlin">Kotlin</a></h2><p>I am a huge fan of the <a href="https://www.http4k.org/">http4k</a> framework for building APIs with Kotlin. It is very small, modular, extremely versatile, and due to its philosophy “Server as a Function” it’s just an absolute joy to work with. It works both locally, and in AWS Lambda, and runs inside the JVM but also supports compiling down to native binary through <a href="https://www.graalvm.org/reference-manual/native-image/">GraalVM Native Image</a>.</p>
<p>As a driver for SQLite I used <a href="https://github.com/sqldelight/sqldelight">SQLDelight</a>. This is the first time I used it, and I am very pleasantly surprised by how nice it is. Definitely using it for all my SQL needs in the JVM. I tried to use <a href="https://github.com/JetBrains/Exposed">JetBrains Exposed</a> before SQLDelight but faced issues during the native image compilation, and its custom modelling is a no-go for me.</p>
<p>Check out the full source code at <a href="https://github.com/lambrospetrou/code-playground/tree/master/go-vs-kotlin-datastores/kotlin-sqlite">https://github.com/lambrospetrou/code-playground/tree/master/go-vs-kotlin-datastores/kotlin-sqlite</a>.</p>
<p>As I mentioned above, I really love Kotlin “the language”, and in combination with <code>http4k</code>, it enables very lean servers. This is the entire server code.</p>
<pre><code class="language-kotlin">package com.example

import com.squareup.sqldelight.db.SqlDriver
import com.squareup.sqldelight.sqlite.driver.JdbcSqliteDriver
import org.http4k.core.HttpHandler
import org.http4k.core.Method.GET
import org.http4k.core.Response
import org.http4k.core.Status
import org.http4k.core.Status.Companion.BAD_REQUEST
import org.http4k.core.Status.Companion.NOT_FOUND
import org.http4k.core.Status.Companion.OK
import org.http4k.routing.bind
import org.http4k.routing.routes
import org.http4k.server.ApacheServer
import org.http4k.server.asServer
import org.sqlite.SQLiteConfig

fun makeApp(db: Database): HttpHandler = routes(
    "/ping" bind GET to {
        Response(OK).body("pong")
    },

    "/add" bind GET to {
        val name = (it.query("name") ?: "").trim()
        when {
            name.isEmpty() ->
                Response(BAD_REQUEST).body("Invalid 'name' given: $name")
            else -> {
                val userQueries = db.usersQueries
                try {
                    userQueries.transaction {
                        userQueries.addUser(name = name)
                    }
                    Response(OK).body("Welcome to our community $name!")
                } catch (e: Exception) {
                    Response(Status.INTERNAL_SERVER_ERROR).body("Sadly we failed to register you: $name")
                }
            }
        }
    },

    "/" bind GET to {
        val name = (it.query("name") ?: "").trim()
        when {
            name.isEmpty() ->
                Response(BAD_REQUEST).body("Invalid 'name' given: $name")
            else -> {
                val userQueries = db.usersQueries
                userQueries.selectByName(name = name).executeAsOneOrNull()?.let {
                    Response(OK).body("Boom! We found you: $name")
                } ?: Response(NOT_FOUND).body("Sadly we could not find you: $name")
            }
        }
    }
)

fun main() {
    val port = (System.getenv("PORT") ?: "9000").toInt()

    val config = SQLiteConfig().apply {
        setSharedCache(true)
        setJournalMode(SQLiteConfig.JournalMode.WAL)
    }
    val driver: SqlDriver = JdbcSqliteDriver("jdbc:sqlite:/tmp/users.sqlite3", properties = config.toProperties())
    Database.Schema.create(driver)

    val app: HttpHandler = makeApp(db = Database(driver))
    val server = app.asServer(ApacheServer(port)).start()

    println(Runtime.getRuntime().availableProcessors())
    println("Server started on " + server.port())
}
</code></pre>
<h2 id="stats-jvm"><a href="#stats-jvm">Stats - JVM</a></h2><p>In this section I am going to show the stats for running the above Kotlin code inside the JVM, the standard way.
The following stats are without any custom JVM argument, just <code>java -jar build/libs/HelloWorld.jar</code>.</p>
<p>For context, I use the JVM provided by GraalVM:</p>
<pre><code>$ java -version
openjdk version "16.0.2" 2021-07-20
OpenJDK Runtime Environment GraalVM CE 21.2.0 (build 16.0.2+7-jvmci-21.2-b08)
OpenJDK 64-Bit Server VM GraalVM CE 21.2.0 (build 16.0.2+7-jvmci-21.2-b08, mixed mode, sharing)
</code></pre>
<ul>
<li>Fat-JAR size: <code>15MB</code></li>
</ul>
<pre><code>$ ls -hl build/libs/HelloWorld.jar
-rw-r--r--  1 lambros  lambros    15M 26 Sep 21:42 build/libs/HelloWorld.jar
</code></pre>
<ul>
<li>RSS memory at startup: <code>143.9MB</code></li>
</ul>
<pre><code>$ ps -eo pid,rss,%mem,%cpu,command | grep -e "HelloWorld.jar"
 3460 143924  0.4   0.0 java -jar build/libs/HelloWorld.jar
</code></pre>
<ul>
<li>Replay <code>GET /add?name=i_am_ironman</code> for 10s with 2 threads of 100 connections each.<ul>
<li>Average requests per second: <code>7.17k</code> (total: <code>142653</code>)</li>
<li>RSS memory after replay: <code>1400MB</code> (<code>1.4GB</code>)</li>
</ul>
</li>
</ul>
<pre><code>$ wrk -t2 -c100 -d10s --latency --timeout 1s http://localhost:9000/add\?name\=i_am_ironman
Running 10s test @ http://localhost:9000/add?name=i_am_ironman
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    68.57ms  154.96ms 950.92ms   88.32%
    Req/Sec     7.17k     2.18k   12.00k    65.50%
  Latency Distribution
     50%  138.00us
     75%   39.81ms
     90%  269.81ms
     99%  722.75ms
  142653 requests in 10.01s, 27.75MB read
  Socket errors: connect 0, read 0, write 0, timeout 278
  Non-2xx or 3xx responses: 142653
Requests/sec:  14253.41
Transfer/sec:      2.77MB
</code></pre>
<ul>
<li>Replay <code>GET /?name=i_am_ironman</code> for 10s with 2 threads of 100 connections each.<ul>
<li>Average requests per second: <code>17.37k</code> (total: <code>345639</code>)</li>
<li>RSS memory after replay: <code>2500MB</code> (<code>2.5GB</code>)</li>
</ul>
</li>
</ul>
<pre><code>$ wrk -t2 -c100 -d10s --latency --timeout 1s http://localhost:9000/\?name\=i_am_ironman
Running 10s test @ http://localhost:9000/?name=i_am_ironman
  2 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.41ms    4.36ms 139.93ms   93.48%
    Req/Sec    17.37k     3.04k   20.22k    92.00%
  Latency Distribution
     50%    2.68ms
     75%    4.68ms
     90%    6.92ms
     99%   15.54ms
  345639 requests in 10.01s, 1.01GB read
  Non-2xx or 3xx responses: 340797
Requests/sec:  34526.79
Transfer/sec:    103.35MB
</code></pre>
<p>So much memory used…</p>
<p>I also tried restricting the heap memory to <code>256MB</code> (with <code>java -Xmx256M -jar build/libs/HelloWorld.jar</code>) but the results were roughly the same, even though the memory heap part was capped.
The <code>ApacheServer</code> used here seems to be using lots of memory due to the many connections.</p>
<h2 id="stats-graalvm-native-image"><a href="#stats-graalvm-native-image">Stats - GraalVM Native Image</a></h2><p>This section shows the boost we can get by compiling our code down to a native binary using GraalVM Native Image. This is a great piece of technology and hopefully it will keep evolving and improving.</p>
<p>This is the command I use to compile the fat-JAR down to a binary:</p>
<pre><code>$ native-image --no-fallback -H:+ReportExceptionStackTraces --enable-url-protocols=https -jar build/libs/HelloWorld.jar build/libs/HelloWorld-native
</code></pre>
<ul>
<li>Binary size: <code>46MB</code></li>
<li>RSS memory at startup: <code>19.7MB</code> (<strong>!</strong>)</li>
<li>Replay <code>GET /add?name=i_am_ironman</code> for 10s with 2 threads of 100 connections each.<ul>
<li>Average requests per second: <code>6.50k</code> (total: <code>129385</code>)</li>
<li>RSS memory after replay: <code>721.1MB</code></li>
</ul>
</li>
<li>Replay <code>GET /?name=i_am_ironman</code> for 10s with 2 threads of 100 connections each.<ul>
<li>Average requests per second: <code>17.17k</code> (total: <code>341736</code>)</li>
<li>RSS memory after replay: <code>1700MB</code> (<code>1.7GB</code>)</li>
</ul>
</li>
</ul>
<p>As we can see, the memory usage is reduced significantly compared to the JVM runs, especially at the start, without introducing significant performance regression. Also, even though the binary size is larger than the fat-JAR, this is a standalone binary we can copy to any system and just run. In the fat-JAR case we still need to have the Java JVM installed in the system running the code. Although, to be fair, with the fat-JAR being less than <code>20MB</code> cold starts in AWS Lambda should be non-existent. </p>
<h3 id="binary-compression-with-upx"><a href="#binary-compression-with-upx">Binary compression with UPX</a></h3><blockquote>
<p><a href="https://upx.github.io/">UPX</a> is a free, portable, extendable, high-performance executable packer for several executable formats. </p>
</blockquote>
<p>Running <code>upx -7 -k build/libs/HelloWorld-native</code> will essentially take the GraalVM binary and compress it down significantly. The first time the executable runs, it decompresses the content and then proceeds to execution.</p>
<p>The size of the binary went from <code>46MB</code> down to <code>18.9MB</code>, without affecting the startup time in any meanigful way.</p>
<h2 id="stats-summary"><a href="#stats-summary">Stats - Summary</a></h2><table>
<thead>
<tr>
<th></th>
<th>Artifact Size</th>
<th>Initial RSS memory</th>
<th>Final RSS memory</th>
<th>Replay 1 Total Requests</th>
<th>Replay 2 Total Requests</th>
</tr>
</thead>
<tbody><tr>
<td>Go</td>
<td>9.7MB</td>
<td>6.6MB</td>
<td>18.2MB</td>
<td>303834</td>
<td>373386</td>
</tr>
<tr>
<td>JVM Fat-JAR</td>
<td>15MB</td>
<td>143.9MB</td>
<td>2500MB</td>
<td>142653</td>
<td>345639</td>
</tr>
<tr>
<td>GraalVM Native Image</td>
<td>46MB</td>
<td>19.7MB</td>
<td>1700MB</td>
<td>129385</td>
<td>341736</td>
</tr>
<tr>
<td>UPX</td>
<td>18.9MB</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
</tbody></table>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Please don’t get too scientific on me about the results. I know (and acknowledge) there are many nuances that affect the differences in performance, especially with SQLite parameters used, connection pooling in Go but not in Kotlin, etc.</p>
<p>The throughput & latency of all server versions examined above is satisfactory to me, so that’s not a concern.</p>
<p>The huge memory usage difference though is what still concerns me when using the JVM, even when the code gets compiled down to an executable binary.
I expected that with GraalVM Native Image the memory use would be reduced a lot more, but I guess there is still lots of space for improvement.</p>
<p>I really like Kotlin the language though, so I am looking forward to the moment when memory won’t be an issue anymore. For the time being, I think I will stick with Go for long-running servers that will have thousands of parallel open connections. On the other hand, for AWS Lambda deployments Kotlin has become a very viable solution! Each Lambda invocation serves exactly one connection at a time, which means we only care about the binary size, and overall performance, and both of these are in the OK-territory (albeit the extra memory usage).</p>

</article>
<article>
<h1>Makefiles for execution coordination</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 25 Sep 2021 00:00:00 GMT</p>
<p>As the title implies, this article is about the now ancient tool <code>make</code> [<a href="https://en.wikipedia.org/wiki/Make_(software)">1</a>] [<a href="https://www.gnu.org/software/make/manual/make.html">2</a>]. This little gem of software is used in our industry for at least 45 years now.</p>
<p>To be fair, I haven’t personally used it in big projects at work, but I regularly use it for my personal side projects since it gets the job done, and it provides a language agnostic common interface which means I can use it for Javascript projects, Kotlin projects, Golang projects, and others.</p>
<p>Enough with the intro though, in this article I will describe a very simple problem I wanted to solve recently and the good old <code>make</code> turned out to be the best solution, versus all the newer, shinier, and way more bloated modern alternatives.</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>The problem at hand is simple, and common to many people. I want a simple workflow tool to run some local programs in a specific order, each one potentially generating some output files to be used as input by subsequent programs in the workflow, and ideally parallel execution for the steps that do not depend on each other. A simple file processing data pipeline, to fully run locally on my laptop.</p>
<p>Simple eh… 🤪 😅</p>
<p>I did spend a few days looking for available tools but I was surprised that most of the modern workflow tooling is extremely bloated, complicated, and most of them made for “web scale distributed systems”, even though most people don’t really need that.</p>
<p>Candidate solutions:</p>
<ul>
<li><a href="https://snakemake.github.io">https://snakemake.github.io</a><ul>
<li>This is the best available tool I found, and if you notice its name it hints to the <code>make</code> tool. It has simple declarative way of expressing the workflow steps, their inputs and outputs, and the tool figures out the order in which to execute them. I really liked its documentation but I ended up not using it since it is built ontop of Python, and I don’t like Python…</li>
</ul>
</li>
<li>Do-it-myself<ul>
<li>The first thought I actually had was to just write a quick script in <a href="https://golang.org/">Go</a> and <a href="https://pkg.go.dev/os/exec@go1.17.1#Cmd.Output">shell out</a> to the programs I wanted to execute, with as much parallelism and dynamism I needed. For a one-time script this is my goto, but I wanted to find a way to define the workflow declaratively, and learn something that could be used in the future for other more complicated flows as well.</li>
</ul>
</li>
</ul>
<h2 id="make"><a href="#make">make</a></h2><p>I chatted with a friend about this and he instantly said “just use make”. I knew <code>make</code> was the main build tool for C/C++ projects, with fancy dynamic rules, running only the steps that have to run in the right order, and with good performance. I never used it though solely as an orchestration tool, with nothing to compile…</p>
<p>If you are not familiar with <code>make</code>, these are great resources on writing Makefiles:</p>
<ul>
<li><a href="https://makefiletutorial.com/">https://makefiletutorial.com/</a></li>
<li><a href="https://swcarpentry.github.io/make-novice/reference.html">https://swcarpentry.github.io/make-novice/reference.html</a></li>
<li><a href="https://www.gnu.org/software/make/manual/make.html">https://www.gnu.org/software/make/manual/make.html</a> (the actual manual)</li>
</ul>
<h2 id="flow-1-parallel-file-processing"><a href="#flow-1-parallel-file-processing">Flow 1 - Parallel File Processing</a></h2><p>The first example workflow does the following:</p>
<ol>
<li>Process all JSON files inside a directory <code>data/</code>. For each JSON file <code>F</code> we want to call some command <code>C</code> that will process <code>F</code>. This ideally should be done in parallel since there could be lots of files.</li>
<li>Once all the files are processed, run another command which can continue the workflow execution.</li>
</ol>
<p>Even though the workflow is simple, it still showcases how to use all the things we need:</p>
<ul>
<li>Parallel execution of independent steps</li>
<li>Ordered execution of dependent steps</li>
</ul>
<p></p>
<p>The following Makefile (filename <code>flow1.mk</code>) implements the above workflow.</p>
<pre><code class="language-makefile">MAKEFILE_NAME = flow1.mk
DATAFILES = $(wildcard data/*.json)

default: boom

# Trigger a recursive `make` for the target `datafiles_all`.
# The critical argument is `--always-make` which will force the run all the time, 
# otherwise `make` will not do anything since the data files are not modified!
boom:
    $(MAKE) datafiles_all --always-make -f $(MAKEFILE_NAME)

# A trampoline target that depends on all the data files to force their processing.
datafiles_all: $(DATAFILES)
    @echo :: 'datafiles_all' finished!

# The target that corresponds to each JSON file in the `data/` directory.
data/%.json:
    @echo "processing single file:" $@
    @cat $@
</code></pre>
<p>If we run the above makefile using <code>make -f flow1.mk</code> we get the following output:</p>
<pre><code>$ make -f flow1.mk
processing single file:  data/a.json
{"a": 1}
processing single file:  data/b.json
{"b": 2}
processing single file:  data/c.json
"c"
:: datafiles_all finished!
</code></pre>
<p>Pretty clear output showing that all three files in the <code>data/</code> directory were processed.</p>
<h3 id="parallelism"><a href="#parallelism">Parallelism</a></h3><p>If we run the above makefile with the additional <code>-j 3</code> arguments, then all three files will be processed in parallel.
This is not clear with the above example, so let’s make it a bit more complicated to showcase this as well.</p>
<pre><code class="language-makefile"># Run with `make -f flow1-parallel.mk -j 3` for parallelism 3
# or with `make -f flow1-parallel.mk -j $(nproc)` to use all processors.
# The default is to process each target one after the other.

MAKEFILE_NAME = flow1-parallel.mk
DATAFILES = $(wildcard data/*.json)

default: boom

# Trigger a recursive `make` for the target `datafiles_all`.
# The critical argument is `--always-make` which will force the run all the time, 
# otherwise `make` will not do anything since the data files are not modified!
boom:
    $(MAKE) datafiles_all --always-make -f $(MAKEFILE_NAME)

# A trampoline target that depends on all the data files to force their processing.
datafiles_all: $(DATAFILES)
    @echo :: 'datafiles_all' finished!

# Special override target for this specific file to simulate a long/slow execution.
data/a.json:
    @echo "processing slow file:" $@
    @sleep 2
    @echo still processing $@ ...
    @sleep 2
    @echo finished processing $@ ...

# The target that corresponds to each JSON file in the `data/` directory.
data/%.json:
    @echo "processing single file:" $@
    @cat $@
</code></pre>
<p>The only difference in this makefile is the newly added explicit target rule for <code>data/a.json</code>.
By explicitly adding that rule, <code>make</code> will execute those commands instead of the generic statements defined by the <code>data/%.json</code> rule.</p>
<p>Let’s see how this runs with the default <code>make</code> invocation:</p>
<pre><code>$ make -f flow1-parallel.mk
processing slow file: data/a.json
still processing data/a.json ...
finished processing data/a.json ...
processing single file: data/b.json
{"b": 2}
processing single file: data/c.json
"c"
:: datafiles_all finished!
</code></pre>
<p>As we can see, we need to completely finish the slow processing of <code>data/a.json</code> before proceeding with the rest files.
Now let’s run it with parallelism <code>2</code>:</p>
<pre><code>$ make -f flow1-parallel.mk -j 2
processing slow file: data/a.json
processing single file: data/b.json
{"b": 2}
processing single file: data/c.json
"c"
still processing data/a.json ...
finished processing data/a.json ...
:: datafiles_all finished!
</code></pre>
<p>In this case, we can see that processing for <code>data/a.json</code> started as before, but before we even get to see the <code>still processing data/a.json ...</code> printout, the other two files are already processed, and then finally <code>data/a.json</code> completes.</p>
<p><code>make</code> is smart enough with parallelism and no matter how many files, or target rules, we have in the Makefile it will respect the parallelism we specify with the <code>-j N</code> argument and execute the target steps accordingly without exceeding the specified parallelism.</p>
<h2 id="flow-2-dag"><a href="#flow-2-dag">Flow 2 - DAG</a></h2><p>Another simple flow I want to show is how to make a pipeline of processes where some depend on each other, and some use local files/pipes as their communication mechanism.</p>
<p></p>
<p>What the diagram above means in plain English?</p>
<ol>
<li>Initially we start with the <code>t1</code> and <code>t2</code> targets executing in parallel.</li>
<li><code>t1</code> writes an output file <code>f1.txt</code> and then proceeds to execute <code>t3</code> which will read the <code>f1.txt</code> file as input (<code>t3</code> depends on <code>t6</code> though, so cannot start until <code>t6</code> finishes as well).</li>
<li>In parallel to <code>t1/t3</code>‘s execution flow, once <code>t2</code> completes it will trigger <code>t6</code>, but <code>t6</code> depends on <code>t4</code> and <code>t5</code> targets to have been completed first.</li>
<li><code>t4</code> and <code>t5</code> execute in parallel, and the output of <code>t4</code> is being read by <code>t5</code> through the file pipe <code>/tmp/comm.fifo</code>. Note that this is blocking communication, meaning that in order for <code>t4</code> to manage to finish, <code>t5</code> must also run in parallel to consume the content produced by <code>t4</code>.</li>
<li>Once <code>t6</code> finishes, <code>t3</code> can execute.</li>
<li>Once both <code>t6</code> and <code>t3</code> are complete, <code>t7</code> will execute and complete our workflow.</li>
</ol>
<p>The above workflow is modelled by the following makefile (<code>flow2-dag.mk</code>).
Note how each target defines the dependencies it needs in order to properly coordinate the execution.</p>
<pre><code class="language-makefile">.PHONY: t1 t2 t3 t4 t5 t6 t7
default: t7

t1:
    @echo "t1"
    @echo "t1-content-output" > f1.txt
    @echo "t1 output file written!"

t2:
    @echo "t2"

t3: t1 t6
    @echo "t3"
    @cat f1.txt
    @echo "t1 output file printed!"

t4_5_setup:
    @rm -f /tmp/comm.fifo
    @mkfifo /tmp/comm.fifo

t4: t4_5_setup
    @echo "t4"
    @cat /usr/share/dict/words > /tmp/comm.fifo

t5: t4_5_setup
    @echo "t5"
    @echo "Total lines: " && wc -l < /tmp/comm.fifo

t6: t2 t4 t5
    @echo "t6"

t7: t3 t6
    @echo "t7"
</code></pre>
<p>Let’s run it and see if it works.</p>
<pre><code>$ make -f flow2-dag.mk -j 4
t1
t4
t2
t5
Total lines:
t1 output file written!
  235886
t6
t3
t1-content-output
t1 output file printed!
t7
</code></pre>
<p>Boom 🥳 It works as expected 🚀 (exercise to the reader to confirm…)</p>
<p><strong>Note:</strong> This workflow requires parallelism of at least <code>2</code>, otherwise targets <code>t4</code> and <code>t5</code> will deadlock. Target <code>t4</code> will be the only one running, and once it fills up the file pipe it will block until someone consumes it. But since <code>t5</code> is not running the workflow will be stuck. In scenarios like this, either always make sure to run this with some parallelism, or use normal files as communication, as we did between <code>t1</code> and <code>t3</code>.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>There is no doubt that <code>make</code> is amazingly powerful and flexible enough to achieve any kind of workflow execution. I barely even covered its functionality. I expected it to be much more cumbersome to use due to the complexity I see in how most of the C/C++ projects are using it.</p>
<p>I was pleasantly surprised! For these use-cases when I don’t want complicated bloated tools, <code>make</code> fits the bill perfectly!
The fact that <code>make</code> is available on almost any system, with super fast execution, makes it a great tool in my toolkit.</p>

</article>
<article>
<h1>Keeping the brain engaged</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 12 Sep 2021 00:00:00 GMT</p>
<p>During the past few weeks I have been feeling that I am not as engaged as I want to be at work, and neither at my outside of work activites. I decided to do some kind of retrospective, going over the past few years to remember what kept my brain excited and engaged. Not really to my surprise, but I realised that it’s not always about projects at work, nor hobbies, nor friends; it could be any of them.</p>
<p>When at times I was working on exciting projects at work, I left the office but my brain was still crunching, even subconciously. While riding the London Underground home I might have been evaluating different solutions to the problem at hand, or thinking about new features to improve the product, or trying to think through what went wrong in that solution that should work but didn’t. It wasn’t rare that I would wake up suddenly in the middle of the night with a brain flash about a software bug I had overlooked, even ones I wasn’t even aware they existed before going to bed, along with its fix which was quite obvious at that moment.</p>
<p>This happens a few times with my hobbies as well, like when I play volleyball. I get flashes later of what I should do better, things I overlooked on the opponent’s side, or how the game situation demanded a certain play but I played differently. Discussions with friends are also a good trigger for these unconcious sudden revelations, especially when we discuss ideas and projects about potential startups. We start from a simple pitch, and end up with business plans and features that need years to implement — talk about a long-term vision. Or when I am evaluating different companies or funds to invest in the stock market, and trying to think of how they would evolve in the future, areas they would expand, potential fall-throughs. And in all these cases the majority of the thinking is done after the fact, not while it’s taking place, but at times when my brain is able to drift freely.</p>
<p>Coincidentally, last week I stumbled upon the following essays by Paul Graham that really stroke a cord and resonated immediately with all the above.</p>
<h3 id="what-doesn-t-seem-like-work"><a href="#what-doesn-t-seem-like-work">What Doesn’t Seem Like Work</a></h3><ul>
<li><a href="http://www.paulgraham.com/work.html">http://www.paulgraham.com/work.html</a></li>
</ul>
<p>Starts talking about how his father loved solving math problems, how he never got bored of them, and concludes with:</p>
<blockquote>
<p>It seemed curious that the same task could be painful to one person and pleasant to another, […]. I didn’t realize how hard it can be to decide what you should work on, and that you sometimes have to figure it out from subtle clues, like a detective solving a case in a mystery novel. So I bet it would help a lot of people to ask themselves about this explicitly. What seems like work to other people that doesn’t seem like work to you?</p>
</blockquote>
<h3 id="the-top-idea-in-your-mind"><a href="#the-top-idea-in-your-mind">The Top Idea in Your Mind</a></h3><ul>
<li><a href="http://www.paulgraham.com/top.html">http://www.paulgraham.com/top.html</a></li>
</ul>
<p>The first paragraph nicely summarizes what I described in the intro of this article:</p>
<blockquote>
<p>I realized recently that what one thinks about in the shower in the morning is more important than I’d thought. I knew it was a good time to have ideas. Now I’d go further: now I’d say it’s hard to do a really good job on anything you don’t think about in the shower.</p>
</blockquote>
<p>Then at some point he mentions the following, referring to the unforced thinking our brain does unconciously.</p>
<blockquote>
<p>I think most people have one top idea in their mind at any given time. That’s the idea their thoughts will drift toward when they’re allowed to drift freely. And this idea will thus tend to get all the benefit of that type of thinking, while others are starved of it. Which means it’s a disaster to let the wrong idea become the top one in your mind.</p>
</blockquote>
<p>And concludes with (emphasis is mine):</p>
<blockquote>
<p>I suspect a lot of people aren’t sure what’s the top idea in their mind at any given time. I’m often mistaken about it. I tend to think it’s the idea I’d want to be the top one, rather than the one that is. But it’s easy to figure this out: just take a shower. <strong>What topic do your thoughts keep returning to? If it’s not what you want to be thinking about, you may want to change something.</strong> </p>
</blockquote>
<h3 id="a-projects-of-one-s-own"><a href="#a-projects-of-one-s-own">A Projects of One’s Own</a></h3><ul>
<li><a href="http://www.paulgraham.com/own.html">http://www.paulgraham.com/own.html</a></li>
</ul>
<p>This is an awesome article and I felt what Paul describes here a lot during the past few months, missing this excitement and satisfaction of working on projects of mine. Projects where you are in control, you do them because you like and want to, without useless and unnecessary processes or people slowing you down.</p>
<p>Putting some excerpts below that I find very interesting.</p>
<blockquote>
<p>Working on a project of your own is as different from ordinary work as skating is from walking. It’s more fun, but also much more productive.</p>
</blockquote>
<blockquote>
<p>There is something special about working on a project of your own. I wouldn’t say exactly that you’re happier. A better word would be excited, or engaged.</p>
</blockquote>
<blockquote>
<p>You feel as if you’re an animal in its natural habitat, doing what you were meant to do — not always happy, maybe, but awake and alive.</p>
</blockquote>
<blockquote>
<p>People who’ve never experienced the thrill of working on a project they’re excited about can’t distinguish this kind of working long hours from the kind that happens in sweatshops and boiler rooms, but they’re at opposite ends of the spectrum. That’s why it’s a mistake to insist dogmatically on “work/life balance.” Indeed, the mere expression “work/life” embodies a mistake: it assumes work and life are distinct. For those to whom the word “work” automatically implies the dutiful plodding kind, they are. But for the skaters, the relationship between work and life would be better represented by a dash than a slash. I wouldn’t want to work on anything I didn’t want to take over my life.</p>
</blockquote>
<blockquote>
<p>If you can find the right people, you only have to tell them what to do at the highest level. They’ll handle the details. Indeed, they insist on it. For a project to feel like your own, you must have sufficient autonomy. You can’t be working to order, or slowed down by bureaucracy.</p>
</blockquote>
<blockquote>
<p>[…] Ideally we can have the best of both worlds: to be deliberate in choosing to work on projects of our own, and carelessly confident in starting new ones.</p>
</blockquote>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>My take-away from the above is very simple. I need, and want, to maximise the time that my brain is engaged in things I like, things I get excited about, and things I get satisfaction from. At work I need to chase certain projects and avoid others. With hobbies I need to drop the ones that are just time-wasting and focus on the ones that bring excitement. And with side projects, well, I need to restart doing!</p>

</article>
<article>
<h1>Bits of Unsolicited Advice by Kevin Kelly</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 26 Apr 2021 00:00:00 GMT</p>
<p>Today, I stumbled upon two articles with some amazing pieces of advice from Kevin Kelly.</p>
<p>I loved them so much I wanted to share them here as well. However, all bragging rights and attribution should be given to the original articles:</p>
<ul>
<li><a href="https://kk.org/thetechnium/68-bits-of-unsolicited-advice/">68 Bits of Unsolicited Advice - April 28, 2020</a></li>
<li><a href="https://kk.org/thetechnium/99-additional-bits-of-unsolicited-advice/">99 Additional Bits of Unsolicited Advice - April 19, 2021</a></li>
</ul>
<p>I also added these to my <a href="/read-watch-listen">Read-Watch-Listen list</a>.</p>
<h2 id="68-bits-of-unsolicited-advice"><a href="#68-bits-of-unsolicited-advice">68 Bits of Unsolicited Advice</a></h2><ul>
<li>Learn how to learn from those you disagree with, or even offend you. See if you can find the truth in what they believe.</li>
<li>Being enthusiastic is worth 25 IQ points.</li>
<li>Always demand a deadline. A deadline weeds out the extraneous and the ordinary. It prevents you from trying to make it perfect, so you have to make it different. Different is better.</li>
<li>Don’t be afraid to ask a question that may sound stupid because 99% of the time everyone else is thinking of the same question and is too embarrassed to ask it.</li>
<li>Being able to listen well is a superpower. While listening to someone you love keep asking them “Is there more?”, until there is no more.</li>
<li>A worthy goal for a year is to learn enough about a subject so that you can’t believe how ignorant you were a year earlier.</li>
<li>Gratitude will unlock all other virtues and is something you can get better at.</li>
<li>Treating a person to a meal never fails, and is so easy to do. It’s powerful with old friends and a great way to make new friends.</li>
<li>Don’t trust all-purpose glue.</li>
<li>Reading to your children regularly will bond you together and kickstart their imaginations.</li>
<li>Never use a credit card for credit. The only kind of credit, or debt, that is acceptable is debt to acquire something whose exchange value is extremely likely to increase, like in a home. The exchange value of most things diminishes or vanishes the moment you purchase them. Don’t be in debt to losers.</li>
<li>Pros are just amateurs who know how to gracefully recover from their mistakes.</li>
<li>Extraordinary claims should require extraordinary evidence to be believed.</li>
<li>Don’t be the smartest person in the room. Hangout with, and learn from, people smarter than yourself. Even better, find smart people who will disagree with you.</li>
<li>Rule of 3 in conversation. To get to the real reason, ask a person to go deeper than what they just said. Then again, and once more. The third time’s answer is close to the truth.</li>
<li>Don’t be the best. Be the only.</li>
<li>Everyone is shy. Other people are waiting for you to introduce yourself to them, they are waiting for you to send them an email, they are waiting for you to ask them on a date. Go ahead.</li>
<li>Don’t take it personally when someone turns you down. Assume they are like you: busy, occupied, distracted. Try again later. It’s amazing how often a second try works.</li>
<li>The purpose of a habit is to remove that action from self-negotiation. You no longer expend energy deciding whether to do it. You just do it. Good habits can range from telling the truth, to flossing.</li>
<li>Promptness is a sign of respect.</li>
<li>When you are young spend at least 6 months to one year living as poor as you can, owning as little as you possibly can, eating beans and rice in a tiny room or tent, to experience what your “worst” lifestyle might be. That way any time you have to risk something in the future you won’t be afraid of the worst case scenario.</li>
<li>Trust me: There is no “them”.</li>
<li>The more you are interested in others, the more interesting they find you. To be interesting, be interested.</li>
<li>Optimize your generosity. No one on their deathbed has ever regretted giving too much away.</li>
<li>To make something good, just do it. To make something great, just re-do it, re-do it, re-do it. The secret to making fine things is in remaking them.</li>
<li>The Golden Rule will never fail you. It is the foundation of all other virtues.</li>
<li>If you are looking for something in your house, and you finally find it, when you’re done with it, don’t put it back where you found it. Put it back where you first looked for it.</li>
<li>Saving money and investing money are both good habits. Small amounts of money invested regularly for many decades without deliberation is one path to wealth.</li>
<li>To make mistakes is human. To own your mistakes is divine. Nothing elevates a person higher than quickly admitting and taking personal responsibility for the mistakes you make and then fixing them fairly. If you mess up, fess up. It’s astounding how powerful this ownership is.</li>
<li>Never get involved in a land war in Asia.</li>
<li>You can obsess about serving your customers/audience/clients, or you can obsess about beating the competition. Both work, but of the two, obsessing about your customers will take you further.</li>
<li>Show up. Keep showing up. Somebody successful said: 99% of success is just showing up.</li>
<li>Separate the processes of creation from improving. You can’t write and edit, or sculpt and polish, or make and analyze at the same time. If you do, the editor stops the creator. While you invent, don’t select. While you sketch, don’t inspect. While you write the first draft, don’t reflect. At the start, the creator mind must be unleashed from judgement.</li>
<li>If you are not falling down occasionally, you are just coasting.</li>
<li>Perhaps the most counter-intuitive truth of the universe is that the more you give to others, the more you’ll get. Understanding this is the beginning of wisdom.</li>
<li>Friends are better than money. Almost anything money can do, friends can do better. In so many ways a friend with a boat is better than owning a boat.</li>
<li>This is true: It’s hard to cheat an honest man.</li>
<li>When an object is lost, 95% of the time it is hiding within arm’s reach of where it was last seen. Search in all possible locations in that radius and you’ll find it.</li>
<li>You are what you do. Not what you say, not what you believe, not how you vote, but what you spend your time on.</li>
<li>If you lose or forget to bring a cable, adapter or charger, check with your hotel. Most hotels now have a drawer full of cables, adapters and chargers others have left behind, and probably have the one you are missing. You can often claim it after borrowing it.</li>
<li>Hatred is a curse that does not affect the hated. It only poisons the hater. Release a grudge as if it was a poison.</li>
<li>There is no limit on better. Talent is distributed unfairly, but there is no limit on how much we can improve what we start with.</li>
<li>Be prepared: When you are 90% done any large project (a house, a film, an event, an app) the rest of the myriad details will take a second 90% to complete.</li>
<li>When you die you take absolutely nothing with you except your reputation.</li>
<li>Before you are old, attend as many funerals as you can bear, and listen. Nobody talks about the departed’s achievements. The only thing people will remember is what kind of person you were while you were achieving.</li>
<li>For every dollar you spend purchasing something substantial, expect to pay a dollar in repairs, maintenance, or disposal by the end of its life.</li>
</ul>
<p>-Anything real begins with the fiction of what could be. Imagination is therefore the most potent force in the universe, and a skill you can get better at. It’s the one skill in life that benefits from ignoring what everyone else knows.</p>
<ul>
<li>When crisis and disaster strike, don’t waste them. No problems, no progress.</li>
<li>On vacation go to the most remote place on your itinerary first, bypassing the cities. You’ll maximize the shock of otherness in the remote, and then later you’ll welcome the familiar comforts of a city on the way back.</li>
<li>When you get an invitation to do something in the future, ask yourself: would you accept this if it was scheduled for tomorrow? Not too many promises will pass that immediacy filter.</li>
<li>Don’t say anything about someone in email you would not be comfortable saying to them directly, because eventually they will read it.</li>
<li>If you desperately need a job, you are just another problem for a boss; if you can solve many of the problems the boss has right now, you are hired. To be hired, think like your boss.</li>
<li>Art is in what you leave out.</li>
<li>Acquiring things will rarely bring you deep satisfaction. But acquiring experiences will.</li>
<li>Rule of 7 in research. You can find out anything if you are willing to go seven levels. If the first source you ask doesn’t know, ask them who you should ask next, and so on down the line. If you are willing to go to the 7th source, you’ll almost always get your answer.</li>
<li>How to apologize: Quickly, specifically, sincerely.</li>
<li>Don’t ever respond to a solicitation or a proposal on the phone. The urgency is a disguise.</li>
<li>When someone is nasty, rude, hateful, or mean with you, pretend they have a disease. That makes it easier to have empathy toward them which can soften the conflict.</li>
<li>Eliminating clutter makes room for your true treasures.</li>
<li>You really don’t want to be famous. Read the biography of any famous person.</li>
<li>Experience is overrated. When hiring, hire for aptitude, train for skills. Most really amazing or great things are done by people doing them for the first time.</li>
<li>A vacation + a disaster = an adventure.</li>
<li>Buying tools: Start by buying the absolute cheapest tools you can find. Upgrade the ones you use a lot. If you wind up using some tool for a job, buy the very best you can afford.</li>
<li>Learn how to take a 20-minute power nap without embarrassment.</li>
<li>Following your bliss is a recipe for paralysis if you don’t know what you are passionate about. A better motto for most youth is “master something, anything”. Through mastery of one thing, you can drift towards extensions of that mastery that bring you more joy, and eventually discover where your bliss is.</li>
<li>I’m positive that in 100 years much of what I take to be true today will be proved to be wrong, maybe even embarrassingly wrong, and I try really hard to identify what it is that I am wrong about today.</li>
<li>Over the long term, the future is decided by optimists. To be an optimist you don’t have to ignore all the many problems we create; you just have to imagine improving our capacity to solve problems.</li>
<li>The universe is conspiring behind your back to make you a success. This will be much easier to do if you embrace this pronoia.</li>
</ul>
<h2 id="99-additional-bits-of-unsolicited-advice"><a href="#99-additional-bits-of-unsolicited-advice">99 Additional Bits of Unsolicited Advice</a></h2><ul>
<li>That thing that made you weird as a kid could you make great as an adult — if you don’t lose it.</li>
<li>If you have any doubt at all about being able to carry a load in one trip, do yourself a huge favor and make two trips.</li>
<li>What you get by achieving your goals is not as important as what you become by achieving your goals. At your funeral people will not recall what you did; they will only remember how you made them feel.</li>
<li>Recipe for success: under-promise and over-deliver.</li>
<li>It’s not an apology if it comes with an excuse. It is not a compliment if it comes with a request.</li>
<li>Jesus, Superman, and Mother Teresa never made art. Only imperfect beings can make art because art begins in what is broken.</li>
<li>If someone is trying to convince you it’s not a pyramid scheme, it’s a pyramid scheme.</li>
<li>Learn how to tie a bowline knot. Practice in the dark. With one hand. For the rest of your life you’ll use this knot more times than you would ever believe.</li>
<li>If something fails where you thought it would fail, that is not a failure.</li>
<li>Be governed not by the tyranny of the urgent but by the elevation of the important.</li>
<li>Leave a gate behind you the way you first found it.</li>
<li>The greatest rewards come from working on something that nobody has a name for. If you possibly can, work where there are no words for what you do.</li>
<li>A balcony or porch needs to be at least 6 feet (2m) deep or it won’t be used.</li>
<li>Don’t create things to make money; make money so you can create things. The reward for good work is more work.</li>
<li>In all things — except love — start with the exit strategy. Prepare for the ending. Almost anything is easier to get into than out of.</li>
<li>Train employees well enough they could get another job, but treat them well enough so they never want to.</li>
<li>Don’t aim to have others like you; aim to have them respect you.</li>
<li>The foundation of maturity: Just because it’s not your fault doesn’t mean it’s not your responsibility.</li>
<li>A multitude of bad ideas is necessary for one good idea.</li>
<li>Being wise means having more questions than answers.</li>
<li>Compliment people behind their back. It’ll come back to you.</li>
<li>Most overnight successes — in fact any significant successes — take at least 5 years. Budget your life accordingly.</li>
<li>You are only as young as the last time you changed your mind.</li>
<li>Assume anyone asking for your account information for any reason is guilty of scamming you, unless proven innocent. The way to prove innocence is to call them back, or login to your account using numbers or a website that you provide, not them. Don’t release any identifying information while they are contacting you via phone, message or email. You must control the channel.</li>
<li>Sustained outrage makes you stupid.</li>
<li>Be strict with yourself and forgiving of others. The reverse is hell for everyone.</li>
<li>Your best response to an insult is “You’re probably right.” Often they are.</li>
<li>The worst evils in history have always been committed by those who truly believed they were combating evil. Beware of combating evil.</li>
<li>If you can avoid seeking approval of others, your power is limitless.</li>
<li>When a child asks an endless string of “why?” questions, the smartest reply is, “I don’t know, what do you think?”</li>
<li>To be wealthy, accumulate all those things that money can’t buy.</li>
<li>Be the change you wish to see.</li>
<li>When brainstorming, improvising, jamming with others, you’ll go much further and deeper if you build upon each contribution with a playful “yes — and” example instead of a deflating “no — but” reply.</li>
<li>Work to become, not to acquire.</li>
<li>Don’t loan money to a friend unless you are ready to make it a gift.</li>
<li>On the way to a grand goal, celebrate the smallest victories as if each one were the final goal. No matter where it ends you are victorious.</li>
<li>Calm is contagious.</li>
<li>Even a foolish person can still be right about most things. Most conventional wisdom is true.</li>
<li>Always cut away from yourself.</li>
<li>Show me your calendar and I will tell you your priorities. Tell me who your friends are, and I’ll tell you where you’re going.</li>
<li>When hitchhiking, look like the person you want to pick you up.</li>
<li>Contemplating the weaknesses of others is easy; contemplating the weaknesses in yourself is hard, but it pays a much higher reward.</li>
<li>Worth repeating: measure twice, cut once.</li>
<li>Your passion in life should fit you exactly; but your purpose in life should exceed you. Work for something much larger than yourself.</li>
<li>If you can’t tell what you desperately need, it’s probably sleep.</li>
<li>When playing Monopoly, spend all you have to buy, barter, or trade for the Orange properties. Don’t bother with Utilities.</li>
<li>If you borrow something, try to return it in better shape than you received it. Clean it, sharpen it, fill it up.</li>
<li>Even in the tropics it gets colder at night than you think. Pack warmly.</li>
<li>To quiet a crowd or a drunk, just whisper.</li>
<li>Writing down one thing you are grateful for each day is the cheapest possible therapy ever.</li>
<li>When someone tells you something is wrong, they’re usually right. When someone tells you how to fix it, they’re usually wrong.</li>
<li>If you think you saw a mouse, you did. And, if there is one, there are more.</li>
<li>Money is overrated. Truly new things rarely need an abundance of money. If that was so, billionaires would have a monopoly on inventing new things, and they don’t. Instead almost all breakthroughs are made by those who lack money, because they are forced to rely on their passion, persistence and ingenuity to figure out new ways. Being poor is an advantage in innovation.</li>
<li>Ignore what others may be thinking of you, because they aren’t.</li>
<li>Avoid hitting the snooze button. That’s just training you to oversleep.</li>
<li>Always say less than necessary.</li>
<li>You are given the gift of life in order to discover what your gift <em>in</em> life is. You will complete your mission when you figure out what your mission is. This is not a paradox. This is the way.</li>
<li>Don’t treat people as bad as they are. Treat them as good as you are.</li>
<li>It is much easier to change how you think by changing your behavior, than it is to change your behavior by changing how you think. Act out the change you seek.</li>
<li>You can eat any dessert you want if you take only 3 bites.</li>
<li>Each time you reach out to people, bring them a blessing; then they’ll be happy to see you when you bring them a problem.</li>
<li>Bad things can happen fast, but almost all good things happen slowly.</li>
<li>Don’t worry how or where you begin. As long as you keep moving, your success will be far from where you start.</li>
<li>When you confront a stuck bolt or screw: righty tighty, lefty loosey.</li>
<li>If you meet a jerk, overlook them. If you meet jerks everywhere everyday, look deeper into yourself.</li>
<li>Dance with your hips.</li>
<li>We are not bodies that temporarily have souls. We are souls that temporarily have bodies.</li>
<li>You can reduce the annoyance of someone’s stupid belief by increasing your understanding of why they believe it.</li>
<li>If your goal does not have a schedule, it is a dream.</li>
<li>All the greatest gains in life — in wealth, relationships, or knowledge —come from the magic of compounding interest — amplifying small steady gains. All you need for abundance is to keep adding 1% more than you subtract on a regular basis.</li>
<li>The greatest breakthroughs are missed because they look like hard work.</li>
<li>People can’t remember more than 3 points from a speech.</li>
<li>I have never met a person I admired who did not read more books than I did.</li>
<li>The greatest teacher is called “doing”.</li>
<li>Finite games are played to win or lose. Infinite games are played to keep the game going. Seek out infinite games because they yield infinite rewards.</li>
<li>Everything is hard before it is easy. The day before something is a breakthrough, it’s a stupid idea.</li>
<li>A problem that can be solved with money is not really a problem.</li>
<li>When you are stuck, sleep on it. Let your subconscious work for you.</li>
<li>Your work will be endless, but your time is finite. You cannot limit the work so you must limit your time. Hours are the only thing you can manage.</li>
<li>To succeed, get other people to pay you; to become wealthy, help other people to succeed.</li>
<li>Children totally accept — and crave — family rules. “In our family we have a rule for X” is the only excuse a parent needs for setting a family policy. In fact, “I have a rule for X” is the only excuse you need for your own personal policies.</li>
<li>All guns are loaded.</li>
<li>Many backward steps are made by standing still.</li>
<li>This is the best time ever to make something. None of the greatest, coolest creations 20 years from now have been invented yet. You are not late.</li>
<li>No rain, no rainbow.</li>
<li>Every person you meet knows an amazing lot about something you know virtually nothing about. Your job is to discover what it is, and it won’t be obvious.</li>
<li>You don’t marry a person, you marry a family.</li>
<li>Always give credit, take blame.</li>
<li>Be frugal in all things, except in your passions splurge.</li>
<li>When making something, always get a few extras — extra material, extra parts, extra space, extra finishes. The extras serve as backups for mistakes, reduce stress, and fill your inventory for the future. They are the cheapest insurance.</li>
<li>Something does not need to be perfect to be wonderful. Especially weddings.</li>
<li>Don’t let your email inbox become your to-do list.</li>
<li>The best way to untangle a knotty tangle is not to “untie” the knots, but to keep pulling the loops apart wider and wider. Just make the mess as big, loose and open as possible. As you open up the knots they will unravel themselves. Works on cords, strings, hoses, yarns, or electronic cables.</li>
<li>Be a good ancestor. Do something a future generation will thank you for. A simple thing is to plant a tree.</li>
<li>To combat an adversary, become their friend.</li>
<li>Take one simple thing — almost anything — but take it extremely seriously, as if it was the only thing in the world, or maybe the entire world is in it — and by taking it seriously you’ll light up the sky.</li>
<li>History teaches us that in 100 years from now some of the assumptions you believed will turn out to be wrong. A good question to ask yourself today is “What might I be wrong about?”</li>
<li>Be nice to your children because they are going to choose your nursing home.</li>
<li>Advice like these are not laws. They are like hats. If one doesn’t fit, try another.</li>
</ul>

</article>
<article>
<h1>Minibri Temp - Asciidoc support</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Wed, 17 Mar 2021 00:00:00 GMT</p>
<p><a href="https://temp.minibri.com">Minibri Temp</a> now supports files formatted in <a href="https://asciidoc.org/">Asciidoc</a>.
The uploaded file needs to have <code>.adoc</code> or <code>.asciidoc</code> extension to go through Asciidoc conversion.</p>
<p>Check out an example of a rendered document at <a href="https://temp.minibri.com/view/asciidoc-guide">https://temp.minibri.com/view/asciidoc-guide</a>.</p>
<p>The source file of 6000+ lines which generated the above output can be found at <a href="https://gist.github.com/lambrospetrou/36af630d6a3c0393f1bcd964f72e2ebd">this gist</a>.</p>

</article>
<article>
<h1>The Read-Watch-Listen list</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 14 Mar 2021 00:00:00 GMT</p>
<p>I often share links to content I read, watch, or listen in my social media accounts (mostly <a href="https://twitter.com/LambrosPetrou">Twitter</a>).
However, many times I am trying to find a link I know I shared before, but don’t remember exactly where it was hosted or who created it. And then I end up searching or scrolling my tweets in an attempt to find it. Looking for a needle in a haystack…</p>
<p>I decided to create a page where I will keep a list of content that I find interesting, and is worthy of sharing with others. This will be a curated list of things, and not just a dump of my tweets, with just what I consider great.</p>
<p>I spent a few hours scrolling through my tweets for the past three years and I extracted the content that I still find interesting, and it’s not behind a paywall at the moment. You can find the new page at <a href="/read-watch-listen">https://www.lambrospetrou.com/read-watch-listen</a>, and at the corresponding navigation link at the top of all pages on this website. The list is ordered with the content I most recently read at the top, therefore new items will always go at the top.</p>
<p>I also created a <a href="/feed/read-watch-listen.rss.xml">dedicated RSS feed</a> separate from my <a href="/feed/rss.xml">articles RSS feed</a> in order to provide an easy way for people to consume this list using their favourite feed reader.</p>
<p>I hope more people find the content of this list interesting. Feel free to get in touch if you have any feedback, or want to share great content with me.</p>
<p>And just for the history records, a screenshot of the list at the article’s time of writing 👇🏼</p>
<p></p>

</article>
<article>
<h1>A journey through email providers (Zoho Mail, Private Email, Gmail, HEY, Google Domains)</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 07 Mar 2021 00:00:00 GMT</p>
<h2 id="context"><a href="#context">Context</a></h2><p>I use several domains for my websites and side projects, and for some of them I want to have email addresses that use the correspoding domains, instead of advertising a Gmail address. For example, I want my personal domain to use the <code>@lambrospetrou.com</code> domain, instead of <code>@gmail.com</code>.</p>
<p>There are hundreds of email providers available, and over the past year I tried a couple of them, and read about tens more.</p>
<p>The process setting up the custom domain email is roughly the same across all providers:</p>
<ol>
<li>You signup and pay for their service.</li>
<li>The service provider gives you their <a href="https://www.cloudflare.com/learning/dns/dns-records/dns-mx-record/">MX records</a> to put in your domain’s DNS configuration.</li>
<li>You update your domain’s DNS and boom, your custom domain email address is ready.</li>
</ol>
<p>However, the email experience you get from each provider differs substantially…</p>
<h2 id="namecheap-private-email"><a href="#namecheap-private-email">Namecheap Private Email</a></h2><p>Almost a year ago I decided to go with the cheapest option, whilst also having good ratings.
For my personal domain (this website 👋) I used <a href="https://www.namecheap.com/hosting/email/">Namecheap Private Email</a> since I was happy with Namecheap as a domain registrar a couple of times.</p>
<p>The offering cannot be simpler for around £9 ($12) per year 🤯:</p>
<ul>
<li>1 free mailbox included</li>
<li>5GB for emails</li>
<li>2GB for files</li>
</ul>
<p>I only used <a href="http://privateemail.com/">Private Email’s website</a>, and even though it did not dazzle me, it was also not missing something important. Apart from one thing…</p>
<p>The spam filter algorithm is horrible, entirely inexistent. I have a few articles about AWS Lambda, Amazon S3, and AWS in general, and since they got picked up by some spammers, I am getting tens of emails that can be considered poster emails for testing spam algorithms.</p>
<p>So, since the year is almost up, I decided to not renew again. Next one…</p>
<h2 id="zoho-mail"><a href="#zoho-mail">Zoho Mail</a></h2><p>For a different domain, a side project, I used <a href="https://www.zoho.com/mail/">Zoho Mail</a>. The Zoho Suite is sometimes referred to as the cheaper Google G-Suite, since it is a collection of services and email is just one of them.</p>
<p>Similar to the other domain, I used the cheapest plan offered, <a href="https://www.zoho.com/mail/zohomail-pricing.html?src=hd">Zoho Mail Lite</a> at £12 ($16):</p>
<ul>
<li>5GB total</li>
<li>Up to 250MB attachments</li>
<li>Lots of other stuff I don’t care…</li>
</ul>
<p>I will be honest, this email is of very low use, and I didn’t publish it in any way so I cannot directly compare it with Private Email in terms of spam filters. However, in roughly 6 months I haven’t seen a spam email at all.</p>
<p></p>
<p>Once again, I only used the website to access this email. The email view is quite standard, but I found the rest UI too complicated, and I got the sense that Zoho tries too much to look like a toolbox rather than a focused tool. Especially when I was navigating through the settings in the first few days to setup everything I got lost more times than I would like to admit… I haven’t used any of the Calendar, Tasks, Notes, and other services, so cannot comment on those.</p>
<p>Overall, I have mixed feelings for Zoho Mail. It’s cheap, simple for just email, but I have to use it more often to see if the rest of the services in the package make up for the intrinsic complexity.</p>
<h2 id="google-workspace-g-suite"><a href="#google-workspace-g-suite">Google Workspace - G-Suite</a></h2><p><a href="https://workspace.google.com/">Google Workspace</a> is what is famously known as G-Suite until last year when Google did a rebranding.
G-Suite is probably the most popular among professional email providers, due to Gmail’s popularity, and the rest of the Google services (Docs, Drive, Calendar, etc.). The only one close in my opinion is the <a href="https://www.microsoft.com/en-gb/microsoft-365">Microsoft 365</a> family of services. I personally hate Microsoft’s plan breakdown, and I don’t find Outlook to be on par with Gmail.</p>
<p></p>
<p>I have considered subscribing to the <a href="https://workspace.google.com/pricing.html">Business Starter</a> plan for Google Workspace for months now (still haven’t):</p>
<ul>
<li>30GB cloud storage per user</li>
<li>£4.60 (~$6.50) per month</li>
<li>Integration with all the Google services with some business oriented features (e.g. for Google Meet)</li>
</ul>
<p>One thing that holds me back is that some of the business versions of the services lack certain features that are available in the corresponding consumer versions. Since I am mostly interested in Gmail, it’s not an actual issue to be honest, especially when compared with other email providers. But if I would be paying that monthly amount then I would like to use all the features I get with that. At times I would probably get into conflict, between my personal Google account and the business one, having to choose which one to use since I am a heavy Google services user.</p>
<p>Google Workspace (G-Suite) is my top choice so far for a dedicated email provider, and I am sure I will find a way to juggle my data between my personal and the business account.</p>
<h2 id="hey-for-you"><a href="#hey-for-you">HEY for You</a></h2><p><a href="https://hey.com/">HEY for You</a> is a brand new email service provided by the great team at <a href="http://basecamp.com/">Basecamp</a>. HEY’s price is almost double than Google Workspace’s at roughly £87 ($120 - $99 + 20% VAT) per year. The big promise from HEY justifying the high price is that it revolutionizes how you use email, taking some modern techniques cued from social media services (e.g. <a href="https://hey.com/features/the-feed/">The Feed</a>, the <a href="https://hey.com/flow/">Just let it flow</a> philosophy), and avoids adding “management features” in an effort to make email simple.</p>
<p></p>
<p>I read through the whole <a href="https://hey.com/features/">feature list</a> and watched the <a href="https://www.youtube.com/watch?v=UCeYTysLyGI">YouTube videos presenting HEY</a>, and it’s quite intriguing. Having said that, I think that one of the reasons I am even considering this product at such a steep price is just because of Basecamp, the team, the company. Last year I read three of the <a href="https://basecamp.com/books">books they wrote</a> (Getting Real, REWORK, Shape Up) and loved them, so I might be biased thinking I would like their email product too 😅</p>
<p>I am definitely going to at least signup for the trial and give this a go in the next few days. I haven’t yet pulled the trigger because custom domain support is still only available for Hey for Work, the business version of the service, which is even more expensive.</p>
<h2 id="google-domains"><a href="#google-domains">Google Domains</a></h2><p>I stumbled upon this solution today, out of nowhere, while reading some Reddit comments.</p>
<p><a href="https://domains.google">Google Domains</a> is Google’s new domain registrar service, still carrying the <code>BETA</code> status. However, today I was pleasantly surprised when I found out that it also <a href="https://domains.google/intl/en_uk/learn/how-to-use-email-forwarding/">provides a free email forwarding service</a>.</p>
<p>As it turns out, you can have up to 100 aliases per registered domain, and each can be forwarded to a different email address, including personal Gmail accounts. I immediately transferred two of my domains to Google Domains and tried the email forwarding feature, and it actually works great 🥳</p>
<p>In addition, according to the docs, email forwarding can also work alongside G-Suite so you can have both enabled at the same time if needed for the same domain.</p>
<p>Email forwarding is obviously not an apples-to-apples comparison to all the services I previously mentioned, but since the original goal was custom domain email address, it fully satisfies my needs. I just forward my domain’s email addresses to my personal Gmail address and boom, everything ends up in the same account which is easy to manage, and with the ability to <a href="https://support.google.com/domains/answer/9437157">send and reply emails using the domain address</a> it’s complete.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Google Domains solved my immediate problem in the best way possible since I don’t have to login and use any other mediocre email provider, I have an amazing spam filter, and it’s all within Gmail which I love.</p>
<p>Having said that, I am still in search for a dedicated offering for my non-personal domains used for some side projects, future mini-businesses. Google Workspaces and HEY are my top two finalists, and I guess I will have to finally tryout HEY before picking the winner.</p>

</article>
<article>
<h1>Minibri Temp — The Deep Dive</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 01 Mar 2021 00:00:00 GMT</p>
<p><a href="https://temp.minibri.com" target="_blank" title="Visit Minibri Temp"></a></p>
<p>Try <strong>Minibri Temp</strong> at <a href="https://temp.minibri.com">https://temp.minibri.com</a>.</p>
<h2 id="inception-story"><a href="#inception-story">Inception Story</a></h2><p>Minibri Temp allows you to upload a file which contains either HTML code (including CSS and JavaScript) or content in one of the supported formats, e.g. Markdown, and soon Asciidoc. Once the file is uploaded, the content is converted to HTML, and you get a link to view the rendered HTML content. You also specify a duration, after which the URL will expire and the uploaded content will be deleted.</p>
<p>This service seems incredibly simple, because it is 😅 Markdown converters are very popular, but my plan is to add support for a few more source formats, which is what differentiates this from everything else since I cannot really find anything supporting non-Markdown content.</p>
<p>I had several moments last year that I wanted to share a link with a friend or a colleague to show them some content. To do that, I had to run some local command line tool on my laptop to convert it to HTML, then login to my AWS account (or Github) and upload the file to my website, and then finally share a link with them. Minibri Temp just makes this simple use-case a 1-step action! 🥳 Google Drive and Dropbox also make this easy but I don’t have them on all my machines, so they are not there when I need them.</p>
<p>I developed the application over two weekends, so it’s nothing record-breaking or something that’s going to win the Alan Turing award, but in this article I would like to do a deep dive into its architecture and some technical decisions. I want to show that using just a few core services allows someone to build useful applications, whilst also having fun, and at a cost of almost zero.</p>
<h2 id="architecture-overview"><a href="#architecture-overview">Architecture Overview</a></h2><p></p>
<p>As the diagram above shows the user’s browser interacts with three main systems.</p>
<ol>
<li>The static assets are loaded from <a href="https://www.netlify.com/products/edge/">Netlify Edge</a>, which is Netlify’s smart Content Delivery Network (CDN).</li>
<li>The uploaded content is sent to a <a href="https://www.netlify.com/products/functions/">Netlify Function</a>, which is Netlify’s serverless compute platform built ontop of <a href="https://aws.amazon.com/lambda/">AWS Lambda</a>. This function does some content validation, and creates one HTML file with the converted content.</li>
<li>The created HTML file is immutable and is stored on <a href="https://aws.amazon.com/s3/">Amazon S3</a>, Amazon’s object storage offering on AWS. The page returned to the user directly fetches the converted HTML file from Amazon S3 (using <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html">presigned URLs</a>).</li>
</ol>
<p>The last step is a bit debateable and I could serve the converted HTML files through Netlify Edge as well, having the S3 bucket acting as an origin. I did not go with this approach for now to avoid having to make the files stored on S3 public, therefore all access to the files uses presigned URLs, which gives me control on who can access them and for how long.</p>
<h2 id="no-javascript-required"><a href="#no-javascript-required">No JavaScript required</a></h2><p>It might be a bit unusual nowadays, but I want the website to be fully functional without JavaScript required in the user’s browser.
Obviously, I will introduce some JavaScript in the future, e.g. for drag & drop the file instead of manual selection, but I want 100% of the functionality to work without JavaScript needed.</p>
<h2 id="content-server-inside-an-iframe"><a href="#content-server-inside-an-iframe">Content server inside an iframe</a></h2><p>If you noticed in the architecture overview I mentioned that the output HTML is fetched using an S3 presigned URL. In addition, if you examine the actual page returned when viewing the generated content (e.g. <a href="https://temp.minibri.com/view/sample">this sample</a>) you will see that it’s basically a very simple skeleton page that contains only an <code></code> with its <code>src</code> attribute set to the S3 presigned URL.</p>
<p>Let’s see why.</p>
<p>I have the following requirements:</p>
<ul>
<li>The generated HTML for the converted content should be immutable. After its creation I should not need to ever touch it again, so that it can be served from S3 or any CDN service.</li>
<li>I want to be able to update the skeleton of the view page, for example to customize the header, or maybe in the future add a footer, etc. These changes are not related to the converted content, but the surrounding parts.</li>
</ul>
<p>I thought about three possible solutions to satisfy the immutability requirement:</p>
<ol>
<li>Generate the HTML file for the converted content, and also include all the HTML needed for my own purposes (e.g. header). Each HTML file will be self-contained completely, and will be the only thing needed to be served to the user.</li>
<li>Generate the HTML file for the converted content, and include an <code></code> at the header section as a placeholder which has its <code>src</code> attribute set to the URL serving the header.</li>
<li>Generate the HTML file for the converted content and store it on its own in S3. The HTML served to the user would be a different HTML page that contains an <code></code> with its <code>src</code> attribute set to the HTML file in S3 containing the converted content.</li>
</ol>
<p>Solution 1 is the easiest, and probably the one with the best caching since it can be cached literally forever. However, it does not allow for updates which is a deal-breaker (see second requirement). Solution 2 is OK since it allows me to update the header section anytime I want, and still benefits from immutable, cacheable HTML files. Solution 3 provides flexibility, and keeping the actual HTML file for the converted content separate from the skeleton page is good for future-proofing. Therefore, I went with solution 3.</p>
<h2 id="direct-download-from-amazon-s3"><a href="#direct-download-from-amazon-s3">Direct Download from Amazon S3</a></h2><p>This is something I might change in the future, but this section describes the initial implementation. When you upload a file, the serverless function handling the request will convert the content to HTML, and then store the generated HTML file on Amazon S3.</p>
<p>One of the goals I had was that content should not be served directly from the serverless functions to avoid incurring high cost, since the <a href="https://aws.amazon.com/lambda/pricing/">pricing for AWS Lambda</a>, and hence <a href="https://www.netlify.com/pricing/#add-ons-functions">Netlify Functions</a>, is based on the number of requests and the duration of the function execution (rounded to the nearest millisecond precision). Furthermore, Amazon S3 is much more suited at serving static files versus AWS Lambda, which is good for our lightweight compute needs.</p>
<p>Another goal of the final implementation was that I wouldn’t allow public access to the S3 bucket.</p>
<p>Let’s go through the possible solutions and see what works and what doesn’t:</p>
<ol>
<li>Use <a href="https://aws.amazon.com/cloudfront/">Amazon CloudFront</a> (Amazon’s CDN) in front of S3</li>
<li>Use Netlify Edge (CDN) in front of S3</li>
<li>Fetch from S3 directly</li>
</ol>
<p>Using Amazon CloudFront satisfies both goals, since it’s a global CDN and by using something called <a href="https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-restricting-access-to-s3.html">Origin Access Identity</a> we avoid opening access to our S3 bucket. However, using Amazon CloudFront would add another service to the mix, and since Netlify is a CDN already I didn’t want to introduce another. Next one.</p>
<p>I haven’t been able to find something akin to the Origin Access Identity for Netlify Edge, and therefore solution 2 is ruled out since it would require public access to S3. If there was a way, this would be my ideal solution.</p>
<p>So we are left we solution 3, exposing files from S3 directly. But, didn’t I say that I didn’t want to give public access? 😒 The final approach is that when the serverless function handling the file upload responds, it basically returns a dummy HTML that contains an <a href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe"><code></code></a> with its <code>src</code> attribute set to a presigned URL for the actual HTML file in S3. This seems like a round-about solution, but it actually satisfies both of my goals nicely. I have total control over who has access to each file and for how long, and I do not expose the file structure on S3 directly.</p>
<p>There is one drawback in this solution, namely S3’s bandwidth cost. Usually for high traffic downloads it would be much cheaper to use a CDN in front of S3 to cache the content, rather than always hitting S3 directly. However, considering that the traffic is going to be close to zero, who cares 🙃 😜</p>
<h2 id="content-expiration"><a href="#content-expiration">Content Expiration</a></h2><p>One of the features I wanted from the beginning was content expiration. Ideal for sharing content that is not ready for prime view or work in-progress, or even stupid notes to a friend.</p>
<p>This feature is implemented using <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-expire-general-considerations.html">Amazon S3 Expiring Objects</a>. This feature allows you to create a <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/intro-lifecycle-rules.html">Lifecycle rule</a> where you specify a filename prefix, or specific tags associated with your object, that will trigger its expiration after a given amount of days.</p>
<p>For example, if I stored the content in different virtual directories we could have different prefixes based on the expiration, e.g. <code>s3://my-bucket-name/expiration-days-1/sample-file1.html</code> and <code>s3://my-bucket-name/expiration-weeks-1/sample-file2.html</code>. Then, you can create a Lifecycle Rule specifying that anything under the <code>/expiration-days-1/*</code> should expire after 1 day, and similarly for <code>/expiration-weeks-1/*</code> after 1 week.</p>
<p>In my case, I prefer to use tagging instead of a certain directory layout to stay flexible in the way I name and store files. Therefore, I assign the tag <code>expiration=Days1</code>, or <code>expiration=Weeks1</code>, etc. to each object and create the corresponding Lifecycle rules to check the <code>expiration</code> tag value.</p>
<p>Note that Amazon S3’s expiration is not precise to the minute, or even to the hour. The cleanup runs once a day and therefore an object could be accessible for several hours passed its expiration. For Minibri Temp, this is fine, and we don’t need anything more precise.</p>
<h2 id="url-generation"><a href="#url-generation">URL Generation</a></h2><p>The URL generation was probably the most fun feature to develop. The URL could not be very short like URL shortener services, because there is no database to hold the mapping, and just querying S3 until a non-existing filename is found (in case of collisions) would be very slow, and very expensive.</p>
<p>Therefore, the URL should be generated in a way to avoid collisions but still be user-friendly, that is less than 80 characters, and without querying S3. As mentioned before, the generated HTML files are immutable, and once the URL is generated it should always point to the same content.</p>
<p>Right from the start I used the content hash as the URL basis using <a href="https://en.wikipedia.org/wiki/SHA-2">SHA512</a>, and encoding the hash digest using <a href="https://en.wikipedia.org/wiki/Base62">Base 62</a>, which resulted in roughly 86 characters. This was a bit longer than I liked. Going one step lower and using SHA256 would result in roughly 43 characters which was very nice, but I thought that over time it could lead to several collisions.</p>
<p>In the end, I went with a hybrid approach of using the SHA256 digest encoded in Base 62 as the URL suffix, along with the hour the file was uploaded, and the expiration selected as the prefix. This creates a nice 2-level scoping for each content hash which makes it extremely unlikely to ever hit a collision.</p>
<p>One other reason I have chosen to use the creation date (precision to the hour) is that it automatically sorts the content in the S3 Console, which is a nice bonus while debugging or troubleshooting issues.</p>
<pre><code>Original URL format:
https://temp.minibri.com/view/I65a3dvY1pDbeSQf1NHmZvUR4S3FNszuM5WZyovifwpil9YbMl5NHi3bphnt9H9AfjVxHlRbpUJKmXGXyldVIz

Final URL format:
https://temp.minibri.com/view/c1614636000-e1-rbPYezZ4mwnb3fANNZjBGcwEyNHAsvKDQRR0olVCOIG
</code></pre>
<h2 id="why-netlify"><a href="#why-netlify">Why Netlify?</a></h2><p>I play and experiment with many serverless platforms over time, and Netlify is one of my most favourite services at the moment. I wrote <a href="https://www.lambrospetrou.com/articles/battle-of-jamstack-platforms-netlify-vercel-aws/">an article last year comparing it to Vercel and AWS</a>, but I haven’t used it for anything more complex than just static sites before. I decided to give it a go for a full-fledged application, and I really loved its quality and development experience!</p>
<h3 id="notable-highlights"><a href="#notable-highlights">Notable highlights</a></h3><ul>
<li><a href="https://www.netlify.com/products/dev/">Netlify Dev</a> is essentially a local server simulating what Netlify does in the cloud. It is amazing! 🤩 I set it up to work with Next.js (the static site part) along with my custom Webpack configuration for the serverless functions, and it has really been working fantastically.</li>
<li><a href="https://docs.netlify.com/site-deploys/overview/#branches-and-deploys">Pull-request Deploy Previews</a> have been complementary to Netlify Dev in order to fully test the application without affecting the live version in production. For every Github pull request it gives you a URL to access a fully-working application, including the serverless functions which hold the main business logic in my case. Full testing before production achieved! 🏆</li>
<li><a href="https://docs.netlify.com/routing/redirects/">Rewrites and Redirects</a> is such a subtle feature that you think you don’t need it, until you use it. I have been using <a href="https://aws.amazon.com/lambda/edge/">AWS Lambda@Edge</a> with Amazon CloudFront to do simple routing for years, but with Netlify redirects and rewrites this becomes a pleasure. For example, in this application when someone visits <code>/create-content</code>, Netlify transparently forwards the request to <code>/.netlify/functions/create-content</code> which is a serverless function handling the request. And you can even use this to proxy requests to servers on different domains.</li>
<li><a href="https://www.netlify.com/products/functions/">Netlify Functions</a> is another greatly executed feature by the Netlify team. I have used AWS Lambda and Amazon API Gateway for years, inside and outside of Amazon, and avoiding all the hassle of messing with API Gateway to setup basic endpoints is so refreshing. I still build the lambda bundles myself though, rather than letting it <a href="https://docs.netlify.com/functions/build-with-javascript/#unbundled-javascript-function-deploys">go full autopilot</a> in order to control exactly what goes into the bundle.</li>
</ul>
<h2 id="aws-resources"><a href="#aws-resources">AWS Resources</a></h2><p>As you probably realised at this point, the application needs an Amazon S3 bucket, outside of Netlify’s control. And since this bucket will be into my account, we also need credentials to access it from Netlify Functions, thus we need an <a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users.html">IAM User</a>. I provision the AWS resources used by the application using <a href="https://docs.aws.amazon.com/cdk/latest/guide/home.html">AWS CDK</a> which is great compared to raw CloudFormation, or even other infrastructure-as-code tools.</p>
<h2 id="total-cost"><a href="#total-cost">Total Cost $$$</a></h2><p>Depends on how many people will actually use this, but unless it gets hundreds of thousands of uploads per day, it’s not going to cost me more than a Hot Chocolate ☕ per month.</p>
<h2 id="future-plans"><a href="#future-plans">Future Plans</a></h2><p>These are just some ideas I might implement in the near, and far, future.</p>
<ul>
<li>Password protected content 🔐</li>
<li>Support more source formats, which was one of the original project goals 📋</li>
<li>Add an editor to write content directly on the website ✏️</li>
<li>Never-expiring content (paid feature?) 💰</li>
<li>Multiple files, including images, as part of the same shareable link (paid feature?) 💰</li>
<li>Delete after read (remember Inspector Gadget’s self-destructed mission papers?) 💣</li>
</ul>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>My favourite AWS services, AWS Lambda and Amazon S3, along with Netlify’s superb simplicity and great development experience, are all we need to build powerful applications! 🥳 🚀</p>
<h3 id="changelog"><a href="#changelog">Changelog</a></h3><ul>
<li>2021-03-02<ul>
<li>Added sections <a href="#no-javascript-required">No JavaScript required</a>, <a href="#content-server-inside-an-iframe">Content server inside an iframe</a></li>
<li>Updated section <a href="#direct-download-from-amazon-s3">Direct Download from Amazon S3</a></li>
</ul>
</li>
</ul>

</article>
<article>
<h1>Goodbye 2020, Hello 2021</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Wed, 30 Dec 2020 00:00:00 GMT</p>
<p>I don’t usually write end-of-year articles, but 2020 was an eventful year.</p>
<h2 id="2020-look-back"><a href="#2020-look-back">2020 Look back</a></h2><ul>
<li><p>Early January 2020 I began preparing for job interviews. Having spent 5 years at Amazon/AWS, it was time to embrace new adventures. After two months of preparation, and one month of sporadic interviews, I accepted an offer from Facebook, and by the end of April I joined <a href="https://www.whatsapp.com/">WhatsApp</a>. It was definitely a weird, but exciting, time to join a new company during a pandemic lockdown, with all the nuances of onboarding in a new team over video calls 😅 Fun fact, one of my competing offers was from <a href="https://www.cloudflare.com/">Cloudflare</a>, which I loved as a company and as a team, and I am still debating if I made the right choice… Especially watching their stock price quadrupling in just one year 😱</p>
</li>
<li><p>Back in April, I <a href="https://www.lambrospetrou.com/articles/hobby-languages-for-2020/">wrote an article</a> about programming languages I wanted to play with during the rest of the year; ClojureScript, ReasonML, and Go. Fast forward to present, I pretty much gave up on ReasonML because it seemed to be a confused language with <a href="https://rescript-lang.org/bucklescript-rebranding">several rebrandings</a> throughout the year. Maybe I will revisit once the dust settles. ClojureScript is still nice, being a Lisp and all, but unless somebody uses Clojure on the backend as well, it’s much easier sticking to JavaScript or TypeScript on the frontend. Go (or golang) still remains one of my most productive languages. Also, joining WhatsApp meant that I spent a lot of time working with Erlang, and BEAM VM, and since Python is the defacto scripting language in Facebook, that’s two more languages for the year.</p>
</li>
<li><p>Around summer, the pandemic really kicked in and I spent most of the time going for walks, when the London weather allowed, but mostly watching Netflix and Amazon Video. <strong>Goodbye social life, hello big fat ass.</strong> I have to say that there is a lot of great content in the sidelines of these straming services, but there is a <strong>lot of garbage</strong> as well… I sort of regret spending so many months binge watching some things 😫</p>
</li>
<li><p>During summer, the idea of starting my own business/startup was really getting prevalent in my mind. I started reading lots of material and books on small companies, and got really intrigued by the concept of <strong>Company of One</strong>. This would be my ideal situation in the future, so I still have hope. Some of the best books I read during this period:</p>
<ul>
<li><a href="https://smile.amazon.co.uk/Company-One-Staying-Small-Business/dp/1328972356">Company of One: Why Staying Small is the Next Big Thing for Business</a></li>
<li><a href="https://basecamp.com/shapeup">Shape Up: Stop Running in Circles and Ship Work that Matters</a></li>
<li><a href="https://basecamp.com/books/rework">REWORK</a>: Worths its number of pages in gold. Amazing book.</li>
<li><a href="https://basecamp.com/books/getting-real">Getting Real</a>: This is similar to REWORK, but explicitly adapted and extended for building web applications. <strong>A trully marvelous book!</strong></li>
<li><a href="https://refactoringui.com/book/">Refactoring UI</a>: Another gem for anyone making web applications. Insightful tips on great UI design that even developers can follow to build amazing products.</li>
<li><a href="https://smile.amazon.co.uk/Lean-Startup-Innovation-Successful-Businesses/dp/0670921602">The Lean Startup: How Constant Innovation Creates Radically Successful Businesses</a></li>
<li><a href="https://www.startupschool.org/curriculum">Startup School by Y Combinator</a>: These are the open lectures by the famous startup school in California. I went through all the lectures and the content is magnificent. Really down to earth, with invaluable tips, and many times surprising truths exposed about some of the megastartups we all use today.</li>
</ul>
</li>
<li><p>Entering the last quarter of the year, I finally pulled the plug and decided to launch a side-project I have been thinking for a while. <a href="https://www.lambrospetrou.com/articles/minibri-score-the-inception/">Minibri Score</a> is a sports prediction game that is actually fun; the 10-word pitch line. I was sidetracked a bit on this, but I am planning to release the first version over the next few weeks.</p>
</li>
<li><p>By the end of the year, I started getting into and learning applied Machine Learning. I am not overly fond of the underlying theory, and I am mostly interested in the applications and use-cases of Machine Learning. Same way that I am very interested learning new programming languages and using them, but I am not excited by programming language theory or compiler internals. I am mostly answering the questions, what are the most popular algorithms and approaches used, how can I use them, and how can I use them at scale. I look forward to getting more into this next year.</p>
</li>
</ul>
<h2 id="2021-look-forward"><a href="#2021-look-forward">2021 Look forward</a></h2><ul>
<li><p>Programming language wise for 2021 I want to focus on less languages and go a bit deeper. These will be <a href="https://www.rust-lang.org/">Rust</a> for <a href="https://rust-lang.org/what/networking">server backends</a> and <a href="https://www.rust-lang.org/what/wasm">WebAssembly</a>, and <a href="https://scikit-learn.org/stable/">Python for Machine Learning</a>. If I have time I want to start working with <a href="https://kotlinlang.org/">Kotlin</a> again since it’s one of the languages I enjoyed working with a lot in the past.</p>
</li>
<li><p>As mentioned in the previous section, I just started reading more about Machine Learning. I already watched some lectures, and read tens of articles, but I want to focus on some specifics over the next few months. I shortlisted my studying to the following, which is subject to change as I learn more:</p>
<ul>
<li><a href="https://smile.amazon.co.uk/gp/product/1492032646">Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems</a>: Seems to be exactly what I want. Applied ML using the most popular tools, and all the reviews I read are very positive.</li>
<li><a href="https://smile.amazon.co.uk/gp/product/149207294X/">Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python</a>: Some of the most used techniques in ML/AI are purely based on statistics, and this book seems to cover them nicely.</li>
<li>After learning the basics above, next comes using Deep Learning (DL) approaches. I bought some nice books last year but never went through them, so the following are specific to deep learning:<ul>
<li><a href="https://smile.amazon.co.uk/gp/product/1492045527">Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD</a>: There is also the <a href="https://course.fast.ai/">Practical Deep Learning for Coders</a> course which complements the book.</li>
<li><a href="https://www.manning.com/books/deep-learning-with-python">Deep Learning with Python</a>: I read the first few chapters months ago, so I know it’s good enough to pick up again.</li>
<li><a href="https://www.manning.com/books/deep-learning-and-the-game-of-go">Deep Learning and the Game of Go</a></li>
</ul>
</li>
</ul>
</li>
<li><p>Launch the first version of <a href="https://score.minibri.com">https://score.minibri.com</a>. <strong><span style="color: black">Simple.</span> <span style="color: #146396">Friendly.</span> <span style="color: #ff1154">Fun.</span></strong></p>
</li>
<li><p>I want to finally convince myself to jump in the exciting world of starting my own business. I have a few ideas in mind, and a few others being discussed with some friends, so it’s just a matter of overcoming my fears mostly. Especially now, since due to the COVID-19 situation, there is not much to lose. The majority of days will be spent locked up inside the house anyway, so why not take advantage of that free time.</p>
</li>
<li><p>I read some amazing books in 2020, and I want to do it more in 2021. I plan to reduce the amount of time I spent on streaming, and increase reading time. My kindle library is getting bigger already 📚</p>
</li>
<li><p>2020 was good for my career, but horrible for my fitness. Staying at home the whole day, especially during winter or rainy months was a disaster. Hopefully, in 2021 I will manage to get my ass out of my chair and get active again. I am starting to have built-in swim rings, and it doesn’t look good 🐷</p>
</li>
</ul>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>2020 was quite the year for me, mostly in a good way. I am very thankful, and I acknowledge that this was a luxury not many people had this year.</p>
<p>Hoping for an even better 2021, and wishing that health and hope will become the trend of the new year, contrary to the current one.</p>
<p>Be safe, Be healthy 😉</p>

</article>
<article>
<h1>Life In The UK test — Tips for 2020</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Tue, 24 Nov 2020 00:00:00 GMT</p>
<h2 id="short-version"><a href="#short-version">Short version</a></h2><ul>
<li>I studied for roughly 2 full days, so even if you take it easy one week should be plenty.</li>
<li>Buy the <a href="https://smile.amazon.co.uk/dp/B00OYTAXYQ">Life in the United Kingdom: A Guide for New Residents, 3rd edition</a> book and read it at least once thoroughly.</li>
<li>Download the <a href="https://play.google.com/store/apps/details?id=com.bubblingiso.LifeintheUKTest">Life in the UK Test 2020</a> Android application (sorry, find an alternative in iOS). Do all of the 48 tests (as of November 2020), until you score 22+/24 in all of them, and then keep repeating them until your exam.</li>
</ul>
<h2 id="long-version"><a href="#long-version">Long version</a></h2><h3 id="the-book"><a href="#the-book">The book</a></h3><p>I strongly recommend buying the <a href="https://smile.amazon.co.uk/dp/B00OYTAXYQ">Life in the United Kingdom: A Guide for New Residents, 3rd edition</a> book 📖 Especially if you are like me and prefer reading a “story” rather than just memorizing 100 names and dates taken out of context.</p>
<p>I did a quick read of the whole book cover to cover, and then did a more focused re-read of topics like the law system, or some important leaders, i.e. Queen Victoria, Elizabeth, the wars with the French, etc. (1 day)</p>
<p>The idea is simple, every question on the test is literally a sentence taken out of the book, formed as multiple-choice. Therefore, reading the book at least once gets you 50-60% there.</p>
<p>The only thing I really wish the book provided is a simple timeline with just the names of all the different leaders/queens/kings on the same line in order, otherwise I found it hard to remember all the James and Charles. There is a Henry VIII for God’s sake.</p>
<p>Therefore, I did this stupidly simple timeline myself the night before the exam: <a href="https://gist.github.com/lambrospetrou/68241a9243312d95b19812ebf36109df">https://gist.github.com/lambrospetrou/68241a9243312d95b19812ebf36109df</a></p>
<p>Feel free to use it, or even submit edit requests if you want to add something missing or fix some inaccuracy. Please note that this not supposed to be a summary of the book, or even a list of all the events. Its main purpose is to list important people, along with “some” events.</p>
<h3 id="practice-tests"><a href="#practice-tests">Practice tests</a></h3><blockquote>
<p>Practice makes perfect! - someone wise</p>
</blockquote>
<p>After reading the book, your next step is to do practice tests, <strong>lots of them</strong>. There are many websites, and applications for both iOS and Android, but I suggest using one where you can see why an answer is wrong or correct. This is super important! Also, it’s important to find an app with many tests with a broad coverage of the book chapters.</p>
<p>I used the <a href="https://play.google.com/store/apps/details?id=com.bubblingiso.LifeintheUKTest">Life in the UK Test 2020</a> Android application, and it was <strong>immensely helpful to see the exact exerpt from the book</strong> that justifies why the answer is correct or wrong. Reading those sentences over and over, even the ones you already know, will just stick them in your brain for good 🏋🏽‍♀️ This application has a “cramming mode”, where it shows you the answer before-hand. I <strong>did not use that though</strong>, because putting some effort every time to make a choice forces you to think, and this forces your brain to start doing associations between names, dates, and events. Reading the answer will just magnify that reasoning.</p>
<p>The above application has 48 tests as of the time of writing. This was my practice methodology:</p>
<ol>
<li>Did every test in order, but before moving to the next one, I repeated the test until I scored at least 22 out of 24. Anything less, and I was doing the test again.</li>
<li>Once I did all the tests, I started doing them again in random order. However, same as before, if I scored below 22/24 I repeated the test.</li>
</ol>
<p>Note that there are no difficulty levels in this test. All of the questions are pretty much of the same difficulty depending on what you are good at remembering. For example, I am horrible at remembering all the sportswomen/men and poets, so I came up with memorable tricks to remember at least some of them, either to help pick the correct answer, or to eliminate the wrong ones. This memorable information does not have to be anything scientific. I am even proud to say that for some of them I associated the sound of their name to their invention or sport 😅</p>
<p>Overall, I spent roughly 6 hours just doing tests. The night before the exam I was at a point of completing a full test within 2 minutes, making just 1-2 mistakes, if any at all.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Read the book, do a hell lot of tests, and use your imagination to remember stuff.</p>
<p>Good Luck 🥳 🙌</p>

</article>
<article>
<h1>Minibri Score — The Inception</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 25 Oct 2020 00:00:00 GMT</p>
<h2 id="tl-dr"><a href="#tl-dr">TL;DR</a></h2><p><strong>Minibri Score</strong> is a sports prediction game that is actually fun! </p>
<p>Visit <a href="https://score.minibri.com">https://score.minibri.com</a> to signup for early access 🚀 and stay tuned for its release ⏳</p>
<h2 id="why"><a href="#why">Why?</a></h2><p>Whenever there was a big football event, like <a href="https://en.wikipedia.org/wiki/UEFA_Euro_2016">UEFA Euro 2016</a> and <a href="https://www.fifa.com/worldcup/">FIFA World Cup</a>, me and my friends were doing contests on who can predict the score, top scorers, etc.</p>
<p>We were using a Google Spreadsheet (or Excel) with some formulas, and in some cases extended it with a bit of JavaScript code to calculate our special point awards and generate a ranking table. At the end of the tournament the player with the most points was the winner (you can put money down to make it more interesting 💸 but don’t get too addicted…).</p>
<p>The funny thing is that this is something that many groups of friends do, and it’s even popular among colleagues in companies, since it’s a nice bonding activity. Who doesn’t like some healthy competition!</p>
<p>Three years ago, one of my friends had the idea of taking what we were doing on spreadsheets, and make a product out of it. It didn’t seem exciting to me at first; who was going to use it, and who was even going to pay for it. We did some research and there were not a lot of similar alternatives on the market. So, we spend the next few weeks discussing the product on and off. After a few weeks of inactivity it basically went to sleep.</p>
<p>A few months ago, I wanted to start something on my own, outside of work, a side-project that I was going to have fun working with. If it got any traction, and customers used it, even better. Therefore, I revived that original idea, refined it, and decided to implement it. The timing now is actually quite good, since next year there will be several big sports events, and it will be a great opportunity to use it, even if just among my friends.</p>
<h2 id="what"><a href="#what">What?</a></h2><p>The idea behind Minibri Score is not new, it’s not innovative, and it does not have to be.</p>
<p>I spent a few weeks looking for alternatives, competitors, but apart from 1 or 2 others, everything else is way too bloated. Full of advertisements all over the screen, a million menus and complex options, and most of them really focused on just a couple of sports. They all try to be another fantasy league. For context, a fantasy league is where you basically focus into a specific league tournament (e.g. Premier League in England), and choose/buy/sell players, take points every time the players you chose score, or their teams win, etc.</p>
<p><strong>Minibri Score is not a fantasy league.</strong> I don’t want it to be one.</p>
<p>The only thing a user will have to do is predict the outcome of a fixture. All of the fun, and strategy, revolves around the point awards. There are several ways to get points, and they all use the score prediction versus the final score in some sort. Maybe you have to take a big risk with a low-chance score to win the big points, or always play safe and getting steady points.</p>
<h3 id="which-sport"><a href="#which-sport">Which Sport?</a></h3><p>One of the major drawbacks of the alternatives, which is a consequence of going with a fantasy league style of tournament, is the fact that you can only find a couple sports supported, mainly football. There is just too much work into the specifics of a single sport.</p>
<p>One of the main goals I have with Minibri Score is to allow any two-sided sport to be supported right from the start. There are several things that are sport specific, for example volleyball has sets instead of just a final score. There will be exciting point awards around the sport specifics, but the core functionality is going to be shared across all sports.</p>
<p><strong>👉🏽 As long as it’s a two-sided sport, Minibri Score just works 🙌🏽</strong></p>
<p>I consider this a big differentiator, and an advantage since many people will be able to have fun. You will be able to compete in a tournament with teams from your local town that you would never find in major fantasy leagues. Or even create a tournament for a sport that is not as popular, but still follows the same principles of two sides and a score.</p>
<h2 id="simple-friendly-fun"><a href="#simple-friendly-fun">Simple. Friendly. Fun.</a></h2><p>I want a product that is:</p>
<ul>
<li>Simple to use without complex menus and unnecessary options.</li>
<li>Friendly, without advertisements and flashy popups coming out from everywhere.</li>
<li>Fun among friends and colleagues alike.</li>
</ul>
<h2 id="when"><a href="#when">When?</a></h2><p>I am hoping to release the first version before the end of the year 🥳</p>
<p>Next year is going to be exciting 🎾 🏓 🏏 ⛹️‍♀️ ⚽️</p>
<p>👉🏿 Visit <a href="https://score.minibri.com">https://score.minibri.com</a> to signup and I will email you when it’s out and ready 🚀</p>
<p>You can also follow its development on Twitter at <a href="https://twitter.com/MinibriScore">https://twitter.com/MinibriScore</a> 🐦</p>

</article>
<article>
<h1>Simple and Low-cost Investing guide (for the UK)</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 04 Oct 2020 00:00:00 GMT</p>
<p>Over the past few years I read a lot of articles, blog posts, and a few books on stock market investing. I wanted a simple approach that would have good enough returns, but without spending lots of time maintaining a portfolio and risking losing money just because I was a few hours late selling some stock.</p>
<p>In this article I will describe what I do as of today, and why I think it’s good enough for anyone that also strives for a simple low-maintenance approach with good returns long-term (10+ years).</p>
<h2 id="tl-dr-the-short-version"><a href="#tl-dr-the-short-version">TL;DR - The short version</a></h2><ol>
<li>Max your <a href="https://www.gov.uk/workplace-pensions/what-you-your-employer-and-the-government-pay">company’s pension contributions</a> to get their maximum rate, it’s free money.</li>
<li>Open a <a href="https://www.vanguardinvestor.co.uk/investing-explained/stocks-shares-isa">Stocks and Shares ISA</a> account.</li>
<li>Setup a monthly direct debit to invest in low-cost, low-maintenance index funds like the <a href="https://www.vanguardinvestor.co.uk/investments/vanguard-ftse-global-all-cap-index-fund-gbp-acc">FTSE Global All Cap Index Fund</a> and <a href="https://www.vanguardinvestor.co.uk/investments/vanguard-ftse-developed-world-ex-uk-equity-index-fund-gbp-acc/overview">FTSE Developed World ex-U.K. Equity Index Fund</a> inside your ISA account.</li>
<li>Save money in <a href="https://www.nsandi.com/premium-bonds">NS&I Premium Bonds</a>, and win monthly tax-free prizes.</li>
<li>If you still have money after maxing out all the previous options, then awesome for you, keep investing in global index funds using a regular investing/trading account.</li>
</ol>
<p>That’s it, without too much effort we maximized our long-term returns in a relatively low-risk, low-cost approach 💸</p>
<h2 id="stocks-and-bonds"><a href="#stocks-and-bonds">Stocks and Bonds</a></h2><p>The most important book I read which I recommend everyone to also read is <a href="https://smile.amazon.co.uk/Little-Book-Common-Sense-Investing/dp/1119404509/">The Little Book of Common Sense Investing</a>. The main idea is that to avoid most of the risk, albeit reducing potential returns, is to invest <strong>in the whole stock market</strong> using low-cost index funds. This approach exposes you to the entirety of traded companies, which means that if a company goes under water it won’t affect you a lot, since some other company that did well will cancel out the losses.</p>
<p>Based on this simple idea, my current portfolio includes:</p>
<ul>
<li><a href="https://www.vanguardinvestor.co.uk/investments/vanguard-us-equity-index-fund-gbp-acc">U.S. Equity Index Fund</a> - This fund invests in the top (roughly) 3500 companies in the US. This fund is very simple, thus having very low annual maintenance cost at <code>0.10%</code>. This is the UK version of the super famous US fund <a href="https://investor.vanguard.com/mutual-funds/profile/VTSAX">Vanguard Total Stock Market Index Fund - VTSAX</a>.</li>
<li><a href="https://www.vanguardinvestor.co.uk/investments/vanguard-ftse-global-all-cap-index-fund-gbp-acc">FTSE Global All Cap Index Fund</a> - This fund invests in the top (roughly) 7000 companies in developed and emerging markets around the world. The annual maintenance cost is a bit higher on this fund (<code>0.23%</code>) but still well below the market average for actively-maintained funds. This is the UK version of the US fund <a href="https://investor.vanguard.com/mutual-funds/profile/VTWAX">Vanguard Total World Stock Index Fund - VTWAX</a>. Note that this fund is somewhat a superset of the above, and just using this one can be enough and simpler, depending on if you want exposure outside the US.<ul>
<li><a href="https://www.vanguardinvestor.co.uk/investments/vanguard-ftse-developed-world-ex-uk-equity-index-fund-gbp-acc/overview">FTSE Developed World ex-U.K. Equity Index Fund</a> - <strong>Update 2023:</strong> I switched from the Global All Cap Index fund to the Developed World Index fund. It excludes the Emerging Markets which had dissapointing growth over the recent years, and I prefer the more stable developed world. This fund also excludes the UK to avoid home-bias, which is OK for me, since living in the UK and getting paid in British Pounds (£) is enough exposure to the UK. It has a lower annual maintenance cost too at <code>0.14%</code>.</li>
</ul>
</li>
</ul>
<p>Both funds are relatively new (just a few years old) which explains why their total invested assets are way below the corresponding ones in the US (hundreds of billions), but hopefully as they get bigger, their cost will get further down, and maybe even reach the US-levels at some point 😅</p>
<p>All reputable investing resources suggest that someone invests in bonds as well, which have much lower returns but also much lower risk. I personally don’t invest in bonds, since I have several decades of investing ahead of me, so I don’t need the lower risk at the moment. But, if you want to have some exposure in bonds, the following is a good global fund.</p>
<ul>
<li><a href="https://www.vanguardinvestor.co.uk/investments/vanguard-global-bond-index-fund-gbp-hedged-acc">Global Bond Index Fund</a> - This fund invests in more than 12000 bonds from around the world, with an annual maintenance cost of 0.15%.</li>
</ul>
<p>There are more things to consider, but by investing in the above funds you are exposed to the whole market at a low cost, and with good enough potential returns.</p>
<p>Honestly, you could simplify this even further. If you really want a <strong>no-maintenance approach</strong>, then you can just invest in one of the <a href="https://www.vanguardinvestor.co.uk/investing-explained/what-are-lifestrategy-funds">LifeStrategy Funds</a> (or similar alternatives) and let the fund managers figure out which funds to invest in. You can choose the pecentage you want to be invested in stocks, the rest is invested in bonds, depending on your risk appetite. For example, the LifeStrategy 40% Fund invests 40% of your money in stocks, and 60% in bonds. Therefore, the easiest approach would be to invest in the LifeStrategy 80% or 100% funds early in your career, and switch to the 20% or 40% funds closer to your retirement.</p>
<p>I use the Vanguard funds because I find them to be among the cheapest and the best around, but keep in mind that most of the investment companies and brokers provide their own variations and alternatives, so do your own research when choosing a broker.</p>
<h3 id="us-versus-international"><a href="#us-versus-international">US versus International</a></h3><p>As I already mentioned, the U.S. Equity Index Fund (like VTSAX) is only investing in US companies, whereas the FTSE Global All Cap Index Fund (like VTWAX) invests in international markets as well.</p>
<p>There are huge and myriad debates among professionals and individual investors trying to decide what’s the best. Trying to answer the question of how much you should invest internationally, and if you even need to. Each side comes up with all sorts of (valid) arguments supporting their view. You can read some of the discussions at [<a href="https://investor.vanguard.com/investing/investment/international-investing">1</a>] [<a href="https://www.bogleheads.org/wiki/Domestic/International">2</a>] [<a href="https://www.reddit.com/r/Bogleheads/comments/i6041z/vtsaxvtiax_vs_vtwax/">3</a>] [<a href="https://www.reddit.com/r/financialindependence/comments/ans6sj/the_100_vtsax_approach_appears_inconsistent_with/">4</a>] [<a href="https://www.reddit.com/r/Bogleheads/comments/bi1q9t/100_vtwax/">5</a>] [<a href="https://www.bogleheads.org/forum/viewtopic.php?f=1&t=281152">6</a>].</p>
<p>Personally, I believe that it makes sense to have international exposure. However, I find the US market stronger at the moment, hence why I invest in both funds, so that I can tilt my weight more on the US market. Having said that, I gradually increase my share of the global fund because I believe that non-US markets will thrive over the next few years, maybe even more than the US.</p>
<p>This is just another speculation, my speculation, thus if you have no opinion just go with the global fund and relax.</p>
<h2 id="stocks-and-shares-isa"><a href="#stocks-and-shares-isa">Stocks and Shares ISA</a></h2><p>In the UK there are some special savings accounts called <a href="https://www.gov.uk/individual-savings-accounts">Individual Savings Accounts (ISA)</a> which allow you to save up to £20000 a year (as of 2020), and any interest you get is tax-free.</p>
<p>The great part is that you can have a <a href="https://www.vanguardinvestor.co.uk/investing-explained/stocks-shares-isa">Stocks and Shares ISA</a> account, which means that you can invest in the funds mentioned above and any return you get is going to be tax-free. As you can imagine, over the years, the reinvestment of compound tax-free interest can result in substantial money 💰</p>
<p>In my opinion, everyone should use a Stocks and Shares ISA, and if they exceed the yearly limit, then proceed with a normal stock trading account.</p>
<h3 id="lifetime-isa-lisa"><a href="#lifetime-isa-lisa">Lifetime ISA - LISA</a></h3><p>I haven’t personally used this (yet), but if you are planning on buying a home in the UK, or want to plan for retirement it could be a nice addition to your normal ISA account. You can put up to £4000 into a LISA account, and the government will add an additional 25% of your investment up to £1000, only applicable till you are 50 years old.</p>
<p>The £4000 comes out of the total ISA allowance, which is why it might not be useful for someone not planning on staying in the UK for long, since the only way to get the money out of the LISA account is to either buy a house, or withdraw the money once you reach 60 years old.</p>
<h2 id="ns-amp-i-premium-bonds"><a href="#ns-amp-i-premium-bonds">NS&I Premium Bonds</a></h2><p>I recently found out about the <a href="https://www.nsandi.com/premium-bonds">NS&I Premium Bonds</a>, which is a 100% safe way to store your money since they are backed by the government, and they have monthly money draws that give out <strong>tax-free prizes</strong>.</p>
<p>There is no guaranteed return on the premium bonds, but for every £1 you invest you get one entry in the monthly draw. Prizes range from £25 to millions, and therefore if you are lucky it can be very beneficial.</p>
<p>Keep in mind that unless you win the big prizes this is not going to match the returns of the stock market. However, in the few months I have been using this, I got more returns than any savings account rate a traditional bank would give, which makes this a winning option for me.</p>
<h2 id="bank-savings-accounts"><a href="#bank-savings-accounts">Bank Savings accounts</a></h2><p>A few years ago you could find traditional bank savings accounts, or even current accounts, with high interest rates in returns. For example, the Lloyds Club Current Account had 5% return up to a £5000 balance, which over the years got reduced down to 4%, 3%, and now it’s around 1%.</p>
<p>However, I can’t find any savings account these days that’s worth saving money in, therefore I recommend just putting a small amount of money into your Stocks and Shares ISA using one of the above low-maintenance funds instead.</p>
<h2 id="stock-trading-brokers"><a href="#stock-trading-brokers">Stock Trading Brokers</a></h2><p>I am not writing this article to advertise a specific broker, even though I solely provided links to Vanguard 😅</p>
<p>The only suggestion I have is to go with a broker that has great customer support, and low annual maintenance costs. These are the two things I care about, especially since I want a low-cost approach, and a good customer support is always helpful when I have questions.</p>
<p>As an example, <a href="https://www.vanguardinvestor.co.uk/what-we-offer/fees-explained">Vanguard’s low fees, clear costs</a> approach is one of the things I love about it. There is an annual account cost of 0.15%, and in addition to that the annual maintenance cost for the funds invested. So far, 0.15% account management cost is the lowest I have found, hence why I use and recommend them.</p>
<blockquote>
<p>One low account fee
Just 0.15% per year</p>
<p>Capped at £375 per year for accounts over £250,000</p>
<p>— By <a href="https://www.vanguardinvestor.co.uk/what-we-offer/fees-explained">Vanguard Investor</a></p>
</blockquote>
<p>As of 2023, <a href="https://www.interactivebrokers.co.uk/en/trading/isa-accounts.php">Interactive Brokers also introduced their ISA accounts</a>, which has even lower fees. It chargers £3 / €3 per trade for Western European stocks, with a minimum £3 charge per month.</p>
<blockquote>
<p>£3 / €3 per trade for Western European stocks, with no added spreads, account minimums or platform fees. These simplified commission rates are available with IB SmartRoutingSM, which optimizes the execution quality for clients by accessing the many exchanges and trading venues across the continent.1 Pricing on US stocks starts at just USD 0.005 per share.</p>
<p>There is a minimum monthly activity fee of £3 for a Stocks and Shares (adult) ISA and £1 for a JISA. You receive one free withdrawal per month and there are no custody fees for all account types. All ISA accounts are cash only, no margin.</p>
<p>— By <a href="https://www.interactivebrokers.co.uk/en/trading/isa-accounts.php">Interactive Brokers</a></p>
</blockquote>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>No matter how much money you can save per month, even if it’s just £50-100, it’s highly worth it to open up a Stocks and Shares ISA account, and invest that money into one of the low-cost, low-maintenance, global funds mentioned above (or similar alternatives) and enjoy the power of tax-free <a href="https://www.investopedia.com/terms/c/compoundinterest.asp">compound interest</a>. 💸⛱️💰🏡</p>

</article>
<article>
<h1>V8 Isolates for fast JavaScript execution in Go</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 26 Sep 2020 00:00:00 GMT</p>
<p>I am working on a side project similar to Leetcode/HackerRank (more on that in another article) where I have to execute some given JavaScript code against hundreds of inputs. The first approach I thought was spawning a subprocess and executing <code>node -e "<code-to-evaluate>"</code> directly passing the script I want to run. This works fine and gets the job done in a low throughput scenario.</p>
<p>However, I remembered that <a href="https://workers.cloudflare.com/">Cloudflare Workers</a>, Cloudflare’s serverless offering, is using V8 Isolates to execute the submitted JavaScript code [<a href="https://developers.cloudflare.com/workers/learning/how-workers-works#isolates">1</a>][<a href="https://blog.cloudflare.com/cloud-computing-without-containers/">2</a>][<a href="https://blog.cloudflare.com/mitigating-spectre-and-other-security-threats-the-cloudflare-workers-security-model/">3</a>]. Therefore, I wanted to see how much faster that can be for my usecase.</p>
<h2 id="experiment"><a href="#experiment">Experiment</a></h2><p>I have written a very basic experiment where the JavaScript code calculates the sum of a given integer array (and some extra arguments). The code is as simple as below:</p>
<pre><code class="language-javascript">// f = (a: number[], b: number, c: string) => number
const f = (a, b, c) => {
  return a.reduce((c, acc) => acc + c, b) + c.length;
};
result = f(%+v, %+v, %+v);
result;
</code></pre>
<p>The weird looking arguments, <code>%+v</code>, are just placeholders which will be replaced in my Go program with actual values. Remember, the reason I wanted this in the first place is to run some code (in this case the function <code>f</code>) against lots of input arguments.</p>
<p>The last line evaluates to the value of <code>result</code>, just imagine that this is interpreted line by line.</p>
<h3 id="spawn-node"><a href="#spawn-node">Spawn Node</a></h3><p>The simplest approach I thought was to invoke <code>node</code> passing the JavaScript code to be evaluated.</p>
<pre><code class="language-go">func node(script string) string {
  output, e := exec.Command("node", "-e", script+"\nconsole.log(result)").Output()
  if e != nil {
    log.Fatalf("Error: %+v\n", e)
  }
  return string(output)
}
</code></pre>
<p>The <code>script+"\nconsole.log(result)"</code> might seem strange at first. We need to suffix our script with <code>console.log(result)</code> since we capture the output of the subprocess.</p>
<h3 id="v8-isolates"><a href="#v8-isolates">V8 Isolates</a></h3><p>I won’t go into the details of what a V8 Isolate is other than quoting <a href="https://developers.cloudflare.com/workers/learning/how-workers-works#isolates">Cloudflare’s documentation</a>…</p>
<blockquote>
<p>V8 orchestrates isolates: lightweight contexts that group variables with the code allowed to mutate them. You could even consider an isolate a “sandbox” for your function to run in.</p>
<p>A single runtime can run hundreds or thousands of isolates, seamlessly switching between them. Each isolate’s memory is completely isolated, so each piece of code is protected from other untrusted or user-written code on the runtime. Isolates are also designed to start very quickly. Instead of creating a virtual machine for each function, an isolate is created within an existing environment. This model eliminates the cold starts of the virtual machine model.</p>
</blockquote>
<p>And <a href="https://chromium.googlesource.com/chromium/src/+/master/third_party/blink/renderer/bindings/core/v8/V8BindingDesign.md">Chromium’s documentation</a>.</p>
<blockquote>
<p>An isolate is a concept of an instance in V8. In Blink, isolates and threads are in 1:1 relationship. One isolate is associated with the main thread. One isolate is associated with one worker thread.</p>
<p>A context is a concept of a global variable scope in V8. Roughly speaking, one window object corresponds to one context.</p>
</blockquote>
<p>Let’s see how we create and use a V8 Isolate in Go, using <a href="https://github.com/rogchap/v8go">https://github.com/rogchap/v8go</a>.</p>
<pre><code class="language-go">func v8isolates(script string, isolateOpt ...*v8go.Isolate) string {
  var isolate *v8go.Isolate
  if len(isolateOpt) > 0 {
    isolate = isolateOpt[0]
  }
  ctx, _ := v8go.NewContext(isolate) // Passing `nil` creates a new Isolate
  defer ctx.Close()
  output, e := ctx.RunScript(script, "function.js")
  if e != nil {
    log.Fatalf("Error: %+v\n", e)
  }
  return output.String()
}
</code></pre>
<p>The main method we use is <code>ctx.RunScript(script, filename)</code> which accepts the code to execute (argument <code>script</code>), and a (fake) filename. The filename will be used inside the generated error stacktrace in case the execution of the passed script fails for any reason. The filename itself does not need to actually exist on the filesystem. The returned value of <code>ctx.RunScript(...)</code> is the last evaluated expression of the given script, which explains the last line in our JavaScript code above which is just <code>result</code>.</p>
<h2 id="results"><a href="#results">Results</a></h2><p>Cloudflare makes bold statements about the performance of their Workers due to using Isolates, but I was still mind-blown by how fast they actually work 🤯</p>
<p>I wrote some basic benchmarks for the above two functions and used a simple input for the array (10 to 150 integer numbers).</p>
<p><strong>Macbook 16” 2020 (16-threads)</strong></p>
<pre><code>➜ go test -bench . -benchtime 1s -benchmem
goos: darwin
goarch: amd64
pkg: github.com/lambrospetrou/code-playground/golang-v8isolates
BenchmarkNode-16                              19      62746119 ns/op     43208 B/op     57 allocs/op
BenchmarkNodeParallel-16                     176       6861996 ns/op     43142 B/op     55 allocs/op
BenchmarkV8IsolatesReuse-16                 4171        290742 ns/op        75 B/op      6 allocs/op
BenchmarkV8IsolatesNoReuse-16                823       1414748 ns/op        88 B/op      7 allocs/op
BenchmarkV8IsolatesReuseParallel-16        30026         45703 ns/op        75 B/op      6 allocs/op
BenchmarkV8IsolatesNoReuseParallel-16       2079        983084 ns/op        83 B/op      7 allocs/op
PASS
ok      github.com/lambrospetrou/code-playground/golang-v8isolates    10.218s
</code></pre>
<p><strong>Surface Pro 2017 (4-threads)</strong></p>
<pre><code class="language-bash">$ go test -bench . -benchtime 1s -benchmem
goos: linux
goarch: amd64
pkg: github.com/lambrospetrou/code-playground/golang-v8isolates
BenchmarkNode-4                            27    47009289 ns/op      51359 B/op     92 allocs/op
BenchmarkNodeParallel-4                    76    15338893 ns/op      51548 B/op     92 allocs/op
BenchmarkV8IsolatesReuse-4               3342      385395 ns/op         74 B/op      6 allocs/op
BenchmarkV8IsolatesNoReuse-4              692     1956912 ns/op         84 B/op      7 allocs/op
BenchmarkV8IsolatesReuseParallel-4       5684      213542 ns/op         75 B/op      6 allocs/op
BenchmarkV8IsolatesNoReuseParallel-4      303     3631768 ns/op         83 B/op      7 allocs/op
PASS
ok    github.com/lambrospetrou/code-playground/golang-v8isolates      12.725s
</code></pre>
<p>The difference between <code>BenchmarkV8IsolatesReuse</code> and <code>BenchmarkV8IsolatesNoReuse</code> is that instead of creating a new Isolate per test run, we use one Isolate per thread and only create a new context per run.</p>
<p>As you can see, there is a <strong>huge performance boost</strong> when using V8 Isolates. At least one order of magnitude faster when creating new Isolate per run, and two orders of magnitude with Isolate reuse. It’s not unexpected since spawning a process is quite expensive but still…</p>
<p>We can do a lot of optimizations to how we use the spawned process as well. For example, we can spawn one <code>node</code> process per thread and then communicate with it over standard input/output, and having a small script that reads the standard input, evaluates it, and prints the result on standard output. This will probably be even faster than the V8 Isolates but apart from the fact that we completely lose execution isolation, it’s also a lot more work, so it’s out of scope.</p>
<p>You can find the code for the experiment at <a href="https://github.com/lambrospetrou/code-playground/tree/master/golang-v8isolates">https://github.com/lambrospetrou/code-playground/tree/master/golang-v8isolates</a>.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>V8 Isolates are amazing 🚀 </p>
<p>However, if you are going to use them on a real production multi-tenant system make sure to secure it further (read references below).</p>
<h2 id="references"><a href="#references">References</a></h2><ol>
<li><a href="https://developers.cloudflare.com/workers/learning/how-workers-works#isolates">https://developers.cloudflare.com/workers/learning/how-workers-works#isolates</a></li>
<li><a href="https://blog.cloudflare.com/cloud-computing-without-containers/">https://blog.cloudflare.com/cloud-computing-without-containers/</a></li>
<li><a href="https://blog.cloudflare.com/mitigating-spectre-and-other-security-threats-the-cloudflare-workers-security-model/">https://blog.cloudflare.com/mitigating-spectre-and-other-security-threats-the-cloudflare-workers-security-model/</a></li>
</ol>

</article>
<article>
<h1>AWS Lambda and SQLite3 over Amazon EFS</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 11 Jul 2020 00:00:00 GMT</p>
<p>I am a huge fan of <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> and serverless in general. I also love the reliability, speed, and simplicity of <a href="https://www.sqlite.org">SQLite</a>, and I was looking to find a few projects to use it.</p>
<p>The issue was that until recently you could not use AWS Lambda together with SQLite, other than bundling the whole database file into the Lambda bundle zip file itself. In practice this meant you only had a readonly database though, since each concurrent Lambda invocation would have its own copy of the database.</p>
<p>However, last month, AWS released <a href="https://aws.amazon.com/blogs/compute/using-amazon-efs-for-aws-lambda-in-your-serverless-applications/">integration between AWS Lambda and Amazon EFS</a>. Since Amazon EFS already supported <a href="https://aws.amazon.com/about-aws/whats-new/2017/03/amazon-elastic-file-system-amazon-efs-now-supports-nfsv4-lock-upgrading-and-downgrading/">NFSv4 lock upgrading/downgrading</a> which is needed by SQLite, it means that now we can access an SQLite database file stored on an EFS filesystem through AWS Lambda in a read-write mode 🔥 🚀</p>
<h2 id="quick-test"><a href="#quick-test">Quick Test</a></h2><p>I did a quick test using a Node.js Lambda function and it seems that the EFS connectivity adds around <strong>100ms</strong> of latency overhead to the Lambda invocations when accessing the database over EFS versus a local filesystem. This test is by no means scientific and performance will vary a lot depending on your database file size, the concurrent connections you have, and the number of concurrent writes to the database.</p>
<p>The <a href="https://github.com/lambrospetrou/code-playground/blob/master/aws-lambda-node-sqlite/local.js">simple test application (code available)</a> executes 100 <code>insert</code> statements using a transaction, a <code>select</code> query with a <code>where</code> condition, and then a <code>select *</code> query.</p>
<p><strong>AWS Lambda</strong></p>
<p></p>
<p>The execution time of the statements accessing the SQLite database is roughly <strong>140ms</strong>. There is some variation but that’s what I was averaging after 10+ runs. I also tried several AWS Lambda memory configurations to see if the more powerful CPU would affect the latency in any way but it didn’t. This makes me think that the overhead is pure network IO.</p>
<p><strong>Local - Surface Pro</strong></p>
<p></p>
<p>The execution time of the statements accessing the SQLite database is roughly <strong>30-40ms</strong>. Some variation expected here as well albeit lower since it’s a local filesystem.</p>
<h2 id="is-this-production-ready"><a href="#is-this-production-ready">Is this Production ready?</a></h2><p>Most probably not…</p>
<p>The correct answer is <strong>it depends!</strong></p>
<ul>
<li>Accessing an EFS filesystem adds latency to your Lambda invocations just due to the network overhead.</li>
<li>SQLite was designed to be the best in-process database in the world. SQLite’s website itself <a href="https://www.sqlite.org/whentouse.html">describes amazingly well the appropriate uses for SQLite</a>, including the inappropriate ones. Our experiment, since it introduces a network between the application (Lambda code) and the database file (EFS) falls into the inappropriate uses…</li>
<li>SQLite supports concurrent readers, but only one writer can be writing to the database, therefore only one Lambda invocation will be able to write to the database at any point in time which is going to be a throughput bottleneck.</li>
</ul>
<p>The above facts make it obvious that using SQLite with AWS Lambda is a very bad idea for production systems with thousands of users, or high-throughput of writes, or any system that needs low latency for that matter.</p>
<p>However, for side projects, small team internal projects, applications where latency is not an issue, or situations when you want to build a quick demo prototype it’s probably alright.</p>
<p>I am sure the AWS Lambda users will find <strong>interesting</strong> ways to abuse this integration, but that’s expected, and it’s half the fun 😜</p>

</article>
<article>
<h1>Make it Work, Make it Beautiful, Make it Fast</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 04 Jul 2020 00:00:00 GMT</p>
<blockquote>
<p>The OTP team is respecting the old slogan: first make it work, then make it beautiful, and only if you need to, make it fast.</p>
</blockquote>
<p>I was reading the <a href="https://learnyousomeerlang.com/maps">Learn You Some Erlang</a> book this morning and at some point I stumbled upon the above sentence. I heard this phrase online for the first time about 2 years ago from Jose Valim (creator of <a href="https://elixir-lang.org/">Elixir</a>) during one of his <a href="https://www.twitch.tv/josevalim/videos">Advent of Code videos</a>.</p>
<p>This phrase really resonates with me because this is exactly how I approach writing code and solving problems as well. Let’s elaborate a bit more…</p>
<h2 id="make-it-work"><a href="#make-it-work">Make it work</a></h2><p>The main reason we write code (other than fun) is to solve a problem, and if our solution is not correct, then we didn’t do our job. <strong>Correctness comes first!</strong></p>
<p>During this step, I usually write code without too much thinking into abstractions, how to design the components, function names, or other appealing details. I just want to solve the problem.</p>
<h2 id="make-it-beautiful"><a href="#make-it-beautiful">Make it beautiful</a></h2><p>This step can be intepreted in many ways depending on the programming language used, the domain of the problem, your team, etc. </p>
<p>However, it could be refactoring the public API of your library/module, coming up with the right abstractions, splitting long functions into smaller more testable ones, and anything else that makes the code readable, extensible, and ready to be reviewed by your peers.</p>
<h2 id="make-it-fast"><a href="#make-it-fast">Make it fast</a></h2><p>This step can be considered optional in a lot of day-to-day cases. However, it’s always good practice, and over time it can become a habit to check your code for performance issues. </p>
<p>You solved the problem correctly (step 1), and the code is all pretty and ready to be reviewed (step 2), but is it fast enough? An even better question would be: <em>Is it slow enough to cause problems?</em></p>
<p>The time and effort you put into making your solution fast solely depends on your use-case, but always remember to ask this question.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Always follow this approach when writing code:</p>
<ol>
<li>Make it work</li>
<li>Make it beautiful</li>
<li>Make it fast</li>
</ol>

</article>
<article>
<h1>Enjoyable browser automation with Puppeteer and Playwright</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 16 May 2020 00:00:00 GMT</p>
<p>I have been doing a lot of web development over the past few years, both in my own time with my side projects but also at work. One of the things that you need at some point is to write a script to automate some kind of action usually done in the browser. For example, this could be navigating to certain pages and taking screenshots, or automated form submission, or automated UI testing.</p>
<p>One of the simplest use-cases I have for my own website is to generate a PDF version of <a href="https://www.lambrospetrou.com/cv/">my CV page</a>. There are several reasons why a PDF file could be better sometimes than a website, so it’s useful to have one in hand. Since I already had the HTML version, what I used to do for several years was to use Chrome’s <strong>Print to PDF</strong> functionality.</p>
<p>However, a couple of years ago, Google released <a href="https://developers.google.com/web/tools/puppeteer">Puppeteer</a> which allows someone to script a headless version of Chromium. From the second I saw it I knew it was going to be big! Easy, simple and headless browser automation through Node. The potential use-cases are infinite 🛸</p>
<p>Since then it has been used by companies to make amazing things.</p>
<ul>
<li>Automated robust UI tests inside headless Chrome, instead of the finicky Webdriver approaches that were the standard back then.</li>
<li>Performance profiling and benchmarking of web applications.</li>
<li>Scripting of manual tasks that people were doing like form submission.</li>
<li>Web crawling through a normal browser to avoid issues with previous approaches that were using browser-like fakes.</li>
<li>Server-side rendering of dynamic single page JavaScript application to avoid SEO issues.</li>
<li>Many other things that required an actual browser running a website.</li>
</ul>
<p>One of the problems with Puppeteer is that it only supports Chromium so far.</p>
<p>I was extremely happy to see a few days ago that Microsoft finally released the v1 version of <a href="https://playwright.dev/">Playwright</a>. Playwright is extremely similar to Puppeteer, but it extends its amazing feature-set by supporting all the three major browsers, Chromium, Firefox, and Webkit 😍 Not only that, but they added very cool features like auto-waiting for element selection which greatly simplifies DOM manipulation.</p>
<p>There are a few features that are not supported by all the browsers, so keep that in mind if something fails in your scripts, make sure that the browser you use supports it.</p>
<h2 id="code"><a href="#code">Code</a></h2><p>From the little time I spent with Playwright its API seems extremely similar to Puppeteer’s so it’s easy for someone to jump between the two or migrate from one to the other.</p>
<p>As an example, see below an excerpt of the code I use to generate the PDF version of my website.</p>
<p>The full code can be found at <a href="https://github.com/lambrospetrou/lpcv/blob/master/html/build-tool/generate-pdf.js">my CV repository</a>.</p>
<h3 id="local-server"><a href="#local-server">Local server</a></h3><p>While I am making changes to my CV I want to be able to generate the PDF and test it without having to push my changes to the live website. The following code starts a local server at port <code>12345</code> that serves the assets from inside the <code>build/</code> directory.</p>
<pre><code class="language-javascript">const handler = require('serve-handler');
const http = require('http');
const path = require('path');
const WS_BUILD=path.join(__dirname, './build');

const server = http.createServer((request, response) => {
    return handler(request, response, {
        public: WS_BUILD
    });
})
server.listen(12345)
</code></pre>
<h3 id="puppeteer"><a href="#puppeteer">Puppeteer</a></h3><p>The following code is what we need to make Puppeteer visit our local server, wait for the page to fully load, and then generate a PDF named <code>cv.pdf</code>.</p>
<pre><code class="language-javascript">const puppeteer = require('puppeteer');

async function generatePdfPuppeteer() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('http://127.0.0.1:12345', {waitUntil: 'networkidle2'});
    await page.pdf({
        path: path.join(WS_BUILD, 'cv.pdf'), 
        format: 'A4',
        margin: {
            top: '0.39in',
            left: '0.39in',
            right: '0.38in',
            bottom: '0.38in'
        }
    });
    await browser.close();
}
</code></pre>
<h3 id="playwright"><a href="#playwright">Playwright</a></h3><p>The code for using Playwright is (almost) identical to Puppeteer in our simple use-case.</p>
<pre><code class="language-javascript">const { chromium } = require('playwright');

async function generatePdfPlaywright() {
    const browser = await chromium.launch();
    const ctx = await browser.newContext();
    const page = await ctx.newPage();
    await page.goto('http://127.0.0.1:12345', {waitUntil: 'networkidle'});
    await page.pdf({
        path: path.join(WS_BUILD, `cv-playwright.pdf`), 
        format: 'A4',
        margin: {
            top: '0.39in',
            left: '0.39in',
            right: '0.38in',
            bottom: '0.38in'
        }
    });
    await browser.close();
}
</code></pre>
<h3 id="issues-with-wsl-on-windows-10"><a href="#issues-with-wsl-on-windows-10">Issues with WSL on Windows 10</a></h3><p>I am using the <a href="https://docs.microsoft.com/en-us/windows/wsl/about">Windows Subsystem for Linux (WSL)</a> version 1, and I had issues while running both Puppeteer and PlayWright due to some incompatibilities with the browser binaries.</p>
<p>You can find more information for this issue in these discussions:</p>
<ul>
<li><a href="https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md#setting-up-chrome-linux-sandbox">https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md#setting-up-chrome-linux-sandbox</a></li>
<li><a href="https://github.com/loteoo/hyperstatic/pull/20/files">https://github.com/loteoo/hyperstatic/pull/20/files</a></li>
</ul>
<p>To make Chromium work the arguments <code>--no-sandbox</code>, <code>--disable-setuid-sandbox</code>, and <code>--single-process</code> have to be passed to the <code>launch()</code> method. See the modified lines below.</p>
<h4 id="puppeteer"><a href="#puppeteer">Puppeteer</a></h4><pre><code class="language-javascript">const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox', '--single-process']});
</code></pre>
<h4 id="playwright"><a href="#playwright">Playwright</a></h4><pre><code class="language-javascript">const browser = await chromium.launch({args: ['--no-sandbox', '--disable-setuid-sandbox', '--single-process']});
</code></pre>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Having the ability to write a few simple lines of code and automate a full-blown browser is amazing. What I discussed above only scratches the surface of what can be done and I am always excited to see how people use these stuff.</p>
<p>Enjoy 😉</p>

</article>
<article>
<h1>The worklog format 1.0</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Tue, 12 May 2020 00:00:00 GMT</p>
<p>Last week I wrote about <a href="https://www.lambrospetrou.com/articles/digital-braindump-and-productivity-tools/">note taking being my productivity tool</a>, and then about <a href="https://www.lambrospetrou.com/articles/best-tip-the-worklog/">the worklog</a> being my most important daily document.</p>
<p>Since then, I had several friends and colleagues reaching out asking specifics about the worklog’s format and how I structure its content. I will try to briefly describe what I use hoping that it can help others as well.</p>
<p>Before we dive in, I would like to reiterate once again that I don’t really believe in the <strong>one size fits all</strong> idea, and what works for me might not work for you. However, seeing the structure I use might inspire you to find one that works for you.</p>
<h2 id="what"><a href="#what">What?</a></h2><p>First of all, I urge you to read <a href="https://www.lambrospetrou.com/articles/best-tip-the-worklog/">the original article explaining what the worklog is</a> to better understand the rest of this article.</p>
<p>The worklog contains essentially three things:</p>
<ol>
<li>Meeting notes</li>
</ol>
<ul>
<li>Who was in the meeting</li>
<li>When was the meeting</li>
<li>What was said during the meeting</li>
</ul>
<ol start="2">
<li>Activity log</li>
</ol>
<ul>
<li>What did I do in each day</li>
</ul>
<ol start="3">
<li>Todo items</li>
</ol>
<ul>
<li>Tasks for the next 1-3 weeks (quite fine-grained)</li>
<li>Couple of long-term things that I need to break down</li>
</ul>
<h2 id="how"><a href="#how">How?</a></h2><p>I use the following format for my worklog file:</p>
<pre><code class="language-markdown"># Worklog 2020-01-01 to present

## Meetings

### 1:1 with manager <userxxx@abc.com> @ 2020-05-10

- topic 1
- topic 2
- action item A
- action item B
- ...

### Meeting for feature XXX @ 2020-04-29

- Who
  + nameA <usera@abc.com>
  + nameB <userb@abc.com>
- thing 1
- thing 2
- action item A
- agreements
  + a
  + b
- open questions...

## Activity log

### 2020-04-29

- Meeting for XXX
  - Notes: <URL>
- Worked on feature Y
  - <Task URL> 
  - <Code review URL> implements ZZZ

### 2020-04-30

- Watched training videos
  + infra: <link>
  + networking
- Investigated YYY
  + <link to the file containing details or to the internal issue>
.
.
.

## TODO

- implement DDD
  + <internal issue link>
- training videos
  + <link>
- do a deep dive on XXX
</code></pre>
<p>As you see, it’s quite simple. Three separate ordered lists.</p>
<p>You will notice that the meetings section orders the meetings by date in descending order (latest at the top), whereas the activity log section is ordered by date in ascending order (latest at the bottom).</p>
<p>This is super handy for me because over time the file grows, and in my productive months it can get hundreds of lines—each worklog file spans my tenure in a team/company (can be several months or years long). This structure allows me to instantly scroll to the top and start writing about the current meeting, or scroll to the bottom and start writing about today’s activities, without wasting time figuring out where I need to write. Almost all the text editors and IDEs have shortcuts to move at the top and bottom of a file which makes this approach efficient.</p>
<p>Regarding the todo section, I try to keep it short and not have more than 10 items, since at that point it becomes a backlog. I want items that I will work on during the next 2-3 weeks ordered in priority, with top being the next one to be done. Having said that, I almost always find myself having 1-2 items that are long-term oriented, for which I need to do some kind of investigation. It’s useful to have them here to remind myself to do a bit of work every day.</p>
<p>Every morning, I open the worklog, remind myself what needs to be done by going over the todo section, and I make sure it is synced with my team’s backlog in case priorities shifted.</p>
<p>That’s it folks, I hope this is helpful!</p>
<h2 id="changelog"><a href="#changelog">Changelog</a></h2><ul>
<li>2023-04-02: Updated the format of the activity log to use separate sections per day.</li>
</ul>

</article>
<article>
<h1>Best tip I received — The worklog</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 09 May 2020 00:00:00 GMT</p>
<p>In a previous article where I wrote about <a href="https://www.lambrospetrou.com/articles/digital-braindump-and-productivity-tools/">my favourite productivity tool</a> I briefly mentioned about my <strong>daily work log</strong> habbit. This article is its origin story.</p>
<h2 id="origin"><a href="#origin">Origin</a></h2><p>About 5 years ago I returned to Amazon (did an internship the year before) as a full-time employee after finishing my Masters degree. During one of my first 1:1 meetings with my manager we were talking about career growth and to this day I remember him saying the following.</p>
<blockquote>
<p>If I can only give you one tip, it’s to keep notes of what you do every day. Not for the usual reasons like performance reviews, but because there will be days that you will feel unproductive, thinking that you don’t produce anything, and it will make you feel bad. Those notes will be the only way to go back and see all the work you did during those days.</p>
</blockquote>
<p>This is by far the best tip I received in my (albeit short) career, and it’s the first tip I give to every new hire joining my team as well.</p>
<p>I call these notes <strong>the worklog</strong>, and it’s essentially a big text file with everything I do throughout the day in chronological order. Anything can go into the worklog like tasks or issues I worked on, code reviews I opened, meetings I attended (including notes), or research I do on new features.</p>
<p>My rule is that anything taking away my time should go into the worklog.</p>
<h2 id="what-s-the-big-deal"><a href="#what-s-the-big-deal">What’s the big deal?</a></h2><p>I consider the worklog to be the most important document I keep regarding work.</p>
<ul>
<li>I use this worklog for every performance review and/or promotion document preparation since I can literally scan through it and pull out a list of the important stuff I did for a certain period amount of time.</li>
<li>I cannot count the times I <a href="https://en.wikipedia.org/wiki/Grep">grepped</a> the worklog for things I remember I did in the past, but vaguely remember exactly what it was or how it was solved. It’s been invaluable!</li>
<li>Occasionally, a colleague will ask about something I did and the answer will be in the worklog since information is spread in several internal tools making discovery hard.</li>
<li>Those rainy days I mentioned above will come no matter how genius you are, and the worklog will be there for you.</li>
</ul>
<h2 id="want-a-tip"><a href="#want-a-tip">Want a tip?</a></h2><p>Keep notes of everything you do throughout the day. <strong>Embrace the worklog!</strong></p>

</article>
<article>
<h1>My digital braindump and productivity tools</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Fri, 08 May 2020 00:00:00 GMT</p>
<p>I thought about writing this article many times over the years but I always thought it was not really something worthy of writing. However, after I saw the announcement for <a href="https://github.com/features/codespaces">Github Codespaces</a> a few days ago I decided to write it because that announcement resonated so much with my daily workflow.</p>
<p>I will describe what I have been using for the past 5 years as my <strong>productivity tool</strong>, to keep my thoughts, notes, and ideas organised.</p>
<h2 id="what-do-i-write-down"><a href="#what-do-i-write-down">What do I write down?</a></h2><p>See below a non-exchaustive list of things that I take notes for over time:</p>
<ul>
<li>I write down everything I do during my day at work (<strong>extremely crucial</strong> and deserves an article on its own)</li>
<li>Draft articles for my website</li>
<li>Ideas for side projects</li>
<li>Business ideas I think (and then reject)</li>
<li>Material to read, watch or listen for several subjects</li>
<li>Research I do on investments</li>
<li>Notes for flats, houses, jobs when I am in the lookout</li>
<li>Many many more things…</li>
</ul>
<p>I am pretty sure everyone encounters at least a few of the items listed throughout the day.</p>
<p>There are many productivity tools available, especially mobile apps, that are supposed to simplify your life, but almost everything I used, and I tried hundreds of them, always has something that ruins it for me.</p>
<p>As a software engineer I spend a lot of time in text editors working with source code. At some point I realised that the best productivity tool was sitting right in front of me… That is <strong>text files</strong>.</p>
<p>In the old days people were using pen and paper to take notes, and many still do, including myself! Pen and paper is easy, quick, and everywhere.</p>
<p>In addition, one of the requirements of any productivity workflow is to have your notes available on all of your devices. The best ideas always come at the weirdest time and place.</p>
<h2 id="what-do-i-use"><a href="#what-do-i-use">What do I use?</a></h2><p>A text file is very similar with pen and paper. Text file editing is available on every device type and it’s quick and easy to write anything.</p>
<p>File synchronization across devices is a solved problem nowadays with services like <a href="https://drive.google.com/">Google Drive</a> and <a href="http://dropbox.com/">Dropbox</a>. In the software world we also have source code versioning software like <a href="https://git-scm.com/">Git</a>.</p>
<p>Many of the applications available restrict you in the type of notes you can take, the format to write them in, searchability might be limited, they have size limit restrictions, and they might not even support your device or operating system.</p>
<p>Having said all this… What do I use?</p>
<p>My productivity tool is basically a directory named <code>notes</code> consisting of several text files, organized sometimes in other sub-directories. Initially, I had this directory inside Dropbox, then Google Drive, and now in a <a href="https://www.lambrospetrou.com/articles/self-hosted-private-git/">self-hosted git repository</a>.</p>
<p>I love this approach because not only it’s extremely easy to write down anything at any time, but I can also put additional files or material inside that directory without any restriction. All text editors provide amazing search capabilities, from simple file name searching, to fuzzy text search of the content of your notes.</p>
<p>You can write your notes in formats like <a href="https://guides.github.com/features/mastering-markdown/">Markdown</a> or <a href="https://asciidoctor.org/docs/what-is-asciidoc/">AsciiDoc</a> and have your editor render them nicely. Many editors also support displaying images or PDFs inline, and you have the full power of your device to open any type of file inside this directory. Again, there is no restriction of what files to put in your directory.</p>
<p>So, long story short:</p>
<ul>
<li>I use <a href="https://code.visualstudio.com/">Visual Studio Code</a> as my editor rooted at the <code>notes</code> directory.</li>
<li>I write most of my notes in Markdown (<code>.md</code>) format.</li>
<li>I have one file named <code>scratchpad.md</code> which acts as my temporary canvas. This is probably my most edited file. I do anything temporary in here, from drafting medium-to-long emails before putting them in Outlook, writing long messages before pasting them in Slack/Mattermost/Facebook, and whatever else that needs a temporary space.</li>
<li>I have my passwords inside <a href="https://www.lambrospetrou.com/articles/encrypt-files-with-password-linux/">encrypted files</a>.</li>
<li>I use a self-hosted git repository, but I am considering moving to a <a href="https://github.blog/2019-01-07-new-year-new-github/">Github private repository</a>. Using git is powerful for me because I can immediately jump to any previous version of any file in the directory from the beginning of time.</li>
</ul>
<h3 id="what-else"><a href="#what-else">What else?</a></h3><p>The approach described above covers almost the entirety of my needs. However, there are two things that I do in other tools.</p>
<ol>
<li>Tiny notes that I need to take while on the move, i.e. walking to the train or the bus, or standing in the elevator, etc. I use <a href="https://keep.google.com/">Google Keep</a> for this but any other lightweight note taking app with synchronization across devices works absolutely fine. Once I am on my desk, I move things into my <code>notes</code> directory as needed.</li>
<li>Calendar events. I still believe that events belong into a traditional calendar app providing core features like reminders and recurring events. I use <a href="http://calendar.google.com/">Google Calendar</a>.</li>
</ol>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>I know there are a million productivity tools, I tried most of them, and I also know that what I described is not rocket science distilled. It’s not even something new, people have been using similar approaches for years. For example, I find <a href="https://www.youtube.com/watch?v=oJTwQvgfgMM">this Emacs Org-mode presentation</a> from 12 years ago fascinating.</p>
<p>However, I do hope that this article will convince you to try a simple approach, or maybe combine 2-3 complimentary tools together to get the most out of your day. There is no need to force yourself to go all-in with an application that puts restrictions on you because it’s supposed to be the best productivity tool of the month.</p>
<p>Now you can probably see why Github Codespaces triggerred my brain. My favourite editor will be available across all my devices (laptop, mobile, tablet) through a web browser allowing me to edit my notes directly (assuming I use Github) from anywhere 🚀</p>

</article>
<article>
<h1>Battle of the Jamstack platforms — Netlify, Vercel, AWS</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Tue, 05 May 2020 00:00:00 GMT</p>
<h2 id="overview"><a href="#overview">Overview</a></h2><p>If you <a href="https://twitter.com/LambrosPetrou">follow me</a> online you probably know that I like serverless platforms a lot. Serverless nowadays is a pretty ambiguous term and can mean anything; from <a href="https://www.sqlite.org/serverless.html">SQLite coining the term</a> to <a href="https://aws.amazon.com/serverless/">AWS enumerating their serverless offerings</a>.</p>
<p>In this article, serverless corresponds to a single page application (SPA) website that is accompanied by a lightweight API. This setup exploded in popularity recently, and it even got a name, <a href="https://jamstack.org/">Jamstack</a>, for <strong>J</strong>avaScript, <strong>A</strong>PIs, and <strong>M</strong>arkup.</p>
<p>I experimented with many platforms over the past few years and I am going to briefly go over my current top 3 platforms to deploy your application, Netlify, Vercel, and AWS.</p>
<h3 id="tl-dr"><a href="#tl-dr">TL;DR</a></h3><p>All three platforms are great and I recommend all of them, depending on how much time you are willing to spend, and how much extensibility you think you might need in the future.</p>
<h2 id="netlify"><a href="#netlify">Netlify</a></h2><p>Website: <a href="https://netlify.com">https://netlify.com</a></p>
<p>I discovered Netlify couple of years ago, and have been tracking their progress ever since. The first time I saw it, I was simply stunned by how simple and fun it was to use while at the same time having powerful features.</p>
<p>Netlify has a lot of features and it feels more like an ecosystem of pluggable functionality (what they call add-ons), with flexible pricing.</p>
<p>Most notable add-on features:</p>
<ul>
<li>Serverless Functions</li>
<li>Instant-forms</li>
<li>Identity</li>
<li>Analytics</li>
</ul>
<p>The core product of the Netlify Platform is the combination of <strong>Netlify Build</strong> and <strong>Netlify Edge</strong>. Netlify Build is the ability to easily connect your Netlify project to your code repository (Github, Gitlab, BitBucket) and deploy your changes after every commit with a unique URL for each deployment. Netlify Edge is the application delivery network (ADN) which propagates the project’s artifacts in locations across the globe, similar to a normal content delivery network (CDN) but much smarter, and faster.</p>
<p>The whole process to get started is so simple that you can <a href="https://app.netlify.com/drop">just drag-and-drop</a> your project folder onto their website and deploy it in seconds!</p>
<p>I cannot possibly enumerate all the features Netlify provides but for the sake of this article we will focus on the core platform and the <a href="https://docs.netlify.com/functions/overview/">Serverless Functions add-on</a>.</p>
<p>Serverless functions use <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> behind the scenes, but abstract it away so that we don’t have to fiddle with API Gateway, IAM role permissions, and all the nitty gritty AWS boilerplate.</p>
<p>For example, if we take JavaScript as an example, simply creating a file <code>functions/hello-world.js</code> with the content below will create an API accessible at <code>/.netlify/functions/hello-world</code>.</p>
<pre><code class="language-javascript">exports.handler = function(event, context, callback) {
  callback(null, {
    statusCode: 200,
    body: "Hello, World"
  });
}
</code></pre>
<p>That’s it for Netlify, let’s move on to the next one.</p>
<h2 id="vercel"><a href="#vercel">Vercel</a></h2><p>Website: <a href="https://vercel.com">https://vercel.com</a></p>
<p>Vercel was <a href="https://vercel.com/blog/zeit-is-now-vercel">until recently known as Zeit Now</a>, and is extremely similar to Netlify in terms of target audience. Vercel however puts a lot of emphasis on their <strong>zero config</strong> deployments, and you can see it mentioned all over their website and docs. By zero-config deployment we mean that their system is <a href="https://vercel.com/docs/v2/build-step">trying to be smart and guess the build system or framework that your project is using</a> based on your files and automatically do what it needs to do without you specifying anything. It works very well most of the time, apart from a <a href="https://github.com/zeit/now/discussions/4132">small issue</a> I discovered with the custom build system.</p>
<p>Vercel provides a similar experience as Netlify, where you can connect your repository and instantly build and deploy your project after every commit, and also includes a delivery network.</p>
<p>A big feature is once again their <a href="https://vercel.com/docs/v2/serverless-functions/introduction">Serverless Functions</a> offering, which is also using AWS Lambda under the covers. However, Vercel is a step up from Netlify, with <a href="https://vercel.com/docs/v2/serverless-functions/supported-languages">more languages</a> and <a href="https://vercel.com/docs/v2/edge-network/regions">more regions supported</a>.</p>
<p>Its delivery network is also quite powerful, and more feature-rich than Netlify’s, since apart from the static assets, it can also <a href="https://vercel.com/docs/v2/serverless-functions/edge-caching">cache serverless function responses</a>.</p>
<p>Just for completeness, a serverless function example in JavaScript requires the following content in the file <code>api/hello.js</code> in order to expose the API at <code>/api/hello?name=xxx</code>.</p>
<pre><code class="language-javascript">module.exports = (req, res) => {
  const { name = 'World' } = req.query
  res.status(200).send(`Hello ${name}!`)
}
</code></pre>
<p>As you can see, even though it’s based on AWS Lambda, Vercel decided to use custom function signatures for the handler in contrary to Netlify which is using AWS’s format.</p>
<p>Before we finish with Vercel, I would like to briefly mention <a href="https://nextjs.org/">Next.js</a>, the React framework they developed which is <strong>simply amazing</strong>. I recently migrated my blog to use this and I cannot emphasize <a href="https://nextjs.org/blog/next-9-3#next-gen-static-site-generation-ssg-support">how great it is</a>, and in conjunction with Vercel’s platform they make a killer combination 🚀</p>
<h2 id="amazon-web-services-aws"><a href="#amazon-web-services-aws">Amazon Web Services (AWS)</a></h2><p>Even though AWS does not provide a nice coherent Jamstack platform (I don’t like <a href="https://aws.amazon.com/amplify/">AWS Amplify</a> at all), it provides all the necessary services to build your application.</p>
<p>For the past 5 years I have been using this myself, and the main services we need are <a href="https://aws.amazon.com/s3/">Amazon S3</a> for storage of the static assets, <a href="https://aws.amazon.com/cloudfront/">Amazon CloudFront</a> as our CDN, and <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> with <a href="https://aws.amazon.com/api-gateway/">API Gateway</a> for our serverless functions API.</p>
<p>AWS has an advantage over both Netlify and Vercel, because of <a href="https://aws.amazon.com/lambda/edge/">Lambda@Edge</a> which is basically a slightly restricted version of AWS Lambda running on the edge locations of Amazon CloudFront, so much closer to the customers compared to the normal Lambda functions which are in the regional datacenters. I have been using Lambda@Edge for years now, both in personal projects but also while I was working at Amazon and I love it!</p>
<p>As you can see, it involves more moving pieces but they are all super robust services used by thousands of customers, serving billions of requests every year without issues. Some of the AWS services themselves are being built on-top of these services, which proves their reliability and that AWS bets on them working as expected!</p>
<p>Finally, I would strongly recommend using <a href="https://aws.amazon.com/cdk/">AWS CDK</a> to provision your resources for all the above services in code, referred to as Infrastructure as Code (remember CloudFormation?).</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>I only scratched the surface of each platform’s feature set, however it’s quite clear that some are better at something and worse at something else.</p>
<p>If you focus only on Jamstack applications, my recommendation would be to <del>go with Vercel</del> go with Netlify (see update below). Recommending <a href="https://nextjs.org/">Next.js</a> for one more time 😃 </p>
<p><strong>Update 2021-02-22:</strong> I have been experiencing slow load times (~2 seconds) on Vercel for two of my projects for a few weeks, and it seems that it’s happening when the page is rarely visited (although it shouldn’t according to docs, and Vercel support). Therefore, I now recommend Netlify over Vercel as a first choice. Netlify <a href="https://github.com/netlify/next-on-netlify">also works with Next.js</a> even if you do not use the export mechanism, so feature parity is good.</p>
<p>If you are going to find uses for Netlify’s add-ons, then it’s a great choice as well! Really, Netlify and Vercel are very similar and you cannot go wrong with either. They even published an article yesterday on <a href="https://www.netlify.com/blog/2020/05/04/building-a-markdown-blog-with-next-9.3-and-netlify/">how to deploy a Next.js application on Netlify</a>.</p>
<p>Finally, if your application is going to need additional cloud infrastructure to support it, like control over the AWS Lambda functions, queueing systems, databases, detailed monitoring, or anything else not provided by these Jamstack oriented platforms, then AWS is your best bet. You can always make requests from Netlify/Vercel’s serverless functions to other AWS resources, but it’s a matter of control and flexibility.</p>
<p>That’s it for this article, maybe I will do a deep-dive in the future for certain features…</p>

</article>
<article>
<h1>Hobby Languages for 2020</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 25 Apr 2020 00:00:00 GMT</p>
<h2 id="overview"><a href="#overview">Overview</a></h2><p>Over the past few years I played around with many languages, starting with C/C++ at college to Java and JavaScript professionally, and to many others during my own time out of interest like Go, Scala, Python, Elixir, Clojure, Racket, Ruby and more.</p>
<p>During the last year I wanted to focus more on functional languages, specifically <a href="http://elixir-lang.org/">Elixir</a> and <a href="https://clojure.org/">Clojure</a>. Unfortunately, I didn’t manage to do much due to work and other things taking up my time, but I really loved both languages.</p>
<p>After trying and playing with many languages I realised that there are certain things I am looking for in order to keep me interested for more than a few days.</p>
<p>My hobby language needs to:</p>
<ul>
<li>Be fun to write</li>
<li>Have a good ecosystem to manage projects without a lot of ceremony</li>
<li>Be general purpose enough to write scripts and CLIs, but also web servers/clients</li>
<li>Be fast enough</li>
<li>Optional - Functional programming</li>
<li>Optional - Concurrency and Parallelism constructs</li>
</ul>
<p>Based on the above requirements, this year I decided to focus on the following languages, and hopefully I will be able to spend a few months with each one and do some projects with them.</p>
<h2 id="reasonml"><a href="#reasonml">ReasonML</a></h2><p>I had <a href="https://ocaml.org/">OCaml</a> on my to-dive-deep list for years, but never jumped into it.
When Facebook released <a href="https://reasonml.github.io/">ReasonML</a> a while ago I thought it was a nice bridge/opportunity to get me into the OCaml world.</p>
<p>It’s no secret that I love the <a href="https://nodejs.org/">Node</a> ecosystem, although it’s something most people hate these days. I truly believe that when you learn how to navigate around without getting disoriented by the huge amount of available modules, it provides a lot of things that other languages lack in a simple and easy way to use!</p>
<p>ReasonML <a href="https://reasonml.github.io/docs/en/installation">has a fantastic integration with NPM</a> which makes things easy.</p>
<p>The first project I am working on with ReasonML is to write an interpreter for the <a href="https://monkeylang.org/">Monkey language</a>, following the book <a href="https://interpreterbook.com/">Writing an Interpreter in Go</a> by Thorsten Ball. I highly recommend this book, it’s simply awesome!</p>
<h2 id="clojurescript"><a href="#clojurescript">ClojureScript</a></h2><p>Around 2 years ago I played around with Lisps, specifically <a href="https://racket-lang.org/">Racket</a> and Clojure. </p>
<p>Racket was surprisingly refreshing and I really enjoyed learning it, with great libraries and amazing documentation. However, it was quite slow for some things I did back then, but I know that some work was done to make it faster nowadays so I need to revisit it at some point.</p>
<p>Clojure, as a language trully impressed me. It has everything a functional language needs, with core and efficient datastructures, and ideas that are very powerful once learnt! However, although I have been using JVM languages professionally at work for 5+ years, I find the ecosystem to require too much ceremony to do the simplest thing, so I avoid it for my hobby projects.</p>
<p><a href="https://clojurescript.org/">ClojureScript</a> on the other hand, is the same (almost) language as Clojure, but compiles to JavaScript. This means that you can use the NPM ecosystem to consume and produce software written in ClojureScript alongside JavaScript.</p>
<p>In addition, since <a href="http://clojure-goes-fast.com/blog/clojures-slow-start/">Clojure has a very slow startup</a> due to the way it handles class loading it’s not really suitable for quick scripts I write every now and then, whereas ClojureScript can use Node as its runtime leading to zero startup latency.</p>
<p>There are some amazing projects to make the ClojureScript integration with NPM easier, like <a href="https://shadow-cljs.github.io/docs/UsersGuide.html">Shadow CLJS</a> which is my go-to; I even wrote a project template generator for it, <a href="https://github.com/lambrospetrou/create-shadow-cljs-app">create-shadow-cljs-app</a>. Coincidentally, David Nollen also wrote an article yesterday about a new <a href="https://clojurescript.org/guides/webpack">feature in ClojureScript to integrate better with JavaScript bundlers</a> which will hopefully bring more people.</p>
<p>Clojure(Script) is so nice that the community wrote standalone REPLs to make it usable directly from the command line for scripting. My favourite is <a href="https://github.com/planck-repl/planck">planck-repl</a>, but <a href="https://github.com/borkdude/babashka">babashka</a> has been making the rounds recently as well and I will have to look into it.</p>
<p>I wrote a few small things in ClojureScript but I want to write something bigger to get a better feel of the language itself.</p>
<p>I will definitly have hard time deciding between ReasonML and ClojureScript since they both target Node, both can be used for similar things, and they are both functional languages. However, while they are very similar in some aspects, they are two very different languages, coming from different language families, Lisp vs ML.</p>
<h2 id="go"><a href="#go">Go</a></h2><p>I started writing in <a href="https://golang.org/">Go</a> back at 2014, and since then I wrote scripts, API servers, several static site generators for my website and other tools.</p>
<p>The language itself has nothing extraordinary, but it really feels nice using it and you can get many things done just by using the standard library, which is amazing on its own.</p>
<p>In addition, being a compiled language means it runs very fast, and it has excellent concurrency constructs that make writing concurrent software very easy, which is nice for the situations where you need to put those CPU cores into work.</p>
<p>I tried replacing Go with some other language over the past few years to be my quick go-to language but I always come back. Maybe ReasonML or ClojureScript will do the trick, but the simplicity, the performance, and the concurrency features are really unmatched so far.</p>
<p>Even though I already wrote several things in Go, I want to get back to it this year since I haven’t used it a lot recently.</p>
<h2 id="others"><a href="#others">Others</a></h2><p>As I said in the overview, I really like playing around with many languages. Some of them lose me from the hello world, but many of them end up in my to-dive-deep list.</p>
<p>Other languages I really liked and I would like to spend more time with them, probably not a lot this year, are Elixir and <a href="https://www.rust-lang.org/">Rust</a>.</p>
<p>Elixir has everything I love in a language apart from single-threaded performance really. It’s amazing for anything that requires a lot of juggling around since it supports millions of processes running independently, with their own garbage collection, preemptive scheduling, and other great features provided by its runtime, the <a href="https://blog.stenmans.org/theBeamBook/">BEAM VM</a>. As a language, and as an ecosystem it is very high on my list! There is some amazing work being done now <a href="https://www.phoenixframework.org/blog/build-a-real-time-twitter-clone-in-15-minutes-with-live-view-and-phoenix-1-5">with Phoenix LiveViews</a> which is very interesting, and I am eager to see where it will lead.</p>
<p>Rust is making waves recently a lot! Both in my social circles, but also professionally it gets adopted by many big companies for their performance-critical systems. It has a nice combination of functional programming concepts together with performance oriented features like memory management to avoid runtime issues. In addition, it is among the best languages to use in order to compile to <a href="https://www.rust-lang.org/what/wasm">WebAssembly</a> which is something I am extremely interested in.</p>
<p>I wish the days were longer :) Much to learn, so little time…</p>

</article>
<article>
<h1>Amazon Leadership Principles — Choose 3</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 18 Apr 2020 00:00:00 GMT</p>
<h2 id="overview"><a href="#overview">Overview</a></h2><p>After almost 5 years at Amazon, I really think that the <a href="https://www.amazon.jobs/en-gb/principles">Amazon Leadership Principles</a> are among the few things that are always going to be engraved in me.</p>
<p>In the yearly reviews (so-called Forte) we choose up to 3 LPs for some of our colleagues to describe their superpowers and their growth opportunities, so in a similar fashion I will enumerate the 3 LPs that I consider to be the most crucial, and which have been the barometer to everything I was doing.</p>
<h2 id="customer-obsession"><a href="#customer-obsession">Customer Obsession</a></h2><blockquote>
<p>Leaders start with the customer and work backwards. They work vigorously to earn and keep customer trust. Although leaders pay attention to competitors, they obsess over customers.</p>
</blockquote>
<p>Not much to say… Customers come first, so the customer should be the sole focus of anything you do. Choosing the right problems to solve, making the right decision between trade-offs, and delivering the final solution should all start and finish with the customer in mind.</p>
<h2 id="insist-on-the-highest-standards"><a href="#insist-on-the-highest-standards">Insist on the Highest Standards</a></h2><blockquote>
<p>Leaders have relentlessly high standards — many people may think these standards are unreasonably high. Leaders are continually raising the bar and driving their teams to deliver high quality products, services, and processes. Leaders ensure that defects do not get sent down the line and that problems are fixed so they stay fixed.</p>
</blockquote>
<p>I cannot emphasize this enough. <strong>Keeping the bar high</strong>, and always insisting on high standards is not only important for the short-term but it’s absolutely crucial for the long-term.</p>
<p>We face the temptation of doing something quick-and-dirty pretty much in every line of code we write or decision we make. Choosing this path enough times though means that the product will degrade over time, but even worse, we make the job of our future self or future colleagues super hard.</p>
<p>How many times did you want to <em>have two words</em> with the author of some part of a system you worked on… :)</p>
<h2 id="deliver-results"><a href="#deliver-results">Deliver Results</a></h2><blockquote>
<p>Leaders focus on the key inputs for their business and deliver them with the right quality and in a timely fashion. Despite setbacks, they rise to the occasion and never compromise.</p>
</blockquote>
<p>This is the culmination of all the Leadership Principles together in my opinion. You have to focus on the <strong>key inputs for your business</strong>, and deliver them with the <strong>right quality</strong>, and in a <strong>timely fashion</strong>, while at the same time <strong>never compromising</strong>.</p>
<p>Yeah, I pretty much re-stated the same thing, but this is the gist of it. At the end of the day, you have to deliver something to your customers, but the difficulty lies in doing it in a timely fashion without compromising quality.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p><strong>I honestly think that all 14 (as of now) LPs are super important!</strong></p>
<p>Having said that, it’s almost impossible for every individual to have the same definition and the same bar for all 14 of them, and this is something I have seen a lot over the past few years. Each person will interpret each LP in their own way, which is fair and expected. </p>
<p>The magic though happens when a team does not lose focus on the Leadership Principles and always comes back and re-evaluates everything against them!</p>
<p>I am glad I had the chance to be an Amazonian!</p>

</article>
<article>
<h1>Meiosis pattern - Unidirectional data flow web applications using streams</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Fri, 14 Sep 2018 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>After using <a href="https://reactjs.org/">ReactJS</a> for two years, along with all its ecosystem (<a href="https://redux.js.org/">Redux</a> for state management, <a href="https://redux-saga.js.org/">Redux-saga</a> for managing side-effects, etc.) I wanted to experiment with a simpler approach where I was not forced to use many framework specific libraries, but at the same time having the same uni-directional flow of data and the ease of UI creation using components.</p>
<p>A few months ago, I came across <a href="https://mithril.js.org/">Mithril</a>, a lightweight component based framework which uses pure Javascript objects for its components. I was instantly hooked and wanted to use it in all my personal projects.</p>
<p><em><strong>How can I get the state management I want without being tied to a framework so that I can use Mithril or React interchangeably without issues?</strong></em></p>
<p><strong>TL;DR</strong></p>
<p>My answer is using <a href="https://meiosis.js.org/">Meiosis pattern</a>.</p>
<h2 id="streams"><a href="#streams">Streams</a></h2><p>In order to properly understand the solution let’s go over some basics regarding streams.</p>
<p>There are several Javascript stream libraries but for the rest of the article I will use <a href="https://mithril.js.org/stream.html">Mithril/stream</a>.</p>
<p>Meiosis requires three basic operations on streams:</p>
<ol>
<li><a href="https://mithril.js.org/stream.html#streams-as-variables">Updating the stream value</a></li>
<li><a href="https://mithril.js.org/stream.html#streammap">Mapping over the stream</a> in order to listen for updates to the stream’s value</li>
<li><a href="https://mithril.js.org/stream.html#streamscan">Scanning over the stream</a> in order to calculate accumulated value after every update</li>
</ol>
<h3 id="updating-the-stream-value"><a href="#updating-the-stream-value">Updating the stream value</a></h3><pre><code class="language-javascript">import * as ms from 'mithril/stream'

const s = ms()
s(1)
console.log(s())  // 1
s(s()+10)
console.log(s())  // 11
</code></pre>
<p>The example above shows that just calling a stream as a function will return the latest value of the stream, and supplying a value to the function application will update the stream’s value.</p>
<p>Also, the fact that the stream itself is a function can be super useful since it can be composed into more complex functions.</p>
<h3 id="stream-map"><a href="#stream-map">Stream.map</a></h3><pre><code class="language-javascript">import * as ms from 'mithril/stream'

const s = ms()

s.map(value => console.log(`updated value: ${value}`))

s(1)    // prints "updated value: 1"
s(20)   // prints "updated value: 20"
s(311)  // prints "updated value: 311"
</code></pre>
<p>The <a href="https://mithril.js.org/stream.html#streammap">Stream.map</a> operation is simple. It just executes a given function every time the stream value is updated, and also returns a new stream. The return value of the given function after every execution becomes the new value for the stream returned by the call to <code>map()</code>.</p>
<h3 id="stream-scan"><a href="#stream-scan">Stream.scan</a></h3><pre><code class="language-javascript">import * as ms from 'mithril/stream'

const accumulate = (accumulator, value) => accumulator * value
const processValue = value => console.log(value)

const s = ms()
const scanned = ms.scan(accumulate, 1, s)
scanned.map(processValue)

s(1)    // prints 1
s(2)    // prints 2
s(5)    // prints 10
s(100)  // prints 1000
s(0.5)  // prints 500
</code></pre>
<p>The distilled functionality of <a href="https://mithril.js.org/stream.html#streamscan">Stream.scan</a> is that whenever the value of the source stream (<code>s</code>) changes, the <code>accumulate</code> function is called being passed the current accumulator value and the new source stream value. The return value will become the new value of the <code>scanned</code> stream and the accumulator value being passed to the next <code>accumulate</code> invocation.</p>
<p>The fascinating idea here is that a stream can have functions as its values.</p>
<p>In the following example we pass functions as values and as a result we can apply different operations on the accumulator value each time.</p>
<pre><code class="language-javascript">import * as ms from 'mithril/stream'

const update = (accumulator, fn) => fn(accumulator)
const processValue = value => console.log(value)

const s = ms()
const scanned = ms.scan(update, 1, s)
scanned.map(processValue)

s(value => value + 1)         // prints 2
s(value => value - 1)         // prints 1
s(value => value * 10)        // prints 10
s(value => 1234)              // prints 1234
s(value => value / 2)         // prints 617
</code></pre>
<p>Spend a few minutes to <strong>fully</strong> understand the above example because it’s the bread and butter of the Meiosis pattern.</p>
<p>Notice how we pass functions in the stream and by the use of <code>scan</code> they are applied in order transforming the stream accumulated value.</p>
<h2 id="meiosis-pattern"><a href="#meiosis-pattern">Meiosis Pattern</a></h2><p>The <a href="https://meiosis.js.org/">Meiosis website</a> is doing an amazing job describing the pattern and includes a brief video showcasing how to use the pattern.</p>
<p>Meiosis was inspired by similar projects striving for unidirectional flow of data to make state management easy like <a href="http://sam.js.org/">SAM Pattern</a>, <a href="https://guide.elm-lang.org/architecture/index.html">Elm Architecture</a>, <a href="http://www.christianalfoni.com/articles/2016_04_06_CycleJS-driven-by-state">CycleJS</a>, and others.</p>
<p>The whole Meiosis pattern is contained in the following code:</p>
<pre><code class="language-javascript">import * as ms from 'mithril/stream'

class AppModel {
  counter: number = 0
}
type ModelUpdateFunction = (model: AppModel) => AppModel
type UpdateStream = ms.Stream<ModelUpdateFunction>

const createApp = (update: UpdateStream) : MeiosisApp => ({ 
  initialModel: ..., view: ..., render: ...
})

const setupMeiosis = (
  createApp: (s: UpdateStream) => MeiosisApp,
  container: Element
) => {
  const update = ms<ModelUpdateFunction>()
  const modelUpdate = (model: AppModel, fn: ModelUpdateFunction) => fn(model)

  const app = createApp(update)

  const models = ms.scan(modelUpdate, app.initialModel(), update)
  models.map(model => app.render(container, app.view(model)))
}

setupMeiosis(createApp, document.body.querySelector('#app') as Element)
</code></pre>
<p>Let’s examine each section separately to fully understand the pattern.</p>
<pre><code class="language-javascript">const update = ms<ModelUpdateFunction>()
</code></pre>
<p>The code above creates the <code>update</code> stream that has functions as its values. These functions receive an <code>AppModel</code> and return a new <code>AppModel</code>. These functions are going to be our actions, that are triggered either by a user interaction or by an asynchronous task, fetch, etc.</p>
<pre><code class="language-javascript">const app = createApp(update)
</code></pre>
<p>This creates our application, which receives the <code>update</code> stream that will be used by the application’s actions.</p>
<p>In addition, the <code>app</code> exposes three methods:</p>
<ul>
<li><code>initialModel()</code> returns the initial <code>AppModel</code> which will act as the bootstrapping model for the application.</li>
<li><code>view(model)</code> receives the latest <code>AppModel</code> and returns the view to be rendered. This view can be of any framework (e.g. React, Mithril).</li>
<li><code>render(element, view)</code> receives the application view, and the DOM element to which the view will be rendered. Meiosis does not care about the framework used as long as there is functionality to render a view on-demand. Mithril, React, Preact, Inferno, etc. are able to do this efficiently with the use of virtual dom.</li>
</ul>
<pre><code class="language-javascript">const models = ms.scan(modelUpdate, app.initialModel(), update)
models.map(model => app.render(container, app.view(model)))
</code></pre>
<p>This is the juicy part of the pattern.</p>
<p>The <code>models</code> stream is a scanned stream over the <code>update</code> stream. This means that any function passed into the <code>update</code> stream, due to the <code>scan</code> functionality described previously, will be used to transform the current application model into a new model, and then set that model as value in the <code>models</code> stream.</p>
<p>The second line applies a <code>map</code> operation over the <code>models</code> stream. This will render the application view using the updated application model every time there is a new value pushed to the <code>update</code> stream. Remember that the values of the <code>update</code> stream are functions that transform the model.</p>
<p>This deviates slightly from traditional frameworks in the sense that when an action occurs, instead of publishing the updated model itself, the transformation functions are pushed, and the <code>Stream.scan</code> functionality takes care of applying those transformations to the model/state.</p>
<h2 id="demo"><a href="#demo">Demo</a></h2><p>The following code is a complete demo implementing a counter, with <code>plus/minus</code> buttons.</p>
<pre><code class="language-javascript">import * as m from 'mithril'
import * as ms from 'mithril/stream'

class AppModel {
  counter: number = 0
}
type ModelUpdateFunction = (model: AppModel) => AppModel
type UpdateStream = ms.Stream<ModelUpdateFunction>

interface MeiosisApp {
  model: () => AppModel
  view: (model: AppModel) => m.Vnode
  render: (el: Element, v: m.Vnode) => void
}

const createActions = (update: UpdateStream) => ({
  inc: (value: number) => update(model => ({counter: model.counter + value}))
})

const createApp = (update: UpdateStream): MeiosisApp => {
  const actions = createActions(update)
  return {
    model: () => ({counter: 0}),
    view: (model: AppModel) => m("div", [
      m('p', model.counter),
      m('button', {onclick: () => actions.inc(1)}, 'plus'),
      m('button', {onclick: () => actions.inc(-1)}, 'minus')
    ]),
    render: (container: Element, v: m.Vnode) => m.render(container, v)
  }
}

const setupMeiosis = (
  createApp: (s: UpdateStream) => MeiosisApp,
  container: Element
) => {
  const update = ms<ModelUpdateFunction>()
  const modelUpdate = (model: AppModel, fn: ModelUpdateFunction) => fn(model)

  const app = createApp(update)

  const models = ms.scan(modelUpdate, app.model(), update)
  models.map(model => app.render(container, app.view(model)))
}

setupMeiosis(createApp, document.body.querySelector('#app') as Element)
</code></pre>
<p><a href="https://github.com/lambrospetrou/code-playground/tree/master/meiosis-mithril-ts">Demo repository on Github</a></p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>After working with most of the Javascript frameworks the last few years, I enjoy having the following while developing an application:</p>
<ul>
<li>Immutability of state/data</li>
<li>Unidirectional flow of data (actions => update state => render updated view => actions)</li>
<li>Fast component rendering (the best solutions as of now use virtual dom diffing)</li>
</ul>
<p>Adopting Meiosis pattern in my applications, means that all of the above are possible, and the only thing tied to a specific framework is really just the view components. Taking the above code, and replacing Mithril with React is just a matter of a few minutes.</p>
<p>I hope that more engineers will start adopting patterns that are applicable to all kinds of Javascript applications instead of focusing on framework specific solutions.</p>

</article>
<article>
<h1>Run multiple services on a single EC2 instance using AWS Elastic Beanstalk (Go and Multicontainer Docker platforms)</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 11 Mar 2018 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>Many times I want to run multiple services on the same EC2 instance. Sometimes I am doing a toy project and I don’t want to pay for resources just hosting each project on its own, and other times, even in production systems, I need to deploy multiple microservices on the same instance, e.g. Nginx proxy, the application web service, and maybe some other monitoring service.</p>
<p>I am a huge fan of <a href="https://aws.amazon.com/elasticbeanstalk/">AWS Elastic Beanstalk</a> and I will explain how we can achieve multi-service on same instance setup using the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-environment.html">Go Platform</a> and the more flexible <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker_ecs.html">Multicontainer Docker Platform</a>.</p>
<h3 id="desired-result"><a href="#desired-result">Desired result</a></h3><p>The example application I will deploy has two Go web services running which just serve some static HTML and an Nginx proxy doing the routing between them.</p>
<p><strong>Web service 2</strong> handles any request under the <code>/web2</code> path, and <strong>Web Service 1</strong> handles everything else.</p>
<pre><code class="language-bash">> curl http://multiplegoservices-env.mzkfjw36fh.eu-west-1.elasticbeanstalk.com/web2
Service 2 Path, "/web2"

> curl http://multiplegoservices-env.mzkfjw36fh.eu-west-1.elasticbeanstalk.com/web1
Service 1 Path, "/web1"

> curl http://multiplegoservices-env.mzkfjw36fh.eu-west-1.elasticbeanstalk.com/
Service 1 Path, "/"
</code></pre>
<p>The code for our services is <strong>exactly the same</strong> between the two platforms, and the only difference is among the files specific for each platform configuration.</p>
<h2 id="go-platform"><a href="#go-platform">Go Platform</a></h2><p>The Go Platform in Elastic Beanstalk is pretty simple, with the most important concepts being the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-procfile.html">Procfile</a>, the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-buildfile.html">Buildfile</a>, and the special <code>.ebextensions</code> folder which we will use to provide <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-nginx.html">custom configuration to the Nginx proxy</a> deployed automatically on the instances.</p>
<p><a href="https://github.com/lambrospetrou/aws-playground/tree/master/elastic-beanstalk-multiple-applications">Source code available on Github</a>.</p>
<h3 id="procfile"><a href="#procfile">Procfile</a></h3><p>This file contains a simple enumeration of the services to start (i.e. executables to run). For the example application, <code>Procfile</code> contains the following:</p>
<pre><code>web_service1: bin/web-service-1
web_service2: bin/web-service-2
</code></pre>
<p>One thing you need to know is that Elastic Beanstalk will start the first service setting the environment variable <code>PORT=5000</code>, and that is the port your service should listen for requests. Each subsequent service will receive <code>PORT</code> values in <strong>100 increments</strong> from the last one, i.e. <code>web-service-2</code> will have <code>PORT=5100</code>.</p>
<h3 id="buildfile"><a href="#buildfile">Buildfile</a></h3><p>This file contains a simple enumeration of commands to run during the deployment artifact build time. This can be used to build your code and generate the executables that <code>Procfile</code> will execute, but in my case I like building locally (or in a pipeline) and just deploy the executables as the Elastic Beanstalk artifact.</p>
<p>Just for the sake of using <code>Buildfile</code>, I call a bash script that prints <code>Hello world!</code>, and it looks like below:</p>
<pre><code>command_to_run_during_build: bin/hello.sh
</code></pre>
<p>As a general guideline, the <code>Buildfile</code> can be used for any arbitrary task that needs to run before the services are started.</p>
<h3 id="nginx-proxy-configuration"><a href="#nginx-proxy-configuration">Nginx proxy configuration</a></h3><p>For the proxy configuration we just need a <code>server {}</code> Nginx directive to provide the routing between the two services. To achieve this, we create a <strong>.conf</strong> file inside the <code>.ebextensions/nginx/conf.d/</code> directory which will be included by Nginx during startup.</p>
<p>The following configuration is enough to do the job, and is in the file <code>.ebextensions/nginx/conf.d/01_proxy.conf</code>.</p>
<pre><code>server {
    server_name .elasticbeanstalk.com;
    listen 80;

    location /web2 {
        proxy_pass http://127.0.0.1:5100;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location / {
        proxy_pass http://127.0.0.1:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
</code></pre>
<h3 id="deployment-artifact"><a href="#deployment-artifact">Deployment artifact</a></h3><p>All the paths used in the files above are based on the following artifact (the <code>.zip</code> file) I deploy to Elastic Beanstalk. Full listing of the contents below:</p>
<pre><code class="language-bash">> unzip -l build/bundle.zip 
Archive:  build/bundle.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
       64  2018-03-11 13:44   Procfile
        0  2018-03-11 13:44   .ebextensions/
        0  2018-03-11 13:44   .ebextensions/nginx/
        0  2018-03-11 13:44   .ebextensions/nginx/conf.d/
      319  2018-03-11 13:44   .ebextensions/nginx/conf.d/01_proxy.conf
        0  2018-03-11 13:44   bin/
  6218916  2018-03-11 13:44   bin/web-service-1
       38  2018-03-11 13:44   bin/hello.sh
  6218916  2018-03-11 13:44   bin/web-service-2
       42  2018-03-11 13:44   Buildfile
---------                     -------
 12438295                     10 files
</code></pre>
<h2 id="multicontainer-docker-platform"><a href="#multicontainer-docker-platform">Multicontainer Docker Platform</a></h2><p>The multicontainer Docker platform uses <a href="https://aws.amazon.com/ecs/">Amazon Elastic Container Service</a> under the covers, but as I said before, deploying through Elastic Beanstalk makes things a lot easier!</p>
<p>There is only one important configuration file in the Multicontainer Docker platform and that is the <a href="https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker_v2config.html">Dockerrun.aws.json</a>.</p>
<p>This file contains Docker specific definitions, e.g. the Docker images we want for each service, the volume definitions mapping to source paths in our deployment artifact, etc.</p>
<p><a href="https://github.com/lambrospetrou/aws-playground/tree/master/elastic-beanstalk-multicontainer-docker">Source code available on Github</a>.</p>
<h3 id="dockerrun-aws-json"><a href="#dockerrun-aws-json">Dockerrun.aws.json</a></h3><p>In order to achieve the same result as with the <strong>Go Platform</strong> I use the following <code>Dockerrun.aws.json</code> file.</p>
<pre><code class="language-json">{
  "AWSEBDockerrunVersion": 2,
  "volumes": [
    {
      "name": "web1",
      "host": {
        "sourcePath": "/var/app/current/web-service-1"
      }
    },
    {
      "name": "web2",
      "host": {
        "sourcePath": "/var/app/current/web-service-2"
      }
    },
    {
      "name": "nginx-proxy-conf",
      "host": {
        "sourcePath": "/var/app/current/proxy/conf.d"
      }
    }
  ],
  "containerDefinitions": [
    {
      "name": "web1",
      "image": "golang:1.10",
      "essential": true,
      "memory": 128,
      "mountPoints": [
        {
          "sourceVolume": "web1",
          "containerPath": "/var/app"
        }
      ],
      "portMappings": [
        {
          "hostPort": 5000,
          "containerPort": 5000
        }
      ],
      "environment": [
        {
          "name": "PORT",
          "value": "5000"
        }
      ],
      "command": ["/var/app/web-service-1"]
    },
    {
      "name": "web2",
      "image": "golang:1.10",
      "essential": true,
      "memory": 128,
      "mountPoints": [
        {
          "sourceVolume": "web2",
          "containerPath": "/var/app"
        }
      ],
      "portMappings": [
        {
          "hostPort": 5100,
          "containerPort": 5100
        }
      ],
      "environment": [
        {
          "name": "PORT",
          "value": "5100"
        }
      ],
      "command": ["/var/app/web-service-2"]
    },
    {
      "name": "nginx-proxy",
      "image": "nginx",
      "essential": true,
      "memory": 128,
      "portMappings": [
        {
          "hostPort": 80,
          "containerPort": 80
        }
      ],
      "links": [
        "web1", "web2"
      ],
      "mountPoints": [
        {
          "sourceVolume": "nginx-proxy-conf",
          "containerPath": "/etc/nginx/conf.d"
        },
        {
          "sourceVolume": "awseb-logs-nginx-proxy",
          "containerPath": "/var/log/nginx"
        }
      ]
    }
  ]
}
</code></pre>
<p><strong>Notes</strong></p>
<ul>
<li><p><code>/var/app/current</code> is the directory on the host machine that contains our deployment artifact, i.e. the <code>.zip</code> file unzipped.</p>
</li>
<li><p>In order to allow the Nginx image to communicate with the two services running we need to <strong>link</strong> those images to the <code>nginx-proxy</code> image, and instead of using <code>http://127.0.0.1</code> in the Nginx <code>.conf</code> file we should use <code>http://web1</code> and <code>http://web2</code> as shown below.</p>
<pre><code>server {
    server_name .elasticbeanstalk.com;
    listen 80;

    location /web2 {
        proxy_pass http://web2:5100;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location / {
        proxy_pass http://web1:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
</code></pre>
</li>
</ul>
<h3 id="deployment-artifact"><a href="#deployment-artifact">Deployment artifact</a></h3><p>Full listing of the deployment artifact contents is as follows:</p>
<pre><code class="language-bash">> unzip -l build/bundle.zip 
Archive:  build/bundle.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     1960  2018-03-10 19:01   Dockerrun.aws.json
        0  2018-03-10 19:01   proxy/
        0  2018-03-10 19:01   proxy/conf.d/
      326  2018-03-10 19:01   proxy/conf.d/default.conf
        0  2018-03-10 19:01   web-service-1/
  6218916  2018-03-10 19:01   web-service-1/web-service-1
        0  2018-03-10 19:01   web-service-2/
  6218916  2018-03-10 19:01   web-service-2/web-service-2
---------                     -------
 12440118                     8 files
</code></pre>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>AWS Elastic Beanstalk is an amazing service which abstracts a lot of stuff that are not really part of the application, e.g. load balancers, autoscaling groups, logging, alarms, and there is even a super helpful dashboard right out-of-the box.</p>
<p>My simplistic guideline on what platform to use is as follows:</p>
<p><strong>Go Platform</strong></p>
<ul>
<li>Your services are written in Go</li>
<li>Your services can be compiled into binary executables that run on Amazon Linux</li>
</ul>
<p><strong>Multicontainer Docker Platform</strong></p>
<ul>
<li>Anything else</li>
</ul>
<p>Have fun microservicing with AWS Elastic Beanstalk!</p>

</article>
<article>
<h1>Encrypt files with password on Linux</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 06 Jan 2018 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>I have some important private files that I want to store in <a href="https://www.google.com/drive/">Google Drive</a> and on my USB flash drive, but I don’t want them to be in plain sight for anyone to see.</p>
<p>I would like to at least password-protect them before storing them, but without too much hassle with asymmetric cryptography where I need to fiddle with keys.</p>
<h2 id="solution"><a href="#solution">Solution</a></h2><p>It turns out pretty much all UNIX systems have <a href="https://www.gnupg.org/">GnuPG</a> installed which allows me to just run a command to encrypt a file using a passphrase, and a corresponding command to decrypt it when I need to open it.</p>
<p>I found out that this method is also used <a href="https://www.nas.nasa.gov/hecc/support/kb/using-gpg-to-encrypt-your-data_242.html">inside NASA when transferring files</a>.</p>
<p>In order to <strong>encrypt and password-protect a file</strong> run the following command:</p>
<pre><code class="language-bash">gpg -c --cipher-algo AES256 private-file.txt
</code></pre>
<p>The <code>-c</code> option specifies that we want to do symmetric encryption using a passphrase. The <code>--cipher-algo 256</code> option specifies that we want to use the <a href="https://en.wikipedia.org/wiki/Advanced_Encryption_Standard">AES256 cipher</a> instead of the default <a href="https://en.wikipedia.org/wiki/CAST-128">CAST5 cipher</a>, although this is not required.</p>
<p>The above command will ask you for the passphrase to use, and then will create a new file named <code>private-file.txt.gpg</code>, which is the encrypted and password-protected file we want to store.</p>
<p>In order to <strong>decrypt the file</strong> run the following command:</p>
<pre><code class="language-bash">gpg private-file.txt.gpg
</code></pre>
<p>Once you enter the passphrase used during the encryption of the file, you will get back the decrypted file which will have the same name without the <code>.gpg</code> extension, hence <code>private-file.txt</code>.</p>
<p><strong>Tips</strong></p>
<ul>
<li>If you want to encrypt a whole directory (folder), then you have to first zip/tar the folder into a single file and then apply the same command above to the zipped/tarred file.</li>
<li>Use long passphrases for important files consisting of multiple words with letters, spaces, symbols, and numbers to maximise the entropy and security of the encryption.</li>
</ul>
<h2 id="references"><a href="#references">References</a></h2><ul>
<li><a href="https://www.nas.nasa.gov/hecc/support/kb/using-gpg-to-encrypt-your-data_242.html">NASA - Using GPG to Encrypt Your Data</a></li>
<li><a href="https://www.cyberciti.biz/tips/linux-how-to-encrypt-and-decrypt-files-with-a-password.html#comment-4392">How to use a password stored in separate file as passphrase</a></li>
</ul>

</article>
<article>
<h1>Export environment variables from multiple files using Bash on Linux</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 19 Nov 2017 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>Recently, I have updated the <a href="https://github.com/lambrospetrou/lambrospetrou.github.io">deployment scripts of my website</a> and for some tasks I wanted to have certain environment variables available in my scripts. For example, I wanted to <a href="http://docs.aws.amazon.com/cli/latest/reference/cloudfront/create-invalidation.html">invalidate the Cloudfront distribution</a> so that I don’t have to wait for the cache to expire before serving the new content, and I needed the <code>--distribution-id</code>.</p>
<p>I looked for the easiest way to export environment variables from key-value pairs in a file, and I was delighted to see that it can done with a single line!</p>
<h2 id="solution"><a href="#solution">Solution</a></h2><p>The solution is based on <a href="https://stackoverflow.com/a/36456837/1066790">this StackOverflow answer</a>.</p>
<p>Assuming the following <code>aws.environment</code> file:</p>
<pre><code class="language-bash">DISTRIBUTION_ID=xxxxxxxxx
S3_BUCKET=www.example.com
</code></pre>
<p>The following script will export the above key-value pairs as environment variables.</p>
<pre><code class="language-bash">#!/usr/bin/env bash

source <(sed -E -n 's/[^#]+/export &/ p' aws.environment)

# ... Commands that use the above variables i.e. echo "$DISTRIBUTION_ID"
</code></pre>
<p>In addition, I didn’t want to limit myself only to one file of environment variables because I would like to version control some of them in git. So, I wanted to scan and export variables from all the files in a given directory. This can be done using the following script.</p>
<pre><code class="language-bash">#!/usr/bin/env bash

source <(find ./build-tool/env/ -type f -exec sed -E -n 's/[^#]+/export &/ p' {} +)

# ... Commands that use the above variables
</code></pre>
<p>This final adaptation uses the <a href="https://ss64.com/bash/find.html">find command</a> to read all files in the <code>./build-tool/env</code> directory and for each one exports its key-value pairs as environment variables.</p>
<p>That’s it! Once again, knowledge of some nifty Linux commands saves the day.</p>

</article>
<article>
<h1>npm package to easily run arbitrary compiled binaries in your applications or AWS Lambda</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Wed, 18 Oct 2017 00:00:00 GMT</p>
<p>I recently posted a detailed article on <a href="/articles/aws-lambda-meets-racket/">how to efficiently run Racket compiled native binaries on AWS Lambda</a>, and the same process can be used for any compiled binary that runs on Amazon Linux (including binaries from languages like Go, Racket, OCaml, Rust, C/C++, etc.).</p>
<p>The Lambda wrapper code was very easy to understand but a bit long, and if you have lots of lambdas using binaries you might end up copy-pasting those lines, which as we know is not good. Remember the <a href="https://en.wikipedia.org/wiki/Don%27t_repeat_yourself">DRY</a> principle?</p>
<p>As a result, to make it easier for me, and anyone implementing the solution described in that article, I created an npm package to help you reduce the boilerplate of your Lambda wrapper code significantly.</p>
<h2 id="aws-lambda-binary-npm-package"><a href="#aws-lambda-binary-npm-package">AWS-Lambda-Binary npm package</a></h2><p>You can find the package at <a href="https://www.npmjs.com/package/aws-lambda-binary">https://www.npmjs.com/package/aws-lambda-binary</a></p>
<p>Its usage is very easy and the package itself provides sufficient documentation of the API. In addition, couple of ready-to-upload AWS Lambda examples are provided inside the <code>_examples</code> folder.</p>
<p>In a nutshell, the following describes how to use this package to start the Linux command <a href="https://ss64.com/bash/cat.html">cat</a> which will be used as an echo-back process.</p>
<p>First of all if you don’t have an existing lambda project, just create one using the following commands:</p>
<pre><code class="language-bash">mkdir lambda-test && cd lambda-test
npm init -y
</code></pre>
<p>Now that you have a project, install the npm package:</p>
<pre><code class="language-bash">npm install aws-lambda-binary
</code></pre>
<p>Copy and paste the following code into a file named <code>wrapper.js</code>.</p>
<pre><code class="language-javascript">const application = require('aws-lambda-binary').spawnLineByLine({
    spawn: { command: 'cat' }
});

exports.handler = function (event, context) {
    application.ensureIsRunning();

    application.stdout(result => context.done(null, result));

    application.stdin(JSON.stringify({event, context}));
};
</code></pre>
<p>Finally, prepare the zip file to upload to AWS Lambda by running the command below:</p>
<pre><code class="language-bash">zip -r bundle.zip wrapper.js node_modules/
</code></pre>
<p>In the <a href="https://console.aws.amazon.com/lambda/home">AWS Lambda console</a>, create a new function with <strong>Runtime</strong> <code>Node.js 6.10</code> (or higher), and with <strong>Handler</strong> <code>wrapper.handler</code>.</p>
<p>Upload the <code>bundle.zip</code> file you created above and <strong>Save and Test</strong>. The above lamba code takes the <strong>event</strong> of the lambda invocation, along with the <strong>context</strong> object which contains some metadata, serialises them into JSON string and sends them to the <code>cat</code> process through standard input. The process returns the result back through standard output and we successfully finish the lambda invocation passing the line of text received as the result.</p>
<p>There are more examples in the npm package but as you can see above, once you have all your logic into a compiled binary, starting the binary and communicating with it is a matter of ~5 lines.</p>
<p>Enjoy, and feel free to contribute to the package.</p>
<p>SEO tags: AWS, AWSLambda, binary, racket, go, ocaml, rust</p>

</article>
<article>
<h1>Command prompt alternatives for Windows</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 16 Oct 2017 00:00:00 GMT</p>
<p>Since I have been fiddling around with <a href="https://msdn.microsoft.com/en-gb/commandline/wsl/install_guide">Windows Subsystem for Linux</a> I realised how bad <strong>Command Prompt</strong> is compared to <a href="https://gnometerminator.blogspot.co.uk/p/introduction.html">Terminator</a>, my go-to terminal solution for Linux.</p>
<p>I wanted to find an alternative which at least offered multiple tabs, proper text selection and in general being a better terminal for Windows.</p>
<p>I tried several ones but for now I settled with <a href="https://conemu.github.io/">ConEmu</a> which has the extra benefit of being open-source.</p>
<p>One problem I had with it, is that when you go into the WSL mode, using the <code>bash</code> command, the arrow keys do not work. This was weird and confusing at first since they work if you go into WSL mode using <strong>Command Prompt</strong>.</p>
<p>The solution is to pass an additional argument when entering the WSL mode through ConEmu, as shown below.</p>
<pre><code class="language-bash"># `p1` refers to tab-1.
# If you are entering this command to a different tab in ConEmu
# you have to use the proper tab-id.
bash -cur_console:p1

# Or you can open a new tab straight into WSL mode and avoid the tab-id
bash -new_console
</code></pre>
<p>Now all the keys should work properly.</p>
<p>This nice terminal application, along with the ability to use editors straight from the Linux subsystem as described in a <a href="/articles/windows-linux-subsystem-editor-setup/">previous article</a> make Windows 10 a very viable solution for developers that love the command line.</p>

</article>
<article>
<h1>Windows Subsystem for Linux - Editor (GUI) Setup</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 15 Oct 2017 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>While I was testing out the <a href="https://msdn.microsoft.com/en-gb/commandline/wsl/install_guide">Windows Subsystem for Linux - WSL</a> I setup all my favourite programming languages inside the Linux system instead of Windows. Everything was working smoothly until I had to use an editor to work on some projects.</p>
<p>I initially installed <a href="https://code.visualstudio.com/">Visual Studio Code</a> in my Windows system but realised that I don’t have access to the <code>PATH</code> or the binaries of my languages installed inside WSL. The solution is to have an X-server running in Windows and then run the editor from inside the WSL but rendering to the X-server display (easier than it sounds).</p>
<p>BTW, if you are a VIM user you can stop reading right now since VIM works out of the box.</p>
<h2 id="solution"><a href="#solution">Solution</a></h2><p>Firstly, we need to install an X-server in Windows. I installed <a href="https://sourceforge.net/projects/xming/">Xming X Server for Windows</a> and it works like a charm, with all the default options. Just download, install, and run!</p>
<p>If the server is running successfully, you should see an icon in the notification area of the taskbar that says <strong>Xming Server:0.0</strong>.</p>
<p>Once the X-server is running we are ready to go. In order to start an application with a GUI from inside WSL and have it render in Windows we need to set the environment variable <code>DISPLAY=:0</code> and then run the command.</p>
<pre><code class="language-bash"># Either set it right before the command
DISPLAY=:0 command-with-a-gui

# or set it for the whole terminal session
export DISPLAY=:0
command-with-a-gui
</code></pre>
<h3 id="visual-studio-code"><a href="#visual-studio-code">Visual Studio Code</a></h3><p>I started using Visual Studio Code for the most part of last year and I was really happy with it, but it seems that it cannot properly run through an X server and it is <a href="https://github.com/Microsoft/vscode/issues/13138">not a priority of its team to get it working</a>.</p>
<p>Not a huge issue since my previously favourite editor is perfectly working, but still a bummer since Microsoft’s own editor does not work with their system!</p>
<h3 id="sublime-text-3"><a href="#sublime-text-3">Sublime Text 3</a></h3><p>The <a href="https://www.sublimetext.com/docs/3/linux_repositories.html">Sublime Text 3 installation instructions</a> are straightforward but I copy the commands needed here to make the article more complete.</p>
<pre><code class="language-bash">wget -qO - https://download.sublimetext.com/sublimehq-pub.gpg | sudo apt-key add -
echo "deb https://download.sublimetext.com/ apt/stable/" | sudo tee /etc/apt/sources.list.d/sublime-text.list
sudo apt-get update && sudo apt-get install sublime-text
</code></pre>
<p>The command to run Sublime Text 3 is <code>subl</code>. So in conjunction with the X-server display variable we can run Sublime using the following command:</p>
<pre><code class="language-bash">DISPLAY=:0 subl
</code></pre>
<p>The above command should open a window of Sublime Text inside Windows, and if you try to <strong>Open folder</strong> you should be seeing the Linux subsystem file-system.</p>
<p>The next step is to <a href="https://packagecontrol.io/installation">install the package control</a> for Sublime so that we can install our favourite plugins.</p>
<h2 id="next-steps"><a href="#next-steps">Next steps</a></h2><p>The font I use for all my text editors, and IDEs, is <a href="https://github.com/adobe-fonts/source-code-pro/">Source Code Pro</a>, which in Ubuntu-like systems can be installed easily using the following commands:</p>
<pre><code class="language-bash">git clone --depth 1 --branch release https://github.com/adobe-fonts/source-code-pro.git ~/.fonts/adobe-fonts/source-code-pro
fc-cache -f -v ~/.fonts/adobe-fonts/source-code-pro
</code></pre>
<p>In addition, there are some Sublime options that is good to change from the start. Below you can see the changes I did to my User settings file (<strong>Preferences</strong> -> <strong>Settings</strong>).</p>
<pre><code class="language-javascript">{
    "font_face": "Source Code Pro",
    "font_size": 10,
    "spell_check": true,
       "always_show_minimap_viewport": true,
       "highlight_line": true,

       "trim_trailing_white_space_on_save": true,
       "show_encoding": true,
       "show_line_endings": true,

       "dictionary": "Packages/Language - English/en_GB.dic",
}
</code></pre>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>I am really impressed by the work Microsoft did with the <strong>Windows Subsystem for Linux</strong>. I am a Linux user for almost a decade but I always had those days that I had to access a Windows system, either for Microsoft Office, or a BIOS update, or something else. As a result, I always setup my machines to dual-boot Windows and Linux.</p>
<p>With WSL, I have a good feeling that it could be a nice alternative to dual boot. Apart from some limitations with GUI applications not running through an X-server, I didn’t have any problem running my favourite languages and tools in WSL. Including Node.js, Racket, Go, Elixir and Erlang, and of course native Linux tools like SSH.</p>
<p>In conclusion, nicely done by Microsoft, and I really hope that this will continue to improve and get polished in order to reach the stability levels we require for day-to-day usage. This could be what Microsoft needed to make Windows a viable option for developers loving their Linux!</p>

</article>
<article>
<h1>AWS Lambda meets Racket (or any compiled language)</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Wed, 11 Oct 2017 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>I recently started learning <a href="https://racket-lang.org/">Racket</a> and one of the first things I do with a new language is integrating it with <a href="https://aws.amazon.com/lambda/">AWS Lambda</a>.</p>
<p>One of my favourite languages is <a href="https://golang.org/">Go</a>, and since both Racket and Go can be compiled down to self-contained executable binaries we can re-use some of the knowledge running Go on AWS Lambda.</p>
<h2 id="aws-lambda-overview"><a href="#aws-lambda-overview">AWS Lambda Overview</a></h2><p>Before delving into the detailed solution that works best for me I will provide the different ways we can run code in Lambda.</p>
<h3 id="language-natively-supported-by-aws-lambda"><a href="#language-natively-supported-by-aws-lambda">Language natively supported by AWS Lambda</a></h3><p>If your language is supported natively by Lambda then it’s very easy and you should follow the AWS documentation for the language. As of the time of writing, the <a href="http://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html">supported languages</a> include <strong>Node.js</strong>, <strong>Java</strong>, <strong>Python</strong>, and <strong>.NET Core</strong>.</p>
<h3 id="language-can-be-compiled-to-c-shared-library"><a href="#language-can-be-compiled-to-c-shared-library">Language can be compiled to C shared library</a></h3><p>If your language of choice can be compiled down to a shared C-library binary, then the most performant way to run on Lambda, is to use a wrapper in Python that loads this library and directly make calls to your shared library.</p>
<p>This is the best way I have found to run Go code so far and I make use of the great library <a href="https://github.com/eawsy/aws-lambda-go">eawsy/aws-lambda-go</a> (For Go 1.8+ there is a <a href="https://github.com/eawsy/aws-lambda-go-shim">new version</a> which uses Go plugins).</p>
<p>This approach can be used to run any language compiled down to a C shared library.</p>
<h3 id="language-can-be-compiled-to-standalone-binary-that-runs-on-amazon-linux"><a href="#language-can-be-compiled-to-standalone-binary-that-runs-on-amazon-linux">Language can be compiled to standalone binary that runs on Amazon Linux</a></h3><p>If your language of choice cannot be compiled to shared library, but can be compiled to standalone binary that can run on the Amazon Linux system (which is the OS AWS Lambda uses behind the scenes) then it’s very easy to run the application binary through Lambda.</p>
<p>The easiest and simplest way to run your code is to spawn a new subprocess of your binary using some lambda wrapper code (in Python or Node.js) and pass the <code>event</code> and/or <code>context</code> as input to that process. Then, you read the standard output of the subprocess from inside the wrapper code and return it to the caller of your Lambda code.</p>
<p>The only downside of this approach is that each Lambda function invocation is pretty slow, since a new process has to be started every time your function runs.</p>
<p>For example, using the code from <a href="http://www.dbrunner.net/2015/08/27/running-racket-on-aws-lambda/">Daniel’s Brunner blog post</a> each Lambda invocation configured with <strong>128MB</strong> memory has an average runtime of <strong>~450-550ms</strong>. You can boost the performance of this if you increase your Lambda’s memory to the maximum of <strong>1536MB</strong>. This will bring the average runtime down to <strong>~50-150ms</strong>.</p>
<p>The reason behind this interesting fact is that increasing your Lambda function’s memory, you also increase its CPU power, hence leading to much faster subprocess spawning times.</p>
<p>This spawning per function invocation works fine, but it’s still slow for me and I don’t want to pay for the maximum Lambda memory to get good performance. My proposed solution is a very simple adaptation of this approach.</p>
<p>Since Lambda has a static initialisation section every time the underlying container is started, I will spawn a process during that time, and then in each function invocation I will communicate over <strong>standard input and output</strong> with the subprocess from inside the lambda wrapper code.</p>
<p>This has a tremendous speedup over creating a new process every time since the average invocation runtime for the <strong>128MB</strong> using this approach is <strong>1-100ms</strong>, and consistently stays under <strong>~10ms</strong> with the <strong>1536MB</strong> memory configuration. This speedup is significant for lambdas that run many times over a period of time because the overhead of spawning a process is only observed the first time our code will run in a specific container instance, and then the same process is re-used leading to these extremely fast times.</p>
<p>See proof below, using the 128MB memory configuration, and runtime of <strong>0.51ms</strong> (yes that’s less than a millisecond)!</p>
<p></p>
<h2 id="solution"><a href="#solution">Solution</a></h2><p>As explained in the previous section, my best solution so far which keeps the complexity to a minimum, is to spawn a subprocess of our Racket application binary during the static initialisation of the Lambda function and re-use that process during the individual lambda invocations, by communicating over stdio.</p>
<h3 id="racket-application"><a href="#racket-application">Racket application</a></h3><p>I will use the following Racket code as example, which just echoes back the input of the application.</p>
<pre><code class="language-scheme">#lang racket/base

;; This is the actual logic of our code!
(define (execute-logic data)
  (display (format "data: ~a~%" data))
  (flush-output))

;; The following code waits for one line of input and then dispatches it to the `execute-logic` function.
;; This way we can have full control over what we can do and there can be an arbitrarily complex protocol
;; between the caller and this application over **stdio**.
(define (loopInput)
  (execute-logic (read-line))
  (loopInput))
(loopInput)
</code></pre>
<p>The last four lines in the code above is just a loop that reads a line from standard input, and then calls our <code>execute-logic</code> function passing the data received. This is the only boilerplate needed by our application code.</p>
<p>Your logic can do whatever it wants with the input data and then just print the result to standard output. Here we just use the <a href="https://docs.racket-lang.org/reference/Writing.html#%28def._%28%28quote._~23~25kernel%29._display%29%29">display</a> function to print the input to standard output.</p>
<h3 id="wrapper-code-in-node-js"><a href="#wrapper-code-in-node-js">Wrapper code in Node.js</a></h3><p><strong>Update@2017-10-18:</strong> I created an npm package to significantly reduce the boilerplate code shown below, so <strong>after</strong> you have read this article and understood how the solution works, use the package <a href="https://www.npmjs.com/package/aws-lambda-binary">AWS-lambda-binary</a> in your production Lambda functions.</p>
<p>The wrapper code will be a bit longer but still remains very simple to understand.</p>
<pre><code class="language-javascript">const child_process = require('child_process');
const readline = require('readline');

/****************************
 * START OF WRAPPER CODE
 ****************************/
const execPath = './application';

let proc = initProc();

function initProc(options) {
    options = options || {};
    const p = child_process.spawn(execPath);

    // Add your own custom handler if you want to handle the errors differently.
    p.stderr.on('data', (err) => {
        console.error('proc stderr: ', err);
    });

    // You might want to use ```exit``` event instead of ```close``` if you don't
    // care about the ```stdio streams``` of the subprocess.
    // https://nodejs.org/api/child_process.html#child_process_event_close
    // https://nodejs.org/api/child_process.html#child_process_event_exit
    p.on('close', function (code) {
        if (code !== 0) {
            console.error(new Error(`Process closed with code: ${code}`));
        }
        const {handlerProcCloseCallback} = proc;
        proc = null;
        if (handlerProcCloseCallback) {
            handlerProcCloseCallback(code);
        }
    });

    // This is the part that you get the result back from your Racket application
    // I prefer to receive lines back from the application for simplicity
    // so I use https://nodejs.org/api/readline.html#readline_event_line
    // but you can adapt this to use binary data as well, exactly like we did with
    // `stderr` above.
    const rl = readline.createInterface({ input: p.stdout });
    rl.on('line', (line) => {
        const {handlerCallback} = proc;
        if (handlerCallback) {
            handlerCallback(line);
        }
    });

    return {
        p, rl,

        // Will be called for every **line** output from our application.
        handlerCallback: null,

        // Will be called when the application process is closed. You can use
        // this callback to restart it automatically or do something custom.
        handlerProcCloseCallback: null,
    };
}

function ensureProcRuns(options) {
    if (!proc) {
        proc = initProc(options);
    }
    if (options.resetCallbacks) {
        proc.handlerCallback = null;
        proc.handlerProcCloseCallback = null;
    }
}
/**************************
 * END OF WRAPPER CODE
 **************************/

exports.handler = function (event, context) {
    ensureProcRuns({resetCallbacks: true});

    // Register the handler we want for each line response!
    proc.handlerCallback = (result) => {
        console.log(`rkt: ${result}`);
        context.done(null, `result: ${result}`);
    };

    // Send the input to the Racket application
    proc.p.stdin.write(`${JSON.stringify({event, context})}\n`);
};
</code></pre>
<p>I don’t think it needs lots of explanation but I will explain some of the important bits.</p>
<p>Your main focus should be inside the <code>exports.handler = function(...) {...}</code> section, and everything before that is just the wrapper code around the subprocess spawning.</p>
<p>The first thing we need to do inside our handler code is to call <code>ensureProcRuns()</code> in order to make sure that there is a running subprocess of our Racket application. Ideally, this should always be instant since we already instantiate a process during the initialisation phase of the container. If the subprocess fails though and exits during an invocation, this will ensure that future invocations will re-spawn the process.</p>
<p>After that first line, we need to register a callback function that will handle the result back from the Racket application. In my case here we just read a single line and then finish the lambda invocation with success <code>context.done(...)</code> and with return value whatever the line we received was.</p>
<p>It is very important to understand that this solution is not limited to single line responses. You can have your own protocol that spans multiple lines of response from the application, or even going down to binary data instead of line-by-line.</p>
<p>Finally, once we set our handler callback, we write on standard input of the subprocess the input to our Racket application. Here I just pass a JSON serialised object containing both the <strong>event</strong> and the <strong>context</strong> of the lambda invocation, which should cover most of your use-cases. If you want to avoid JSON deserialisation in your Racket code, then you can parse the event in this wrapper code and just pass a line of comma-separated values down to the application. Again, here I use a single line of input, but you can easily adapt this to span multiple lines of input or binary data.</p>
<p>There is also an additional callback that you can set, the <code>handlerProcCloseCallback()</code>. This callback is called every time the subprocess’s standard input and output streams are closed (usually when the subprocess terminates). For example, you can use this callback to re-spawn the closed subprocess and speedup future invocations, as shown below.</p>
<pre><code class="language-javascript">proc.handlerProcCloseCallback = (code) => {
    ensureProcRuns({resetCallbacks: true});
};
</code></pre>
<h2 id="compile-and-bundle-your-code"><a href="#compile-and-bundle-your-code">Compile and bundle your code</a></h2><ul>
<li><p>Copy the Racket code from above and save it in a file named <code>application.rkt</code></p>
</li>
<li><p>Compile the Racket code into a standalone binary</p>
<pre><code class="language-bash">raco exe --orig-exe application.rkt
</code></pre>
</li>
<li><p>Copy the Node.js code from above and save it in a file named <code>wrapper.js</code></p>
</li>
<li><p>Bundle everything together <code>zip bundle.zip application wrapper.js</code></p>
</li>
</ul>
<p>You can upload <code>bundle.zip</code> now to your lambda function and test it.</p>
<p>Make sure that your lambda function has the following configuration:</p>
<table>
<thead>
<tr>
<th>Configuration Property</th>
<th>Value</th>
</tr>
</thead>
<tbody><tr>
<td><strong>Runtime</strong></td>
<td>Node.js 6.10</td>
</tr>
<tr>
<td><strong>Handler</strong></td>
<td>wrapper.handler</td>
</tr>
</tbody></table>
<p>Test around with different memory configuration to find a good balance between latency and cost for your application.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>As you can see, we did not invent quantum computing. We just applied old-fashioned programming principles to AWS Lambda. We have our custom application logic written in Racket, or any other compiled language, that accepts data from standard input, and writes to standard output. Anybody used Linux command line tools before? Yes, it is the same!</p>
<p>This solution is very flexible and allows you to either go with a minimalistic protocol of single line input and single line output as I have done above, or even with a super complicated custom protocol. After all, it’s just standard input and standard output.</p>
<p>The most fascinating thing about this approach is that now your custom code can run in AWS Lambda with super low latencies!</p>
<p>SEO tags: #AWS #Racketlang - #AWSLambda #Meets #Racket</p>

</article>
<article>
<h1>Self-hosted git repository for privacy and control</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Wed, 13 Sep 2017 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>Few months ago I started receiving the following warning from <a href="https://github.com">Github</a>, <strong>Your GitHub academic discount coupon has expired</strong>, and if I wanted to continue working with my private repositories I would have to pay.
In addition to this, for the last couple of months I was investigating and trying different products for personal and <strong>private</strong> note taking. I found a lot of interesting products that were integrating with git services like Github, or <a href="https://gitlab.com">Gitlab</a> and I was really close to settle with them, but I went deeper into their paid offerings and they actually did not provide encryption at rest, so it was a <strong>no-go</strong>.</p>
<p>Of course, someone can say they are used by millions, so why not trust them. Well, I don’t, and in conjunction with their availabilities issues I decided to find alternatives. Actually, while I was writing this article Gitlab was <strong>out of service</strong> with the website returning <strong>500 internal server</strong> errors.</p>
<p>My next solution was a custom website I would develop that would have a super-simple UI, including an editor and a list of all my notes, which would use either <a href="https://www.google.com/drive/">Google Drive</a> or <a href="https://aws.amazon.com/s3/">Amazon S3</a> as it’s backend, and since I would code the whole thing I would add client encryption to my super important notes (like passwords).</p>
<p>I started looking around for websites similar to this, and found plenty, but none provided the privacy I wanted since they would pass my data through their servers, and although they would say they don’t look at them, well I was skeptical! I found some very good products though which I would gladly use if it wasn’t for the privacy issues, like <a href="http://writeboxapps.com/">Write-box</a>, <a href="https://app.standardnotes.org/">Standard Notes</a> and <a href="https://dynalist.io/">Dynalist</a>.</p>
<p>So long story short, nothing suited my needs which are summarised below:</p>
<ul>
<li><strong>Privacy</strong> of my data with encryption at rest</li>
<li><strong>Accessibility</strong> and <strong>Availability</strong> of my notes from my laptops and my smartphone</li>
</ul>
<p>I was close to start implementing my solution, but then I looked closely into every coder’s friend, <a href="https://git-scm.com/">Git</a>.</p>
<h2 id="solution"><a href="#solution">Solution</a></h2><p>I decided to use git repositories for storing my notes, and files. After investigating several self-hosted solutions, including gitlab’s, and bitbucket’s server offerings, I settled on an open source super lightweight self-hosted git server, <a href="https://gitea.io/">Gitea</a>. Gitea is written in <a href="https://golang.org/">Go</a> which allows for amazingly low resource consumption (it uses less than 0.3% CPU on my T2.nano EC2 instance).</p>
<p>In addition to the self-hosted git server, I found out about the <strong>git bare repos</strong> which basically act like a git server when hosted on a system accessible over SSH. I use these bare repos for super simple stuff where I don’t need the accessibility over my phone, like coding side projects.</p>
<h3 id="git-bare-repositories"><a href="#git-bare-repositories">Git bare repositories</a></h3><p>Due to the design of git, you can have a directory acting as the <strong>remote location</strong> of all your working directories, where you can push and pull your files.</p>
<p>How to setup a git repository which can be used as remote location?</p>
<ol>
<li><p>On the server (EC2 instance, or any other system accessible through SSH) run the following command to create the git directory:</p>
<pre><code class="language-bash">mkdir -p $HOME/git-repos/<REPO_NAME>.git && cd $HOME/git-repos/<REPO_NAME>.git
git init --bare
</code></pre>
<p> The first command creates an empty directory where our files will be and the second one initialises that directory to hold a git repository. The <code>--bare</code> argument is the important bit, which tells git to create a repository without a <strong>working directory</strong>, meaning that nobody will use this directory as his workspace to work on the files. It will be simply used as a remote location for other clones of this git repository.</p>
</li>
<li><p>On any system that you want to work with the above repository just clone it and work with it as you would with any other git repo.</p>
<pre><code class="language-bash">git clone username@serverHostname:~/git-repos/<REPO-NAME>.git
</code></pre>
</li>
</ol>
<p>The good thing of git bare repositories is that they can be used from any system that has access to the server hosting them. I know that lots of people are using this approach but instead of hosting their <strong>bare repos</strong> on a server, they store them in Dropbox or Google Drive. They achieve the accessibility and availability aspect as long as they keep their systems in-sync, but the privacy aspect still resides in the amount of trust you have on Dropbox and Google. BTW, I am a huge fan of Google services like Google Drive and Google Keep, which I use for pretty much all my storage and note taking needs. However, there are some stuff for which I need some extra privacy (i.e. nuclear codes)!</p>
<h3 id="self-hosted-git-server"><a href="#self-hosted-git-server">Self hosted git server</a></h3><p>As I said, my git server of choice, at least for the time being, is Gitea, especially due to the low resource consumption on my server. It provides a nice Github-like web interface which you can use to read and edit your files using any web browser, and also has simple organisation of the repositories it creates just like any normal git bare repo which you can use like explained above if you want to bypass the server.</p>
<p>It is amazingly simple to setup as described in <a href="https://docs.gitea.io/en-us/install-from-binary/">https://docs.gitea.io/en-us/install-from-binary/</a>.</p>
<pre><code class="language-bash"># Download the git server
wget -O gitea https://dl.gitea.io/gitea/1.0.1/gitea-1.0.1-linux-amd64 && chmod u+x gitea

# Run the server
./gitea web
# Or Run the server in the background
nohup ./gitea web &
</code></pre>
<p>Once you have the server running, access it through a web-browser and you will be prompted for the first setup, which will ask you to register a user (automatically becomes the admin user as well), and to choose your database (SQLite3 should be sufficient without other dependencies).</p>
<p>Obviously, you need to have access to your server over HTTP or HTTPS preferably.</p>
<p>As a side note, I have to mention that Gitea stores all the git repositories you create through its web interface in a directory you define during the first setup and these repositories are just like any git bare repository you create on your own. In a nutshell, Gitea is just a web proxy to git bare repos that does user management and provides a web interface to them.</p>
<h3 id="backups-and-encryption"><a href="#backups-and-encryption">Backups and encryption</a></h3><p>OK, I achieved availability and accessibility since I can access my git server from anywhere using a web browser or over an SSH connection. We achieved incredibly easy management of my files through any editor and any system that can manage git repositories, without forcing us to use a single website UI. But what happens with encryption and privacy?</p>
<p>In my case, I use a T2.nano EC2 instance as my server which I use for personal projects few years now. The <a href="https://aws.amazon.com/ebs/">EBS volume</a> backing the instance is using <a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html">Amazon EBS Encryption</a> so we achieve our encryption needs. The security of the server itself is another topic but following AWS best-practices around IAM security rules it should be fine.</p>
<p>In addition, just to avoid any problems when that horribly dark day comes when my host will die or stop working, I take backups of all my repos and store them in <a href="http://docs.aws.amazon.com/AmazonS3/latest/dev/serv-side-encryption.html">S3 with server-side encryption</a> after every <code>push</code> I do on the repos. I also take weekly backups of the whole EBS which will make it very easy to just spin up another server with the exact configuration if this one fails.</p>
<p>The script I use for my repository backups is as shown below, and it assumes that all the git repositories I create, on my own or through Gitea, are under the <code>$HOME/git-repos</code> path.</p>
<pre><code class="language-bash">#!/bin/bash

set -e

GIT_REPO_DIR="git-repos"
S3_BUCKET="s3://your-s3-bucket-name/git-repos/"

TIMENOW="$(date --rfc-3339=seconds | tr ' ' '_' | tr ':' '-' | tr '+' '.')"
FILENAME="git-repos.$TIMENOW.tgz"

DEST_TAR="/tmp/$FILENAME"

tar -cvzf "$DEST_TAR" -C "$HOME" "$GIT_REPO_DIR"

aws s3 cp "$DEST_TAR" "$S3_BUCKET$FILENAME" --storage-class STANDARD_IA --sse
</code></pre>
<p>If you don’t want to get into the trouble of enabling webhooks in Gitea to backup after every push, you can just do periodic backups using cronjobs. Assuming that the above backup script file is located in <code>/home/user/bin/backup-git-repos-s3.sh</code> a simple cronjob to take a backup every day at 20:00 is shown below.</p>
<pre><code class="language-bash">crontab -e

0 20 * * * /home/user/bin/backup-git-repos-s3.sh > /tmp/crontab.backup-git-repos.log 2>&1
</code></pre>
<p>I personally use Amazon S3 for my backups cause I already use it for hosting all my static websites and for general backup storage since it’s dirty cheap. You could easily modify the script above to upload the file in Google Drive, or any other preferred service.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>The first initial setup for the Git self hosted server and the backup automation might seem more work than just signing up on an online service, but for me every it totally paid off, especially since I already had a server running. I have a super flexible way to keep track of personal projects I don’t want to have public, or to store sensitive information without being afraid of prying employee eyes. And with the backups, in case something goes wrong with my server I have perfectly good archives to work with.</p>
<p>Although I solved my problem, I still wish there was a <strong>completely client-side web application</strong> integrated with Google Drive or Amazon S3, which will provide capability to encrypt my notes before storing them, without transmission of any data to third-parties (apart of course from the storage service itself).</p>
<p>To be honest, I was super amazed that there are so many online products integrating with Github, Dropbox, Google Drive, and others, but still none of them provides privacy and client-side encryption before sending the data over. Therefore, I still plan to implement that super-duper web application, so keep an eye for it over the next coming months :)</p>

</article>
<article>
<h1>Solve the eight queens problem with Elixir</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 01 May 2017 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>While I was learning Elixir I wanted something to use the <a href="http://elixir-lang.org/getting-started/comprehensions.html">for-comprehension</a> to solve a problem. I remembered the <a href="https://en.wikipedia.org/wiki/Eight_queens_puzzle">Eight queens puzzle</a> back from my college courses so I decided to give it a go.</p>
<p>I adapted the problem a bit, so my solution takes as inputs the following:</p>
<pre><code>Inputs
======
n: number of queens to position on the board
m: size of the board side

Outputs
=======
List of all the solutions (List[List[int]])
</code></pre>
<h2 id="naive-solution"><a href="#naive-solution">Naive Solution</a></h2><p>The solution is simple, and uses <strong>backtracking</strong> to iterate over all possibilities and select the valid ones.</p>
<pre><code class="language-elixir">defmodule Queens do

  @doc """
  Given n number of queens and m the size of the checkerboard, find all solutions to 
  position each queen so that it does not collide with any other queen vertically, 
  horizontally or diagonally.
  """
  def solve(0, _m), do: [[]]
  def solve(n, m) do
    for done_queens <- solve(n-1, m),
        avail_pos <- (Enum.to_list(1..m) -- done_queens),
        safe_pos(avail_pos, done_queens, 1), 
      do: [avail_pos | done_queens]
  end

  defp safe_pos(_, [], _), do: true
  defp safe_pos(pos, [queen | queens], distance) do
    (pos != queen + distance) and 
    (pos != queen - distance) and 
    safe_pos(pos, queens, distance+1)
  end

end
</code></pre>
<p>Obviously, this is <strong>not</strong> the fastest algorithm to solve this problem (exponential complexity), but it shows how elegant the solution can be using Elixir’s comprehensions.</p>
<p>Sample output in <strong>iex</strong>:</p>
<pre><code class="language-elixir">iex(47)> c("queens.ex")
[Queens]
iex(48)> :io.write Queens.solve(4, 4)
[[3,1,4,2],[2,4,1,3]]:ok
iex(49)> :io.write Queens.solve(3, 4)
[[2,4,1],[1,4,2],[4,1,3],[3,1,4]]:ok
iex(50)> :io.write Queens.solve(5, 5)
[[4,2,5,3,1],[3,5,2,4,1],[5,3,1,4,2],[4,1,3,5,2],[5,2,4,1,3],[1,4,2,5,3],[2,5,3,1,4],[1,3,5,2,4],[3,1,4,2,5],[2,4,1,3,5]]:ok
iex(51)> :io.write Queens.solve(1, 5)
[[1],[2],[3],[4],[5]]:ok
</code></pre>
<p><strong>Demo:</strong> <a href="http://elixirplayground.com?gist=688be58a64712a172878a58683ed0eda">http://elixirplayground.com?gist=688be58a64712a172878a58683ed0eda</a></p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Every day I learn more and more Elixir (and Erlang) and I can say that it’s among the few languages that managed to keep me interested and excited for more than 3-4 months (looking at you Scala and Python).</p>

</article>
<article>
<h1>AWS S3 sync - only modified files, using git status</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 24 Sep 2016 00:00:00 GMT</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>As you know I build this website using my custom site generator builder so the whole website is being re-created every time I change something. Similar tools are widely used by many people, like <a href="https://jekyllrb.com/">Jekyll</a>.</p>
<p>Then I use the <a href="http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html">AWS S3 sync</a> command to update the public version of the site in my S3 bucket. The only problem with this is that since the local website is fully re-built every time it seems to be <strong>newer</strong> than the remote website in the S3 bucket (due to newer modification timestamps), resulting in extra uploads for all files. </p>
<h2 id="git-solution"><a href="#git-solution">Git Solution</a></h2><p>The easiest solution to solve this problem is to use <strong>git</strong> to handle diff changes and then just pass along the modified files to the <strong>aws s3 sync</strong> command that will sync them against the remote S3 bucket.</p>
<p>So, long story short, assuming you have the site’s directory git-versioned the following script will sync the directory with the remote S3 bucket, including adding new files, removing deleted files, etc.</p>
<pre><code class="language-bash">#!/bin/bash
set -ex

FILES=()
for i in $( git status -s | sed 's/\s*[a-zA-Z?]\+ \(.*\)/\1/' ); do
    FILES+=( "$i" )
done
#echo "${FILES[@]}"

CMDS=()
for i in "${FILES[@]}"; do
    CMDS+=("--include=$i""*")
done
#echo "${CMDS[@]}"

echo "${CMDS[@]}" | xargs aws s3 sync . s3://www.lambrospetrou.com --dryrun --delete --exclude "*" 
</code></pre>
<p><strong>Important</strong></p>
<p>You have to remove the <code>--dryrun</code> option in order to actually apply the changes remotely, otherwise it will just fake them.</p>
<h3 id="explanation"><a href="#explanation">Explanation</a></h3><p>The important part of the above script is the <code>--include</code> and <code>--exclude</code> filters. The order of the filters <strong>matters</strong>, that’s why we have the exclude first, and the includes last. In case the exclude was last nothing would be updated.</p>
<p>The two for-loops generate the required <code>--include=FileX</code> arguments, which are expanded using the <code>"${CMDS[@]}"</code> trick. Then <strong>xargs</strong> takes care of sending them as last arguments to the aws s3 sync command, also taking care of very long list of files that exceed the command line length limit.</p>
<p>In addition, I have to use <code>git status</code> instead of <code>git diff</code> otherwise new files will not be synced, since they are not part of the index tree.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Using <strong>git</strong> along with <strong>aws CLI</strong> it’s very easy to maintain my website and only upload the real <strong>diff</strong>, modified files, each time. One can imagine that this can be used in a much more advanced scenario with <a href="https://developer.github.com/webhooks/">Github webhooks</a> integrated with <a href="https://aws.amazon.com/codepipeline/">AWS CodePipeline</a> or any other CI tool that will release your website automatically.</p>
<h2 id="references"><a href="#references">References</a></h2><ul>
<li><a href="http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html">AWS S3 sync documentation</a></li>
<li><a href="https://git-scm.com/docs/git-status">git status documentation</a></li>
</ul>

</article>
<article>
<h1>Transfer your contacts to an old Nokia Symbian S40 mobile phone</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Thu, 30 Jun 2016 00:00:00 GMT</p>
<p>Every now and then we have to change phones and transfer contacts from one device to another. It’s a trivial thing to do nowadays with Android, iOS and Windows Phone, but try to do it on an old Nokia Symbian S40 phone. It’s 2016 and this simple thing is just <strong>fuxxing too complex</strong>. And it’s standards inconsistencies and Nokia to blame!</p>
<h2 id="problem"><a href="#problem">Problem</a></h2><p>I have my contacts in <a href="https://contacts.google.com">Google Contacts</a>, how they got there from an older S40 device is an even more fascinating story but let’s leave it for another post. Now I just want to import these contacts into a <a href="http://www.gsmarena.com/nokia_301-5323.php">Nokia 301</a>, which runs Nokia Symbian S40, and yes some people still use this kind of phones, obviously not me :)</p>
<h2 id="solution-1-nokia-pc-suite"><a href="#solution-1-nokia-pc-suite">Solution 1 - Nokia PC Suite</a></h2><p>Nokia PC Suite was a nice tool by Nokia, <strong>if and only if</strong> it worked with your phone. In order to make it to recognise the device I had to install specific versions of the software (Nokia PC Suite version 7.1.180.94) and specific version of the Nokia Connectivity Cable driver (version 7.1.182.0).</p>
<p>OK, we are connected and we can manage the device through the PC suite. Now, what?</p>
<p>First of all, I needed to have the contacts in a form that can be imported somehow by the PC suite, so I exported all the contacts from Google to <strong>.csv</strong> and <strong>.vcf</strong> formats just in case one did not work.</p>
<p>There is a handy option in PC suite when you open the contacts section:</p>
<pre><code>File => Import
</code></pre>
<p>which allows you to select any .csv, .vcf file to directly import into your phone. Of course this would be too easy :) so neither the .csv nor the .vcf import worked using the files from Google Export.</p>
<p>The problem with <strong>.csv</strong> is that Google’s csv format follows a different structure compared to Nokia’s PC suite so that’s a <strong>no-go</strong>. The problem with <strong>.vcf</strong> is that this one file was containing ~1000 contacts but the import functionality was only importing 1 contact (the same happens when you send this .vcf file directly to the phone via Bluetooth).</p>
<p>After spending several hours going through myriads of forums with people asking for a solution I found an article written in 2009 that provides a solution to split this single <code>.vcf</code> file into multiple <code>.vcf</code> files, one for each contact. The tool mentioned in the article, which I also used is <a href="http://www.philipstorry.net/software/vcardsplit">vCard Split</a> by <a href="http://www.philipstorry.net/">Philip Storry</a>.</p>
<p><strong>Very important</strong></p>
<ul>
<li>In step 3 (see screenshot below) you have to select <strong>Force version 2.1</strong> otherwise Nokia PC suite will complain and fail to import the contacts. And also select/check the <strong>Remove Type information</strong> option because Google puts a TYPE section in each .vcf which is again not recognizable by PC suite.</li>
</ul>
<p></p>
<p>Awesome, we have all our contacts splitted in a single .vcf file per contact. Now the last thing to do is just select all these .vcf files, COPY them (CTRL + C) and then PASTE them (CTRL + V) into the PC suite contacts section, or simply use the <strong>File => Import</strong> wizard and select all the .vcf files from the opened window.</p>
<p>Great, the tool seems to be importing the files and finishes successfully. <strong>But</strong>, there is a catch :) It seems that Nokia only cares about english/latin characters and any Greek character is converted into a weird symbol which makes your contacts <strong>useless</strong>. </p>
<p>I searched for quite some time to find a way to bypass this limitation but it’s not possible, so PC suite is out of the question. If you want Greek contact names, Nokia PC suite is a <strong>BIG FAIL</strong>.</p>
<h2 id="solution-2-nokia-transfer-app-and-an-android-phone"><a href="#solution-2-nokia-transfer-app-and-an-android-phone">Solution 2 - Nokia Transfer app and an Android phone</a></h2><p>To be honest this was the first solution I tried but it instantly failed so I went on with PC suite, until that one also failed and I returned to this one. This time I noticed that it was just a small step missing to make it work.</p>
<p>Nokia bundles with the Nokia 301 an application named <strong>Tansfer</strong> which you can also manually install on your own by downloading the <strong>.jar/.jad</strong> setup files and using PC suite to install them on the device. I manually installed the latest version <strong>Transfer 1.0.11</strong>.</p>
<p>This application’s functionality is transferring data from <strong>some</strong> phones over to your device via <strong>bluetooth</strong>. I have a Motorolla Moto X and Moto G3 so I tried to use this and transfer the contacts directly from the Android phone to the Nokia phone. The problem was that I was always getting the error <strong>This service is not supported by the phone</strong> with both Androids, which was weird.</p>
<p>Then I went with PC suite route but we know how that ended up. So I decided to give another go on this simple method. This time, the magic was that I deleted the <strong>bluetooth pairing</strong> between the phones on both of them. </p>
<p>The following steps did the job:</p>
<ol>
<li>Turn on bluetooth on Moto G3</li>
<li>Turn on bluetooth on Nokia 301</li>
<li>On Nokia 301 go to bluetooth trusted devices and search for new one. Select the Moto G3 and accept the prompts in both phones, <strong>checking/selecting</strong> the option shown in Moto G3 which says <strong>Allow Nokia 301 to read contact information</strong>.</li>
<li>Open <strong>Nokia Transfer</strong> app on Nokia 301 (inside the Applications menu option)</li>
<li>Select <strong>Add new device</strong> and wait until Moto G3 appears and select it</li>
<li>Select Moto G3 from the populated list</li>
<li>Select <strong>Import contacts</strong></li>
<li>Wait for the import to finish :)</li>
</ol>
<p>And now I finally have all the contacts from Google into the Nokia 301.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>If you have a compatible device with Nokia Transfer app it is by far the easiest way to import your contacts into a new Nokia S40 phone, but remember to delete the pairing and then do it properly allowing access to contacts.</p>
<p>If you want to use PC Suite, unfortunately you cannot import contacts with Greek names, and most probably any contact name with non-ANSI characters.</p>
<p><strong>But please</strong>, upgrade your phones to something modern, preferably Android :) </p>
<h2 id="references"><a href="#references">References</a></h2><ul>
<li><a href="http://www.philipstorry.net/software/vcardsplit">vCard Split</a></li>
<li><a href="http://answers.microsoft.com/en-us/mobiledevices/forum/mdasha/nokia-suite-support-for-nokia-301/58a39104-d1ff-4ff1-91af-983ded51ea27">Nokia Suite Support for Nokia 301 - Microsoft Answers</a></li>
</ul>

</article>
<article>
<h1>Update Route53 record set with EC2 instance public IP for a DIY load balancer</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Wed, 29 Jun 2016 00:00:00 GMT</p>
<p>In this small tutorial I provide a code snippet that allows you to update a record set in <a href="https://aws.amazon.com/route53/">Route53</a> with the public IP of an <a href="https://aws.amazon.com/ec2/">EC2</a> instance programmatically.</p>
<h2 id="why-just-use-an-elastic-load-balancer"><a href="#why-just-use-an-elastic-load-balancer">Why? Just use an Elastic Load Balancer!</a></h2><p><a href="https://aws.amazon.com/elasticloadbalancing/">Elastic Load Balancing</a> is an awesome service that handles load balancing of the traffic to your instances very well. But, it comes with a price, literally. An ELB charges a small amount for every hour running, pretty much like a <strong>T2.small</strong> instance (as of time of writing). This is dirty-cheap when you have tens or thousands of instances behind it, but for 1-2 <strong>T2.nano</strong> instances it might seem overkill.</p>
<p>I play around a lot with AWS and I create small projects, websites mostly, where a single node is more than enough to handle all the load, thus paying for an additional load balancer is too much.</p>
<p>Moreover, there are many people that have use cases where ELB is not the right solution and they prefer to have <strong>Route53</strong> act as their load balancer. Loggly team wrote a <a href="https://www.loggly.com/blog/why-aws-route-53-over-elastic-load-balancing/">very nice article</a> describing this approach and I recommend it for anyone interested in real-world advantages.</p>
<h2 id="alternatives"><a href="#alternatives">Alternatives?</a></h2><p>There are several custom solutions to avoid using an ELB. Most of them take advantage of <a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html">Elastic IPs</a> and Route53 API to programmatically update record sets.</p>
<h3 id="solution-1-elastic-ip-only"><a href="#solution-1-elastic-ip-only">Solution 1 - Elastic IP only</a></h3><p>This solution is as simple as possible and only applies to single instance applications. When the EC2 instance boots you provide a user data script which will associate your Elastic IP (you should create this before booting the instance) with the instance.</p>
<p>Find more information for the exact CLI command at the <a href="http://docs.aws.amazon.com/cli/latest/reference/ec2/associate-address.html#examples">official EC2 CLI reference</a>.</p>
<p>I recently discovered that this is how <strong>single-node</strong> configuration of Elastic Beanstalk is implemented (very similar).</p>
<h3 id="solution-2-route-53-only"><a href="#solution-2-route-53-only">Solution 2 - Route 53 only</a></h3><p>This solution requires a bit more work from you in terms of scripting but it is more flexible than having an Elastic IP for each instance. Inside the user data script of your launch-configuration you have full access to the AWS CLI. Therefore, you can pretty much do anything! </p>
<p>Let’s assume that you have an application at <strong>test.lambrospetrou.com</strong> served by a <strong>single</strong> instance. You want to programmatically update the <strong>A</strong> record of this record set to point to any new instance being created by your <strong>max-1-min-1-desired-1</strong> auto-scaling group.</p>
<p>It turns out it is very simple to do :)</p>
<p>First of all we need to find the public IP of the running instance. AWS provides several <a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html#instancedata-data-retrieval">metadata information to every EC2 instance</a> and public IP is one of them.</p>
<pre><code class="language-bash">curl http://169.254.169.254/latest/meta-data/public-ipv4
</code></pre>
<p>The next step is to update the <strong>test.lambrospetrou.com</strong> record set. It is pretty easy to navigate through the <a href="http://docs.aws.amazon.com/cli/latest/reference/route53/change-resource-record-sets.html">Route53 - ‘change-resource-record-sets’ documentation</a> which we find that the JSON we need is pretty much the following:</p>
<pre><code class="language-json">{
  "Comment": "Update the A record set",
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "test.lambrospetrou.com",
        "Type": "A",
        "TTL": 300,
        "ResourceRecords": [
          {
            "Value": "127.0.0.1"
          }
        ]
      }
    }
  ]
}
</code></pre>
<p>In the above JSON you can see that we want to do an <strong>UPSERT</strong> (update) for the <strong>test.lambrospetrou.com</strong> record set, with type <strong>A</strong> since we want to point to an IP address, and with value 127.0.0.1.</p>
<p>In order to do the record set update you need the following command (assume that the above JSON is the content of a file named <strong>update-route53-A.json</strong>):</p>
<pre><code class="language-bash">aws route53 change-resource-record-sets --hosted-zone-id "$HOSTED_ZONE_ID" --change-batch file://./update-route53-A.json
</code></pre>
<p>In the above command you have to put the correct <strong>Hosted zone id</strong> where the record set resides. You can find this in the <a href="https://console.aws.amazon.com/route53/">Route53 console</a> or with the following command:</p>
<pre><code class="language-bash">aws route53 list-hosted-zones-by-name
</code></pre>
<p>If you try and play with the above command you will notice that the update is pretty much instant, which means that as soon as the instance is up and running, your application will be available using the domain.</p>
<p>There is one variation of the above command which accepts a string instead of a file, which makes it easier to use through scripting. The only difference is that the JSON we examined above needs to be enclosed in a field named <strong>ChangeBatch</strong>:</p>
<pre><code class="language-json">{ "ChangeBatch": { /*INPUT_JSON_LIKE_BEFORE*/ } }
</code></pre>
<p>and the command to use this JSON string is as follows:</p>
<pre><code class="language-bash">aws route53 change-resource-record-sets --hosted-zone-id "$HOSTED_ZONE_ID" --cli-input-json "$INPUT_JSON_STR"
</code></pre>
<p>To summarise, these are the only two commands we need to do what we want:</p>
<ol>
<li><p>Get the public IP of the running instance</p>
<pre><code class="language-bash">curl http://169.254.169.254/latest/meta-data/public-ipv4
</code></pre>
</li>
<li><p>Update the Route53 record set</p>
<pre><code class="language-bash">aws route53 change-resource-record-sets --hosted-zone-id "$HOSTED_ZONE_ID" --cli-input-json "$INPUT_JSON_STR"
</code></pre>
</li>
</ol>
<p>Of course you will want to do some more scripting to replace the <strong>127.0.0.1</strong> value with the proper IP and programmatically find the hosted zone id of your domain. The following snippet is a quick/hacky way of accomplishing this. Feel free to use it if you are bored to come up with a better one :)</p>
<pre><code class="language-bash">#!/bin/sh

if [ -z "$1" ]; then 
    echo "IP not given...trying EC2 metadata...";
    IP=$( curl http://169.254.169.254/latest/meta-data/public-ipv4 )  
else 
    IP="$1" 
fi 
echo "IP to update: $IP"

HOSTED_ZONE_ID=$( aws route53 list-hosted-zones-by-name | grep -B 1 -e "lambrospetrou.com" | sed 's/.*hostedzone\/\([A-Za-z0-9]*\)\".*/\1/' | head -n 1 )
echo "Hosted zone being modified: $HOSTED_ZONE_ID"

INPUT_JSON=$( cat ./update-route53-A.json | sed "s/127\.0\.0\.1/$IP/" )

# http://docs.aws.amazon.com/cli/latest/reference/route53/change-resource-record-sets.html
# We want to use the string variable command so put the file contents (batch-changes file) in the following JSON
INPUT_JSON="{ \"ChangeBatch\": $INPUT_JSON }"

aws route53 change-resource-record-sets --hosted-zone-id "$HOSTED_ZONE_ID" --cli-input-json "$INPUT_JSON"
</code></pre>
<p>As an additional note, the above example updates the A record with a single IP. You can easily adapt the script to retrieve the current IPs of the record set and append the new one to them. This way you can even achieve simple round robin load balancing between nodes exploiting Route53’s weighted routing. Imagination is the limit to what you can achieve :)</p>
<h3 id="solution-3-route53-amp-lambda"><a href="#solution-3-route53-amp-lambda">Solution 3 - Route53 & Lambda</a></h3><p>Another interesting and even more flexible way of achieving what I explained above, and way much more, is utilising <a href="https://aws.amazon.com/lambda/">AWS Lambda</a> functions. The combination of Lambda with events and Route53 is <strong>super-powerful</strong> and can implement very complex configuration updates that are usually very difficult.</p>
<p>A lot of people are already playing with Lambda for doing network configuration changes triggered by <strong>autoscaling</strong>. For example, instead of doing the update of the Route53 record set in the user data section of the EC2 launch-configuration, you could setup a Lambda function to be invoked when a new instance has been created (or terminated) and do your changes using the official AWS SDKs available in Lambda functions.</p>
<p>Many AWS tutorials are available in the <a href="https://aws.amazon.com/blogs/">AWS blogs</a> but you can find some in the References section below.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>The above solution is by no means comprehensive or suitable for everyone and every use-case. Most of the time you will <strong>want</strong> to use Elastic Load Balancing which provides cross-AZ balancing, monitoring, and transparent load balancing in front of several auto-scaling groups.</p>
<p>In some cases though you want to keep it simple or <strong>cheap</strong>, and this is where these solutions are preferred!</p>
<p>Feel free to contact me for any mistakes you find in the snippets above or if you have a better solution for the aforementioned problem.</p>
<p>Happy AWS Clouding :)</p>
<h2 id="references"><a href="#references">References</a></h2><ul>
<li><a href="https://www.loggly.com/blog/why-aws-route-53-over-elastic-load-balancing/">Why Loggly Chose Amazon Route 53 over Elastic Load Balancing</a></li>
<li><a href="https://aws.amazon.com/blogs/compute/using-aws-lambda-with-auto-scaling-lifecycle-hooks/">Using AWS Lambda with Auto Scaling Lifecycle Hooks</a></li>
<li><a href="https://aws.amazon.com/blogs/compute/building-a-dynamic-dns-for-route-53-using-cloudwatch-events-and-lambda/">Building a Dynamic DNS for Route 53 using CloudWatch Events and Lambda</a></li>
<li><a href="http://docs.aws.amazon.com/autoscaling/latest/userguide/lifecycle-hooks.html">Auto Scaling Lifecycle Hooks</a></li>
<li><a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html">Elastic IP Addresses</a></li>
<li><a href="http://docs.aws.amazon.com/cli/latest/reference/ec2/associate-address.html#examples">Official EC2 CLI - Associate Address reference</a></li>
</ul>

</article>
<article>
<h1>Banana or Human, and Marginal Degradation</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sat, 11 Jun 2016 00:00:00 GMT</p>
<p>A few months ago I attended a talk given by Monotype’s <a href="http://monotype.de/studio/steve-matteson">Steve Matteson</a> who designed awesome fonts like <strong>Open Sans</strong>, Android’s <strong>Droid family</strong> and Microsoft’s <strong>Segoe family</strong> among others, and by <a href="http://markboulton.co.uk/">Mark Boulton</a> who is now Design Director at <a href="http://monotype.com/">Monotype</a> and founder of <a href="http://www.markboultondesign.com/">Mark Boulton Design</a>, which was acquired by Monotype :)</p>
<p>The talk was pretty great and as you can imagine from their background, it was about Typefaces and their importance in a brand’s or a product’s value. <strong>Yes</strong>, I am a software engineer but I <strong>do</strong> care about the design and the aesthetics of my work! As much as I hate Apple, <strong>Steve Jobs</strong> was the leading exemplar of the fusion between technology and design, and will always be one of my role models.</p>
<p>Anyway, the one thing that stood out and impressed me the most out of the talk was Mark’s speech about <strong>Marginal Degradation</strong>. I found it so basic and simple but at the same time so powerful that it just sticked to my mind. After a few days I decided to read more about it and found a <a href="http://markboulton.co.uk/journal/marginal-degredation">related article in Mark’s blog</a> which is basically a scripted version of what he said. I always paid attention to the smallest detail, but after reading and hearing about this I just made it a habit! I will give you a brief summary below but I encourage you to read the whole article.</p>
<p></p>
<blockquote>
<p>Here’s a few fun facts… - Gorillas share 98.4% of our DNA - Goldfish share 68% - Bananas share 50%</p>
<p>Bananas. Are 50% the same as us.</p>
</blockquote>
<p><strong>Read</strong> the above quote, then read it <strong>again</strong>, and then <strong>again one last time!</strong></p>
<p>Only 1.6% of DNA differentiates us from gorillas and we are half-way to becoming bananas! Don’t get all fancy science genius with me now, I don’t know how much really that is in DNA world, but the fact still holds. Even the slightest marginal degradation of our DNA can make us inferior.</p>
<p>Now consider this to whatever you do. I write code as a living, you might be writing literature, or you might be designing the next website for the most awesome startup ever. You <strong>have to</strong> pay attention to detail, even the smallest one! Do not let anyone tell you “Just do it quickly for now, the manager wants it urgently. We can revisit this later.” <strong>No</strong>, you will <strong>not</strong> revisit it later and you will <strong>not</strong> fix this little sucker. It is there and it will stay there, degrading and sucking out the value and quality of your product.</p>
<blockquote>
<p>Your brand or design is supposed to be a human, but people perceive it as a gorilla. Or a banana.</p>
</blockquote>
<p>You have to really think the consequences of everything you sacrifice by not giving 100% of yourself in doing something. It might seem like a tiny thing at the moment, but if you take into account <strong>all</strong> those little tiny things you sacrificed over the years, you will find out how much you deviated from your original goal. </p>
<p>Nobody wants his product or brand to be seen as a banana, when it was supposed to be a human!</p>
<h2 id="takeouts"><a href="#takeouts">Takeouts</a></h2><ul>
<li>Pay attention to the details!</li>
<li>Do not sacrifice quality for a moment’s weakness or lack of time!</li>
<li>Always have in mind the <strong>original</strong> goal/target of what you want to achieve, otherwise every little deviation might lead to something entirely different!</li>
</ul>
<h2 id="references"><a href="#references">References</a></h2><ul>
<li><a href="http://markboulton.co.uk/journal/marginal-degredation">The difference between a goldfish and a human</a></li>
</ul>

</article>
<article>
<h1>Keyboard shortcut for opening context menu</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Tue, 07 Jun 2016 00:00:00 GMT</p>
<p>This is going to probably be the shortest article I wrote, or will write!</p>
<p>Have you ever wanted to open the right click menu on something without lifting your hands from the keyboard? I did a lot of times, especially when navigating through videos wanting to <strong>open</strong> them <strong>with</strong> the non-default application (which can be done with the <strong>Open with</strong> option in the right click context menu).</p>
<p>Well, you can do it with the following keyboard shortcut:</p>
<pre><code class="language-bash">Shift + F10
</code></pre>
<p><strong>Warning:</strong> It does not work in <strong>all</strong> the applications, but it does in the file manager I use which is enough for me :)</p>
<p>Alternatively, you can set keyboard shortcuts to trigger mouse events including right click, but this is out of this article’s scope.</p>

</article>
<article>
<h1>Backup your Wordpress and serve it as a static website</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 05 Jun 2016 00:00:00 GMT</p>
<p>A few months ago I started migrating all my projects and websites into AWS. The last thing standing was my old wordpress blog (hosted at <strong>mastergenius.net</strong>, not alive anymore). Today, I decided to make a final backup and create a static version out of it and ditch the VPS I have running just for that.</p>
<p>It turned out to be extremely easy :) </p>
<h2 id="backup-all-content"><a href="#backup-all-content">Backup all content</a></h2><p>Apart from serving the content as static website I wanted to have the posts in a form that can be easily imported later back in Wordpress and also a format which is easy to parse and read using a text editor.</p>
<p>I just wanted the <strong>text</strong>, which is the heart of any article anyway, so I just did the usual <strong>export</strong> feature of Wordpress as described in the <a href="https://en.support.wordpress.com/export/">official export documentation</a>.</p>
<p>A full folder copy is recommended too, you know just in case I want to revisit any files, images or code!</p>
<h2 id="serve-as-static-website"><a href="#serve-as-static-website">Serve as static website</a></h2><p>There are a lot of ways to dump an active wordpress installation and several plugins that allow you to do this conversion easily. I just took the simplest way and used the well-known <strong>wget</strong> Linux command line tool (original Quora question link with the command is in the References section).</p>
<pre><code class="language-bash">wget -k -K  -E -r -l 10 -p -N -F --restrict-file-names=windows -nH http://active-wordpress.domain.com
</code></pre>
<p>After executing the above command I deleted all the unnecessary files, stripping down the whole site in some MBs.</p>
<p>I just uploaded the static files to <a href="http://aws.amazon.com/s3/">Amazon Simple Storage Service - S3</a> and that’s all!</p>
<p>Zero-cost hosting :)</p>
<h2 id="references"><a href="#references">References</a></h2><ul>
<li><a href="https://www.quora.com/How-do-you-export-a-WordPress-site-to-a-static-HTML">How do you export a WordPress site to a static HTML?</a></li>
<li><a href="https://en.support.wordpress.com/export/">Wordpress export documentation</a></li>
</ul>

</article>
<article>
<h1>Fun with Typescript, Webpack and ReactJS</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 22 May 2016 00:00:00 GMT</p>
<p>I decided to play around with <a href="https://www.typescriptlang.org/">Typescript</a> instead of <a href="https://babeljs.io/">BabelJS</a>, which I already tried in the past, for this weekend’s project. </p>
<h2 id="reasons-for-typescript"><a href="#reasons-for-typescript">Reasons for Typescript</a></h2><p>During the last year I explored several Javascript <strong>languages/flavors</strong> (transpilers, etc.). If you read my previous articles you should know that I also like <a href="https://www.dartlang.org/">Dart</a>. I started using it in 2014 along with <a href="https://github.com/angular/angular.dart">AngularDart</a> throughout its journey to the stable version 1.0, always going through refactorings and fixing breaking changes. But Dart made me feel again productive and gave me a sensible platform to do front-end development. Unfortunately, it did not pick up as many fans as Google (and me) wanted so it was kinda left behind.</p>
<p>I really believe that the future lies in languages that compile to Javascript or even something else, <strong>WebAssembly</strong> maybe? (<a href="https://developer.mozilla.org/en-US/docs/WebAssembly">WebAssembly</a> and <a href="https://medium.com/javascript-scene/what-is-webassembly-the-dawn-of-a-new-era-61256ec5a8f6#.pb2824qir">Article</a> )</p>
<p>I already gave a go to Babel, so this time it was time for Typescript. Why choose Typescript and not stay with Babel is justified by the points below.</p>
<ul>
<li>I love static typed languages :) and Typescript is much closer to that rather than JS</li>
<li>I prefer compilation errors much more than runtime exceptions</li>
<li>I believe it will be alive for at least 3 more years since it is backed by <strong>Microsoft</strong> and <strong>Google</strong> and is used as the main language for <strong>Angular 2</strong></li>
<li>Better IDE support (if I decide to use one)</li>
</ul>
<h2 id="project"><a href="#project">Project</a></h2><p>I wanted to not spend more than 1-2 days so I have chosen something super simple to implement. I decided to develop a <strong>numerology</strong> calculator of your life path number :) Fancy stuff!</p>
<p><a href="/articles/webdev-typescript-webpack-react-numerology/demo/">Calculate your life path number at the demo page</a></p>
<h2 id="build-process"><a href="#build-process">Build Process</a></h2><p>My main goal with the project was to come up with a build process using <strong>Typescript</strong> with <strong>ES6</strong>, <strong>Webpack</strong> and <strong>ReactJS</strong>. I also wanted to use <strong>Autoprefixer</strong> and <strong>SASS</strong> so I added <strong>Gulp</strong> to the mix too, but I tried to keep it at a bare minimum.</p>
<p>The final process is explained in the following sections.</p>
<h3 id="code"><a href="#code">Code</a></h3><p>I wanted from the beginning to consolidate the code building into one tool, and the decision was pretty easy since <a href="http://webpack.github.io/">Webpack</a> is the big winner (at least for now).</p>
<p>The requirements are simple:</p>
<ul>
<li>Write my code in Typescript</li>
<li>Use ES6 features</li>
<li>Support browsers that only have ES5 support</li>
<li>As a bonus I also wrote the gulpfile in Typescript and used a hack-around to use it to actually compile the Typescript code :)</li>
</ul>
<p>For <strong>ReactJS</strong> I needed to install <strong>react</strong> and <strong>react-dom</strong> along with their typings to allow Typescript to resolve the React types. Pretty smooth process and I also found out something I did not know before. With webpack I can use the npm modules during development but <strong>do not</strong> bundle them along with my app code to keep the footprint small. By specifying that these modules are external I just need to use <strong>script</strong> tags in my HTML to import the public ReactJS libraries and take advantage of CDNs and local cached versions.</p>
<pre><code class="language-bash">npm install typescript typings react-dom react --save-dev
typings install dt~react dt~react-dom --global --save
</code></pre>
<p>In order to be able to use some of the ES6 features, like <strong>Object.assign()</strong>, I had to include <a href="https://cdnjs.com/libraries/es5-shim">es5-shim</a> and <a href="https://cdnjs.com/libraries/es6-shim">es6-shim</a> in my HTML code right before I imported the Typescript compiled bundled which targets <strong>ES5</strong>. I also needed to install the typings for ES6 in order to stop Typescript compilation errors when using ES6 features in the code.</p>
<pre><code class="language-bash">typings install dt~es6-shim --global --save
</code></pre>
<p>So, the Typescript compiler will use the typings and compile the code into ES5-valid Javascript and then Webpack will take all the files and output 2 bundles. I like to have separation between the app code which is Typescript and other JS code I might have, so I compile a bundle for the Typescript code <strong>ts.bundle.js</strong> and one for normal JS-code <strong>js.bundle.js</strong>.</p>
<p><strong>Use Typescript for the Gulpfile</strong></p>
<p>I could have used regular ES5 Javascript to write the Gulpfile but I wanted to play more with Typescript. I spend some time to be honest searching for solutions for this and I ended up following <a href="https://medium.com/@pleerock/create-a-gulpfile-and-write-gulp-tasks-using-typescript-f08edebcac57#.55q6zomio">this awesome guide</a> adapting it where needed.</p>
<h3 id="sass-and-build"><a href="#sass-and-build">SASS and Build</a></h3><p>No surprises here! Gulp is used to clean the build directory, to use <a href="https://github.com/postcss/autoprefixer">Autoprefixer</a> to prefix new/useful CSS with browser-prefixes and to build the final CSS using <a href="http://sass-lang.com/">SASS</a>.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>I spend some time back and forth with the Typescript documentation, expected since this is the first time I tried it, but I liked it. I am going to stick with this for a while until I find something much more productive and powerful, <a href="https://www.scala-js.org/">ScalaJS</a> might be that one  :)</p>
<p>Finally, I liked ReactJS too, and this is the second fun-project I do using it, but since I am an <strong>Angular die-hard</strong> I have to try the stable version of <a href="https://angular.io/">Angular 2</a> (which just released RC) to decide <strong>if</strong> there is a winner between them.</p>
<p>If you have a better pipeline for the above tools please contact me :)</p>
<p><a href="https://github.com/lambrospetrou/numerology/">Source code is hosted at Github</a></p>

</article>
<article>
<h1>AWS Certified Developer & Solution Architect Associate Certification Tips</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 15 May 2016 00:00:00 GMT</p>
<p>I wanted to write this article for a while now but I waited until I actually passed the certification exam before I started giving tips to others :) </p>
<p>This article is mostly a reference for myself, but it might also be helpful to anyone preparing for the <a href="https://aws.amazon.com/certification/">AWS Certifications</a>. I used the following resource material myself over the last couple of months and since I passed the exam with a pretty high mark I can assume they are good enough to recommend them :)</p>
<p>First of all, you have to bookmark the official <a href="http://aws.amazon.com/certification/certification-prep/">AWS Certification Preparation Guide</a>.</p>
<h2 id="aws-videos-a-name-aws-videos-a"><a href="#aws-videos-a-name-aws-videos-a">AWS videos <a name="aws-videos"></a></a></h2><p>The first thing I started doing when preparing for the exams, <a href="http://aws.amazon.com/certification/certified-solutions-architect-associate/">Solutions Architect</a> and <a href="http://aws.amazon.com/certification/certified-developer-associate/">Developer</a>, is watching all the re:Invent videos from 2015.</p>
<p>It might seem that they are just too many to watch or that you will spend huge amount of time, but trust me, they are going to help you more than any documentation or tips or online trainings. They will not only give you insights for individual services but also architectural design best-practices which you can apply into your own work environment even if you are not AWS-hosted.</p>
<p>Personally, I watched pretty much all the 2015 re:Invent videos over the span of 3 months, but you can just focus on the <strong>fundamentals</strong> and the <strong>deep dives</strong> and skip the <strong>war stories</strong> from several companies.</p>
<p>Below you will find the links to the <strong>AWS re:Invent 2015</strong> playlist on Youtube and to the <strong>AWS Summit 2016 - Chicago</strong> playlist which is just a few weeks old :)</p>
<p>The re:Invent videos are marked with a department and a 3-digit code denoting the pre-requisite knowledge you need to have in order to being able to follow the speaker. Remember the college course naming scheme? It’s the same!</p>
<h3 id="aws-re-invent-2015-a-name-aws-reinvent-2016-videos-a"><a href="#aws-re-invent-2015-a-name-aws-reinvent-2016-videos-a">AWS re:Invent 2015 <a name="aws-reinvent-2016-videos"></a></a></h3><p>You can find the videos at <a href="https://www.youtube.com/user/AmazonWebServices/playlists?view=50&shelf_id=15&sort=dd">AWS re:Invent 2015 Youtube playlist</a></p>
<p>Some of my favourites are (no particular order):</p>
<ul>
<li><a href="https://www.youtube.com/watch?v=_wiGpBQGCjU&list=PLhr1KZpdzukc9aw8-gnLmyralfsBv7zcR&index=11">(SEC302) IAM Best Practices to Live By</a></li>
<li><a href="https://www.youtube.com/watch?v=Du478i9O_mc&list=PLhr1KZpdzukc9aw8-gnLmyralfsBv7zcR&index=13">(SEC305) How to Become an IAM Policy Ninja in 60 Minutes or Less</a></li>
<li><a href="https://www.youtube.com/watch?v=3HDQsW_r1DM&list=PLhr1KZpdzukdTMmq1gkXs7g6WIIXtL5r9&index=1">(STG201) State of the Union: AWS Storage Services</a></li>
<li><a href="https://www.youtube.com/watch?v=1TvJCLl9NNg&list=PLhr1KZpdzukdTMmq1gkXs7g6WIIXtL5r9&index=8">(STG401) Amazon S3 Deep Dive and Best Practices</a></li>
<li><a href="https://www.youtube.com/watch?v=gUAuhdtHacI&list=PLhr1KZpdzukdTMmq1gkXs7g6WIIXtL5r9&index=13">(STG206) Using Amazon CloudFront For Your Websites & Apps</a></li>
<li><a href="https://www.youtube.com/watch?v=WL2xSMVXy5w&index=10&list=PLhr1KZpdzukdRxs_pGJm-qSy5LayL6W_Y">(ARC307) Infrastructure as Code</a></li>
<li><a href="https://www.youtube.com/watch?v=4trGuelatMI&list=PLhr1KZpdzukfVW6NrpDzdT6Sej0p5POkN&index=6">(CMP201) All You Need To Know About Auto Scaling</a></li>
<li><a href="https://www.youtube.com/watch?v=SZAvtbrIBAk&list=PLhr1KZpdzukfVW6NrpDzdT6Sej0p5POkN&index=11">(CMP402) Amazon EC2 Instances Deep Dive</a></li>
<li><a href="https://www.youtube.com/watch?v=4VfIINg9DYI&index=6&list=PLhr1KZpdzukeMbjRqGswHX38DCqOHZ5GA">(DAT407) Amazon ElastiCache: Deep Dive</a></li>
<li><a href="https://www.youtube.com/watch?v=ggDIat_FZtA&index=16&list=PLhr1KZpdzukeMbjRqGswHX38DCqOHZ5GA">(DAT401) Amazon DynamoDB Deep Dive</a></li>
<li><a href="https://www.youtube.com/watch?v=5_bQ6Dgk6k8&index=1&list=PLhr1KZpdzukcjwZgFBBTmSNPjf_gImgfx">(NET201) VPC Fundamentals and Connectivity Options</a></li>
<li><a href="https://www.youtube.com/watch?v=SMvom9QjkPk&index=2&list=PLhr1KZpdzukcjwZgFBBTmSNPjf_gImgfx">(NET406) Deep Dive: AWS Direct Connect and VPNs</a></li>
</ul>
<p>The above are just a few selections to <strong>start with</strong> and by no means conclusive. I also recommend you watching <strong>re:Invent 2014 and 2013</strong> videos especially <strong>deep-dives</strong>.</p>
<h3 id="aws-summit-2016-chicago-a-name-aws-summit-2016-chicago-videos-a"><a href="#aws-summit-2016-chicago-a-name-aws-summit-2016-chicago-videos-a">AWS Summit 2016 - Chicago <a name="aws-summit-2016-chicago-videos"></a></a></h3><p>You can find the videos at <a href="https://www.youtube.com/playlist?list=PLhr1KZpdzukc2_5o7YTT7e2dlKBEKR1ez">AWS Summit 2016 - Chicago</a></p>
<p>This videos are not as structured as the re:Invent ones but you can as easily filter out the service-oriented ones from the war-story-oriented ones.</p>
<p>Again a few selections:</p>
<ul>
<li><a href="https://www.youtube.com/watch?v=9-7azhB27So&list=PLhr1KZpdzukc2_5o7YTT7e2dlKBEKR1ez&index=23">Deep Dive on Amazon Relational Database Service</a></li>
<li><a href="https://www.youtube.com/watch?v=652Wf1KKedk&list=PLhr1KZpdzukc2_5o7YTT7e2dlKBEKR1ez&index=17">DevOps on AWS</a></li>
<li><a href="https://www.youtube.com/watch?v=MDeKncXDAgk&list=PLhr1KZpdzukc2_5o7YTT7e2dlKBEKR1ez&index=41">Deep Dive on Amazon Elastic Block Store</a></li>
</ul>
<p>There is some overlap with re:Invent videos in most of the summit ones but there is also additional valuable content.</p>
<h2 id="aws-whitepapers-a-name-aws-whitepapers-a"><a href="#aws-whitepapers-a-name-aws-whitepapers-a">AWS Whitepapers <a name="aws-whitepapers"></a></a></h2><p><strong>Whitepapers</strong> are my second favourite resources after the re:Invent videos. They present some of the most important topics of deploying services on AWS and provide best-practices and pitfalls to avoid in the process.</p>
<p>You can find them all at <a href="http://aws.amazon.com/whitepapers/">AWS Whitepapers</a>.</p>
<p>Recommended reading:</p>
<ul>
<li><a href="http://d0.awsstatic.com/whitepapers/Building%20Static%20Websites%20on%20AWS.pdf">Building Static Websites on AWS - an Astonishing Modern Architecture</a></li>
<li><a href="http://d0.awsstatic.com/whitepapers/AWS%20Storage%20Services%20Whitepaper-v9.pdf">AWS Cloud Storage Services Overview</a></li>
<li><a href="http://d0.awsstatic.com/whitepapers/architecture/AWS_Well-Architected_Framework.pdf">AWS Well-Architected Framework</a></li>
<li><a href="http://d0.awsstatic.com/whitepapers/AWS_Cloud_Best_Practices.pdf">Architecting for the AWS Cloud: Best Practices</a></li>
<li><a href="http://d0.awsstatic.com/whitepapers/Security/Intro_to_AWS_Security.pdf">Introduction to AWS Security</a></li>
<li><a href="http://d0.awsstatic.com/whitepapers/aws-security-best-practices.pdf">AWS Security Best Practices</a></li>
<li><a href="http://d0.awsstatic.com/whitepapers/overview-of-deployment-options-on-aws.pdf">Overview of Deployment Options on AWS</a></li>
<li><a href="https://d0.awsstatic.com/whitepapers/AWS_Securing_Data_at_Rest_with_Encryption.pdf">AWS Securing Data at Rest with Encryption</a></li>
</ul>
<h2 id="documentation-and-faqs-a-name-doc-faq-a"><a href="#documentation-and-faqs-a-name-doc-faq-a">Documentation and FAQs <a name="doc-faq"></a></a></h2><p>Another great resource is obviously the documentation of each service and its FAQ. I would strongly advise you to go over the <strong>FAQ</strong> of the most important services, depending on your exam, and if you have time then go over the docs. </p>
<p>I have to warn you :) <strong>FAQs</strong> are a <strong>must!!!</strong></p>
<p>You can find links to most of the FAQs at <a href="https://aws.amazon.com/faqs/">https://aws.amazon.com/faqs/</a>
Or you can visit the FAQ of the service you want directly by visiting the appropriate link, <strong>aws.amazon.com/SERVICENAME/faqs/</strong>. For example, the Simple Storage Service (S3) FAQ will be at <strong>aws.amazon.com/s3/faqs/</strong>.</p>
<p>Personally, I read the FAQs while travelling in the aeroplane or in the tube/buses. As a result, I created a small script to extract the main content out of the website of each FAQ and I put them on my Kindle :) resulting in a much better and easier read!</p>
<p>You can find the small script on Github: <a href="https://github.com/lambrospetrou/aws-faq-client/">AWS FAQ client</a></p>
<h2 id="aws-training-classes-a-name-aws-training-class-a"><a href="#aws-training-classes-a-name-aws-training-class-a">AWS Training classes <a name="aws-training-class"></a></a></h2><p>The official <a href="http://aws.amazon.com/certification/certification-prep/">AWS Certification Preparation Guide</a> suggests to attend training classes depending on the exam you want to pass.</p>
<p>I attended two of them so far, the <a href="http://aws.amazon.com/training/course-descriptions/developing/">Developing on AWS</a> and the <a href="http://aws.amazon.com/training/course-descriptions/sysops/">System Operations on AWS</a>. I have to admit that the developing one apart from some advices and tips from the trainer himself it was very basic and I would not recommend it to someone that already uses AWS. However, if you are just starting with AWS it is a <strong>great</strong> introduction to the <strong>core</strong> services. The <strong>SysOps</strong> training on the other hand was <strong>amazing</strong>. I highly recommend this training to anyone because it covers things from advanced monitoring, to tagging, to cost-optimization, to custom AMIs and deployment options and pretty much things that I doubt you will ever do on your own or if you are not already working as a systems administrator.</p>
<p><strong>Update @2016-05-30</strong></p>
<p>Last week, I attended the <a href="http://aws.amazon.com/training/course-descriptions/architect/">Architecting on AWS</a> training course. I would really recommend it to anyone that will attempt the certification exam since it is a very nice overview of all the architectural aspects of AWS. It goes over specific use-cases, optimizes and finds bottlenecks of the architecture for certain scenarios which requires you to know about several services and how they can be used together.</p>
<p>Overall, if money is no problem (most companies cover the expenses anyway), I would really recommend you to attend these 3-day training courses. But, it really depends on your knowledge and expertise level. </p>
<h2 id="aws-qwiklabs-a-name-aws-qwiklabs-a"><a href="#aws-qwiklabs-a-name-aws-qwiklabs-a">AWS Qwiklabs <a name="aws-qwiklabs"></a></a></h2><p>I highly recommend you to do as many <a href="https://run.qwiklab.com/">AWS Qwiklabs</a> as you can. They range from quick to long practical labs on most of the services and basic concepts you will need for the exams and for actual work on AWS.</p>
<p>Start by doing the <strong>free</strong> ones first, but I really encourage you to try some of the advanced ones too. Even better if you could convince your company to pay for them. Not only they cover in-depth concepts but they allow you to use expensive services that would be cost-prohibitive to use and play with using your own personal account. </p>
<h2 id="projects-projects-projects-a-name-projects-a"><a href="#projects-projects-projects-a-name-projects-a">Projects, projects, projects <a name="projects"></a></a></h2><p><strong>Most Important</strong></p>
<p>As with anything you want to learn, actual practice provides the best learning experience.</p>
<p>I did several small projects, and wrote about them in previous articles, where I actually used most of the services covered by the exams. This provided me with invaluable knowledge and insights as to how some services work which I could not acquire just by reading stuff. After all, the exam is called <strong>Developer Associate</strong> because you need to be able to develop and build systems, not just know about them :) I can assure you that the exam contains questions that you cannot answer correctly just by reading the docs.</p>
<p><strong>Qwiklabs</strong> can significantly contribute to this if you opt-in for the advanced labs but again, from personal experience, if you do not build something from scratch on your own you do not really learn.</p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>I provided the most important resources I used to prepare for the exam and get myself acquainted with most of the AWS services. I am pretty sure if you follow my advices you will have great results, but by no means consider this an exhaustive list of content or tips.</p>
<p>Watch as many videos as you can from all the <strong>re:Invents</strong>, do as many <strong>qwiklabs</strong> as you can, and most importantly try to do small (or big) projects utilising a variety of AWS services.</p>
<p>The content of the different examinations has a significant amount of overlap, <strong>but</strong> focuses on different aspects of each topic. For example, the <strong>developer</strong> exam will concentrate on performance, API, and implementation details of a service whereas the <strong>architect</strong> exam will focus on orchestrating and utilising the correct services to solve a problem.</p>
<p><strong>Good luck :)</strong></p>

</article>
<article>
<h1>Yet another Spito re-write, on the Cloud (AWS)</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Mon, 25 Apr 2016 00:00:00 GMT</p>
<p>Finally, I found some time to re-write my super-duper URL shortener yet once again :)</p>
<p>As part of my preparation for the <a href="https://aws.amazon.com/certification/">AWS Certifications</a> I plan on taking, I wanted to make an architectural re-design of <strong>Spi.to</strong> completely cloud-based on AWS. Last weekend, I finally convinced myself to stop watching shows on <a href="https://www.amazon.co.uk/av">Amazon Video</a> and did it.</p>
<h2 id="architecture-overview"><a href="#architecture-overview">Architecture overview</a></h2><p>In the following diagram we can see an overview of the application’s architecture. It is is one of the simplest applications you can make in a weekend but at the same time it allows you to use many cloud services following best-practices in order to achieve fault-tolerance, high availability and durability, which is similar to what you would do in a real super-scalable service.</p>
<p></p>
<p>You can see that it is a pretty basic setup but without sacrificing performance, availability or durability!</p>
<h2 id="services"><a href="#services">Services</a></h2><p>The services I use in this application can be identified from the diagram above but read below for a small description as to the <strong>why</strong> use each service.</p>
<p>For more information about <strong>Route 53</strong>, <strong>Cloudfront</strong>, and <strong>S3</strong> regarding hosting a static website on AWS you can <a href="/articles/migrate-to-aws-static-website/">read my previous article</a>.</p>
<h3 id="amazon-route-53"><a href="#amazon-route-53">Amazon Route 53</a></h3><p><a href="https://aws.amazon.com/route53/">Route 53</a> is a cloud-based DNS management service which I use to manage my domain <strong>spi.to</strong>. I do not use advanced features of the service for this application but you should check it out because it is simply awesome.</p>
<h3 id="amazon-cloudfront"><a href="#amazon-cloudfront">Amazon Cloudfront</a></h3><p><a href="https://aws.amazon.com/cloudfront/">Cloudfront</a> is one of those services that once you understand how it works and play a little with it, you just love it. It is Amazon’s CDN solution but at the same time can act as a reverse proxy to your backend or as a faster gateway to your services instead of going throughout the public network.</p>
<p>I use it in front of my static website server (explained in S3 section) and my REST API service (explained in Elastic Beanstalk section) which are the two origins servicing my application. </p>
<p>In cloudfront I specify certain behaviors for caching depending on the files but I also include some path patterns to direct each request to the appropriate backend (S3 or API). You can see below a snapshot of the rules I have at the moment.</p>
<p></p>
<p>The above rules are evaluated top-down until the path matches one rule and that rule is applied without going further down the rule-chain.</p>
<p>You can observe that the first rule ensures that all <code>/api/</code> calls are going to our <strong>API backend</strong> whilst whatever request comes with <code>.</code> (dots) or <code>/</code> (slashes) will go to the <strong>website client</strong> backend. The fourth rule ensures that whatever request comes with <strong>at least</strong> one character, I want to match the <strong>hash ids</strong> now, will go to the API backend. The last rule ensures that the root path <code>spi.to/</code> without any path (since we covered all other cases) will go to the <strong>website client</strong>, which is the homepage.</p>
<p>There might be a simpler solution, but I could not find a way to use classes of characters in path patterns. If you find one or I missed something in the documentation please contact me :)</p>
<p>Also, another important thing with Cloudfront that a <strong>lot</strong> of people ignore is that you <strong>CAN</strong> use it with dynamic services. In my case the API is strictly dynamic since the spits have expiration dates and they cannot be just cached. You can specify that a path will have no caching, which means that you will just use Cloudfront as a proxy to your service, and as I said before it might be beneficial to your users because the communication to your servers will be done inside the AWS network which has all sorts of optimizations.</p>
<h3 id="amazon-s3-simple-storage-service"><a href="#amazon-s3-simple-storage-service">Amazon S3 - Simple Storage Service</a></h3><p>I use <a href="https://aws.amazon.com/s3/">Amazon S3</a> in order to serve the web client (website) which you can access by visiting <a href="http://spi.to">http://Spi.to</a>. The client is a static web application written in Dart which consumes the <strong>Spito API</strong> (REST API explained below).</p>
<p>Amazon S3 as discussed in my previous article, <a href="https://lambrospetrou.com/articles/migrate-to-aws-static-website/">Hosting a static website at AWS</a>, is just amazing for static content (especially when combined with Amazon Cloudfront).</p>
<h3 id="amazon-dynamodb"><a href="#amazon-dynamodb">Amazon DynamoDB</a></h3><p><a href="https://aws.amazon.com/dynamodb/">Amazon DynamoDB</a> is used to store the text or the URL that you upload to the service. I was considering to use S3 again for that but Dynamo is much faster for small texts. I will update the API in the future and use S3 for large-sized text (now the service only allows you to post text up to 128KB).</p>
<p>DynamoDB has very low latency and allows the service to provide super-fast retrieval of the so-called <strong>Spits</strong> (my naming for the uploaded text) and also provides durability and high-availability of the data.</p>
<h3 id="aws-elastic-beanstalk"><a href="#aws-elastic-beanstalk">AWS Elastic Beanstalk</a></h3><p><a href="https://aws.amazon.com/elasticbeanstalk/">Elastic Beanstalk</a> is a tool that I just learnt recently, and I loved it instantly. Super-easy to use and you get all the benefits of autoscaling and custom bootstrapping out-of-the-box. You, as a user, just need to upload your source code or binary of the application.</p>
<p>Beanstalk handles the <strong>Spito API</strong> servers inside a managed auto-scaling group which by its turn is behind a managed elastic load balancer, all handled by beanstalk itself.</p>
<p>For this service I use <strong>T2.nano</strong> instances in order to keep the costs at a minimal, in case you were interested :)</p>
<p>The Spito API is written in <a href="https://golang.org/">#Go</a> (source code found in <em>Conclusion</em> section).</p>
<h3 id="aws-cloudformation"><a href="#aws-cloudformation">AWS CloudFormation</a></h3><p><a href="https://aws.amazon.com/cloudformation/">AWS Cloud Formation</a> is a <strong>future</strong> feature I will add to the project in order to automate the infrastructure creation. As developers we love code, so everything needs to be in code :)</p>
<p><em>Coming soon</em></p>
<h2 id="conclusion"><a href="#conclusion">Conclusion</a></h2><p>Through this project I learnt a ton about <strong>Elastic Beanstalk</strong> and how to use <strong>Cloudfront</strong> even for dynamic requests, and also re-used the knowledge and tricks from my last project (the aforementioned article about static websites).</p>
<p>I open-sourced the application, both the website and the REST API, but keep in mind that this is <strong>only</strong> the application. I still haven’t added the CloudFormation template which creates and bootstraps the required AWS services in an automated way.</p>
<p>Source code</p>
<ul>
<li><a href="https://github.com/lambrospetrou/spitoweb">Spitoweb client</a></li>
<li><a href="https://github.com/lambrospetrou/spito">Spito API</a></li>
</ul>
<p><em><strong>Warning</strong>: The client (spitoweb) has been written a long time ago so it clearly does not follow Dart best-practices, however it could be useful to read if you are into Dart too :)</em></p>
<p><strong>Ah,</strong> and before I forget, you can find <strong>Spito</strong> at <a href="http://spi.to">http://spi.to</a>.</p>
<p>Any comment or feedback is appreciated. </p>
<h2 id="references"><a href="#references">References</a></h2><ul>
<li><a href="https://lambrospetrou.com/articles/migrate-to-aws-static-website/">Migrate to AWS - Make a static website using S3, Cloudfront and Route 53</a></li>
<li><a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/go-environment.html">Deploying Applications on the Go Platform</a></li>
<li><a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/eb-cli3-configuration.html">Configure the EB CLI</a></li>
<li><a href="http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/distribution-web-values-specify.html#DownloadDistValuesPathPattern">Values that You Specify When You Create or Update a Web Distribution - Cloudfront</a></li>
<li><a href="https://d0.awsstatic.com/whitepapers/Building%20Static%20Websites%20on%20AWS.pdf">Whitepaper - Hosting Static Websites on AWS</a></li>
<li><a href="https://d0.awsstatic.com/whitepapers/overview-of-deployment-options-on-aws.pdf">Whitepaper - Overview of Deployment Options on AWS</a></li>
</ul>

</article>
<article>
<h1>Import Gmail messages into Google Keep as notes</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Sun, 14 Feb 2016 00:00:00 GMT</p>
<p>A few weeks ago a friend asked me for a way to import his Apple Notes into Google Keep. It seems that no easy straighforward way exists as of today so I decided it was a great opportunity to make my first Chrome extension and create my own importer.</p>
<p>It turned out to be a bit more complicated than I expected though, hence no plugin already exists. The reason is that Google as of now does not provide an open API for developers to use in order to import messages which makes the whole project impossible from a programmatible point of view. In the end I managed to overcome this issue and created the extension which you can download from the Chrome store and try it yourself (<a href="https://chrome.google.com/webstore/detail/lp-gmail-keep-importer/ingomolknmgnfbafknpkmklapabaednn">direct link to Chrome Store</a>).</p>
<p>In this tutorial I will first provide instructions how to use the extension for the impatient ones, and then I will go into more details regarding its implementation and the issues encountered during the process.</p>
<h2 id="how-to-use-the-lp-gmail-to-keep-importer"><a href="#how-to-use-the-lp-gmail-to-keep-importer">How to use the LP Gmail to Keep Importer</a></h2><p>The most updated state and instructions about the plugin will always be accessible at its <a href="https://github.com/lambrospetrou/gmail-keep-importer">Github repository</a> where you can also download the source code. Any feedback and pull-requests are welcomed.</p>
<p>The following instructions are valid and work with the Google Keep website as of the date of this article.</p>
<ol>
<li><p><strong>Install</strong> the plugin from the Chrome store: <a href="https://chrome.google.com/webstore/detail/lp-gmail-keep-importer/ingomolknmgnfbafknpkmklapabaednn">Download LP Gmail Keep Importer</a></p>
</li>
<li><p>In order for the plugin to work you <strong>have to</strong>, read again, <strong>have to</strong> be at the Google Keep website (reason explained below). Visit <a href="https://keep.google.com/">Google Keep</a> and login with your account.</p>
</li>
<li><p>Once you are at the Keep website, and you have installed the extension you should be able to see the <strong>LP G2K</strong> icon in Chrome’s toolbar.</p>
</li>
<li><p><strong>Click</strong> the extension’s icon in order to open the plugin and you should see a small window like the picture below.</p>
<p> </p>
</li>
<li><p>The authorization is required only the first time you use the extension. You have to authorize it to access your emails (<strong>read only</strong> don’t worry). </p>
<ul>
<li>Click <strong>Authorize</strong> to be redirected to Google’s authorization page. </li>
<li>Click <strong>Allow</strong> to allow the extension to use your account’s email. </li>
<li>The window will stay open with a blank page. This behavior is a bug from Google’s side and they are looking into a fix, but for now just close the window yourself and refresh the page.</li>
</ul>
</li>
<li><p><strong>Click</strong> again the extension’s icon and now you should see the normal view with the text box (from now on this is what you should see since you already authorised it).</p>
</li>
<li><p>In the text box you have to type the <strong>Gmail label name</strong> you want to import. As you know Gmail organizes your mails, messages, in labels so just find out the label you want to import and type it here or create a new one if you don’t have one. <strong>Apple Notes</strong> automatically saves your notes as emails under the label name <strong>Notes</strong>, so use that if you want to import your Apple notes.</p>
</li>
<li><p>Click <strong>Import GMail Label</strong>. Once the plugin fetches your messages from Gmail, it will show you a small result view with the number of messages found (like the picture below) or it will show an error message if there was a problem during the fetching process.</p>
<p> </p>
</li>
<li><p>Click <strong>Start Importing messages</strong> in order to start adding your messages as notes into Google Keep. If you have hundreds of messages please <strong>be patient</strong> and wait until you see a response, either an error or a success message.</p>
</li>
<li><p>When your messages have been imported you will actually see them in the background to be inside your Keep notes but also you will see a confirmation like the following picture.</p>
<p></p>
</li>
<li><p>You have successfully imported your Gmail messages into Google Keep :)</p>
</li>
</ol>
<p>If you encountered any problems or your experience was different from what I described please contact me in order to fix the issue. As you will read later, the extension is highly experimental and it might not be working as expected in all operating systems (I had some issues with Mac OSX but I tried to resolve them).</p>
<p>In addition, it goes without saying that the importer works for plain emails. If you try to import emails that have attachments or fancy HTML websites embedded only the visible text of the message will be imported. As of version v0.13 the first line of the email becomes the title of the note in Keep.</p>
<h2 id="deep-dive-into-the-juicy-details"><a href="#deep-dive-into-the-juicy-details">Deep dive into the juicy details</a></h2><p>This section contains more details about the problem itself, its issues, and the approach that I used to solve it. If you discovered a better solution contact me or even better do a pull-request at Github with your solution and I will adjust it accordingly.</p>
<h3 id="problem-overview"><a href="#problem-overview">Problem overview</a></h3><p>Initially the problem at hand was to import Apple Notes into Google Keep. Apple Notes stores your notes as messages at Gmail (assuming you use a Google account of course) under the label <strong>Notes</strong> (yes, very innovative name by Apple that will not conflict with anything :D ). This makes it easier for us since the problem now can be reduced to importing from Gmail to Google Keep, which seemed trivial at first glance. Therefore, our problem now is two-fold: <strong>a)</strong> Fetch messages from Gmail and <strong>b)</strong> Create a note in Google Keep with code programmatically. </p>
<p>Gmail has a very nicely explained and easy to use <a href="https://developers.google.com/gmail/api/quickstart/js">Javascript API</a> so the first task of fetching the emails under a label name is trivial and I will not go into more details since there are hundreds of articles explaining how to do this. Personally, I find Google’s documentation to be superb.</p>
<p>The problem is with the second part, creating notes into Google Keep. Google does <strong>not</strong> provide an open API for Keep as it does with Gmail so far, maybe because it is a new (couple of years) service or just because they don’t want developers yet to create apps around it. As a web developer myself I instantly had an idea to overcome this problem. Let’s use the Keep website and use the existing form programmatically, thus typing the note, the title and clicking the Done button, all with code. <strong>Nifty idea right? :)</strong></p>
<h3 id="the-journey-inside-google-keep-website"><a href="#the-journey-inside-google-keep-website">The journey inside Google Keep website</a></h3><p>My first goal was to come up with a way to identify and reference the three major components we need:</p>
<ul>
<li>Title text box</li>
<li>Content text box</li>
<li>Done button (more problematic since the actual text might be different in each language)</li>
</ul>
<h4 id="identification-selectability"><a href="#identification-selectability">Identification - Selectability</a></h4><p>I dived into the HTML source code of the website, always using the awesome Chrome web developer tools, and I tried to detect something distinctive for each one of them. After trying several things the only solution that worked for me since I did not want to use actual text due to translations, is using <strong>CSS classnames</strong>. The only concern I had, have, and will have regarding this is that the classnames are not actual words so I guess Google uses some kind of builder to generate them, so at any moment in the future when Google modifies the Keep website there is a high probability that the classnames will change, thus breaking the plugin. Since I could not find any other identifiable characteristic I decided to go with it and in case this happens in the future I will just update the extension with the new names, until Google release a proper API.</p>
<h4 id="add-note-with-code"><a href="#add-note-with-code">Add note with code</a></h4><p>So far so good, we have a way to reference each HTML element on the website. Let’s try to add some dummy content into the text boxes and simulate the click on the submit button to see what happens. </p>
<pre><code>1. Adding text programmatically: Works
2. Clicking Done: Does not work!
</code></pre>
<p>Hmmm, this was not expected at first because the button <strong>was</strong> clicked, but nothing happened and the note disappeared instead of being added. After diving deeper into the code I was examing the HTML dom of the elements in question throughout a proper manual note creation and saw what changed, where the content was added in reality, and what was the button’s behavior. I discovered that Google uses two elements per input field instead of one in order to allow expandable text areas. Interesting I thought, so I did a little research and found <a href="http://alistapart.com/article/expanding-text-areas-made-elegant">this great article by A List Apart</a> which explains in more details how this is done.</p>
<p><strong>Attempt number two</strong>. Having seen the issue I know referenced the proper element for the input boxes when adding the text, the ones which are really used during the submission of the note. Let’s try again.</p>
<pre><code>1. Adding text programmatically: Works
2. Clicking Done: Does NOT work!
</code></pre>
<p>OK! Google Keep vs Lambros, 2 - 0.</p>
<p>Something was going wrong right? I went over my code several times to make sure I was not being stupid, and strangely enough I wasn’t so the problem was somewhere else. I spent several hours investigating the problem and I had an idea. Maybe it was something special happening when the submit button was pressed in reality that I was missing by doing it with code. Again, Chrome developer tools to the rescue.</p>
<p>I discovered a great tool that allows you to inspect the fired HTML/Javascript events of the page and even put breakpoints in your code to the exact point when it happens. 
If you want to see this and test it:</p>
<ul>
<li>Open the developer tools in Chrome</li>
<li>Click the <strong>Sources</strong> tab</li>
<li>You can see on the right-hand side now a section named <strong>Event Listener Breakpoints</strong></li>
<li>Play with it :)</li>
</ul>
<p>Fast-forward a few hours and I discovered that in order to have a successful note submitted the button when clicked does not only fire the <strong>click</strong> event, but it has to also fire the <strong>mousedown</strong> followed by the <strong>mouseup</strong> events, in this specific order. Additionally, the same has to happen with the input boxes. All the events I use as of now to make a proper submission are shown below.</p>
<ul>
<li>Input boxes (title, content)<ol>
<li>Add the text to the right element</li>
<li>Fire events: <strong>change</strong> => <strong>mousedown</strong> => <strong>mouseup</strong></li>
</ol>
</li>
<li>Submit button<ol>
<li>Fire events: <strong>mousedown</strong> => <strong>mouseup</strong></li>
</ol>
</li>
</ul>
<p>The implementation source code of this logic and the whole note creation is in the <a href="https://github.com/lambrospetrou/gmail-keep-importer/blob/master/chrome-logic-keep.js">chrome-logic-keep.js</a> file at Github.</p>
<p><strong>Attempt number three</strong>. </p>
<pre><code>1. Adding text programmatically: Works
2. Clicking Done: Works
</code></pre>
<p>And voila! We can now add notes into Keep programmatically.</p>
<p>After this, I spent several hours figuring out how Chrome extensions work and went ahead and created my first extension. If you encounter any errors, please contact me or contribute to the project on Github. I have already said that it is a <strong>highly experimental</strong> project and it might break at any time if even the smallest detail changes on Google Keep’s website.</p>
<p>Google Keep vs Lambros, win for Lambros with KO!</p>
<h3 id="as-jeff-bezos-says-it-s-still-day-one"><a href="#as-jeff-bezos-says-it-s-still-day-one">As Jeff Bezos says, “It’s Still Day-One”</a></h3><p>It was a very nice little project, and learned tons of new stuff. <strong>Web is amazing and ever-changing</strong> so we have to always keep learning. </p>

</article>
<article>
<h1>Migrate to AWS - Make a static website using S3, Cloudfront and Route 53</h1>
<p>lambros@lambrospetrou.com (Lambros Petrou) — Thu, 11 Feb 2016 00:00:00 GMT</p>
<p>A few months ago I started studying most of the Amazon Web Services (AWS) and as I said in a previous post one of my goals for this year is to transition all my apps and microservices into AWS.</p>
<p>This tutorial is the first part of a series and explains in detail how to setup a static website with unlimited throughput, durability and availability using <a href="https://aws.amazon.com/s3/">Amazon S3</a>. Additionally, we exploit <a href="https://aws.amazon.com/cloudfront/">Amazon Cloudfront</a> to provide caching and super-fast downloads all over the world and finally, we use <a href="https://aws.amazon.com/route53/">Route 53</a> to serve the website behind our custom domain name (e.g. lambrospetrou.com).</p>
<p>The reason I decided to write this post is because although the documentation in AWS is very good, the majority of all the articles and threads in several forums are a bit outdated and do not cover the new features of the AWS services we are going to use. For example, Cloudfront’s ability to serve GZIP-ed compressed content and its free HTTPS support. The reason is that Amazon releases hundreds of features every year and it is impossible for everyone to catch-up ;)</p>
<p>I will try to structure this tutorial in a way that each part is self-contained and builds upon the previous part, in order to allow you to pick-and-choose only the parts that you want for your website.</p>
<h2 id="overview"><a href="#overview">Overview</a></h2><p>The most important service when it comes to static content is Amazon S3, which stands for Simple Storage Service. This is one of the <strong>best</strong> web services available as of this moment and not just among Amazon’s offerings but other competitors’ too. It is a file storage solution that offers super high-availability, extreme durability (eleven nines, 99.999999999%) and it is very cheap. <strong>Amazon S3 alone allows you to have a proper static website ready in a few minutes.</strong></p>
<p>The other two services, Cloudfront and Route 53 are going to extend our website to provide caching in edge locations all over the world, to allow users to download the files from locations closer to them, and with Route 53 we will allow the usage of custom domains instead of using the default AWS ones (e.g. lambrospetrou.com.s3-website-region.amazonaws.com).</p>
<h2 id="make-a-static-website-using-amazon-s3"><a href="#make-a-static-website-using-amazon-s3">Make a static website using Amazon S3</a></h2><p>Before we dive into the step-by-step guide let’s create our scenario.</p>
<p>Assume that throughout this tutorial we want to make a website for <code>lambrospetrou.com</code>. Also we want the www-prefixed domain, <code>www.lambrospetrou.com</code> to redirect the users to the APEX domain, the non-www domain, <code>lambrospetrou.com</code>.</p>
<ol>
<li><p>First we want to create the bucket that will hold our website files. Visit the AWS management console and navigate to the S3 service. (<a href="https://console.aws.amazon.com/s3/">direct link to S3</a>).</p>
</li>
<li><p>Click <strong>Create Bucket</strong> and type in the <strong>Bucket Name</strong>, which in my case it was <code>lambrospetrou.com</code>. It is <strong>very important</strong> that you give your bucket a name that matches exactly your domain. Additionally, choose the region where you want your bucket to reside (choose the one closer to your users).</p>
</li>
<li><p>Click <strong>Create</strong> and your new bucket should be visible under the list of All Buckets.</p>
</li>
<li><p>Now click on the bucket you just created and select <strong>Properties</strong> from the tabs in the right side of the dashboard (you should see options like the picture below).</p>
<p> </p>
</li>
<li><p>The next step is to make this bucket act like a website.</p>
<ul>
<li>Click on the <strong>Static Website Hosting</strong> option and select <strong>Enable Website hosting</strong>.</li>
<li>Type <strong>index.html</strong> in the <strong>Index Document</strong> option (and optionally fill the <em>Error Document</em> option if you plan to have an error page.). This option specifies that when a user navigates to a directory we want the index.html file of that website to be downloaded.</li>
<li>Click <strong>Save</strong>.</li>
</ul>
</li>
<li><p>Now the last step is to add permissions to the bucket to allow <strong>anyone</strong> to access its files.</p>
<ul>
<li>Click the <strong>Permissions</strong> option.</li>
<li>Click <strong>Add bucket policy</strong> and paste in the following snippet (remember to replace <code>lambrospetrou.com</code> with your bucket’s name):</li>
</ul>
<pre><code>{
    "Version": "2012-10-17",
        "Statement": [
        {
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::lambrospetrou.com/*"
        }
    ]
}
</code></pre>
<ul>
<li>Click <strong>Add CORS Configuration</strong> and paste in the following snippet:</li>
</ul>
<pre><code><?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="https://s3.amazonaws.com/doc/2006-03-01/">
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <MaxAgeSeconds>3000</MaxAgeSeconds>
        <AllowedHeader>Authorization</AllowedHeader>
    </CORSRule>
</CORSConfiguration>
</code></pre>
<ul>
<li>Click <strong>Save</strong>. The above snippets allow everyone to request all files inside this bucket.</li>
</ul>
</li>
<li><p>That’s it! You have now a working website without the need to provision any server or any other infrastructure to support your static website. Whatever you upload in this bucket is going to be accessible from all over the world and rest assured that your website will always be online ;)</p>
</li>
<li><p>The link to use in order to access your website is under the <strong>Static Website Hosting</strong> option, where it says <strong>Endpoint</strong>. It should be something similar to <code>lambrospetrou.com.s3-website-region.amazonaws.com</code>. I will refer to this endpoint throughout the rest of the tutorial as <strong>S3-Endpoint</strong>, so take a note of it.</p>
</li>
<li><p>Try to open that link using your browser. Of course you will see nothing since you haven’t uploaded anything to your bucket yet ;p</p>
</li>
</ol>
<h3 id="make-the-www-subdomain-to-redirect-to-the-apex-domain"><a href="#make-the-www-subdomain-to-redirect-to-the-apex-domain">Make the www-subdomain to redirect to the APEX domain</a></h3><p>In this section I will describe how we can make all requests to <code>www.lambrospetrou.com</code> to be redirected to <code>lambrospetrou.com</code>. Again we only need to use Amazon S3 for this part. The idea is very simple. We will create another bucket named <code>www.lambrospetrou.com</code> but now instead of putting files into this bucket we will specify redirection rules to the bucket we created before.</p>
<ol>
<li><p>Create another bucket following the previous steps 1-4 but this time use the www-prefixed domain (e.g. <code>www.lambrospetrou.com</code>) as the bucket name.</p>
</li>
<li><p>The next step is to add the redirection rules.</p>
<ul>
<li>Click the <strong>Static Website Hosting</strong> option.</li>
<li>Select <strong>Redirect all requests to another host name</strong></li>
<li>Fill the input box named <strong>Redirect all requests to:</strong> with your non-www domain name, which is also the bucket name your created before (e.g. <code>lambrospetrou.com</code> like the picture below).</li>
</ul>
<p> </p>
<ul>
<li>Click <strong>Save</strong></li>
</ul>
</li>
<li><p>Finished!</p>
</li>
</ol>
<h3 id="test-your-website"><a href="#test-your-website">Test your website</a></h3><ol>
<li><p>Upload your website to the non-www bucket you created first, or just create an <code>index.html</code> file with the following snippet:</p>
<pre><code class="language-html"><html>
    <head>
        <title>Awesome website on AWS S3
    
    
        
Hello world!

    



Try to access your website using your S3-Endpoint (lambrospetrou.com.s3-website-region.amazonaws.com). You should see your website, if not there is a problem so contact me if you cannot figure out what step went wrong.

Try to access your www-prefixed endpoint and make sure that you are redirected to the non-www one.


Use a custom domain with our website instead of the Amazon S3 endpoints
In this section, I will describe how you can use your domain name (e.g. lambrospetrou.com) to point to your website on S3 using Route 53, which is Amazon’s DNS solution offering.

Register your domain with any registrar or Route 53 itself if you do not have a domain in hand.

Open the Route 53 management console (direct link).

Click Hosted zones from the left navigation menu.

Click Create Hosted Zone in order to setup our domain.

For Domain Name you have to specify the non-www domain (e.g. lambrospetrou.com)
Click Create


Open the hosted zone you just created and make sure that it has the NS and SOA record sets.

Now you have to set the nameservers of your domain to the servers specified under the NS record. This is required only if your registrar is not Route 53. You can continue with the tutorial but in order for you to be able to see the changes you have to change the nameservers of your domain.

Now we have to creare a Record Set for our domain in order to point to the S3 bucket we created above.

Click Create Record Set
Leave the Name as blank (empty).
Set the type to A - IPv4 address
Select Alias - YES
Type into the Alias Target your S3-Endpoint for your non-www bucket (e.g lambrospetrou.com.s3-website-region.amazonaws.com).
VERY IMPORTANT: do not select from the suggested targets, type the full S3-Endpoint yourself (I will explain later on why).
If this gives you an error about an invalid value, i.e Alias target contains an invalid value then use the S3 website region endpoint for alias target, like s3-website-eu-west-1.amazonaws.com. (note the dot at the end!).
Click Create.


Repeat step 7 in order to create a Record Set for your www-prefixed domain but now you have to use www as the Name of the Record Set, and the www-prefixed S3-Endpoint as an Alias Target.

You should allow a few minutes (or hours) for your changes to propagate through the network and assuming that your successfully changed the nameservers of your domain you should be able to visit your naked domain, lambrospetrou.com, and see your website as you would through the S3-Endpoint. Also, when you visit the www-prefixed domain you should be redirected to the naked one.


You can stop now if you want since you managed to create a static website that is very durable, highly-available, cheap, and that uses your custom domain. The rest of the tutorial will cover Amazon Cloudfront and how caching can affect your users all over the world.
Use Amazon Cloudfront to provide super-fast latencies all over the world
Again, we will use Cloudfront to provide a cache layer in front of our website for several reasons.
First of all, Cloudfront is a CDN (content delivery network) with edge locations scattered around the world which means that our users will always download the files from locations close to them avoiding long routing trips to the other side of the world (the location of your website is the region you specified during the S3 bucket creation).
In addition, Cloudfront is very useful because it allows us to use GZIP compression on most of the static files out-of-the-box, which is a new feature released in 2015 (related link).
Another very important advantage is that we can use HTTPS to access our website. By default S3 supports only HTTP requests through the custom domain, although using the S3-Endpoint allows for HTTPS. With Cloudfront we can use HTTPS from the user to AWS and then Cloudfront internally will use HTTP to communicate with S3 and return the content back to the user over HTTPS again. This feature in conjuction with the fact that now Cloudfront can be associated with an Amazon SSL key allows us to use HTTPS for Free. Both features were released in January 2016 (AWS Certificate Manager, Cloudfront Origin Security Features).
Let’s go ahead and create our first Cloudfront distribution.

Open the Cloudfront management console (direct link).

Click Create Distribution and then click Get Started under the Web section (see picture below).
 

Now you have to be very careful here.

Option Origin Domain Name should be set to the full S3-Endpoint of your non-www bucket (e.g. lambrospetrou.com.s3-website-region.amazonaws.com)
Leave the other options default for now since you can change these later as your website evolves (see two pictures below).

 
 

The only option you might want to change for now is the Forward Query String depending on whether your website uses query strings from users. By default Cloudfront will just strip out the query string before forwarding the request to S3 so enable this if your want it.
Additionally you will most probably want to enable the option Compress Objects Automatically since it will GZIP most of the static files, thus leading to even faster download times.


In the Distribution Settings choose the Price Class you want depending on your user-base but bear in mind that the more edge locations you support the higher prices you pay.

In the Alternate Domain Names (CNAMEs) you have to write in the custom domain names you want to use to point to this distribution. In my case I just have the naked domain lambrospetrou.com.

The next step is to setup our HTTPS certificate (if you do not want HTTPS just skip this step and leave the defaults)

Select Custom SSL Certificate under the SSL Certificate option.
If you have already uploaded your own custom certificate to AWS use that, otherwise click Request an ACM certificate to get an Amazon certificate for free.
In the opened page type in your naked and www-prefixed domains (e.g. both lambrospetrou.com and www.lambrospetrou.com, see picture below).

 

Click Review and Request and make sure to confirm the certificate activation.
Now you should be able to select the newly created certificate from the dropdown menu (sometimes you might have to refresh the page to make the certificate available to the dropdown)


Click Create Distribution and wait for the status to become Deployed instead of In Progress.

Once the distribution status is Deployed use the Domain Name of your distribution to access your website (e.g. xxxxxxxxxxxx.cloudfront.net). If we did everything right you should now be able to see the exact page as if you visited your website through your domain or through the S3-Endpoint.


Setup a Cloudfront distribution for the www-prefixed domain
Now we need to create another distribution for the www-prefixed domain in order to use the second bucket we created that just redirects to the first.

Repeat steps 1-6 but now use the www-prefixed domain in steps 3 and 5.

Once your distribution is deployed try to visit it using the distribution’s Domain Name and make sure that the redirection works properly.


Set your domain in Route 53 to point to Cloudfront distribution
The last step is to make our custom domains (e.g. lambrospetrou.com and www.lambrospetrou.com) point to the Cloudfront distributions rather than the S3 buckets directly.
Since we have already setup Record Sets in Route 53 we just need to update them.

Open the Route 53 management console (direct link).

Click Hosted Zones from the left navigation menu.

Select your domain hosted zone (e.g. lambrospetrou.com)

Click on the naked (non-www) domain record set and update the Alias Target now to have the Domain Name of the corresponding Cloudfront distribution (e.g. xxxxxxxxxxx.cloudfront.net). Click Save Record Set.

Repeat step 4 for the www-prefixed domain to use the second Cloudfront distribution.


Finally, Done!
Congratulations, you have created a website that is impossible to hack, it will always be available, it will always have exceptional performance, it is amazingly easy to update, you have HTTPS, you have GZIP compression for static files, you use your custom domain names and it should not cost you more than a few dollars (mainly for Cloudfront) unless you have billions of visitors, in which case I guess you make enough money with the content of the website ;)
The most interesting part of this architecture is that although this scenario uses entirely static websites, you can very easily extend this to support dynamic websites too and at the same time keep the caching layer for all the benefits it brings. You will just need to create another Cloudfront distribution that will handle a specific prefix of your domain, or even a subdomain, and which will point to your server (EC2, Elastic Load Balancer, etc.) in order to create the dynamic content of the website. This scenario and much more are explained in high detail in an amazing whitepaper by AWS which you can download for free named Hosting Static Websites on AWS. It covers the topics we implemented in this tutorial along with more advanced topics like A/B testing, user sharding, logging, security and many more.
I hope you liked the tutorial, but even if you did not I would really appreciate your feedback to improve it (I am pretty sure you can find where to send me a message).
Yes, my website at the moment (as of the time of this article) is hosted on S3, with Cloudfront distributions as defined above and Route 53 to handle my custom domain.
Notes

The reason that we use the full S3-Endpoint when we refer to it either from Cloudfront distributions or from Route 53 Record Sets is simple. When S3 acts like a website, like in our scenario, it handles automatically the redirects and the retrieval of index.html files when the request is done on a directory. In order to use these features when behind of a custom domain we have to use the full S3-Endpoint otherwise if we use the links suggested in the AWS documentation or those suggested form the dropdown menus in the console these features will not work.

You can play along with the caching behavior of Cloudfront distributions. With the default values, if you do not specify a Cache-Control header on an object in S3, the defaults are going to be used which based on the documentation use 1 day of caching. Depending on your website you can specify to cache for less time, forward query strings, forward custom headers or not, etc.

You can use the S3-Endpoint during development for easier testing since Cloudfront will serve the cached version to you which will annoy you during testing ;p Otherwise you can invalidate some files in your distribution, but even that takes time so think carefully about your caching technique (times, versionings, etc.) and use S3-Endpoints to ease up your dev process.


References

AWS Cloudfront Documentation - Working with Web Distributions
AWS S3 Documentation - Setting Up a Static Website




Create a contact form using Amazon Web Services (AWS) Simple Notification Service (SNS)
lambros@lambrospetrou.com (Lambros Petrou) — Wed, 28 Oct 2015 00:00:00 GMT
A common feature among most websites is a way to contact the owner for any comments or feedback (or complaints). In previous versions of my website and blog I used several techniques to provide a contact me form. In almost all the cases I had to implement the actual email submission on the website’s hosting server or in some cases used a 3rd party service that provided me this functionality by querying an HTTP endpoint they gave me with the actual message and they sent the email for me, to me.
Today I decided to roll out my own contact-form service using AWS. As it turns out, it is amazingly easy to setup and it only takes you 15 minutes from the moment you sign up for the free-tier account to the point where you have an active form sending you emails. Awesome time we live in!
Assumptions

You have an active AWS account (if not do yourself a favor and register for the 1-year free tier at AWS).
You have a website somewhere or you know how to use Simple Storage Service (S3) from AWS to host your website files.

Services
In order to implement the contact form we are gong to use two services from AWS.

Simple Notification Service (SNS) (SNS service)
 Amazon SNS is a fast, flexible, fully managed pub-sub messaging service. Use it as a cloud-based mobile app notification service to send push notifications, email, and SMS messages; or as an enterprise-messaging infrastructure. (description as per AWS documentation)

Amazon Cognito (Cognito service)
 Amazon Cognito gives you unique identifiers for your end users and then lets you securely store and sync user app data in the AWS Cloud across multiple devices and OS platforms. You can do this with just a few lines of code, and your app can work the same, regardless of whether a user’s devices are online or offline. When new data is available in the sync store, a user’s devices can be alerted by a silent push notification so that your app can sync the new data automatically. (description as per AWS documentation)


Procedure
The whole procedure is very simple. We will create an SNS topic (imagine it as an endpoint) where the website will send the message of the contact form. As soon as SNS receives the message it will automatically send it over to our email, which we will have configured through the AWS console to be a subscriber of that specific topic. Amazon Cognito is used just as a way to get temporary credentials for the website in order to be able to publish the message to the SNS topic.
I will explain the implementation in three steps, a) Amazon Cognito, b) SNS and c) Securing our SNS topic to allow only publishings from our website domain.
Amazon Cognito
We use Cognito just for the sake of getting temporary credentials for our website in order to be able to use any AWS service through the Javascript SDK.

Login to the AWS console and open the Amazon Cognito service (direct link)

Select Create new identity pool
 

Type in a name for your pool (i.e. Website_Contact_Form) and enable Unauthenticated identities.

You should have something along the lines of the following picture.
 

Click Create Pool and then on the next page you get the message “Your Cognito identities require access to your resources”. Just click Allow (bottom right) to proceed.

You should be in the Edit identity pool page now where you see the information you provided above. From the left side menu choose Sample code and in the Platform dropdown select Javascript.

Download the AWS SDK since you will use it on your website later.

You can use the code provided to get temporary credentials for the newly created identity pool. In my case the code is similar to what is shown below.
// Initialize the Amazon Cognito credentials provider
AWS.config.region = 'eu-west-1'; // Region
AWS.config.credentials = new AWS.CognitoIdentityCredentials({
    IdentityPoolId: 'eu-west-1:e4c24108-5050-42f8-ac0b-761c46aa947f',
});



That’s it for the Cognito service!
Simple Notification Service (SNS)
SNS is a bit longer procedure but nothing complex. The philosophy behind SNS is that you have a topic (endpoint) with some publishers and some subscribers. When a publisher posts something to that topic then all the subscribers are notified automatically by the service with the newly published message. I guess by now you should have figured out that the subscriber to our SNS topic will be our email and the publisher of the topic will be our website.
Let’s dive in!

Login to the AWS console and open the SNS service (direct link)
 

Select Create Topic and then just fill out the Topic name with whatever you want (i.e. com-website-contact-form). You should now see the Topic Details page for your new topic (similar to the picture below).
 

Click on the dropdown Other topic actions and select Edit topic policy.

From the Edit topic policy screen make sure you are in the Basic View and modify the options as per the picture below and click Update. Basically we allow everyone to become publisher of the topic but only we (the owner) can subscribe to it.
 

Now you successfully have a topic that anyone can publish messages to.

Time to add our email as a subscriber to the topic. From the Topic Details page, click on the Create Subscription button.

Make sure to change the Protocol to Email and fill out your email on the corresponding box and click Create Subscription. An email will be sent to your email in order for you to confirm that you would like to subscribe to this topic.
 


Perfect, we are almost there!
Code
Here I will provide the very minimal code required to publish a message from your website. I will not include any styling or validation or anything since the scope of the article is just to connect the different AWS services together.
Below you can find the whole HTML and Javascript code used to send a message to your email from the website.


    
    


    
        
        
    



I deployed locally the above code and as you can see from the screenshot below it works perfectly! Good for us!

Secure our SNS topic
The only issue we have right now is that anyone can publish messages to the SNS topic. Therefore we could receive a lot of spamming if someone inspected our Javascript code and extracted the endpoints for the SNS topic and Cognito identity pool.
The solution to this problem is to add a policy rule in our SNS topic to only allow our website to act as a publisher, which is trivial.

Login to the SNS console (direct link)

From the left side menu select Topics

From the list of topics select the one we created before, then click at Actions and from the dropdown select Edit topic policy.
 

Now instead of the Basic View you should go to the Advanced View

In Advanced View you can directly edit the topic policy rules in detail. We want to restrict access for publishing only to our domain. Make sure that the second statement of the rules is as the following excerpt.
{
  "Sid": "__console_pub_0",
  "Effect": "Allow",
  "Principal": {
    "AWS": "*"
  },
  "Action": "SNS:Publish",
  "Resource": "arn:aws:sns:eu-west-1:717437904155:com-website-contact-form",
  "Condition": {
    "StringLike": {
      "aws:Referer": [
        "https://www.lambrospetrou.com/*",
        "https://lambrospetrou.com/*"
      ]
    }
  }
}

 The above configuration adds the Condition to the existing rule which is what is restricting access to the Publish permissions of the topic only to our domain. Make sure that you use your own Resource arn url which can be found in the topics details page and also your own domains.

Done. Your website is the only one with rights to publish messages into your SNS topic (except of course yourself through the console). Please be advised though that it is not difficult for programmers to create TCP packets with the proper headers to fool this policy. At least we make it a bit harder for the usual user :)


Conclusion
In conclusion, we have seen how easy it is with Simple Notification Service (SNS) and Amazon Cognito services to create our own mailing system. You can extend this configuration to send SMS, Push notifications to your devices or anything else you can imagine just by playing with the different SNS subscription types.
There are many features built into SNS that you can use in order to make your messages more advanced and more powerful. You can read all about them in the AWS SNS documentation.
If you spot any errors in the tutorial or your experience is different from what is described here please don’t hesitate to contact me.
Disclaimer
The details used above for the SNS topic and Cognito identity pool were deleted after completing this tutorial.



After-school effect
lambros@lambrospetrou.com (Lambros Petrou) — Sun, 25 Oct 2015 00:00:00 GMT
Time passes and goes at amazing speed. Finally, after twenty-five years, school has reached its end. I still don’t know if this is bad or good, since I am now thrown into the wild jungle known as life.
Anyway, enough with the philosophical talk. Now that I graduated from university I have some time to revisit some of my old projects and refactor them. As always, when I want to try something new I re-write my website. During last week I completely re-wrote my website and blog from scratch, lambrospetrou.com.
As I described in my last article, dated in 2014-July-20, I am in love with Go aka #golang so it was a natural choice to use it again. I decided to go away from the dynamic nature of a blog and revert to a static blog with dynamic content ;) Well, let’s just say that whenever I create write an article I regenerate the website. Static Site Generators have seen a lot of popularity increase in the last couple of months, maybe because they are easy to use or because the site is easier to deploy than messing around with Wordpress, Joomla, plugins and all the sh**t brought with them. Additionally, as a Software Engineer I am always fond of my own creations. You can find the generator for this blog at Github: Micro-blog generator repository. I have to warn you that it is not exactly designed for generic usage. However, if you like its simplicity and want to try it just check out the source code of this blog at Lambros Petrou blog source code and replace it with your own content.
Apart from my website and blog, another important project I had in pending state was Spi.To, my own custom URL shortener and text share tool. During last year, I was experimenting with Polymer and Paper elements to give it a material design makeover. However, I found it to be pretty unstable (it was still in development stages so it might have fixed its issues with its current stable version) and very heavy for the simple and minimalistic product I wanted. I ported most of the stuff I needed into native Dart and now I find it to be much more responsive.
I plan to add more features to Spi.to during the coming months and I really want to replace the back-end with a server-less infrastructure at Amazon Web Services (AWS). AWS is among my top priorities in the To-Learn list for this year and I would love to create more projects utilizing the myriad services available by Amazon, especially the new API Gateway, Lambda functions and of course the awesome Simple Storage Service (S3).
One thing I wanted to do for a long time was to find a way to edit my CV (resume) easily without booting into Windows just to use Microsoft Word and without searching for a Latex template that would most probably not match the design I wanted. I decided to turn to the easiest platform for customizations and easy updates, the web. I replicated the design of my old CV using only HTML and CSS, leading to a pixel-perfect document which is amazingly easy to update and to customize in-detail. You can find my new CV version at lambrospetrou.com/cv/.
In conclusion, I would like to take advantage of my time working at Amazon and be certified as an AWS Certified Professional, which will boost my knowledge and skills around cloud computing and at the same time open many more doors career-wise. As it seems AWS is the dominant cloud computing provider and it has no plans to cease existing for the foreseeable future.
It is nice to be in production mode again :)



Micro-blog service written in Go as a personal blog
lambros@lambrospetrou.com (Lambros Petrou) — Sun, 20 Jul 2014 00:00:00 GMT
I have been a user of Wordpress since 2006 in order to provide my own blog, latest version found at MasterGenius.NET. I invested a lot of time in it learning PHP, web development (HTML, CSS, Javascript and the likes) and I thank it since it made my way into the web as a developer. There have been many times I wanted to switch out from it and make my own micro-blog service but due to time constraints and other things popping up I always halted the process.
Recently, I started using Go language as my primary tool for my latest projects and I am really amazed by how much you can create in a small amount of time and at the same time having a substantial performance. All other languages that tried to do something similar failed either in performance or in coding-style, at least for my taste. Go is the sweet spot ;)
I wanted a project in order to learn Go further than reading and doing some tutorials so the micro-blog service project kicked-off. This blog is the result of those 3-4 days and I am going to use it as a more personal blog with thoughts and opinions on variety of topics and also post some useful tutorials (without going too much out-of-topic though like my previous blog).
This blog is written entirely in pure Go #golang language without any 3rd party library (i.e gorilla/mux or martini) that many Go articles suggested and I have to admit that I am very impressed. The core standard library is very sufficient and provides you all the tools to create anything you want easily. As a back-end I am using Couchbase server 2.x which is one of the best NoSQL databases I have tried so far, and speedy-demon too. As for the content of the articles I made a very minimalistic and simple panel where I can write them in Markdown format. I have chosen Markdown for the articles because it lets me write in plain text without any HTML tags or CSS classes getting involved, thus I can concentrate on the content. And it is good to know since it is being used in many other services too, like Github. The Markdown content is compiled in HTML when posted by the Go back-end.
The old blog is still live, at least for now, and I will try to find some noteworthy tutorials I wrote back in the days and post them here too, like I already did with some of them but I will keep this blog cleaner and more personal.
I hope you like this new blog and if you have any suggestions don’t hesitate to contact me ;)



Wordpress permissions guide
lambros@lambrospetrou.com (Lambros Petrou) — Mon, 18 Nov 2013 00:00:00 GMT
If you have ever installed wordpress on your own server, at some time you faced permissions problems. Especially when you are migrating a wordpress blog from one host to another and there is file copying involvement through ftp, scp etc.
If not, lucky you! :)
Scenario



Description
Value



username
lpuser


apache2 groupname
www-data


wordpress dir
/home/lpuser/wordpress/


Let’s say you are lpuser and your server that runs the wordpress blog belongs to the www-data permissions group.
Wordpress Permissions
Everything should have the 0755 permission for the user, apart from the .htaccess files and the wp-admin/index.php file that need to have 0644.
Let’s do this:
sudo chmod -R 0755 /home/lpuser/wordpress/
sudo chmod 0644 /home/lpuser/wordpress/.htaccess
sudo chmod 0644 /home/lpuser/wordpress/wp-admin/index.php

Now if you execute the following command you should verify that everything has rwxr-xr-x, and the two other files rw-r--r--.
If you try now to make changes to your theme through the wordpress editor then you won’t be able to save since the server does not have write permissions.
We can find that apache runs under the user www-data and the same group, as seen by the following commands:
ps aux | grep apache
groups www-data

So an easy solution to fix the problems is to add ourselves to the group www-data and then make the files to be owned by us, and the new group we belong.
This can be achieved with the following commands:
sudo usermod -a -G www-data lpuser
sudo chown -R lpuser:www-data /home/lpuser/wordpress/wp-content/themes
chmod -R 0775 /home/lpuser/wordpress/wp-content/themes

The last command allows the group to read/write/execute on the themes folder only.
If you have another way don’t hesitate to share it. The scenario is that we have a username different than the server’s that runs the wordpress blog.



Share your Dropbox folder between Windows and Ubuntu on dual-boot system
lambros@lambrospetrou.com (Lambros Petrou) — Thu, 16 Aug 2012 00:00:00 GMT
Many users are dual booting Windows and Ubuntu for numerous reasons today. Well a problem that may arise is that if you have Dropbox, you should have 2 folders of your Dropbox files, 1 for Windows and 1 for Ubuntu.
Following the guide below you can consolidate them into one and reclaim your wasted space back.
NOTE: I am assuming that the Windows partition is mounted at startup when you boot into Ubuntu.
Guidelines

Set up your Dropbox folder inside Windows first and verify that it is syncing your files without problems.

Boot into Ubuntu and stop Dropbox from syncing.
pkill dropbox


Delete your Dropbox folder from Ubuntu (the command below assumes it’s located in your home directory).
rm -rf /home/_username_/Dropbox/


Create a symlink to your Windows Dropbox folder (this will create a symbolic link  in your home directory that links to your Dropbox folder in Windows).
ln -s /media/WindowsOS/User/_username_/Dropbox /home/_username_/Dropbox

 /media/WindowsOS/User/_username_/Dropbox
 the path to the Dropbox folder in your Windows partition mounted in Ubuntu  
 /home/_username_/Dropbox
 the default folder for Dropbox to synchronize

Reboot and you’ re finished. Now you maintain only one copy of your Dropbox folder which needs to be synced only once, no matter which OS is booted at any time.


That’s it, I hope you won’t find any problems. If you do though, contact me :)



Install JDK 7 (Java Development Toolkit) on Ubuntu 12.04 LTS
lambros@lambrospetrou.com (Lambros Petrou) — Fri, 27 Jul 2012 00:00:00 GMT
Since version 7 of Java Ubuntu no longer has in its packages the required files for the Oracle JDK 7 due to new license terms by Oracle. The default option for Ubuntu is OpenJDK but if you really want the standard Java from Oracle (former Sun) then you have no easy way to do it on your own.
Therefore I prepared a simple script that will install the Java 7 development toolkit on your system with no effort at all.

Download the correct package from Oracle website: JavaSE Downloads.
Choose the JDK version and afterwards Linux x86 or Linux x64 depending on your system. It must be the file with the .tar.gz extension.

Download my script from here: Oracle-JDK-Installer Repository
Click the button ZIP and after downloading it extract it anywhere, and remember the path.

Put the package you downloaded at step 1 in the same directory as the jdk-installer.sh you downloaded (step 2).

Type
```bash
sudo ./jdk-installer.sh PATH_TO_PACKAGE SYSTEM_ARCHITECTURE
```

You can read the instructions at my github repository (step 2) for more information.

Verify your installation:
java -version

You should get a message mentioning the version you downloaded.

Visit Verify Java Version and click Verify Java version in order to check that the browser plugin works.


That’s it folks!



Control the fan speed of you Thinkpad T430 running Ubuntu 12.04 LTS
lambros@lambrospetrou.com (Lambros Petrou) — Fri, 20 Jul 2012 00:00:00 GMT
After researching for a few hours online to find a guide on how to control my Thinkpad’s fan speed I realized that the new models have some differences from previous models and the guides available are not complete if not wrong. So, I am making this tutorial for anyone that has a new Thinkpad (x30/x20 models) and needs to control his fan in order to keep the noise down and get more battery life.
Every step below uses the terminal so open an instance with the combination CRTL + ALT + T

The first thing we will do is to install a program that will provide us information about the sensors of the laptop and their temperatures
sudo apt-get install lm-sensors

 Configure the application in order to find every available sensor
sudo sensors-detect

 Answer Yes to every question and the last confirmation for saving the changes made.

Install thinkfan which is our main program:
sudo apt-get install thinkfan


Add the coretemp module to the startup list. It will provide us the temperature inputs.
echo coretemp >> /etc/modules


Load the coretemp module.
sudo modprobe coretemp


The next step is to find your temperature inputs so take note the results of the following command.
sudo find /sys/devices -type f -name "temp*_input"

 If you don’t get any outputs ( similar to the next step ) please Reboot and continue from this step.

It’s time to edit our thinkfan configuration
sudo vim /etc/thinkfan.conf

 Go to the line where it says #sensor /proc/acpi/ibm/thermal … and below that line (which should be commented since thermal is not supported in the new thinkpads) insert something like the following:
sensor /sys/devices/platform/coretemp.0/temp1_input
sensor /sys/devices/platform/coretemp.0/temp2_input
sensor /sys/devices/platform/coretemp.0/temp3_input
sensor /sys/devices/virtual/hwmon/hwmon0/temp1_input

 The above lines are the results from Step 5 prefixed with ‘sensor ‘.

Time to set the temperature rules. The format is: FAN_LEVEL, LOW_TEMP, HIGH_TEMP meaning that each FAN_LEVEL will start when the highest temperature reported by all the sensors meets its LOW_TEMP and if it surpasses its HIGH_TEMP it will go to the next FAN_LEVEL rule. If it goes below the LOW_TEMP it will fallback to the previous FAN_LEVEL rule.
 Please take notice that the HIGH_TEMP of a rule must be between the LOW_TEMP & HIGH_TEMP of the rule that follows.
My settings are:
#(FAN_LEVEL, LOW, HIGH)
(0,    0,    60)
(1,    57,    63)
(2,    60,    66)
(3,    64,    68)
(4,    66,    72)
(5,    70,    74)
(7,    72,    32767)

 NOTE: I am not responsible for any problems you encounter with these rules. They are fine as per my configuration so please test them before using them and if necessary adjust them to your needs.

Now, we must add a configuration file into the modprobe.d
sudo echo "options thinkpad_acpi fan_control=1" >> /etc/modprobe.d/thinkpad.conf


If you want to start thinkfan automatically at boot-time please type the following:
sudo vim /etc/default/thinkfan

 Change the line START=no to START=yes. If the line does not exist, add it yourself.

RESTART your laptop and everything should work as expected. Test your laptop’s temperatures (using sensors command) under different workloads and verify that the fan speed is as per the rules you defined.


If you encounter a typing mistake or a step not working for you please contact me.



nVidia Optimus Support in Ubuntu 12.04 LTS - How to get the most of your battery life guide
lambros@lambrospetrou.com (Lambros Petrou) — Fri, 20 Jul 2012 00:00:00 GMT
Anyone that purchased a laptop with nVidia Optimus is facing problems with Linux support. In a few words, Optimus is not supported officially ( the best project so far is Bumblebee ). As a result the power draw is more than double compared to integrated card-only systems because both cards are enabled and loaded but only one is in use, the Integrated one.
Bumblebee is a project that tries to bring Optimus support in Ubuntu and so far it had good results but the battery impact was inevitable.
The latest update though changed that and now linux fans can have their Optimus configuration without the enormous battery draw. Follow the steps below to install bumblebee but please note that I am assuming you removed any nVidia drivers you may installed on your own.

We must add the repositories for the bumblebee packages so please type the following in your terminal CRTL + ALT + T
sudo add-apt-repository ppa:ubuntu-x-swat/x-updates
sudo add-apt-repository ppa:bumblebee/stable
sudo apt-get update


Now install the bumblebee:
sudo apt-get install bumblebee bumblebee-nvidia

  NOTE: If you do not want the discrete card to be enabled at all please use the following command instead:
sudo apt-get install --no-install-recommends bumblebee

  This will not install the drivers for your discrete card  but if you want in the future to have it just enter the following and get it installed:
sudo apt-get install bumblebee-nvidia


Some people recommend to reboot now although I saw the results without doing so.

If you installed the nVidia driver too, in order to launch a program and use the discrete card you must use the following command:
optirun _appnamehere_ &



In order to get support for 32-bit applications with nVidia you should do the following:
sudo apt-get install virtualgl-libs-ia32

I hope that you will find this guide helpful. I got amazing results with it myself:
Power draw before: 26-28w
Power draw after: 11-13w
Bumblebee Project Website



Change your Computer Name (hostname) inside your network - Ubuntu
lambros@lambrospetrou.com (Lambros Petrou) — Sun, 11 Sep 2011 00:00:00 GMT
Your hostname or otherwise Computer Name is the name that your computer uses inside the network it is connected to in order to distinguish itself. Usually you setup your own during the setup of the Operating System but if for any reason you want to change it check out the guide below.

Open a terminal CRTL + ALT + T

Execute:
sudo vim /etc/hostname


Change it to whatever you want your name to be, Save and Close the file

Now we have to update our hosts file in order to route ourselves.
sudo vim /etc/hosts


Find the line where it says: 127.0.1.1 usernameHere and change it to what you entered above and then Save and Close the file.


Enjoy your new Computer Name :)

Description	Value
username	lpuser
apache2 groupname	www-data
wordpress dir	/home/lpuser/wordpress/

Lambros Petrou RSS Feed

Cloudflare Durable Objects are Virtual Objects

Large Language Model (LLM) prompting guides

DevConnect Series talk - Cloudflare Developer Platform Overview

How to do encryption and envelope encryption with KMS in Go

How to detect website text content changes with Skybear.NET

Latest vCard Update

element that has a child with the expected ID://h3[.//*[@id='latest-vcard-update']]

element from step 1://h3[.//*[@id='latest-vcard-update']]/following-sibling::ul[1] We will normalize the text content of the element and its children to remove leading and trailing whitespace simplifying our assertion (see normalize-space() docs):normalize-space(string( ... ))

Testing the Cloudflare D1 REST API with Hurl

Control and data plane architectural pattern for Durable Objects - Cloudflare Reference Architecture Diagram

Love letter to Hurl

Deploy your applications on a server with zero downtime

Building a global TiddlyWiki hosting platform with Cloudflare Durable Objects and Workers — Tiddlyflare

Skybear.NET Scripts secret variables, HTTP triggers, and replacing AWS Lambda with Fly.io - Changelog 2024-10-15

Building a feature flags service with Cloudflare Workers and Durable Objects

Durable Objects (DO) — Unlimited single-threaded servers spread across the world

Brag document and folder — feels good only

Skybear.NET Scripts landing page and Business plan - Changelog 2024-07-21

Skybear.NET Scripts response bodies and cron triggers - Changelog 2024-05-28

Ownership - High agency - Manager of One

SaaS Pricing - What I want vs What I offer

Skybear.NET Scripts private user accounts - Changelog 2024-02-18

element that has a child with the expected ID:
`//h3[.//*[@id='latest-vcard-update']]`

element from step 1:
`//h3[.//*[@id='latest-vcard-update']]/following-sibling::ul[1]`

We will normalize the text content of the
element and its children to remove leading and trailing whitespace simplifying our assertion (see `normalize-space()` docs):
`normalize-space(string( ... ))`