Studying Diffuse - A Cloud Music App

#study-session

2023-10-29

This method has been inspired and adapted from an analysis method that has nothing to do with programming, as we can neither hear or feel our code. Instead I applied it to what I consider the 6 pillars of software engineering and programming: Environment, Architecture, Framework or Libraries, Abstractions, Algorithms and Syntax.
This method is interesting because it allows a structured documented thought process when analysing other projects cherry picking new insights. It also focuses on exactly one observation per category, making for relatively short sessions.

Source Code Study of

Diffuse from Github

Why I chose this?

First of all whenever I learn a new language I want to learn from an example. I also like the example to be somewhat related to multimedia or creative domains, in this case music streaming. It also showcased an interesting architecture of a clear frontend and backend, in Elm which was the language in question. I want to learn Elm as a purely functional alternative Frontend language to Javascript, so this looks like the perfect fit. Since my goal is to learn about Elm overall, I will adopt an open mind to the full project.


Environment

The environment is featuring multiple programming languages: js, elm, haskell
each with their own dependencies and package managers: npm, cabal, stack
And an unfamiliar new system, which I came across before: 'nix'

What caught my attention?

I was wondering how everything played together.

What is going on technically?

Nix is a reproducible build system. Having a look at the shell script located in nex/shell, we can see that it defines entire toolchains and languages, or rather language package managers. The building itself is done through Just, which looks very interesting, having a minimal definition for the different build tasks, which can also be expressed in each language natively. Very cool. Nix is additionally run inside a Docker container -> Making use of NixOS?

What is the effect ? (What)

The effect is a purely functional and compact system that builds and manages your dependencies automatically.

What else supports this effect? (And?)

The interaction with Just makes this very powerful. First nix downloads the packages and sets up the dev environment and then runs Just which in turn supports multiple building instructions to build the different parts of the system and runs additional tasks.

How can I use this?

Get Nix, and use it in multi-language tools, however it is only supported through WSL on Windows. Just on the other hand could be an interesting alternative to Gulp, especially in polyglot projects.


Architecture

The Elm code is organized into Applications and Library. Applications is Brain and UI and higher-ordered the source code divided into Library and Application. I really appreciate the clear separation.

What caught my attention?

The module "Brain", could it be a Backend implemented in Elm (a frontend language), is that possible?

What is going on technically?

I was right, Brain seems to be a backend worker. This architecture reminds me of electronjs. It uses Platform.worker as entrypoint and has no "view" (obviously). It uses the Alien abstraction to communicate with the UI application. Brain seems to be running in the background taking care of heavy asynchronous tasks, like downloading and connecting to Service providers.

What is the effect ?

Like in electron you have a dedicated thread to off-load heavy operations without blocking the frontend making the UI feel sluggish.

What else supports this effect?

As already mentioned the Alien abstraction seems to be an important bridge between the Brain and the UI.

How can I use this?

It is good to know that I can have workers in Elm, to implement heavy operations and I am not limited to a single render thread.


Library or Framework

The few Haskell modules found in "Build" make use of Something called "Shikensu" for building appearently.

What caught my attention?

While Elm is inspired by Haskell, I am still very surprised to see Haskell in the project, to use a library which has an interesting name.

What is going on technically?

Shikensu is used to run a sequence of functions on in-memory representations of files.

What is the effect ?

It is used somehow to write some meta-data to the build process. My best guess is from a quick glance, that the built files are modified by Shikensu after they have been written without having to really open them. It is also aware of the file-tree which is processed before writing.

What else supports this effect?

It works by invoking stack build in the Justfile @system definition, first building then executing, which is called in every build task and system related task. This also seems to be related to the following licensing step. Maybe licensing is injected in each file?

How can I use this?

Generally speaking I like this very colorful ecosystem of different languages and build steps being all combined neatly into one Task runner, without having to stick to a common Interface like in GulpJS with streams. It seems elegant and very scalable, yet flexible. Haskell as pipeline tool seems to be very intruiging!


Abstraction

The Alien abstraction in the Elm Library

What caught my attention?

Because of its unorthodox name, yet expressing a concept concisely: Interfacing with external systems. Or communicating between alien worlds.

What is going on technically?

It is a module responsible for the communication between workers and manages incoming and outcoming data. The type Event is defined with a Tag (string), data (as json encoded value) and a Error (which is a maybe string). More importantly it defines the sumtype Tag as a collection from/to UI messages and more general events like SecretKey? He then creates an enum by binding Tags and capitalized string messages.

Then the three function broadcast, report and trigger are defined to respectively communicate a json value, report an error or just sending a tag. Then the functions tagDecoder, tagToJson and tagToString are just an abstract wrapper around the respective enum implementations. A more interesting function is hostDecoder which is a function that allows for nested decoding, and seems to have a bit more involved implementation.

What is the effect ?

It's main job seems to be establishing a fixed communication protocol. One central source of truth for how the "processes" exchange information. Complementing this with a protocol for triggering and reporting messages.

What else supports this effect?

A quick search for Alien across the source code reveals, that it is being used in the main Brain and UI modules, so it is very hard to break down the full context. But one clear pattern is that the possible Tag values have been exported explicitely by using (..) and are used for pattern matching inside the modules consuming them, defining responses for each. The function for this is called translateAlienData. Inside the code where Alien events are actually produced it is used in the context of communicating change, as I assumed.

How can I use this?

I feel like to centralize possible message types in one module is a good idea, to have a uniform messaging protocol. As a matter of fact I have been doing this in Electron where I setup a custom queue running on an external utilityProcess which in turn is communicating with a GraphRunner inside of it. I enforce uniform message names with hardcoded string unions in type script between the main Manager Singleton on the main process, the queue in it's secondary process and the runner. This is the closest to a Sumtype I can get.


Algorithm

match from Playlists.matching

match : Playlist -> List IdentifiedTrack -> ( List IdentifiedTrack, List IdentifiedPlaylistTrack )

match playlist =
    List.foldl
        (\( i, t ) ( identifiedTracks, remainingPlaylistTracks ) ->
            let
                imaginaryPlaylistTrack =
                    { album = t.tags.album
                    , artist = t.tags.artist
                    , title = t.tags.title
                    }
                ( matches, remainingPlaylistTracksWithoutMatches ) =
                    List.foldl
                        (\( pi, pt ) ->
                            if imaginaryPlaylistTrack == pt then
                                Tuple.mapBoth
                                    ((::) ( playlistTrackIdentifiers i pi, t ))
                                    identity
                            else
                                Tuple.mapBoth
                                    identity
                                    ((::) ( pi, pt ))
                        )
                        ( [], [] )
                        remainingPlaylistTracks
            in
            ( identifiedTracks ++ matches
            , remainingPlaylistTracksWithoutMatches
            )
        )
        ( []
        , List.indexedMap (\idx -> Tuple.pair { index = idx }) playlist.tracks
        )

What caught my attention?

I was browsing the Library of all the smaller sub-domains, and from all what I could see this seemed one of the most involved implementations and since it is part of the Playlist module/domain I can assume it is an important algorithm for this app.

What is going on technically?

I am in a bit of a dilemma here. Because since this is an unfamiliar language, of which I don't know the syntactical details, it is hard for me to comprehensibly break it down. I think what I want to achieve during this session, is to understand how you express an algorithm in elm more then fully understand what every part does. Also, I know some Haskell, so that can help.
The signature tells me that the function takes a Playlist and a List of IdentifiedTrack and returns a tuple that joins the List of IdentifiedTracks with a List of IdentifiedPlaylistTrack.
Then from haskell I can see that this a semi-pointless definition omitting the second argument as the function returns a partial applied to playlist.

foldl : (a -> b -> b) -> b -> List a -> b

This is the definition of foldl, which tells us that it takes as first argument a function that given a and b, returns an accumulated b. If you didn't tell yet, this is the Array.reduce of Elm. Then we give it the initial value of b and finally apply it to a List of a. In this definition of match we return an incomplete foldl, by constructing the function and the initial b using Playlist, but it can then be applied to any List of IdentifiedTrack.

The b part is a tuple of an empty list, and list of the playlist tracks as a tuple with an index.
Probably used inside the reduce function to keep track of indices.

First of all in the anonymous function signature we have both a and b being destructured into tuples with i and t being part of a and identifiedTracks and remainingPlaylistTracks being part

What is the effect ?

Using the fold function you can express iterative operations that accumulate some state in a very succint way.

What else supports this effect?

Also the partial application is a neat application. Destructuring has continuosly been used to reduce the mental effort.

How can I use this?

Think of functions in Elm as composable operators and when implementing the solution be aware of the possibility to find an existing algorithm (folding) that does the heavy lifting for you.


Syntax

The use of the |> operator (?)

What caught my attention?

In Haskell I have seen similar operators like the infamous >== or <$> , while this one seems to be the only operator of that kind. Also whenever I see this operator the code is structured in a way that resembles imperative instructions which might be more close to method-chaining from javascript.

What is going on technically?

They are aliases for function application burrowed from F# (not Haskell!). In a series of function call they define the order of how values from from output to input. <| Is the "conventional" functional way in which you read from right to left, or when you have nested functions. While |> is the very close to how you would use method chains. Left to right.

What is the effect ?

The effect is that you can express a series of transformation of actions as list instead of nested composition. Instead of having to write

fourthFunction (thirdFunction (secondFunction (firstFunction a)))

you can write

a
|> firstFunction
|> secondFunction
|> thirdFunction
|> fourthFunction

Increasing readability and allowing to reason in sequential yet side-effect free way!

What else supports this effect?

Technically you can write the operator inline like a |> firstFunction |> secondFunction, however the way the language is tokenized allows to add arbitrary white spaces allowing for these structures to be expressed as separate lines. A similar pattern is used for pattern matching, case statements, and the definition of sum types and list members.

How can I use this?

Syntax not exclusively presents to you operational keywords or new concepts. Sometimes languages offer you operators that allow for a reframing of an expression. Use these to structure code in such a way that it reflects your thought process.