Haupz Blog

... still a totally disordered mix

Mini Lambda

2022-11-04 — Michael Haupt

Justine Tunney has built a plain lambda calculus interpreter that compiles down to just 383 (as of this writing) bytes of binary (on x86_64). That's Turing completeness in just under 400 bytes of machine code, and by default "instant awesome". The documentation is extensive, contains lots of examples, and such gems as a compiler from a symbolic representation of lambda calculus to the interpreter's binary input format using sed. If you have any interest in minimal abstraction, do yourself a favour and check this out please.

Tags: hacking, the-nerdy-bit

Mature Programming Language Ecosystems

2022-07-23 — Michael Haupt

I used to work a lot with, and on, Java, so I have a soft spot for that language and ecosystem. One specific point I've come to realise while dabbling with some tech and reading about log4j problems over the past months is that a rich standard library (like the one that's part of Java) can make you a lot of days. The following can easily be misunderstood as flamebait. Please don't.

The log4j misery could have been avoided - the Java standard library has a built-in logging facility since JDK 1.8; and a capability for remote code execution simply isn't needed in a logging library.

Dependencies can be tricky. On Windows, there used to be DLL hell; today, we have npm dependencies that have a tendency to go really awry. Yes, Java has its issues too, when there are hard-to-resolve conflicts between dependencies managed by Maven, for instance. But back to npm. The JavaScript language is very small, and it and Node.js don't come with a very rich standard library. Consequently, many "standard" things end up being pulled in as dependencies through npm. Also, everybpdy (and then some) thinks it a good idea to release their particular solution to a recurring problem as an npm module.

The maze of dependencies, sometimes conflicting licences, and outdated or insecure code becomes ever harder to navigate, leading to yet more software being built to help developers and companies (who don't want to lose lots of money in licencing or software security lawsuits) to handle the complexity. That means there's businesses flourishing on the fallibilities of the ecosystem, rather than fixing those.

Sometimes modules are pulled "just like that" (because the developers can), and sometimes this happens for the worst reasons, e.g., because a developer cannot make a living from software they hand out for free after their apartment burned down. This points to a deeper problem with open-source software: it's taken for granted. And if a maintainer doesn't have a company behind them that helps with paying the bills, it's a precarious gratitude those developers are showing.

Libraries and dependencies growing out of proportion is an issue that can be addressed by relying on an ecosystem that comes with a rich standard library to begin with. Java is at the heart of one such ecosystem, and it's being maintained and developed in a very sane and transparent process, by a very capable and mature community. Some big industry players are part of that community, and fund a lot of the work. I'm using Java as one example - there are others.

What's my point? There are several:

  • When choosing technology that's meant to run a business, erring on the side of true-and-tried ecosystems with rich standard libraries and robust buy-in is safer.

  • Where vivid open-source technology is used, consider funding it in addition to using it, to have a visible stake.

  • Technology should be chosen for the right reasons. Therefore, it doesn't need to be hip. It needs to work, reliably and sustainably.

  • Working with true-and-tried (some might say "old and boring") technology does not substitute supporting research into new, innovative things that can be the true-and-tried ones ten years from now.

Tags: work, hacking


2022-04-01 — Michael Haupt

(Warning: maybe because it's 1 April, the post below contains a bit of irony. When you find it, feel free to keep it.)

You've seen them - congratulatory e-mails flooding your inbox, even if you're not on the receiving end of the celebration but merely a member of the cheering crowd. Thanks to "reply-all", they happen. I personally don't mind them much, but some people do take mild offence in being faced with the challenge of having to mark swathes of "congrats!" messages as read.

There's something to be said for both sides here. On the one hand, such a broadcast message, e.g., a promotion announcement, can be seen as primarily meant to notify the crowd of the news. Reply with heartfelt congrats, rejoice in the fact, cool, move on. On the other, the broadcast does have a social aspect to it in that it encourages the crowd to cheer, and he cheering gets amplified by itself.

Of course, there's an easy remedy. Instead of putting all recipients on CC, putting them on BCC and keeping just the intended recipient of the congratulations in the "To:" field will reduce the recipients of the "reply-all" flood to just the subject of the celebration and their manager (or whoever sends the message). The parties put on BCC can be mentioned in the message, for transparency.

I sense an actual research question in this: Will CC-reply-all flooding incentivise more people to congratulate the person, or will the ones who want to do this do it anyway? Is there an amplification effect in "reply-all"? If so, what does it amplify more, cheering, or grumbling? Does BCC-messaging have a contrary effect?

Who's in for doing an empirical study?

Tags: work, hacking, the-nerdy-bit

Contemporary CPU Architecture

2022-03-25 — Michael Haupt

Building a CPU (or other hardware device) emulator is a fun endeavour - I've built myself half a Z80 emulator in Smalltalk once, test-driven development and all.

However, writing comparatively low-level code in the 21st century is really old-school. You have to use contemporary technology for everything now. That includes building emulators.

Consequently, David Tyler has built an 8080 emulator complete with computer and CP/M operating system to run on the CPU. It being "today", he has of course applied what is en vogue. That means microservices (one for each opcode supported by the CPU, just in case you were wondering), Docker, and the like. It's super hilarious. In fact, it's a more than valid successor to this wonderful enterprise Java implementation of FizzBuzz.

Tags: hacking, the-nerdy-bit


2022-03-19 — Michael Haupt

By chance, I came across Hammerspoon a while ago. This is a pretty amazing tool for Mac automation. It hooks right into a (still growing) number of the macOS APIs and allows for fine-grained control. It also offers numerous event listeners that allow for reacting to things happening, e.g., signing on to a certain WiFi network, connecting a specific external monitor, battery charge dropping below a certain threshold, and so forth. It's also possible to define hotkeys for just about everything. Use Cmd-Option-Ctrl-B to open a URL copied to the clipboard in the tracking-quenching Brave Browser? You got it:

hs.hotkey.bind({"cmd", "alt", "ctrl"}, "B", function()
  local url = hs.pasteboard.readString()
  hs.applescript('tell Application "Brave Browser" to open location "' .. url .. '"')

The goodness is brought about by the Lua scripting language. Lua is very lightweight and yet powerful, and easy to learn. Oh, and Hammerspoon has some very good API documentation, too.

Tags: hacking

Writing Documentation

2022-02-19 — Michael Haupt

This article argues that there are mainly two reasons why developers don't (often, usually) (like to) write documentation. Firstly, writing is hard, and secondly, not documenting doesn't block shipping. There are also some remarks on the value of documentation, and advice on how to go about ensuring there is decent documentation after all. I don't disagree with any of that, but that's not the point. I would like to expand a bit on the writing documentation is hard topic.

Back when I was working on JEP 274, which eventually made it into Java 9 in the form of a chunk of public API in the OpenJDK standard library, I wrote a lot of API documentation. This was a necessity because other implementers of the Java API, such as IBM, were supposed to be 100 % compatible but were, for legal reasons, not allowed to look at the implementation behind the API. So, that API documentation had better be rather darn accurate.

The centerpiece of JEP 274, a method used to construct loops from method handles, has a complex but nifty abstraction of loops at its core. (I fondly recall the whiteboard session where several of the wonderful Java Platform Group colleagues at Oracle designed the "mother of all loops".) When I had completed the first version of this, I put it out there for review, and started collaborating on the project with an excellent colleague, named Anastasiya, from the TCK group, located in St Petersburg, Russia.

It was Anastasiya's job to ensure the API documentation and implementation were aligned. She took the API documentation - which I was quite fond of already - and wrote unit tests for literally every single comma in the documentation. Sure enough, things started breaking left and right. During our collaboration, I learned a lot about corner cases, off-by-one errors, and unclear language. In a nutshell, my implementation was full of such issues, and the API documentation I thought so apt was not exactly useless, but irritating in many ways.

Anastasiya and I worked very closely for several months, and eventually, the documentation was not only in line with what the implementation did, but also precise enough to be usable by our friends at IBM.

Point being? Writing good documentation is indeed very, very hard, and I couldn't have done this without Anastasiya's help. Without her, I'd have ended up shipping public Java API - code running on dozens if not hundreds of millions of devices worldwide - in a really bad state. I'm still very grateful.

Tags: hacking, work

Array Programming

2022-02-19 — Michael Haupt

There are multiple programming paradigms. Object-oriented and functional programming are the most popular ones these days, and logic programming has at least some niche popularity (ever used Prolog? it's interesting). Put simply, the paradigm denotes the primary means of abstraction in languages representing it. Classes and objects, (higher-order) functions, as well as facts and deduction are constituents of the three aforementioned paradigms, and no news to users thereof.

Also, there's array programming. Here, arrays are the key abstraction, and all operations apply to arrays in a transparent and elegant way. The paradigm is visible in many of the popular programming languages in the form of libraries, e.g., Numpy for Python.

Array programming also has languages representing it that put "the array" at the core and bake the array functionality right into the language semantics. "Baked in" here means that the loopy behaviour normally encountered when dealing with arrays isn't visible: there are operations that apply the loops transparently, and even support implicit parallelisation of operations where possible. One of the more widely known languages of this kind is R, which is rather popular in statistics.

All of array programming stems from a programming language coined in the 1950s named "A Programming Language" (not kidding), and abbreviated APL. This language is most wonderful.

APL's syntax is extremely terse - many complex operations are represented by just one single character. As a very simple example, consider two arrays a and b with the contents 1 2 3 and 4 5 6. Adding these pairwise is simple: a + b will yield 5 7 9. Look ma, no loops. As another example, taking the sum of the elements of an array is done by using the + operation and modifying it into a fold using /: +/a will yield 6.

You see where this leads. APL code is very condensed and low on entropy. Many complex computations involving arrays and matrices can literally be expressed as one-liners. John Scholes produced a wonderful video developing a one-line implementation of Conway's Game of Life.

Indeed, there are numerous special characters in the APL syntax, and IBM used to produce dedicated keyboards and Selectric typeballs for the character set. To this day, it's still possible to buy APL keyboards. Here's mine. Look at all the squiggly characters!

After making my way through an APL book, I wanted to write a little something on my own, and ended up doing a BASE64 encoder/decoder pair of functions. They're both one-liners, of course. Admittedly, they're long lines, but still. I've even documented how they work. Looky here.

While you may wonder about the practicability of such a weird language, be assured that there's a small but strong job market. Insurance companies and banks are among the customers of companies like Dyalog, which build and maintain APL distributions. Dyalog APL is available for free for non-commercial use.

Tags: hacking

One-Letter Programming Languages

2021-12-14 — Michael Haupt

This page about one-letter programming languages is absolute gold. I hadn't realised what a lot of them there is out there.

I've been using some of them. C, of course. J, K, and Q are all APL descendants that I was using to improve my understanding of array programming when working on an implementation of R, which I've obviously also used.

Some of these languages are cranky, but "one letter" doesn't imply crankiness per se.

Tags: hacking

Code Pwnership

2021-11-08 — Michael Haupt

code pwnership: being in charge of a code base multiple clients depend on, having exclusive commit and release rights, and totally refusing to fulfil requests or consider pull requests.

Tags: the-nerdy-bit, work, hacking


2021-05-13 — Michael Haupt

Programmed in Forth ever you have?

If you chuckled at that, the answer is probably "yes".

Forth is the programming language that makes you code the way Yoda talks. Point being, you push values on a stack and eventually execute an operation, the result of which will replace the consumed elements on the stack. The arithmetic operation 3+4*5 could be expressed by saying 4 5 * 3 + (to really really honour evaluating multiplication before addition). It could equally well (and a bit more idiomatically) be expressed as 3 4 5 * +: 4 and 5 will be the topmost elements of the stack after 3 4 5, and * takes the two topmost elements. Either way, the result will be 23.

This sounds appropriately nerdy, and you might wonder what the heck this is all about. The thing is, this simple convention makes for very efficient and compact implementations of the language, and it's still quite popular in embedded systems such as the Philae comet lander. If your computer runs on Open Firmware, you have a little Forth interpreter right at the bottom of your tech stack.

The Factor programming language is a direct descendant of Forth, and comes with a really powerful integrated development environment, as well as with a just-in-time compiler written in Factor itself. The IDE has executable documentation, making it similar in explorable power to Smalltalk and Mathematica.

Tags: hacking

Mostly Functions

2021-05-01 — Michael Haupt

"Write code. Not too much. Mostly functions."

I agree. Note that this can be easily misunderstood as one of those stupid pieces that pitch one particular programming paradigm against all the others. Don't take it that way.

Object-oriented abstractions have as much a justification as have functional ones (and don't get me started about all the other paradigms). Each has its place, and functions are a very good abstraction for, well, functionality, while objects are a good one for state.

Modifying state is a necessity, and those bits should be tucked away safely in very disciplined places to avoid the headaches that arise when parallelism comes into play. All the rest should be expressed in functions that are as pure as possible - same input, same output - to allow for clearer reasoning about what is going on. Programming languages usually allow to work that way.

Tags: hacking

Self-Documenting Code

2021-03-28 — Michael Haupt

Self-documenting code is a dream that rarely comes true. Most of the time, self-documentation goodness applies well at the level of individual functions or methods, but it gets more tricky when entire classes or modules and their interactions come into play. Frankly, it is vastly more relevant to document their intent than the one of functions or methods (which can be read in their entirety quickly). Here is another case for putting meaningful comments in code, from a somewhat unexpected angle: they work better than PR comments to document code's purposes.

Tags: hacking

Laptop Purchase

2020-08-29 — Michael Haupt

A while ago, I backed the CrowdSupply campaign for Reform, an open-source laptop. There's a small Berlin based company behind this, the founder of which, Lukas F. Hartmann, is a genius. He likes to build things from scratch in an open source fashion. One example is an operating system. Another is the Reform laptop. Lukas and friends have designed the entire machine from the ground up, PCB layout, keyboard, case, and all that. The open source philosophy goes as far as making 3D printing instructions for the case parts available. The laptop is a bit clunky, but completely maintainable (no hardwired battery!). It's also a bit pricey and comes with little memory, but the idea as such has my greatest sympathy.

Future editions of the laptop might go even further open source. The current processor is still based on (proprietary) ARM architecture, but Lukas is already thinking about a RISC-V based machine.

In backing the campaign, I chose the package that involves me assembling the laptop from its parts. This is going to be so much fun. I've also agreed with Lukas to save him the shipping cost, and will pick up the box in person in Berlin, hopefully in December.

Tags: hacking, the-nerdy-bit

Bashblog Support for "Finished" Markdown Files

2020-07-15 — Michael Haupt

So Mitch Wyle told me a while back that he likes to use the e-mail inbox feature of the Blogger platform: write your posting, send it to that e-mail address, and have it posted. That works well on an airplane - all postings will go live once the mobile phone as connectivity again.

Since I use bashblog, which isn't really a hosting service, adding e-mail will be a bit of a stretch, but there could be a different solution.

I host my blog in a private GitHub repository anyway. So how about a workflow that involves, towards the end, pulling in a Markdown file I store in some branch, putting it through bashblog, and posting it?

Now, bashblog is a bit adverse to simply processing files without its edit loop. But that's fine. I've added a publish command to bashblog that allows for just that. It may be buggy (I'm not too well versed in shell programming), and I'll be grateful for suggestions for improvement.

While the processing (pulling from GitHub, applying bashblog, uploading to the webspace) can be done by another Raspberry Pi at home that runs a cron job, I'm also painfully aware that the first steps of the workflow are still unclear.

How do I get that Markdown file onto GitHub, say, when I'm sitting on an airplane? Can I send GitHub an e-mail with the contents (push-by-mail)? Can I use the GitHub app in offline mode? More things to try.

Tags: hacking

bashblog and MacVIM

2020-06-26 — Michael Haupt

My favourite text editor is MacVIM, so it's the logical choice to use this one for editing the Markdown sources for these bashblog blog postings. MacVIM also comes with a command line tool, mvim, that can be used to start editing in the GUI from the shell.

There's just one caveat when using mvim as the EDITOR for bashblog: mvim is a shell script that will start the MacVIM application using exec, so that it terminates when it hands over control to the editor. For bashblog, termination of the EDITOR command means to save the source file and generate HTML.

Since the source file is initially generated containing a template, the filename will be title-on-this-line.html, and the file will contain just that: the template. To actually generate the desired blog posting, one has to re-enter the editor and save again.

There's a simple hack that doesn't involve modifying either bb.sh or mvim: simply set EDITOR="mvim -f" in bashblog's .config file. The -f argument will keep the mvim script from terminating.

Tags: hacking