RECOMMENDER SYSTEMS

A recommender system is an algorithm that curates content to fit the preferences of users. Usually, user behaviour on a platform is registered and measured, and the data is then used to generate a mathematical representation of user preferences. That representation is then used to rank the available content to maximise the relevance, diversity and surprise for each user.

The most famous recommender systems are probably those of Netflix and Spotify for professional content, and social media feeds such as those of Instagram or Twitter for user generated content. A variety of traditional and deep learning algorithms can be used to create recommender systems.

I worked on the recipe recommender of cookidoo, a recipe website, for more than two years. We generated daily recommendations for millions of users, did online hyperparamter tuning and piloted a transformer based recommender system.


TYPE SYSTEMS

From Wikipedia:

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols).

They are useful in checking the logical structure of a computer program, and make writing correct code easier. The main selling point to me of Scala and Rust, two programming languages I like, are their type systems. I am convinced that sophisticated type systems will really come into their own when paired with LLM code generation.

I studied philosophy of maths and logic at university, and first encountered the terminology of types in the historical attempt to resolve Russell's paradox with the help of ramified type theory. I was delighted to later discover that this field is also important in computer science. I would like to study this topic more formally and learn more about the connection between type theory and formal logic (and category theory).


DECISION THEORY
Decision theory is the mathematical study of decision problems, which are characterised by
  • choices that are available to the agent
  • the probabilities of outcomes associated with each choice
  • utility functions over outcomes
A further extension of this topic is collective decision making and preference aggregation, which is known as social choice theory.

My approach to this field came through formal epistemology (epistemology is the study of knowledge acquisition) and rationality, together with an interest in public policy and economics. Due to its generality, decision theory is relevant to these usually disparate fields (and a large number of others!).

Decision theory is useful in practical contexts such as in software development, and in a more formal way within financial modelling.


Rust

Rust is a relatively new programming language that has an expressive type system and memory safety. This makes writing performant, low level code in a safe way possible.

I'm especially promising that Rust modules can be made available in Python and Javascript (through wasm), enabling performant and safe cross-platform packages for machine learning and frontend development.


BANDITS

Multi-armed bandits are a class of algorithms used in repeated decision making problems with discrete choices and uncertainty about the reward associated with each choice. They balance maximising expected reward with reducing the uncertainty on expected rewards, making them the set of solutions to the explore-exploit trade off.

I wrote my psychology bachelors thesis on human behaviour in a specific repeated decision making problem, and designed my own set of experiments to test a particular hypothesis as to why human behaviour is reliably suboptimal in this setting. Finding later that there are algorithms for this specific problem, and proofs of their properties, was very cool.

During my off year I created a package in Scala that enables more exotic uses of bandits, for ML model selection, hierarchical stacking and assembling into larger structures.


Game Theory

Game theory is the extension of decision theory into situations where the reward probabilities do not only depend on your action, but also that of other "players" or agents. The most famous game is Prisonners Dilemma, where two suspects are captured by police. Each can decide to confess the crime to get a more lenient sentence, at the cost of a longer sentence for the other suspect. However, if both confess, both get a much longer sentence than if both keep quiet. The structure of this problem enables the proof that (if there are no repetitions/other outside consequences), nonetheless, both should confess.

My philosophy bachelors thesis was on the spontaneous evolution of communication using evolutionary game theory and information theory, following Brian Skyrms. Its not that relevant in my day-to-day life or work, but I'd still like to get a more thorough mathematical understanding of this space.


DEEP LEARNING

Deep learning is the use of artificial neural networks (matrix multiplications in a trench coat) for any number of purposes, including object recognition, translation, language generation, image generation, representation learning, and text processing and classification.

I have looked into a few of the subfields of deep learning, and have worked professionally with autoencoders, deep cnn models for computer vision and transformers. I have created the library sequifier for training sequence models with transformers quickly and reproducibly. I also really like studies reconstructing stimuli from recordings of brain states. Being able to look "into peoples heads" is like magic to me. You can find a recently published study that does this here.


PROGRAMMING LANGUAGE THEORY

Programming language theory concerns the formal definition and design of programming language syntax and semantics.

I have unfortunately never studied this properly, but I really like the fact that programming language semantics are formally defined and tractable, two things natural language semantics aren't. Whereas we speakers are 'inside' the natural languages we speak, and can never fully specify the semantics of any utterance, with formal languages, we are looking in from the outside, and can define or investigate their properties.

In a more applied sense, I like metaprogramming, templating and the definition of mini-"languages" for the purpose of configuration or for a particular use case. This page for example is composed from its elements using a home spun static page generator.




portrait of Leon Luithlen in black and white



Hi! I'm Leon Luithlen, the person this page is about.

You can get a sense of my interests from the little topic introductions above, and you can see some of my open source work on my github.

My current focus is the development of sequifier, a library to easily configure, train and infer transformer models for tasks other than LLMs/NLP. The most complete public application is the development of a generative "language" model for whale click patterns, aka whale GPT.

Another project I spent considerable time on is 100gecs, a library that provides child classes for XGBoost and LightGBM that automatically do hyperparameter optimization.

If you are interested in my professional profile, you can have a look at my linkedin.

If you feel like I could be of help, or you want to discuss anything in particular, you can go through linkedin or write to (all lowercase):

[first name][middle name][last name] @ [google email service] .com

My middle name is Timna. Please reach out!




Blog

A technical introduction to the configuration parameters of sequifier
Preprocessing data, training a transformer, and inferring the model on new data
Introducing Sequifier
Sequifier enables fast, cheap and reproducible transformer model prototyping.
What is the Adaptive Ensemble?
A quick overview over what Ada Ensembles are and how they can be used.
A Technical Introduction to AdaEnsemble
How Ada Ensembles are implemented in Scala, and how to use them.
Three General Patterns for Adaptive Ensembles
Three abstract use cases for Ada Ensembles.
Mitigating Pipeline Underspecification with an Adaptive Ensemble
How AdaEnsembles can help select among underdetermined model weights in practice.