Using different versions of CUDA in Ubuntu

Recently I’ve been experimenting again with quite a few computer vision models (including Vision Mamba). The fact that researchers share their code make experimenting with models a much more pleasant experience! Sometimes, though, the released code is very sensitive to CUDA versions. Luckily, switching between CUDA versions on Ubuntu has now become a relatively easy task using update-alternatives. Install the first CUDA version (e.g. CUDA 11.8) This can be achieved by following the documentation by NVIDIA, found in the CUDA Toolkit Download page....

March 26, 2024 · 2 min · Me

Pretrained Vision Mamba: a minimal example

At the beginning of 2024, an article came out that made the whole computer vision community excited: “Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model”. It’s a very interesting approach that promises to mitigate the cost of self-attention in transformers using a state space model (SSM) based vision backbone. The authors of Vision Mamba (Vim) shared both the source code and some of the pre-trained model weights. It took me a bit to figure out how to put all the things together, so I thought to share this to help the community....

March 20, 2024 · 3 min · Me

Common Voice Community call: Offline voice transcription, Coqui STT and WebAssembly

On the 7th of July 2022 I had the privilege to present my work (about transcribing voice messages in Signal Desktop) at the Common Voice Community Call. We talked about how to modify Signal Desktop in order to have it transcribe voice messages fully offline, without sending the message to any third party. There’s a nice bonus at the end: a live demo on the Coqui STT WebAssembly bindings, transcribing an audio file completely offline, in the browser!...

September 1, 2022 · 1 min · Me

CoquiSTT + Signal = Love (death to voice messages)

Let’s face it: if you’re reading this, chances are that you are receiving the dreaded voice messages more often than you would want. I like the romantic feeling behind voice messages on Instant Messaging platforms: you can feel the nuances of the sender’s voice, quickly deciphering their mood. However, as you start receiving voice messages more often, that romantic feeling will collapse under the weight of a stack of these:...

May 29, 2022 · 6 min · Me

Mercurial and patch queues for dummies

Lately I’ve started contributing to Mozilla Firefox with some patches (yay! It’s an awesome project… consider contributing 😉 ). It was a good excuse to practice with Mercurial, the distributed revision control software used by Mozilla. I’m mainly a Git user and I have to admit I was a little puzzled at first and had to adapt to the new workflow. In this article I will try to answer some of the questions I had during my learning phase....

February 9, 2014 · 1 min · Alessio Placitelli