How Much Can One Shot Cost, Michael?

Oct 13, 2024 · 1 minute read
babies
the grandfather paradox
0x78 in hex

View this post on Instagram

A post shared by Ian Pointer (@carsondial)

My prize for successfully completing Thank Goodness You’re Here. They even splashed out on the fancy Yorkshire Tea!

In the past 48 hours, I have been out with Maeryn and twice asked if I’m the grandfather. Less of that, please.

Back from San Francisco. One of my ears is still a little dodgy, and I still feel sunburnt from my 20 minutes in the Castro last Sunday. It was a weird trip, with seemingly half the hotels having strike action in front of them, having to move across the city, not being able to actually get into the office on Monday because there was nobody there, and then leaving the place on the evening of actually having the meeting I was flying out for instead of spending multiple working days in the Bay. Still, flying across the continent for one day to talk about artificial intelligence and search has to be pretty close to Living the Dream, I think…

…and then the tumble dryer breaks down when you get back to keep you grounded…

60% Tea By Volume

Sep 30, 2024 · 2 minute read
family visit!

View this post on Instagram

A post shared by Ian Pointer (@carsondial)

Apologies for the lack of posting for most of September! Good excuses, though - firstly, celebrating my Dad’s 70th birthday the first week, and then as the family’s visit came to an end I naturally fell so sick I had to be taken to urgent care - don’t worry, I’m fine, but it was a few weird hours where I was very cold and very unsure on my feet.

Anyway, a great visit, I think. Maeryn certainly enjoyed playing to a devoted audience every night, and hopefully we’ll be seeing them again in a couple of months for Christmas adventures! By which time, Maeryn will probably also be solving differential equations, based on her current learning trajectory. I think we’re about 24 hours away from her learning how to set locks on all the doors. Which is why we bought emergency lock keys yesterday. Always stay one step ahead!

It is probably going to remain quiet here for a little bit longer; I’m off to San Francisco at the weekend to visit the Lucidworks offices again, so no updates until I get back, I would expect. It’s been a bit of a lean year for the blog; maybe I’ll make up for it in the Autumn (stares at the way the calendar just turns to darkness on November 6th)…

Clean All The Things

Sep 15, 2024 · 1 minute read
dusting the lego Galaxy Shuttle too

Running around the house this weekend trying to get everything ready for the family visit next week. Which was ages away…and then it really wasn’t. As part of this effort, I got over my issues and we used a coupon for a cleaning service. My issues immediately came back after they stayed about two hours longer than intended, scrubbed the place from top to bottom (including areas that we’d told them they didn’t have to do!) and made our glass cooking hob look brand-new with some sort of cleaning sorcery that makes us mystified — we’ve cleaned, we’ve used products, we’ve used razor blades, even, and yet nothing as good as what happened this Saturday.

They even dusted my Transformers (to my shame).

Anyway, the house is amazingly clean now. Almost as if we just moved in. But we may have to move away to another state due to our embarrassment…

Out: Baking

Sep 8, 2024 · 1 minute read
we don't talk about bruno or the macarons

View this post on Instagram

A post shared by Ian Pointer (@carsondial)

Multiple Knebworths Incoming

Sep 1, 2024 · 2 minute read
babies

Okay, you know what? My first reaction to the Oasis reunion was, of course, this:

via GIPHY

(in the great debate between Oasis and Blur, the correct answer is that Pulp’s Babies contains everything you could conceivably need)

But then I read this article from The Quietus, and well, if Gen-Z can move past the baggage of the era and love them, it’s not really my place to argue otherwise, I think. And Alex Niven’s article in The Guardian protests a little too much, but otherwise is a reminder that the high point of Oasis preceded New Labour’s ‘Cool Britannia’. So I can save my vitriol for the kind of fan around my age that complains about “21-year-old girls getting their ticket”. Not that it matters, but those girls are likely able to rattle off in-order setlists of their appearances at Tokyo’s Club Quattro. Which is more than you or I could do in our advancing years…

(my long-standing peeve is that Noel’s heel-turn against all things modern in the 21st century has just been so boring. I wanted to hear his rave tapes! More experimentation like using samples from NWA/the Amen Break on D’You Know What I Mean? Something else from working with The Chemical Brothers! Just anything except what we got from the rest of the third album onwards. And while I’m here, the dynamic pricing fiasco of the weekend reminds me that they’ve always been something of a…mean band. The “Cigarette Boxes” that contained the same interview CD, that it took almost 15 years for Whatever to appear on a compilation album, the editing of the Radio 1 Knebworth broadcast to skip over their new songs, Creation forcing Radio 1 DJs to talk over the first week or two of playing songs from Be Here Now in a bizarre effort to try and stop home taping. As if we didn’t all traipse down to the record shop to buy it on that day in August 1997…)

Driving Home In A Large Bomb

Aug 25, 2024 · 1 minute read
so much lithium

On the one hand, an outdoor grill basically means you’re 100% Americaon. On the other, there’s the drive home when you realize you’re in a metal cage containing gasoline and propane, all sitting on top of a sizeable bed of lithium. Past a school. That focuses your mind, I’ll say.

Last week’s medical shenanigans got resolved in the “non-expensive, but you’ve screwed up the one treatment for psoriasis that has worked for you” manner. Which is marginally better than “tens of thousands of dollars and you’ve lost the treatment”, but still a bitter pill to swallow as we head into the colder months (which is when it’s worst for me up here in Ohio). Hurrah for the perfect framework that is the US health insurance system! (and 3rd-party HR SaaS operations while I’m here)

It is, however, difficult to remain sad and upset when you have a agent of chaos running through your house, laughing in absolute delight at the toy icemaker she’s been playing with all afternoon, and desperately trying to make friends with Helvetica Black (who is, I think beginning to accept that this is her fate). Or when she finally does fall asleep to the gentle sounds of The Shamen’s Ebeneezer Goode…

Thank Goodness You're Here (@ Lucidworks)

Aug 18, 2024 · 1 minute read
healthcare disasters
david peace

A good first week back at Lucidworks, slightly marred by an ongoing health insurance disaster that still has no conclusion in sight (but a variety of potential outcomes from a shrug of shoulders to a very expensive mistake). I even managed to go from idea to demo for a small (but fun!) research project in a couple of days. Which is not something I’ll be able to do every week as a manager, but it’s good to still be doing active research!

(Don’t worry, there is a long list)

Also, how did I not know that David Peace has a new book coming out??? Another football book to go along with The Damned Utd and Red or Dead. I’m starting to think we might never actually see UKDK…

Thank Goodness You're Here!

Aug 10, 2024 · 1 minute read
calamity james
matt berry
yorkshire

You cannot go wrong with a game that uses the IBA advert cue dot as its save icon.

Announcements!

Aug 4, 2024 · 1 minute read
new job
lucidworks
again?
yes!

As promised, exciting employment announcements¹!

I’m going to be leaving Bookend.AI this Wednesday. Following that, on August 12th, I’m going to be heading to…checks notes

Lucidworks!

checks notes again

Lucidworks?? Yes, I’m heading back to the company that I left last year, but this time as Senior Manager of Data Science. Although this is, as you might expect from that title, more of a management role, I will still be doing research as part of the job. Things to hopefully look forward to in the near future:

Fun! And! Games! With! Sparse! Autoencoders!
Death! To! MS! MARCO!
The! Rudest! Paper! Title! Discovered! By! Accident!

And more besides. I’ve been missing search; it’s time to get back in…!

New to you unless you read it earlier this week on Bluesky… ↩︎

Building a website in 2024

Aug 3, 2024 · 5 minute read
website building
Johnny Boy
yeah! yeah!
Adam Curtis

As mentioned earlier, it’s the 20th anniversary of You Are The Generation That Bought More Shoes And You Get What You Deserve, and to celebrate, I rebuilt the website, running into all sorts of issues that weren’t really a problem in the context of webdev when I built the original version.

The first iteration of the website had a bunch of clips from The Mayfair Set and a bunch of quotes; the page simply cycled through the clips and quotes at random. Which is fine, but I wanted something a little more interesting to celebrate two decades.

Dance, you fuckers, dance¹

Firstly, I wanted more clips. That was relatively easy — all of Adam Curtis’ series went up on iPlayer a couple of years ago, so I have nice hi-res (and complete!) copies of those. Instead of cycling at random, though, I decided to return to my love of embeddings.

Using a SigLIP multimodal model, I encoded the lyrics of the song, all the quotes (with some new additions) and a random frame from every five seconds across every single episode of Curtis’s documentaries. Yay, a bunch of embeddings! You know I love them.

Once I had this pile of embeddings, I turned back to the original video for the song. I split it up into 38 different five-second fragments, and then every slot is randomly assigned to either:

A clip of the original video
A Curtis clip based on the nearest neighbour lookup from the lyrics at that point in the song compared to the video embeddings, picking a random element from top_k
A Curtis clip based on the nearest neighbour lookup from the current quote compared to the video embeddings, picking a random element from top_k

Obviously, the easiest thing to do at this point would be to bung all the embeddings in a vector database, but I didn’t want the hassle of having to deal with setting up FAISS or a more complicated vector store. Plus I also wanted it to be pretty fast…and given that the videos, the quotes, and the lyrics are fixed, I just precomputed all the embedding lookups for a top_k of 50. torch.topk for the win².

This gave me two very large arrays, which I could have stuck behind a Python API to generate the clips on-demand. But I was feeling like the website should be even more dumb than usual, so I just got Claude 3 Sonnet to generate a bunch of JavaScript and copied the arrays into the HTML page directly. It’s all there, go and peek (and don’t blame me for the terrible JS code. The computer wrote it!).

After that, it was just a matter of dealing with how browsers handle playing audio (when I built the original website, auto-playing was allowed, but that hasn’t been the case for quite some time now), and I also hard-coded the start and end of the song to play from the original video to provide a better ‘playlist’.

My feeling was it was going to be relatively simple to package up and get started on Google Cloud Run. I just had to upload the clips and make a small Docker container for hosting the page (with a tiny FastAPI server to do some mounting and serve the page itself). That was fine, but when I tried to actually get the application running under the proper domain name, everything broke. Mind you, it broke with an obscure Kubernetes error that I have seen a lot in my time, so even though the UI couldn’t tell me what was wrong, I knew instantly.

It seems that Google Cloud Run creates a pod on a random internal Kubernetes cluster and uses the domain name as the pod name. When dealing with sane domain names, that works fine. However, youarethegenerationthatboughtmoreshoesandyougetwhatyoudeserve.com is 65 characters long. And pod names can only be a maximum of 63 characters. Boo. With a bit more access to the system, I could probably have fixed that, but you don’t get that luxury with Google Cloud Run. I ended up having to use an ugly DNS redirect to the bare Cloud Run URL (which is why you see the domain name change when you visit the site).

And there was more to come. I blithely posted out an announcement on Bluesky, but forgot that bare domains links these days default to HTTPS connections. and for some reason³, I didn’t have an SSL cert. No problem! Let’s Encrypt! Except…imagine an entire day of bouncing around SSL providers before coming to the conclusion that no, none of them were going to accept the domain and its inordinate length. So my launch fizzled and spluttered. Oh well. In the meantime, the non-SSL site is fully operational.

See you back here for the 25th?

The first line of my live review of Johnny Boy from 2005, where I attempted to merge Chris Roberts and Paul Morley into one being. It was a short music journalism career, but if I had to pick out two pieces I quite like from that era, it would have to be my two Johnny Boy pieces, one of which even got quoted on advertisements for the album next to Kieron Gillen… ↩︎
In general, while I understand the logic of people like Karpathy and Howard saying “just use numpy/torch operations!” for similarity search, you quickly run into a wall as soon as you need to do real searches. Look at Karpathy’s “simple movie review” site for example — for a site like this, filters such as year ranges or categories are just things that you take for granted with a search engine…and while you could certainly build all that up with tensor ops…it is much simpler just to throw everything into QDrant or Fusion (hey, look, I’m back on the party line already!) ↩︎
I am wondering if I ever tried it ten years ago in light of my hassles. Probably not, given that I don’t think Let’s Encrypt was around when I originally created the site… ↩︎