I’m going to be leaving Bookend.AI this Wednesday. Following that, on August 12th, I’m going to be heading to…checks notes
Lucidworks!
checks notes again
Lucidworks?? Yes, I’m heading back to the company that I left last year, but this time as Senior Manager of Data Science. Although this is, as you might expect from that title, more of a management role, I will still be doing research as part of the job. Things to hopefully look forward to in the near future:
Aug 3, 2024 · 5 minute
read
website building Johnny Boy yeah! yeah! Adam Curtis
As mentioned earlier, it’s the 20th anniversary of You Are The Generation That Bought More Shoes And You Get What You Deserve, and to celebrate, I rebuilt the website, running into all sorts of issues that weren’t really a problem in the context of webdev when I built the original version.
The first iteration of the website had a bunch of clips from The Mayfair Set and a bunch of quotes; the page simply cycled through the clips and quotes at random. Which is fine, but I wanted something a little more interesting to celebrate two decades.
Firstly, I wanted more clips. That was relatively easy — all of Adam Curtis’ series went up on iPlayer a couple of years ago, so I have nice hi-res (and complete!) copies of those. Instead of cycling at random, though, I decided to return to my love of embeddings.
Using a SigLIP multimodal model, I encoded the lyrics of the song, all the quotes (with some new additions) and a random frame from every five seconds across every single episode of Curtis’s documentaries. Yay, a bunch of embeddings! You know I love them.
Once I had this pile of embeddings, I turned back to the original video for the song. I split it up into 38 different five-second fragments, and then every slot is randomly assigned to either:
A clip of the original video
A Curtis clip based on the nearest neighbour lookup from the lyrics at that point in the song compared to the video embeddings, picking a random element from top_k
A Curtis clip based on the nearest neighbour lookup from the current quote compared to the video embeddings, picking a random element from top_k
Obviously, the easiest thing to do at this point would be to bung all the embeddings in a vector database, but I didn’t want the hassle of having to deal with setting up FAISS or a more complicated vector store. Plus I also wanted it to be pretty fast…and given that the videos, the quotes, and the lyrics are fixed, I just precomputed all the embedding lookups for a top_k of 50. torch.topk for the win2.
This gave me two very large arrays, which I could have stuck behind a Python API to generate the clips on-demand. But I was feeling like the website should be even more dumb than usual, so I just got Claude 3 Sonnet to generate a bunch of JavaScript and copied the arrays into the HTML page directly. It’s all there, go and peek (and don’t blame me for the terrible JS code. The computer wrote it!).
After that, it was just a matter of dealing with how browsers handle playing audio (when I built the original website, auto-playing was allowed, but that hasn’t been the case for quite some time now), and I also hard-coded the start and end of the song to play from the original video to provide a better ‘playlist’.
My feeling was it was going to be relatively simple to package up and get started on Google Cloud Run. I just had to upload the clips and make a small Docker container for hosting the page (with a tiny FastAPI server to do some mounting and serve the page itself). That was fine, but when I tried to actually get the application running under the proper domain name, everything broke. Mind you, it broke with an obscure Kubernetes error that I have seen a lot in my time, so even though the UI couldn’t tell me what was wrong, I knew instantly.
It seems that Google Cloud Run creates a pod on a random internal Kubernetes cluster and uses the domain name as the pod name. When dealing with sane domain names, that works fine. However, youarethegenerationthatboughtmoreshoesandyougetwhatyoudeserve.com is 65 characters long. And pod names can only be a maximum of 63 characters. Boo. With a bit more access to the system, I could probably have fixed that, but you don’t get that luxury with Google Cloud Run. I ended up having to use an ugly DNS redirect to the bare Cloud Run URL (which is why you see the domain name change when you visit the site).
And there was more to come. I blithely posted out an announcement on Bluesky, but forgot that bare domains links these days default to HTTPS connections. and for some reason3, I didn’t have an SSL cert. No problem! Let’s Encrypt! Except…imagine an entire day of bouncing around SSL providers before coming to the conclusion that no, none of them were going to accept the domain and its inordinate length. So my launch fizzled and spluttered. Oh well. In the meantime, the non-SSL site is fully operational.
See you back here for the 25th?
The first line of my live review of Johnny Boy from 2005, where I attempted to merge Chris Roberts and Paul Morley into one being. It was a short music journalism career, but if I had to pick out two pieces I quite like from that era, it would have to be my two Johnny Boy pieces, one of which even got quoted on advertisements for the album next to Kieron Gillen… ↩︎
In general, while I understand the logic of people like Karpathy and Howard saying “just use numpy/torch operations!” for similarity search, you quickly run into a wall as soon as you need to do real searches. Look at Karpathy’s “simple movie review” site for example — for a site like this, filters such as year ranges or categories are just things that you take for granted with a search engine…and while you could certainly build all that up with tensor ops…it is much simpler just to throw everything into QDrant or Fusion (hey, look, I’m back on the party line already!) ↩︎
I am wondering if I ever tried it ten years ago in light of my hassles. Probably not, given that I don’t think Let’s Encrypt was around when I originally created the site… ↩︎
Aug 1, 2024 · 1 minute
read
And I just can't help believing Though believing sees me cursed Johnny Boy Adam Curtis
You Are The Generation That Bought More Shoes And You Get What You Deserve by Johnny Boy is twenty years old.
You may turn into dust now.
To celebrate, I’ve updated the youarethegenerationthatboughtmoreshoesandyougetwhatyoudeserve.com website. It’s a complete rebuild that builds a new ‘video’ for the song using clips from every Adam Curtis series1 and the original song’s video. I’ll be posting a deeper explanation into how it all works by using image and text embeddings in a week or so, but for now, feel free to view and consider the passage of time.
YEAH! YEAH!
I’m not saying the song made an appearance in Can’t Get You Out Of My Head because of my original website, but I’m not not saying that either… ↩︎
Going to be a lean few weeks here, sadly. Illness and probably spending much of the day unblocking a sink means not much of a post today. And then I’m spending next weekend visiting friends and former1 co-workers, so don’t expect a lot then either.
I do have a couple of days off coming up, so maybe I’ll work through the backlog then laughs bitterly…
In the Once and Future King sense of the word ‘former’… ↩︎
Jul 14, 2024 · 2 minute
read
teasing once again brown sauce emergency
Continuing to be a tease, but some exciting (if perhaps a little odd) news this week that I’ll likely be sharing properly in the first week of August. Not exactly all change, but some change.
Other things from the week:
on recent events: welp
The happiest baby is apparently a baby with her mouth stuffed full of croissant.
In my quest to hit Peak Dad, I have ordered a Blackstone Griddle. Maeryn will hopefully learn to love smashburgers.
I’m not saying I have a problem, but I actually have a schedule for buying Lego sets for the rest of the year…
This Frequency’s My Universe is now pretty much done. Two bugs remain, one of which seems to be a limitation of Google’s Cloud Run service and how it maps domain names to Kubernetes pod names (because I remember the cryptic error message from some earlier k8s adventures) — this I can’t fix and so there will be an awkward URL redirect. Things look terrible on mobile right now too, but I think a few CSS directives should fix that. Looking good for August!
Planning for Thanksgiving has begun! But…this year we’re going to try and scale down our ambitions a little. Check back in a few months to see how that’s worked out. Our record for keeping things contained is…not great.
It’s 25 years since The Mayfair Set was broadcast. I still want to know if there are more lyrics to the Starship rewrite of We Built This City that was performed for Milken:
Jul 7, 2024 · 2 minute
read
election 2024 goodbye you victorian ghoul You've got that Britain-can-make-it look about you
A strangely muted election night, considering the circumstances and the absolute scale of the defeat. While seeing the back of Rees-Mogg (until he is regenerated into his inevitable final form as a Lord), the hapless Truss, Corbyn seeing off the Labour challenger, and watching the Paisley dynasty disappear from the night were all great fun, Farage winning handily in Clacton and Starmer’s Labour not exactly being inspiring meant it did not quite bring the 1997 feeling.
But maybe that’s a good thing, given what happened there? I can wince at Starmer’s language cancelling the Rwanda programme, pointing out its failure as a deterrent instead of the immorality of it. But on the other hand, he cancelled it within a few hours of becoming PM. Not even waiting until next week. Unlike Blair, they actually plan on nationalizing the railways - Ed finally gets to execute the plan that seems to have survived his and Corbyn’s stints as Leader of The Opposition.
And the idea of creating New Towns for the first time in decades, planning regulations and the green belt be damned? That’s actually pretty radical, and in the right hands could be amazing. Towns built for the 21st century - you can’t say “15 minutes” out loud, but imagine new areas for living and working where pedestrians, bikes and public transit could be given the same priority as personal cars! EV stations! Houses that aren’t just Barratt Homes extrapolated to their 21st century endpoint with cheap fixtures and fittings! New ways of living!
Of course, I’m likely to be disappointed. It’ll probably be “Poundbury, but we added two charging stations”. But still, the possibility of actually building something in the UK that isn’t in London?1
So, a little optimistic? Just maybe?
And Britain can do it - look at the Elizabeth Line! ↩︎
Jun 30, 2024 · 2 minute
read
san francisco! courtney love in the sky feature preview
For those of you keeping track, and yes, I realize it’s only me, but this weekend marks the 25th anniversary of when I got mild sunstroke at Glastonbury and Courtney Love appeared to me in a vision across the sky. Telling me to enjoy myself and get some water. And, well, when Courtney Love tells you to do that, you go off and finish listening to the Super Furry Animals with a bottle of water, don’t you?
Anyway, we had a good trip to San Francisco! One of the nicest I’ve ever had, I think. Sunny skies, not too hot or too cold, adventures on all sorts of different public transport, Alcatraz, the piers, Chinatown, an interactive Speakeasy theatre, the Castro, and more besides. Plus despite an extended stay in SFO, which turned into an extended stay at DFW, Maeryn has spent the week getting more and more confident on her feet, leading up to hours of walking around in the SFO play area. Now that we’re home, she’s also trying to run me down with her walker. It won’t be too long before she’s everywhere. eyes the house nervously
I am hopeful that July will see the last few bits of Frequency’s My Universe completed (it all works, it’s just a matter of wrapping it into a bow and uploading it all at this point, but all the packaging bits are going to suck up some time). Which should allow me to get further on Rude Title (which at least has the beginnings of a dataset now!), and I have the training code for Chock-A-Block pretty much worked out. So who knows, maybe it’ll be a summer of tech posts and other surprises?
Jun 21, 2024 · 2 minute
read
not on a sunday? my goodness!
A slightly earlier update this time as I’ll be in San Francisco without my computer until the middle of next week. I haven’t been doing a good job on getting ready, which means I have spent Thursday afternoon into the evening wandering around the house with enough nervous energy that makes everybody else nervous. I think I’m packed now, though. Honest. Really. Look, I’ll be right back.
Been ploughing through Michael Palin’s diaries this month — I think I’m somewhere in 1983 at the moment. As part of that, I’ve also watched the first two Ripping Yarns, and…oof, I didn’t expect to bounce off them as hard as I did. They’re not terrible, but it was just a few smiles here and there rather than actual laughter (except for the icebreaker model joke — that was silly enough and a completely extravagant use of filming time and money that you couldn’t help be moved by it). The versions I saw had the audience laughter that Palin was adamantly against, and far be it for my to argue with a Python, but although the mix could have used a little fine-tuning here and there, no laughs at all would have surely sunk them on first broadcast. Anyway, as you’d expect, he comes across as Great Bunch of Lads Python, who refuses to cross picket lines, worries that he’s no good at what he’s doing, and slowly accumulating houses along his street.
(I’m also reading Owen Hatherley’s new book, which opens with the same complaints I always make about JFK and the subway, before actually making me want to go back to check out some of the places he talks about, damn him)
I know I’ve been teasing all sorts of tech posts and then not actually doing them. I’ll continue at least the first part of that - I hope to finish off the project that goes live in August next weekend, and I’m finally collating the datasets for “Rude Title For A Paper That I Can Never Use”, and I may reuse some of that work for an idea I have about embedding…so there are real things coming up, I promise. Oh, and after six years, I’ve finally updated my about page to reflect that I now live in Cincinnati. Oops.
Right, it’s time for ambien and bed, I think. If you hear of a British person next week being forcibly removed from Alcatraz shouting “Glass or plastic? GLASS OR PLASTIC?!?!”, then it’s probably me.
Jun 16, 2024 · 4 minute
read
welcome home, old friend WILY OLD BUZZARD ALERT
Obviously, the answer to that question is no, of course not, but I may be closer to calling my collection complete. Since around 2006 (what other date could it be, yes?), the Transformers toyline has had a number of sub-lines on sale. There’s been toys focused on the movie of the day, toys for the current cartoon, and another line which, whilst it has had many names, has always been aimed at older fans1). Starting out as Classics, then Universe, and its most recent regeneration as Transformers: Legacy United2, it has been mainly re-workings of old characters with more modern toy-making technology. So yes, always a new Optimus Prime, and lo! there’s a new Bumblebee…and here’s Megatron and please stop complaining that he doesn’t turn into a gun any more.
As these lines have gone on, they’ve introduced a few new characters, revisited some of the more esoteric areas of G1 - combiners, Headmasters, and the like, but in the past few years, there’s definitely been a sense of “look, we’re now filling in the gaps…and in some cases just going back and redoing the things we redid a few years ago”. There’s been some wins with this; the current versions of the Dinobots are basically the toys you always wished the originals were, but even as somebody who owned the Stunticons back in 1987, I have no desire to buy the new set of them — I even skipped the “new” version of them back in 2016 too. Still, I bought quite a few of these remasters — I wasn’t going to miss the chance to have a Scorponok that is in scale with the original G1 Fortress Maximus, after all! But I’ve noticed of late that I have been just scrolling past “oh look, another version of Jazz”.
(the exception here is Optimus Prime, which I have bought a lot over the years, but if you’re after the proper answer to “which Prime should I buy?” then it is “find Earthrise Prime". It’s not too expensive, comes with a reasonable trailer, and the robot mode is likely the best representation of G1 Prime from the cartoon/comics that you can get without spending over $200)
But oh, those gaps. As somebody that grew up with the Marvel UK version of Transformers, there’s always been that annoyance that a lot of everybody’s favourite characters were not real toys. You could never complete a collection of The Wreckers for example, because Rack’n’Ruin and Impactor were never toys!
And then things like this started to happen.
Impactor? There’s three different versions of him, including one with an IDWverse head, as well as the classic Marvel UK torso and head:
What about Straxus3? Okay, sure, Hasbro are yet to release Straxus-in-a-jar, but look at this set, coming out later this year:
They even released Tarn and Rung from the IDW comics!
And then, well, I think we were all surprised by this one. Jhiaxus, a Transformer created for the ill-fated G2 revival, suddenly appeared on shelves like he had stepped out of 1994 in all his “BIG GUNS!” 90s glory. Look at the snarl on this piece of plastic!
There was always one figure who seemed destined to remain elusive. After all, would there be an audience for a mostly-Marvel UK figure who never transformed?
Yes, so Hasbro this week answered that with “silly man, why don’t we throw in his nemesis from the zombie storyline too, eh?"4
That’s Emirate Xaaron, manipulative leader of the underground Autobot resistance, and short-arse. HE’S IN SCALE WITH IMPACTOR IN THEIR TARGET: 2006 SCENES. Reader, I could not smash the “Buy Now” button harder on Thursday.
I have wanted this toy since I was seven. And in a few months, I’ll have him, and Flame (!?!?!) too. I really am not sure I need anything else5…
There’s also the Masterpiece line, which is really expensive (>$100) toys for the rich lads who demand their die-cast metal, but aside from a few fun pieces, I’ve never really got on with them, as they’re fiddly, prone to breaking, and insanely priced. ↩︎