Chapter 9.5 — Text Generation With GPT-2 And (only) PyTorch

(note, you can find a Jupyter Notebook version of this post here)

While I’m mostly happy with how the book turned out, bar some silly errors that should not have made it to print and needing about another six months to do it properly (although work would have precluded that, so…anyway), I was a little disappointed with how I handled text generation. It worked, that’s for sure, but it was little more than ‘run this program on this text, then run this script to transform the Tensorflow model into a PyTorch compatible format, and run this script to generate output’. And then, to top it all off, about a week after the book went to print, the repo that housed most of the code underwent a major change from pytorch-pretrained-BERT to its eventual name of transformers. A bit of a pain.

In a way to make that up to people, welcome to Chapter 9.5 - A Half-Chapter in Two Parts. In this part, we’ll take another look at text generation, but this time, we won’t leave PyTorch. Promise. In Part Two (or is that Chapter 9.75?), we’ll have a bit of a final look back at images. The common theme between both parts will be self-supervision and domain modelling. I don’t have an ETA for Part Two yet, but it’ll come, promise.

If you’re looking for a refresher on the Transformer architecture, then there’s some in Chapter 9 of my book, but more usefully, you could go here to read The Illustrated Transformer, and here for The Illustrated GPT-2.

Adding New Generation Tricks To GPT-2

Right, so if you remember in the book, we went on a jolly side-jaunt with P.G. Wodehouse. And that was all very fine and whimsical, but maybe we want something that shows off the capabilities of GPT-2 a little better, even if it’s really just doing most of the same thing under the covers.

Instead of Jeeves and Wooster, we’re going to generate tweets. And we’re going to take things a step further by adding a new “control code” to our fine-tuned GPT-2 model, so we can instruct GPT-2 that we specifically want to generate a new tweet. If we don’t add the control code, then we should just get a (mostly) standard GPT-2 output. And we can use this technique to add multiple control codes, so if you had different sets of synthetic data that you wish to generate, you can use those codes to determine which type to create.

And first…let’s go back to the standard thing we always do.

“Gee Brain, what are we going to do tonight?” “The same thing we do every night Pinky. Write a new custom dataset and take over the world!”

via GIPHY

Datasets

Don’t worry though, we won’t be doing anything too crazy with this Dataset.

Much.

class CSVTwitter(Dataset):
    def __init__(self, control_code, truncate=False, gpt2_type="gpt2", max_length=768):

        self.tokenizer = GPT2Tokenizer.from_pretrained(gpt2_type)
        self.tweets = []

        # This uses the same CSV of Sentiment140 that we created in Chapter 5
        
        with open('train-processed.csv', newline='') as csvfile:
            tweet_csv = csv.reader(csvfile)
            for row in tweet_csv:
                self.tweets.append(torch.tensor(
                    self.tokenizer.encode(f"<|{control_code}|>{row[5][:max_length]}<|endoftext|>")
                ))
                
        if truncate:
            self.tweets = self.tweets[:20000]
        self.tweet_count = len(self.tweets)
        
    def __len__(self):
        return self.tweet_count

    def __getitem__(self, item):
        return self.tweets[item]

Firstly, you might wonder is why we’re ensuring that we chop our strings at 768 characters. We’re going to be using gpt2-small in this chapter, which has that limitation due to its hidden dimensionality of 768 (if you want to use larger pre-trained models, then you can increase this: gpt2-medium/1024, gpt2-large/1280, gpt2-xl/1600). Of course, because this dataset is only tweets, we’re never going to bump up against the limit, but I thought I would I’d include it so you know to be aware of the limitation.

You’ll also see that we’re injecting our <|tweet|> control code at the start of each entry, and the <|endoftext|> code at the end - this is actually a code that GPT-2 has already learnt during its initial training to signify the end of a piece of text. It’ll become useful later on in training when we pack our training tensors.

The last part of the dataset is encoding. This is similar to the encoding of text that we did back in Chapter 5, but with a small twist. Instead of a simple mapping of all words to a new dictionary, we are using a byte pair encoding tokenizer. This works in a different way to what we have seen before as it builds a dictionary by keeping track of common pairs of bytes and replaces them with a byte that is not present in the encoding.

For example, take the nonsense string:

aabaabdeaa

The first pass of the byte pair encoder would replace our aa strings:

AbAbdeA
A = aa

But note that we now have new byte pairs and so we can replace again:

BBdeA
A = aa
B = Ab

For building up a vocabulary from our data, the byte pair encoding in language models these days tends to work in the opposite direction; it starts out with a set of characters in that language, and through passes on the data, builds up subwords by finding the pairs present in the dataset, and then merging to find larger pairs, and so on. In this way, the tokenizer learns a vocabulary directly from the dataset itself and not from any manual input from an external source (like us).

Happily, we can use the BPE tokenizer that has already been trained on the dataset of GPT-2 and not have to worry about training it ourselves here (though if you’re looking to train on a new language, Huggingface’s tutorial on learning Esperanto will tell you everything you need to get started). We create a pre-trained version using GPT2Tokenizer.from_pretrained(gpt2_type), which will download the appropriate files for the version of GPT-2 we’re working with. We then encode the dataset and create tensors, returning a particular tensor within __getitem__() as normal.

In addition to the CSV-based Dataset, I’ve also included a different implementation that uses PyArrow to load in named columns from a parquet file. I just had a bunch of parquet-based datasets lying around so it was useful to make a class that could handle them as well.

We’ll build a DataLoader in our usual way:

DataLoader(dataset, batch_size=1, shuffle=True)  

(the reason for batch_size being 1 is something we’ll come back to later)

### Training

Okay, so how do we train this thing? Well, it turns out that it’s actually a lot more simple than you’d think. We already have a pre-trained model, so we’re just doing some fine-tuning (we won’t freeze layers here, but you can certainly experiment with it). But…don’t we need labels?

Training GPT-2’s involves passing our input text into the transformer model…and training the model to get the text back as output. In this way, the model learns the something of how text is structured, and eventually builds up a language model that can be used for generating further text. So our labels are the input text!

To get the model to produce anything resembling English or whatever language you’re training it on requires a gargantuan amount of text (OpenAI trained GPT-2 on 8 million webpages). But as we’re using a pre-trained model, all that hard work has been done for us, so we can get away with a much smaller dataset. We can create a pre-trained GPT-2 transformer with one line of code:

model = GPT2LMHeadModel.from_pretrained(gpt2_type)

As for our training loop, given that our labels are our input, all we’re really doing is:

outputs = model(input)
loss = loss_function(output, input)
loss.backward()
optimizer.step()

But there’s a slight catch. You remember that GPT-2 is big, right? Very big. It’s quite possible that you can’t fit all the parameters and all the gradient updates inside your GPU. I know I can’t, and I have a 1080Ti. There’s various approaches we can use to get around this problem, like distributed training, or maybe gradient checkpointing (covered in Chapter 7).

However, there’s a simpler option we can use . What we’re going to do is accumulate our gradients for a number of batches and then do the updating every x batches instead of every batch. We’ll divide our loss updates by the accumulated_batch_size to average out the loss that we’re applying.

We’re almost at the point of having the training loop sorted. But what’s that, Columbo?

via GIPHY

You may have looked at the links to the illustrated Transformer articles and discovered that GPT-2 will ‘see’ all of its input at once. And we’re sending in encoded tensors of 140-character strings. That’s leaving a lot of our input set to…basically zero. Is that going to be great for training? Probably not, as we’re not going to get a lot of information flowing forwards and backwards through our network. Enter…pack_tensor()!

def pack_tensor(new_tensor, packed_tensor, max_seq_len):
    if packed_tensor is None:
        return new_tensor, True, None
    if new_tensor.size()[1] + packed_tensor.size()[1] > max_seq_len:
        return packed_tensor, False, new_tensor
    else:
        packed_tensor = torch.cat([new_tensor, packed_tensor[:, 1:]], dim=1)
        return packed_tensor, True, None

This is a very simple method that just tries to fit as many pieces of text into an input tensor as possible. This is why we created the DataLoader with a batch_size of 1, as in our training loop, we’ll simply loop over and over the data until we’ve stuffed a tensor, and then push it through our model. Of course, this breaks the relationship between batches that come from the Dataset and what we send to the model for the training, so we add accumulating_batch_count as a counter to work out when we need to train on our accumulated gradients.

You’ll also notice in the train() code below that instead of our normal patten of: outputs = model(input) loss = loss_function(output, input)

We’re actually doing:

outputs = model(input, labels=input)
loss = outputs[0]

There’s nothing too nefarious going on here; the GPT-2 model simply has code inside it that calculates the loss to make things easier. [It’s just a simple CrossEntropyLoss as we’ve seen in previous chapters]().

Our optimizer and learning rate also come from the transformers library, and we’re using the AdamW (Adam + Weight Decay) optimizer with a warmup and linear decay (you can see alternatives at Huggingface’s docs page). Plus we also include the ability to save a set of weights at the end of an epoch.

def train(
    dataset,
    model,
    tokenizer,
    batch_size=16,
    epochs=4,
    lr=2e-5,
    max_seq_len=400,
    warmup_steps=5000,
    gpt2_type="gpt2",
    device="cuda",
    output_dir=".",
    output_prefix="wreckgar",
    test_mode=False,
    save_model_on_epoch=False,
):

    acc_steps = 100

    model = model.to(device)
    model.train()

    optimizer = AdamW(model.parameters(), lr=lr)
    scheduler = get_linear_schedule_with_warmup(
        optimizer, num_warmup_steps=warmup_steps, num_training_steps=-1
    )

    train_dataloader = DataLoader(dataset, batch_size=1, shuffle=True)

    accumulating_batch_count = 0
    input_tensor = None

    for epoch in range(epochs):

        print(f"Training epoch {epoch}")
        for idx, entry in tqdm(enumerate(train_dataloader)):
            (input_tensor, carry_on, remainder) = pack_tensor(entry, input_tensor, 768)

            if carry_on and idx != len(train_dataloader) - 1:
                continue

            input_tensor = input_tensor.to(device)
            outputs = model(input_tensor, labels=input_tensor)
            loss = outputs[0]
            loss.backward()

            if (accumulating_batch_count % batch_size) == 0:
                optimizer.step()
                scheduler.step()
                optimizer.zero_grad()
                model.zero_grad()

            accumulating_batch_count += 1
            input_tensor = None
        if save_model_on_epoch:
            torch.save(
                model.state_dict(),
                os.path.join(output_dir, f"{output_prefix}-{epoch}.pt"),
            )
    return model

Generating Text

For generating text from our fine-tuned model, there are multiple approaches that we could use, including beam search, top_k filtering, and the one we’re going to use — nucleus sampling (or top_p filtering). We take our input, in this case our new control code <|tweet|> and then feed that into the model to generate a new sequence. But all we care about it is the next word, and in particular, the probabilities of all the possible words that the model predicts should appear there.

Of course, lots of words that the model may predict will not make sense, and that’s where we can bring in nucleus sampling (or top_k or any other approach). In this approach, we sum up all the probabilities, sorted in descending order that are present until the total sum (the cumulative distribution function) is above an adjustable hyperparameter, p, which is normally set between 0.7 and 0.9. There’s another parameter, temperature, which can be used to scale the probabilities before they’re summed up into the CDF.

Once the CDF is formed, we eliminate everything that falls outside of our p by setting it to -Infinity. We’re not messing around here. Note that as we’re doing this by summing the highest probability selections first, it’s possible that if there’s a few high probability choices, they’ll be the only ones present. And that makes sense if you think about sentences like:

The dog lifted up its ____

Possible options here could include paw, tail, tongue. You’d expect paw or tail much more than tongue. In this way, our sampling feels more natural, while still providing the possibility for surprise when probabilities are more spread out.

Most of the code here is taken from Huggingface’s run_generation.py script.

Once we have our next word, we loop back around to the start, but this time we feed in the sentence with the new word added and choose the following word in the same way. We continue until we either reach entry_length or if the model generates a <|endoftext|> marker. And then it’s back to the outer loop to generate our next sentence until we’ve generated the requested number of sentences.

def generate(
    model,
    tokenizer,
    prompt,
    entry_count=10,
    entry_length=100,
    top_p=0.8,
    temperature=1.,
):

    model.eval()

    generated_num = 0
    generated_list = []

    filter_value = -float("Inf")

    with torch.no_grad():

        for entry_idx in trange(entry_count):

            entry_finished = False

            generated = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0)

            # Using top-p (nucleus sampling): https://github.com/huggingface/transformers/blob/master/examples/run_generation.py

            for i in range(entry_length):
                outputs = model(generated, labels=generated)
                loss, logits = outputs[:2]
                logits = logits[:, -1, :] / (temperature if temperature > 0 else 1.0)

                sorted_logits, sorted_indices = torch.sort(logits, descending=True)
                cumulative_probs = torch.cumsum(
                    F.softmax(sorted_logits, dim=-1), dim=-1
                )

                sorted_indices_to_remove = cumulative_probs > top_p
                sorted_indices_to_remove[..., 1:] = sorted_indices_to_remove[
                    ..., :-1
                ].clone()
                sorted_indices_to_remove[..., 0] = 0

                indices_to_remove = sorted_indices[sorted_indices_to_remove]
                logits[:, indices_to_remove] = filter_value

                next_token = torch.multinomial(F.softmax(logits, dim=-1), num_samples=1)
                generated = torch.cat((generated, next_token), dim=1)

                if next_token in tokenizer.encode("<|endoftext|>"):
                    entry_finished = True

                if entry_finished:

                    generated_num = generated_num + 1

                    output_list = list(generated.squeeze().numpy())
                    output_text = tokenizer.decode(output_list)

                    generated_list.append(output_text)
                    break
            
            if not entry_finished:
                output_list = list(generated.squeeze().numpy())
                output_text = f"{tokenizer.decode(output_list)}<|endoftext|>" 
                generated_list.append(output_text)
                
    return generated_list

Example Output

And here’s some output of calling generate on our trained model.

"<|tweet|>Casa the fifth Monday afternoons in the summer. Stay for one more - you'll be much better at finding a workplace than you would at the    office.\n\nThe Hours\n\n14:00 - 15:00, Hot and Cold\n\n18:00 - 19:00, Cafe Oktoberfest\n\n19:00 - 21:00, More Information<|endoftext|>",
'<|tweet|>Tweet what you like.<|endoftext|>',
'<|tweet|>Sigh. Hope to see ya in there.<|endoftext|>',
'<|tweet|> | The Walking Dead ends, '10 hours after everybody gets killed! I'm sick of zombies. pic.twitter.com/tsxhXdGLuGx.<|endoftext|>'

### Further Techniques & Reading

Huggingface Better Language Models and Their Implications (GPT-2) Applying BERT-based models in Search How To Sample From Language Models

Rich Tea Biscuits Are The Most British Thing Ever

San Francisco visit clichés: they always start with the feeling of optimism, soon ground into the dirt after a day’s exposure to the city, and the final leaving always seems a relief. This trip was no different; an incredibly pretty journey into the city centre on a brand spanking new BART train, and then as I was leaving my hotel for the very last time, I witnessed a Starbucks employee have a flashlight thrown at her face by a belligerent man, who was then chased off by her colleague brandishing a broom.

It’s difficult to know what to do; they were across the street and clearly didn’t need any more help or random people coming up to them, so I got in the waiting car and headed off to the vicinity of the airport, staying for one further night in a Holiday Inn Express that seemed to be equidistant and oppose from the Holiday Inn Express I stayed in last February.

Also, I may have to give up alcohol for Lent. This year’s company get-together was a little more restraint than the previous one with its Tiki Bar party. But trust me, we made up for that in the evening. Round and round we went, on a tasting tour of various bourbons from the country, explaining the cultural aversion to rice pudding that many British people of A Certain Age harbour, and that a prison that serves tea, but with no milk and only Rich Tea biscuits is one of our visions of Hell itself. It was a fun few days; I was reminded of that odd period in primary school where all my friends where shunted off to a different class and I discovered pop music, Smash Hits and became one of the cool kids. But with a lot more drinking and inadvertently confusing people as the Two British Ians sat next to each other quite a bit.

And then, of course, the actual reason for the trip itself. The days of talks, learning what everybody else is the company is getting up to, and what we’re planning to do this year. Obviously, I come to these things with a culturally cynical eye. But in a remarkable coincidence, one of the Play For Today’s I watched on the plane was Instant Enlightenment, including VAT where Simon Callow puts on an American accent and makes everybody go through a clear knockoff of est. Only Dot Cotton gets out intact. And well…you can’t deny that it has no effect (stares at 2016). It was good to see everyone, nice to have our work congratulated, and I’m looking forward to developing a few new models in the very near future (and I spent Friday night starting instead of doing something more appropriate like going to bed early for my 3am start on Saturday).

(One of my thoughts for the year - it’s absolutely fine to make neural models that are one-offs and have no real use beyond their initial one. I made a fine-tuned model based on DeOldify over Christmas for one purpose; I would defer to the new DeOldify for further colourization work, but I think I did okay with that little model I made for what I needed it for. And hopefully you’ll see some of that work end up here on the blog, or in my book repo as I fix some of the code examples that have broken or become obsolete through new PyTorch versions)

View this post on Instagram

A post shared by Ian Pointer (@carsondial) on

I did fulfil one long-standing SF desire on this trip - I finally made it to Wursthall, J. Kenji López-Alt’s restaurant. I can report that the Korean hot chicken bites are very good, and the Impossible sub is very meat-like, though it was also covered in approximately 8162 mushrooms, which reduced its appeal somewhat. I do hope to go back and try other bits of the menu. Anyway, that almost just leaves Lazy Bear as the last place there that I really want to try. But it’s not one you can do on a work trip, and I try to avoid the place otherwise. But one day, we’ll go and do the city in a more tourist vein. Maybe I won’t hate it so much1.

March is coming. Super Tuesday. My first ever vote in a Presidential primary. The inevitability of current polling and how delegates get proportionally allocated. Last time it ran on for ages, despite it being mathematically improbable that the leader would change in the remaining states (and it didn’t). This year? We may get to that improbable stage by early March and have to deal with a zombie race all the way to the convention itself, the spectre of 1972 hanging over in multiple ways. Still, what could possibly go wrong, eh?


  1. I’m fairly sure I will actually hate it more as I won’t be as distracted by work from the stark tech dystopia you see on almost every street in the core of the city. But I do try to be optimistic sometimes, honest! [return]

Rich Tea Biscuits Are The Most British Thing Ever

San Francisco visit clichés: they always start with the feeling of optimism, soon ground into the dirt after a day’s exposure to the city, and the final leaving always seems a relief. This trip was no different; an incredibly pretty journey into the city centre on a brand spanking new BART train, and then as I was leaving my hotel for the very last time, I witnessed a Starbucks employee have a flashlight thrown at her face by a belligerent man, who was then chased off by her colleague brandishing a broom.

It’s difficult to know what to do; they were across the street and clearly didn’t need any more help or random people coming up to them, so I got in the waiting car and headed off to the vicinity of the airport, staying for one further night in a Holiday Inn Express that seemed to be equidistant and oppose from the Holiday Inn Express I stayed in last February.

Also, I may have to give up alcohol for Lent. This year’s company get-together was a little more restraint than the previous one with its Tiki Bar party. But trust me, we made up for that in the evening. Round and round we went, on a tasting tour of various bourbons from the country, explaining the cultural aversion to rice pudding that many British people of A Certain Age harbour, and that a prison that serves tea, but with no milk and only Rich Tea biscuits is one of our visions of Hell itself. It was a fun few days; I was reminded of that odd period in primary school where all my friends where shunted off to a different class and I discovered pop music, Smash Hits and became one of the cool kids. But with a lot more drinking and inadvertently confusing people as the Two British Ians sat next to each other quite a bit.

And then, of course, the actual reason for the trip itself. The days of talks, learning what everybody else is the company is getting up to, and what we’re planning to do this year. Obviously, I come to these things with a culturally cynical eye. But in a remarkable coincidence, one of the Play For Today’s I watched on the plane was Instant Enlightenment, including VAT where Simon Callow puts on an American accent and makes everybody go through a clear knockoff of est. Only Dot Cotton gets out intact. And well…you can’t deny that it has no effect (stares at 2016). It was good to see everyone, nice to have our work congratulated, and I’m looking forward to developing a few new models in the very near future (and I spent Friday night starting instead of doing something more appropriate like going to bed early for my 3am start on Saturday).

(One of my thoughts for the year - it’s absolutely fine to make neural models that are one-offs and have no real use beyond their initial one. I made a fine-tuned model based on DeOldify over Christmas for one purpose; I would defer to the new DeOldify for further colourization work, but I think I did okay with that little model I made for what I needed it for. And hopefully you’ll see some of that work end up here on the blog, or in my book repo as I fix some of the code examples that have broken or become obsolete through new PyTorch versions)

View this post on Instagram

A post shared by Ian Pointer (@carsondial) on

I did fulfil one long-standing SF desire on this trip - I finally made it to Wursthall, J. Kenji López-Alt’s restaurant. I can report that the Korean hot chicken bites are very good, and the Impossible sub is very meat-like, though it was also covered in approximately 8162 mushrooms, which reduced its appeal somewhat. I do hope to go back and try other bits of the menu. Anyway, that almost just leaves Lazy Bear as the last place there that I really want to try. But it’s not one you can do on a work trip, and I try to avoid the place otherwise. But one day, we’ll go and do the city in a more tourist vein. Maybe I won’t hate it so much1.

March is coming. Super Tuesday. My first ever vote in a Presidential primary. The inevitability of current polling and how delegates get proportionally allocated. Last time it ran on for ages, despite it being mathematically improbable that the leader would change in the remaining states (and it didn’t). This year? We may get to that improbable stage by early March and have to deal with a zombie race all the way to the convention itself, the spectre of 1972 hanging over in multiple ways. Still, what could possibly go wrong, eh?


  1. I’m fairly sure I will actually hate it more as I won’t be as distracted by work from the stark tech dystopia you see on almost every street in the core of the city. But I do try to be optimistic sometimes, honest! [return]

But Let's Talk About You For A Minute

Have you ever felt that while you’ve loved an artist’s output so very much and that they’ve been a massive part of your life, you’d be willing to give all that up so that they’d be okay? In something akin to a very special British indie-pop episode of Quantum Leap?

I condescend a smile and wink directly at the camera

Three songs. England Made Me.

Off to San Francisco again on Tuesday. Planes and deep learning. I’m hoping that I’ll have a bit more time for the latter in March. I am slowly getting my evenings and weekends back, so that seems positive at least. I even did a little fiction writing this week. And made chocolates! Which does make the week sound a lot more exciting than it actually was. Rest assured, I’m currently watching a Channel 4 documentary from the previous century, so some things haven’t changed.

Oh, and to prove that the house can still surprise me after having it for two years (aside from the slightly worrying cracks in the ceiling): I discovered this weekend that one of the ovens has a proofing mode. I have been making all the baked goods here for two years without knowing this!!

Crunch Crunch Crunch

This week has been mostly comprised of working, interspersed with blazing rows with Sanders supporters where I got called all sorts of names for insisting incompetence over conspiracy and that their candidate, whilst not having a resounding win, has ended up in a rather commanding position going into Super Tuesday. I was called centrist scum many times.. It’s a fun primary season, let me tell you.

What else? Not much, to be honest! It snowed, I re-discovered the problem of opening a 25-year-old bottle of bourbon and thanked my past self for buying a superbag for straining services (the issue is that the cork will have almost certainly dried out over the years, so when you try to open it, the cork disintegrates and falls into the bourbon. Fun times!), and I discovered Earthrise Transformers in Northern Kentucky. Oh! And the Year of Pain Au Chocolat continues with my second batch of the year. I’ve bought enough chocolate batons for about 180 croissants, so…there’s going to be some baking…

All in all, I’m just very tired.

The Palindrome Day Post

Picard s01e02 review: Irish Romulans are the best

“Come on love, where’s your flag?”

Things you think about on another rewatch: how much the opening sequence of That’ll Be The Day resembles the start of The Invisibles, albeit without beings from outside time and space wanting to subjugate humanity and copious ultraviolence.

“You won’t tell anyone, will you?”

Did I mention that I have two competing desires? One to write a sitcom about Ibiza where the plots for the first series are simply lifted from the first series of Hi-Di-Hi, and the second is a completely straight version of Hi-Di-Hi crossed with David Peace. This is what almost ten years of living away from Britain does to you. And perhaps this weekend more than most.

“Babycham and a packet of crisps”

Also embarrassing: did I buy my navy coat before I saw Essex wear it in Stardust or after? I’m fairly sure it was before, but after all this time, I can’t really remember.

I am…better? My nose is still clogged, but maybe, just maybe, January is gone and I might be well again. Probably best not to jinx it though. Just enough to do my taxes and not pass out. Baby steps.

It appears that I’ve now lived in the area long enough that I can have an emotional reaction to a shop closure. This is my local Kroger! Affectionately known as the ‘little Kroger’…and it’s going the same way as the ‘little Tesco’ in Bicester. Though probably won’t return asa a B&M that feels half the size of the old shop. Still don’t know quite how they manage that. Anyway, it won’t affect me much, but that’s because I’m lucky and I own a car. Walmart is 1.5 miles away, and the shiny new Kroger that’s probably helping this one to close is probably about 3. Not great at all.

Ill With A Cat On The Lap

Still ill. But! Finally, finally starting to feel like I might be well by February. Which is not something I felt last Sunday. So progress of a sort.

I have tired of reboots and restarts. Gilmore Girls cured me of that back in 2016, and the new series of Mad About You out last year did seem like the bottom of a barrel being scraped. Although, having said that, I’d be up for a 6-episode series of Watching. But I believe I might be the only person on the planet who does.

So I was prepared to be let down by Picard. Imagine my surprise when it turned out to be mostly good (except for a few bits that scream “River Tam”, some poor costuming decisions, and a weird fridging that really doesn’t seem like it was necessary given the direction of the story). Irish Romulans as housekeepers! An old man needing help to get up the stairs! Number One! A general air of Le Carré’s Legacy of Spies, at least at the beginning! Anyway, it might all go to pot in the remaining nine episodes, but at least Rory isn’t going to end up with Logan again. Yes, I’m still bitter about that.

Meanwhile, I seem to have temporarily reached a part of my life where I can sit in the corner of my house on a Sunday night with a cup of tea in one hand and a play from 1972 (Lay By, if you’re wondering) in the other. With a book on Eastern Bloc Cold War architecture in Africa to follow. I can’t say I hate it.

While I walked down to the beach — New Order 2020

The first thing we did when we got to Miami was buy the drugs. By which I mean we walked past two very hopping clubs at 1:30am, complete with many people not wearing much in the way of clothing, and went straight for the 24-hour Walgreens with its bounty of NyQuil and VapoRub. It’s like a Northern Soul Revival if you squint hard enough, I swear.

(Although to be fair, the reviews for the clubs are…pretty terrible. Though I’m struggling to wrap my head around somebody spending $1500 (in cash!) on what’s clearly not a major or even that fancy club)

I’d like to say that we spent our one full day in Miami seeing all the sights that we could stuff into the 24 hours. What actually happened is that we got lunch in the Time Out Market, including some Bachour pastries, wandered aroun South Beach, and then we went back to the hotel where I laid in bed for the entire afternoon whilst Tammy worked out a concoction of cold medicines that would keep me up and running for the concert.

(let’s take a minute here to talk about The Fillmore Miami Beach. A wonderful art deco building that has surprisingly good sight lines even at the back of the venue, but the people running it seem to have absolutely no idea about how queues work or the chaos that results when you combine seating with general admission policies. Absolute madness ensued, and I’m still not entirely sure we sat in the right place)

New Order — fac 2020

If you were going to choose an era of New Order to see live, it’s likely that the post-2000 era is not going to be at the top of your list. Still, as I was barely a teenager when the 90s hiatus began, I can’t complain too much, and if I had to make a choice in this century, I’d definitely prefer seeing a lineup with Gillian Gilbert in it than one that doesn’t have her but does have Hook (cue my long-standing diatribe about how Gilbert has often got short shrift in fan circles over the years).

Anyway, the concert itself was good! Nothing on the scale of Taras Shevchenko, obviously, but a good fun mix of the hits people want to hear, new songs, songs that haven’t been played for a while, and even though they’ve been playing them for a while, the-still-surprising Joy Division songs (including Disorder, which they’ve only started playing again in the past couple of years). Was I disappointed that they didn’t play World In Motion despite 2020 being its 30 year anniversary? Sure, but I do appreciate that maybe Miami isn’t the ideal setting for an England World Cup song. And they played World (The Price of Love).

Also, whilst I will not be taking any questions at this time in response to my argument that the 12” version of Temptation is the only one that should be allowed to exist, the reworked Substance version they played did come with a strangely affecting nostalgia-heavy video including footage of BASF cassettes, ghetto blasters, radios, and even footage of OutRun. It’s still wrong, though.

A different concert experience than the rather Los Campesinos!-heavy ones I’ve had of late, but I have now seen New Order live and I’m very happy about having done so. And as always, many thanks to Tammy for managing to come up with a cocktail of cold remedies that kept me going all the way to the end of the concert before I collapsed back into a pile of snot. I have been ill every day of 2020 so far. I’m quite tired of it.

Ill Again

Last week, I was sick. This weekend, I’m also sick, though with something completely different and about as bad a bout since 2005 (though at least this time, I’m just wasting a weekend and not having to carry out an interview whilst trying to hold my insides together). I am hopeful that I’ll be better by next weekend, where I’ll be flying on three different planes and three airports for more chances to catch any bug that’s going around.

But! Miami! New Order! 26ºC! The total oddness of staying at South Beach! And who knows, by Monday, I may work up some needed enthusiasm for the trip. But right now, I’m lying on the couch in full ‘manflu’ repose. Helvetica is soaking up John Wick and given her interest in Casino last week, I’m a little concerned that my cat might be a little bit evil.

Oh! I also subjected an American to The Story of 1989. As a result of that, I discovered that Airwolf was not as popular here as it was back home. Multiple people in my age bracket did not remember it. I had to show the titles to indicate how exciting they were to a six-year-old that didn’t really care that, pilot aside (which I think was rated 15 in the UK?), only had the one plot and five different bits of stock helicopter footage.

As a result of that, I went down a Wikipedia rabbit hole and discovered that Donald Bellisario actually met Lee Harvey Oswald during the time when both of them where in the Marines, and those encounters made it into the Quantum Leap JFK episode. So there’s your surprising fact for the weekend. I will be hanging out here on the couch for the rest of it.

The Story of 1989 and Other Things

This is one of my favourite times of the year., when the BBC archive gets mined and either Mel or Sue narrates The Story of:, or “we’re about to show a lot of Top Of The Pops from this year, so we should probably put it into some context whilst we’re cutting out all the Mike Smith bits — _wait, is that Gary Glitter?_” Anyway, we’ve reached 1989. Soul II Soul, Happy Mondays, S’Express…and The Bunny That We Will Not Speak About. Oh, and The Reynolds Girls fleetingly glimpsed in the Fall of SAW montage, which is normally a cue for me to launch into a stirring defence of I’d Rather Jack. But it has been a long week and I’m alone on the couch shivering with a mild fever, so it’s more of a wistful watch as people talk about how 1989 seemed to be heralding a promise of a better world.

The President interrupts, threatening to bomb cultural sites and other war crimes.

And this is why I’m watching the first episode of Time Team. Escape to the past. Hence all the contemporary documentaries reporting on the Three Day Week and Militant, plus all the new (old) books lying about the house. There’s no way out, and no future except for what we can excavate from the past. A nice bit of flint and a regeneration programme for the masses. Against the classes.

But there’s also time for simple nostalgia:

View this post on Instagram

A post shared by Ian Pointer (@carsondial) on

I seem to remember that there was a spearmint version. And that it was disgusting. But my mind might be playing tricks on me there…