AI Image Descriptions

I tend to take a lot of somewhat random photos of things that interest me. If it’s a good photo and the mood strikes, I’ll share it on various social media channels.

One image I recently shared featured a stack of “Wooden Bitcoin” with a joke about their value (there are certain scenarios where the wooden ones are quite valuable).

A small pile of wooden disks on a silver background.  An outer circle on each disk reads "Wooden Bitcoin.  Just as Good, Really."  Text on an inner circle reads "To Redeem: Insert into any Internet Coin Slot."  The Bitcoin logo appears at the center of each disk.

When I shared the image on social media (and when I embedded it above), I included an image description for accessibility purposes.

A small pile of wooden disks on a silver background. An outer circle on each disk reads “Wooden Bitcoin. Just as Good, Really.” Text on an inner circle reads “To Redeem: Insert into any Internet Coin Slot.” The Bitcoin logo appears at the center of each disk.

That description wasn’t very hard to come up with, but sometimes, it’s a bit more difficult to convey all the visual details. For example, consider this image of winterberries.

Bright red winterberries on bare branches.

In the embedded image’s alt text I’ve described it as “Bright red winterberries on bare branches” which is fairly accurate, though it doesn’t offer a lot of detail.

Much like everyone else, I’ve been getting my feet wet with AI. Lately, I’ve been using Ollama to run models locally for various toy projects; and more recently, I noticed that Meta’s llama3.2-vision model had become available, allowing the ability to work with image files.

Uploading the winterberry image, I gave llama the prompt:

Please describe this image.

What I got back was quite a bit more detailed than what I’d come up with on my own.

This image presents a close-up view of a vibrant red berry bush, showcasing its intricate details.

The bush’s branches are bare and grayish-brown in color, with numerous small twigs extending from the main stems. The berries themselves are bright red, spherical in shape, and densely packed along the branches. They appear to be ripe, as they have lost their green hue and turned a deep crimson color.

In the background, other bushes or plants can be seen, although they are not clearly visible due to the dense foliage of the berry bush. The overall atmosphere suggests that this image was captured during the winter season, likely in December or January, given the bare branches and ripe berries.

That description pretty well knocks it out of the park! And although the image contains no Exif data, the AI correctly determined that the photo was taken “…in December or January” (December 31 in fact, right on the cusp of the change).

For comparison, I uploaded the same image to ChatGPT, using the current “GPT-4o mini” model. Entering the same prompt, I got a similarly detailed, albeit shorter, description.

This image shows branches covered with clusters of bright red berries. The berries are small, smooth, and shiny, growing densely along the thin, leafless twigs. These are likely the fruits of a holly plant, possibly winterberry (Ilex verticillata), which is known for its vibrant red berries that remain on the branches during winter, adding color to the landscape. The background is blurred, emphasizing the striking red berries in the foreground.

Again, that’s a great description. No guesses on when the photo was taken, but this time, ChatGPT correctly recognized that these were winterberries.

In the case of llama 3.2, I’d likely remove the guess about when the photo was taken since I know that information. For ChatGPT, I’d take out the guess work about the type of berries since I already know it. But to be honest, either of these outputs would be good starting points for describing the photo.

So, what about that first photo with the wooden bitcoins?

Here’s the llama output. It’s wordy and repeats some the details multiple times. And I disagree with its description of the Bitcoin logo as the curved lines form the letter “B” and not a circle.

The image shows a collection of wooden coins with the Bitcoin logo and text on them, likely used as tokens or rewards in an online game or platform.

The coins are made of wood and have a circular shape.

They feature the Bitcoin logo in the center, which is a stylized letter “B” made up of two curved lines that form a circle.

The text on the coins reads “WOODEN BITCOIN” at the top and “JUST AS GOOD, REALLY” at the bottom.

There are 15 coins visible in the image, arranged randomly on a gray surface.

Some of the coins overlap each other, creating a sense of depth and layering.

The background is a plain gray color, which helps to focus attention on the coins.

Overall, the image suggests that these wooden coins are being used as a novelty item or collectible, perhaps in conjunction with an online game or platform that rewards players with virtual currency.

Chat GPT did better on this one.

This image shows a pile of wooden tokens designed to resemble bitcoins. Each token is circular and engraved with text and imagery related to Bitcoin. The engraving includes the Bitcoin logo (a “B” with two vertical lines) and phrases such as “Wooden Bitcoin,” “Just As Good, Really,” and “To Redeem: Insert into any internet coin slot.” The tokens are humorously crafted to mimic physical representations of digital currency.

In this case, Chat GPT clearly did the better job, and both tools did a credible job of recognizing text presented in an unusual format, but I’m not sure I’d want to trust either one to generate text without reviewing it before sending the text into the world.

This is a great step toward solving the “blank page problem”, most likely giving a solid starting point toward a description of the image (it’s also a huge step up from xkcd’s “Tasks” cartoon).

At the moment, my phone’s camera roll has nearly 10,000 photos on it (I really do take a lot of somewhat random photo). Being able to run them all through AI has the potential to save a lot of tedium with creating descriptions. But, along with the need to double-check the descriptions, I’m also still on the the hook for deciding whether the image is even worth sharing.

Using an AI tool definitely has potential to save me some time. But it’s not a panacea.

Opt Out of LinkedIn AI Training

It’s not quite a week since I first saw the meme announcing that LinkedIn had just become the latest high-profile company to start using their users’ data to train an AI system, and (of course) it was opt-out. So if you did nothing (e.g. you didn’t see the notification) then they would simply assume you were OK with it.

Initially, you could just go to https://www.linkedin.com/mypreferences/d/settings/data-for-ai-improvement to turn off the setting, but clicking that link doesn’t seem to be reliable. (If I go directly to that link, it works, but if it goes through Facebook’s link tracker, it goes to a page not found error.)

In a desktop web browser, you can click on “View Settings” in the left navigation, click on “Data Privacy” and then, under the “How LinkedIn uses your data” heading, click on “Data for Generative AI Improvement.” At the moment, that page has a single toggle for “Use my data for training content creation AI models.”

I don’t have the app installed myself, but I’m told you can similarly go to Settings -> Data Privacy -> How LinkedIn Uses Your Data -> Data for Generative AI Improvement

You can read more about this in The Washington Post’s Tech Friend column (no account required, this is a gift link)

Tip of the hat to John Newmark who first made me aware of this.

Turn Off The Dictation Hotkey

I have a habit where I’m about to type a control key (e.g. control-c), I’ll hit the control key twice. I don’t know why I do this, and I’m not sure why it’s only the control key, but on Mac, this has the unwanted side-effect of popping up a prompt to enable dictation. (If I had dictation enabled, I suspect it would start transcribing my speech, which might be even worse.)

Macintosh dialog box asking, "Do you want to enable Dictation."

To turn this off:

Go to system preferences and scroll down to “Keyboard.”

Macintosh preferences dialog with the "Keyboard" pane active.

At the bottom of the keyboard pane, is the “Shortcut” label. This shows the current hotkey for activating dictation. I really don’t want this hotkey, but there doesn’t seem to be an option to not have one, so I’ll choose the microphone and hope it doesn’t start transcribing every random conversation near the MacBook.

List of dictation hotkey options: "Press Microphone," "Press Control Key Twice," "Press Globe Twice," "Press Right Command Key Twice," "Press Left Command Key Twice," "Press Either Command Key Twice," "Customize..."

(Cover image generated by AI via bing.com/create.)

Installing PHP modules

I’m running a WordPress instance on a Zimaboard on the home network. Normally, that’s the kind of thing I’d put on the paid hosting service so I can let someone else worry about patching the OS and such, but since family calendars and the like don’t need to be on the public internet, I decided to do this one in-house.

Once I got WordPress up and running, I checked the “Site Health” and along with some old themes that needed to be cleaned up, there was a notice of a critical issue, telling me that “One or more required modules are missing.”

Expanding the dropdown, I saw the list included curl, imagick, zip, and gd. I tried to be thorough when installing PHP, but evidently missed a few. No biggie. Here’s how to fix it.

Performance warning that "One or more required modules are missing"

PHP modules perform most of the tasks on the server that make your site run. Any changes to these must be made by your server administrator.

The WordPress Hosting Team maintains a list of those modules, both recommended and required, in the team handbook.

Warning: The optional module, curl, is not installed, or has been disabled.

Warning: The optional module, imagick, is not installed, or has been disabled.

Warning: The optional module, zip, is not installed, or has been disabled.

Error: The required module, gd, is not installed, or has been disabled.

This is all running on Ubuntu, so the first step is to update the list of available packages. Because you always do that first.

$ sudo apt update -y

Next, install the missing packages. These are php modules, so the package to install is named “php-” and then the module name. (e.g. php-curl).

$ sudo apt install php-gd php-imagick php-curl php-zip

My machine is running PHP 8.1, so apt determined that the correct packages to install were php8.1-gd, php8.1-imagick, php8.1-curl, and php8.1-zip

Now I know the php-imagick module depends on ImageMagick, so I wanted to make sure that was installed, so after checking the apt command’s help text, I ran apt list -a ImageMagick

$ apt list -a ImageMagick
Listing... Done
imagemagick/jammy-updates,jammy-security,now 8:6.9.11.60+dfsg-1.3ubuntu0.22.04.3 amd64 [installed]
imagemagick/jammy 8:6.9.11.60+dfsg-1.3build2 amd64

Excellent. Everything should be good now right? That’s what I thought, but WordPress disagreed. Returning the Site Health page, the same message appeared, telling me that “One or more required modules are missing.”

I had to scratch my head for a bit on that one. Then I remembered, you don’t just install PHP modules, you also have to tell PHP you want to use them. (For example, Xdebug notoriously causes programs to run more slowly.)

You use phpenmod to enable modules and phpdismod to disable them. (Arguably, I should have included “-s apache2” so they would only be enabled for Apache, but I wanted to make them available for any use of PHP.)

After enabling the modules, you also need to restart apache so the new modules are loaded, so the final set of commands is:

$ sudo phpenmod curl imagick zip gd
$ sudo systemctl restart apache2

And now the Site Health report is a little bit happier.

Set up a MySQL Database for WordPress

I keep losing track of the file where I have these steps written down. It’s past time to put them someplace where I can find them, and perhaps help a few others as well.

create database SOME_DATABASE;
create user 'SOME_USER'@'localhost' IDENTIFIED BY 'A_STRONG_PASSWORD';
GRANT ALL ON SOME_DATABASE.* TO 'SOME_USER'@'localhost';

The database permissions grant can be fine-tuned a bit, e.g. after installation, remove the DROP, ALTER, and GRANT permissions. (This does, of course, depend on what your plugins are doing, and potentially the needs of a particular major version upgrade.)

Setting defaults for the dig command

Today I learned you can set default output options for the dig command by creating a .digrc file in your home directory.

Ordinally, running the command dig www.chaosandpenguins.com, the result is this rather hefty block of text.

$ dig www.chaosandpenguins.com

; <<>> DiG 9.16.1-Ubuntu <<>> www.chaosandpenguins.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40732
;; flags: qr rd ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.chaosandpenguins.com.      IN      A

;; ANSWER SECTION:
www.chaosandpenguins.com. 0     IN      CNAME   chaosandpenguins.com.
chaosandpenguins.com.   0       IN      A       216.92.152.175

;; Query time: 0 msec
;; SERVER: 172.28.224.1#53(172.28.224.1)
;; WHEN: Wed Nov 16 23:13:00 EST 2022
;; MSG SIZE  rcvd: 136

That’s a whole lot of text. So let’s add a couple options. +noall turns off everything. Running dig www.chaosandpenguins.com +noall would literally return nothing at all. To bring back the answer section (which is what I’m interested in most of the time), you add the +answer option.

$ dig www.chaosandpenguins.com +noall +answer
www.chaosandpenguins.com. 0     IN      CNAME   chaosandpenguins.com.
chaosandpenguins.com.   0       IN      A       216.92.152.175

That’s much more compact , but getting it requires some extra typing. And I want that version of the output most of the time, so wouldn’t it be nice if there was to make that the default?

This is where the .digrc file comes in. You create it in your home directory and just put in a single line containing the options you want. So, to make +noall +answer the defaults, I just run this command:

$ echo +noall +answer > ~/.digrc

And now when I run dig www.chaosandpenguins.com without any options, here’s the default output:

$ dig www.chaosandpenguins.com
www.chaosandpenguins.com. 0 IN CNAME chaosandpenguins.com.
chaosandpenguins.com. 0 IN A 216.92.152.175

Getting Verified on Mastodon

I’m not leaving Twitter, not yet anyhow, but over the weekend I decided to dip my toes in the water and check out Mastodon. I’m still collecting my thoughts on the topic (more on that later perhaps), but in the meantime, let’s talk about how to get verified.

Last week, Twitter announced you’d soon be able to get verified (i.e. a blue checkmark confirming that you’re who you say you are) by paying eight dollars a month. I think they’ve missed the boat.

For me, verification isn’t “Oh, this person’s name really is ‘Roy Kent.” It’s about verifying the account belongs to the same Roy Kent who used to play for AFC Richmond and not someone who coincidentally has the same name and uploaded the other guy’s publicity photo. (Update November 13, 2022: Oh wow! I thought they’d at least check that the name matched what was on the credit card. Instead, the blue check mark has gone from “This person is who they claim to be” to something more akin to the The Star-bellied Sneetches.)

Being a collection of independent web sites, verification on Mastodon doesn’t work quite the same way. It’s quite possible (and arguably, desirable) that @thatblairguy@mstdn.social and thatblairguy@mastodon.social (or any other server) are different people.

The “tricky” part is, you need your online identity to be more than “I’m this person on this particular social media site.” (If your identity really is, “I’m this person on this particular social media site,” I don’t understand the value of verification – if you’re only on one site, where else do people know you from?)

If you do have a web site (or really, any page you control), you can link to it from your Mastodon profile. Then, while on the “Edit your profile” page, copy the verification link and put that on the page you linked to. This causes Mastodon to display the link to that page in green, meaning “verified.” (Or, put another way, “The person controlling that other site is also the person who controls this profile.”)

Using my own profile as an example, here’s how to do it (some of the details may vary between servers, but I believe the general steps will be the same):

Step 1: Edit your profile.

Step 2: Scroll down to the “Profile metadata” section. And put in both a label and a link to your website. (I labeled mine as “website” but you can use whatever text you want.)

Step 3: Click “Save Changes”. (This saves your work, but you’re not verified yet.)

Profile metadata block listing thatblairguy.com and chaosandpenguins.com as web sites I control.

At this point, your mastodon profile will show a link to your web site, but it won’t display that you’ve verified the ownership. You might still be some rando trying to claim that you own someone else’s web site. Next, you have to prove that you own the page you’ve linked to.

Step 4: Scroll back down to the “Profile metadata” section and click the “Copy” button under the verification text. This puts a link into your paste buffer, for my account, it looks like this:

<a rel="me" href="https://mstdn.social/@ThatBlairGuy">Mastodon</a>

Place that link somewhere on the page and after a short while, your profile page will show the link to your web site with a highlight, verifying that you’re the same person who controls that page.

Do bear in mind that it might take a few minutes for the highlight to show up. Most mastodon servers are being run by volunteers and at the moment, they’re under a heavy load with people moving from Twitter.

Profile showing thatblairguy.com and chaosandpenguins.com as verified.

Oh, and if you really, really want a checkmark next to your name, you can add an emoji as part of your display name. ✅

Immutable Objects in JavaScript

Yesterday I learned you can have immutable objects in JavaScript.

Constant values of course are nothing new (e.g. const PI = 3.14), at least for simple values. But for objects, the name of the constant only affects which object the name refers to, not the value of the object’s members.

const a = { num: 5}
console.log(a.num);
// outputs: 5

a.num = 10;
console.log(a.num);
// outputs: 10

a = {}
// TypeError: Assignment to constant variable.

So it turns out you can make an object immutable by passing it to Object.freeze().

So if we take the previous example and freeze the object, changing the property’s value doesn’t work.

const a = { num: 5}
console.log(a.num);
// outputs: 5

Object.freeze(a);

a.num = 10;
console.log(a.num);
// outputs: 5

The thing that bothers me about this is the attempted assignment fails silently. That opens the possibility of some hard to find bugs.

Start the code with 'use strict', and now there’s an explicit runtime error.

'use strict'
const a = { num: 5}
console.log(a.num);
// outputs: 5

Object.freeze(a);

a.num = 10;
// TypeError: Cannot assign to read only property 'num' of object

The other thing to note is that, much like const does for variables, freeze()only affects the values of simple properties. If a property refers to an object, that object’s properties won’t be frozen.

The examples in the Mozilla documentation includes a discussion of “shallow freezing” a deepFreeze() example method demonstrating a recursive technique for freezing object members recursively.

(Cover image via WikiMedia commons, used under Creative Commons Attribution-ShareAlike 1.0 Generic)

Turn off the “finish setting up your device” screen

One of my peeves with Windows 10 is occasionally I’ll get a screen saying “Let’s finish setting up your device.” (Uh, I finished setting it up a couple years ago, why do you keep suggesting this?)

This evening, I spotted a post on Twitter from @MrTurner asking how to get rid of that prompt. Great question! And there was an equally awesome reply from @Lucas_Trz with the answer.

So, just in case that isn’t clear enough.

  1. Go into settings (click the Start button, and then the “gear” icon), click 
  2. Click “System”
  3. Click “Notifications & actions”
  4. Uncheck the box next to the text that starts off with “Show me the Windows welcome experience after updates and occasionally” (I’d like to suggest you might want to uncheck a few other things as well.)

Going forward, this is on my list of things I’ll do any time I reinstall Windows. Right next to turning off the feedback surveys.

Cover image by twitter user @MrTurnerj, used in the context of a critique of Windows.

Why your organization needs to own its email address

Dear hypothetical reader: this is a coalescence of some thoughts that have been circulating through my head over the past several years. Hopefully if I write them down, I can make room for other, more interesting thoughts.

I’m opposed to the idea that an organization (I apply this equally to civic organizations, club, and businesses) might use email addresses which the people doing the organization’s business set up with their personal email provider of choice.

That is to say, the organization’s leadership shouldn’t send email to the general public (or even the organization’s own members) from @yahoo.com, @gmail.com, or @whatever email address that isn’t owned by the organization. (Ideally, this would be the same domain name as the organization’s web site, but I do recognize that for some small businesses, their primary web presence is Etsy or something similar.)

I have three main reasons for this

  • It doesn’t look professional. People expect to get email from the organization they’re transacting with. (Would you do business with Amazon if the emails came from JeffyB64@yahoo.com?)
  • Having mail boxes the organization administers provides a fallback if someone forgets their password. (This was recently driven home for me when a friend lost access to 20+ years of business correspondence because of a lost AOL password.)
  • Having email addresses the organization administers protects both the organization and the individual if someone leaves under less than amicable circumstances. (Someone leaving under such circumstances is unlikely to be happy if asked to forward emails indefinitely.)

(Image via Pixabay user Deans_icons used under Pixabay License.)