Short Tale of Intel -> Apple Silicon Migration
2024-03-13
Note: I will be using the terms
amd64
,x86_64
, andIntel
interchangeably in this post. They all refer to the same architecture. Similarlyarm64
andApple Silicon
are used interchangeably.
I recently got handed a new MacBook at work. It’s a MacBook Pro with an M3 chip and my first encounter with Apple Silicon. I have been hearing many good things about Apple Silicon and having experienced it for myself for the past day I must say that my old Intel MacBook can’t compare. I know that comparing my 2020 Intel MacBook, which has been used and abused for 4 years, to a brand new, out-of-the-box 2023 M3 Mac is not an Apples to Apples comparision. So I will take the word of tech reviewers through the years as well as my own biased experience (I no longer have to wait several seconds for a compilation step) and say that Apple Silicon is a game changer.
The point of this post is not to talk up Apple’s chips but to talk about the challenges I faced getting up and running after moving my stuff from an Intel Mac to an Arm Mac.
Migration Assistant
While not specific to the Intel -> Arm transfer, I was reminded once again that I should probably be running Migration Assistant over a cable instead of WiFi. It took all night and the way the numbers kept going up instead of down, sometimes, gave me a ton of anxiety. The next time I do this I have to remind myself that taking 5 extra minutes to get that cable out and connected will save me hours later.
It Does Not Come with Rosetta
I’m not entirely sure if this is always the case or if it ended up being the case because I performed a Migration, but Rosetta was not already installed on my system. Instead, any time I tried to run a process that was Intel based, I got a pop-up that asked me if I want to install Rosetta.
I put it off for as long as I could. I wanted to have everything running as natively Apple Silicon as I could possibly manage. I finally did cave when I discovered a few tools did not have an Arm counterpart, but I think I managed to reinstall a lot of things before I hit that stage.
You Can See What’s Running on Rosetta
If you open the Activity Monitor, you can check which apps are running on Rosetta. In the CPU tab, look at the Kind column. It will either say Intel or Apple.
I managed to discover several things running on Rosetta and then went and found Apple Silicon versions of them. For example, I have some bash scripts that open tunnels to my company’s internal network. For managing these multiple tunnels the script uses tmux. I discovered in Activity Monitor that I had several bash
processes running as Kind Intel
. I suspected tmux
spawned these processes, so I did a quick check:
➜ which tmux
/usr/local/bin/tmux
➜ file /usr/local/bin/tmux
/usr/local/bin/tmux: Mach-O 64-bit executable x86_64
It looks like tmux
is still an x86 executable.
➜ which bash
/bin/bash
➜ file /bin/bash
/bin/bash: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
/bin/bash (for architecture x86_64): Mach-O 64-bit executable x86_64
/bin/bash (for architecture arm64e): Mach-O 64-bit executable arm64e
bash
on the other hand, is a universal binary, an app format that encapsulates multiple architectures and acts as one of the bridges between Intel and Arm Macs.
But since these bash processes were spawned by an x86_64 tmux
, it was the x86_64 version that got spawned. Once I replaced tmux
with the arm version, the bash processes also spawned with their arm counterparts. So far so makes sense!
A Scriptological Journey
Warning: This section ended up being longer than anticipated and is filled with arcane unix tools and commands. It was also the most fun section to write!
I wanted to find other binaries that were still x86_64 and see if we could clean them up. So I went into the /usr/local/bin/
directory and ran the following command:
➜ find -L . -type f -exec file {} + | grep 'x86_64' | grep -v 'arm64'
Breaking it down:
find -L . -type f
searches in the current directory and subdirectory, limited to items that are files. The-L
option follows symbolic links as well.-exec file {} +
executes the file command on each file.grep 'x86_64'
finds us instances ofx86_64
appearing in the output (the output for x86_64 binaries isMach-O 64-bit executable x86_64
).grep -v 'arm64'
excludes instances where there is a universal binary (likebash
, as mentioned before), which has both x86_64 and arm variants available in one package.
That last grep is to explicitly ensure that if I have a universal binary with both x86_64 and arm64 architectures, we keep that universal binary.
Side Note: The
+
modifier is an interesting one. It instructsfind
to accumulate the found files and execute the command passed to exec (in this casefile
) command on them in a single invocation. So while without+
each of the found files would havefile
executed against them one by one, with+
the files are passed to the command in one go.
The result of running this command:
./vagrant-go (for architecture x86_64): Mach-O 64-bit executable x86_64
./dummy: Mach-O 64-bit executable x86_64
./taste: Mach-O 64-bit executable x86_64
./python3.10-intel64: Mach-O universal binary with 1 architecture: [x86_64:Mach-O 64-bit executable x86_64]
./python3.10-intel64 (for architecture x86_64): Mach-O 64-bit executable x86_64
If you look closely, you’ll notice that the command is not quite completely giving us the result we want. For ./python3.10-intel64
(which is a universal binary with only 1 architecture) we are getting two lines of output. And for ./vagrant-go
(which we can deduce to be a universal binary with both architectures), we are getting the x86_64 line of output when we actually want none at all. Recall that the ultimate goal is to delete the Intel-only binaries, for which we want a clean list of files.
awk
to the rescue!
For universal binaries with two architectures, file
gives us the following.
➜ file vagrant-go
vagrant-go: Mach-O universal binary with 2 architectures: [arm64:Mach-O 64-bit executable arm64] [x86_64]
vagrant-go (for architecture arm64): Mach-O 64-bit executable arm64
vagrant-go (for architecture x86_64): Mach-O 64-bit executable x86_64
The first line tells us that this is a universal binary with 2 architectures and the next two lines give more details on those architectures, one line being for each architecture.
We can’t treat this as a line-by-line problem, anymore. We need to process the whole 3-line block. We need to recognize that we’re processing a 2-architecture universal binary and then figure out a way to exclude the whole block.
➜ find -L . -type f -exec file {} + | \
awk '/Mach-O universal binary with 2 architectures/ && /arm64/ && /x86_64/ { getline; getline; next } { print }' | \
grep 'x86_64' | \
grep -v 'arm64'
We’ve added an awk
invocation after listing our files via find
(and file
) and before we start grepping them. Breaking down the command:
awk
is our invocation of the awk tool./Mach-O universal binary with 2 architectures/ && /arm64/ && /x86_64/
is the pattern that matches the first line output byfile
.{ getline; getline; next }
is the action block that executes on the lines matched by our pattern. We call getline twice to skip over the two architecture lines.next
stops processing the current line since we don’t want to do anything with the architecture lines.{ print }
the default action block to apply to anything that does not match our pattern. It lets them pass through and be printed.
So to put it all together, our awk
invocation looks for lines that indicate a universal binary with two architectures and skips over the architecture lines, essentially ignoring the whole block.
The result of this invocation is:
./dummy: Mach-O 64-bit executable x86_64
./taste: Mach-O 64-bit executable x86_64
./python3.10-intel64: Mach-O universal binary with 1 architecture: [x86_64:Mach-O 64-bit executable x86_64]
./python3.10-intel64 (for architecture x86_64): Mach-O 64-bit executable x86_64
Now we no longer get ./vagrant-go
as part of our results, but ./python3.10-intel64
still appears twice. This is easily remedied by tacking on another grep -v "universal binary"
to the end of our commands.
➜ find -L . -type f -exec file {} + | \
awk '/Mach-O universal binary with 2 architectures/ && /arm64/ && /x86_64/ { getline; getline; next } { print }' | \
grep 'x86_64' | \
grep -v 'arm64' | \
grep -v 'universal binary'
./dummy: Mach-O 64-bit executable x86_64
./taste: Mach-O 64-bit executable x86_64
./python3.10-intel64 (for architecture x86_64): Mach-O 64-bit executable x86_64
At this point our command chain is getting a bit too long. For the sake of sanity I will put it all in a file called find86_64.sh
.
Side Note: If you’re wondering what
dummy
is, let me show you the trick to creating a dummy Mach-O x86_64 executable file for testing purposes.➜ printf '\xcf\xfa\xed\xfe\x07\x00\x00\x01\x03\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' > dummy
This file won’t execute, but running
file
on it gives usdummy: Mach-O 64-bit executable x86_64
Which is good enough for testing.
The next step is to extract just the name of the file. We can leave in the leading ./
since ls
(which we will run next) will accept that just fine.
awk
to the rescue again!
➜ find86_64.sh | awk -F':| ' '{print $1}'
./dummy
./taste
./python3.10-intel64
Breaking it down:
awk -F':|
uses either:
or space as a field separator to split the line into multiple fields.{print $1}
prints the first field, which, in our case, is the file name.
Our updated script now looks like this.
#!/bin/sh
find -L . -type f -exec file {} + | \
awk '/Mach-O universal binary with 2 architectures/ && /arm64/ && /x86_64/ { getline; getline; next } { print }' | \
grep 'x86_64' | \
grep -v 'arm64' | \
grep -v 'universal binary' | \
awk -F':| ' '{print $1}'
Pipe that through xargs ls -l
.
➜ find86_64.sh | xargs ls -l
-rw-r--r-- 1 shak admin 48 9 Mar 12:47 ./dummy
lrwxr-xr-x 1 shak admin 81 9 Mar 12:57 ./python3.10-intel64 -> ../../../Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10-intel64
-rw-r--r-- 1 shak admin 48 8 Mar 23:50 ./taste
Note that we have a mix of regular files and symlinks (the l
at the beginning of the permission string indicates a symlink, which we can also see by the arrow pointing from ./python3.10-intel64
to what it is linked to). This is not unexpected. Package managers and installation scripts often write their binaries to a location of choice and make them reachable for execution by symlinking from places like /usr/local/bin/
.
Side Note: If we were to pipe anything into
ls
by invoking| ls -l
, it would ignore the piped input and runls -l
in the working directory, which is clearly not the behavior we want. This is becausels
does not accept input piped into it. xargs converts the piped in input to arguments forls
to consume.
If we were to pipe the output of find86_64.sh
directly into xargs rm
, we would remove the files stored in /usr/local/bin
. Which works well for the binaries that are stored in this directory but for the ones that are symlinked, we will only have their symlinks removed and not the underlying files.
After a bit of research I found the readlink command which can be used to find the target of a symbolic link.
➜ readlink ./python3.10-intel64
../../../Library/Frameworks/Python.framework/Versions/3.10/bin/python3.10-intel64
➜ echo $?
0
On regular files it outputs nothing and exits with a 1
.
➜ readlink ./dummy
➜ echo $?
1
With this knowledge in mind, we can extend our script to read each line and do something different based on whether it’s a symlink or a regular file.
#! /bin/bash
find -L . -type f -exec file {} + | \
awk '/Mach-O universal binary with 2 architectures/ && /arm64/ && /x86_64/ { getline; getline; next } { print }' | \
grep 'x86_64' | \
grep -v 'arm64' | \
grep -v 'universal binary' | \
awk -F':| ' '{print $1}' | \
while read -r file; do
if [ -L "$file" ]; then
target=$(readlink "$file")
echo "$file is a symlink targeting $target"
else
echo "$file is a regular file"
fi
done
Breaking it down:
while read -r file
reads each line, the-r
prevents any backslashes from being processed as escape sequences, preserving the original content of the line.if [ -L "$file" ]
checks if the file is a symlink
And the rest is just echo
s. You can imagine how this could be replaced with a bunch of rm
s to clean up our unwanted binaries. Disinclined to take responsibility for shenanigans apart from my own, I will leave that final bit of scripting to the reader.
Easier Than I Imagined
I talked to a colleague who got his Apple Silicon Mac more than three years ago when these things were fresh out of the oven. His tales of trial and tribulations made me grateful that I am making the move now when so many projects have adopted Apple Silicon and have arm versions released. I code primarily in Golang, and it was almost trivial to clean up my old Golang versions and install Apple Silicon compatible ones.
Wrapping it Up
I started a draft for this post when I began my migration journey. I imagined I would have to go through a lot more trouble than I actually ended up going through. Almost every app had an Apple Silicon version that was a simple download+replace operation and the story held true for nearly everything I needed to install via brew. I was up and running in a matter of hours.
Having said that, the scripting journey was a lot of fun. I had to dust off some old Unix tools and learn a few new ones. I also got to play with some of the more arcane features of these tools. I hope you had as much fun reading about it as I did writing about it.
I thank everyone who have been experimenting with Apple Silicon these last few years. They walked with hobbling steps, so people like me could run.