If you own a Kindle you’re probably like me, you send personal documents such as PDFs and books purchased elsewhere to your Kindle library via email so you can read them on your other devices. I primarily use my Kindle Paperwhite, so most of my highlights end up in the My Clippings.txt
file stored locally on the device. That file is easy to export: just plug in the Kindle, grab the text file, and email it to Readwise to sync all my new highlights. Simple.
But recently, I’ve been reading across multiple devices. For example, when I forgot my Kindle at home, I caught up on a doc using the Kindle app on my iPhone. Since I had emailed the document to my Kindle address, it appeared in my library and downloaded smoothly on the app.
To my surprise, the highlights I made on my iPhone synced perfectly to my Kindle Paperwhite. That part worked beautifully. But when I went to export my highlights to Readwise, there was a problem: those synced highlights weren’t included in My Clippings.txt
. Since I didn’t make those highlights on the Paperwhite, they never made it into the export file.
Amazon offers a way to email highlights from the Kindle app, even for personal documents but there’s a limit to how many highlights you can send. And honestly, restricting myself to only highlighting on one device just to get around this limitation felt wrong. So I decided to dig deeper.
If highlights made on the iPhone are syncing to the Paperwhite, then they’re being stored locally somehow. The question was: where?
Fortunately, the community at MobileRead forums has done a great job reverse-engineering Kindle file structures. In particular, jhowell has created two essential tools:
- KindleUnpack: Reads metadata from Kindle file formats
- KFX Input Plugin for Calibre: Allows decoding of
.kfx
files used in newer Kindle documents
Using those tools—and with some help from ChatGPT Codex—I built a Python script that can extract synced highlights from the Kindle’s local files. You’ll need two files for each personal document:
- The
.kfx
file (the main book or article) - The corresponding
.yjr
file (which stores your annotations)
Once you have those, the script will parse the highlights and generate an HTML file you can review or import into another tool like Readwise.
I’ve posted everything here:
👉 GitHub Repo – KFX Highlights Extractor
I’ve tested this on a few documents, and it worked well, but your mileage may vary. If you run into issues or have improvements, feel free to open a pull request or file an issue.