Don't Trust API Docs, Trust Shipping Code: Reverse Engineering Undocumented APIs

When integrating with an undocumented API, the README is the worst place to find out what works. We learned this the hard way last week: the LinkedIn Voyager API endpoint we copied from a community-maintained reference repo had been dead for years, and we burned hours debugging "why is auth failing" when the real answer was "this endpoint returns 301 to everyone, you're chasing a ghost." The fix wasn't a smarter retry strategy — it was reading the source code of the latest shipping version on PyPI instead of the documentation pinned to the top of GitHub.

Reverse engineering undocumented APIs is half the work of any serious automation project, and the lesson generalises far beyond LinkedIn. Documentation rots. Shipping code doesn't.

Why we stopped trusting the README on undocumented APIs

We were building a LinkedIn outreach automation tool — research a contact, send a connection request, follow up via DM. For the connection-request piece, we did what every engineer does first: searched GitHub for "linkedin api invitation," found a popular Python wrapper that was actively starred and recently committed-to, and copied the endpoint it documented:

POST /voyager/api/growth/normInvitations

Every request returned HTTP 301 Moved Permanently with no useful body, no Location header pointing anywhere meaningful, and no error message. We assumed it was a CSRF problem. Then a cookie problem. Then a header-ordering problem. We rebuilt the request from scratch three times.

The endpoint had been removed from LinkedIn's Voyager API years ago. The community fork we copied from was actively maintained, but "actively maintained" meant "people are fixing tooling around the endpoints" — not "the endpoints still work." Nobody had recently tried to actually send an invitation through it.

How we found the endpoint that actually ships

We stopped reading READMEs and went to the linkedin-api package on PyPI. The latest shipping version (v2.3.1, November 2024) called a completely different endpoint with a completely different request body:

POST /voyager/api/voyagerRelationshipsDashMemberRelationships
     ?action=verifyQuotaAndCreateV2

The body shape was different too — wrapped in an inviteeUnion with a member URN, instead of the flat { "invitee": { "profileId": "..." } } the old endpoint expected. We swapped the call. It worked first try. The contact's profile showed "Pending" within seconds.

The lesson is procedural: when integrating with an undocumented or reverse-engineered API, never trust the README — open the latest version's source on PyPI or npm and read what it actually calls today. GitHub stars measure popularity, not correctness. Recent commits measure activity, not endpoint health. The only signal that an endpoint works is "this is what the latest published package version sends." Everything else is folklore.

This is the same trap we hit two years ago when we wrote about finding a hidden API behind a healthcare provider portal — the public docs and the actual production API had diverged so far that the docs were actively misleading.

Why we briefly tried DOM automation and why it failed

Before we found the working endpoint, we tried to route around the problem entirely with DOM automation. The plan: open LinkedIn in a hidden Chrome tab via chrome.tabs.create({ active: false }), navigate to the profile, find the Connect button, click it, find the modal, click "Send without a note." No API calls. Just driving the real UI.

The modal opened — we confirmed via screenshots from the hidden tab. Our content script's polling logic could never find it. We tried aria-label selectors. We added multi-language text aliases for i18n. We added a MutationObserver to catch the modal at DOM-insertion time. Nothing.

The root cause turned out to be documented Chrome behaviour we'd forgotten about. Chrome 88 introduced heavy throttling of chained JavaScript timers in background tabs — setTimeout intervals get clamped to 1 second minimum, requestAnimationFrame stops firing, and background tabs broadly suspend most periodic work to save battery. Our 500ms polling loop was running at 1000ms. React's reconciler — which schedules portal-mounted children on rAF — was paused entirely. The modal renders, our observer's window to see it expires, the modal auto-dismisses, we report "modal not found."

This isn't a bug we could engineer around. It's a deliberate Chrome design decision. Every shipping LinkedIn automation extension we looked at — the popular open-source ones, the commercial ones — runs as a content script on the user's foreground LinkedIn tab. They piggyback on real human browsing because hidden-tab DOM automation simply doesn't work reliably on modern Chrome.

Linked Helper — one of the largest LinkedIn automation products on the market — abandoned the Chrome extension model entirely in 2020 and rebuilt as a desktop Electron app, specifically because the extension sandbox couldn't deliver the reliability they needed. That migration cost them their distribution channel. They did it anyway, because the extension architecture had hit a wall.

The architectural pattern we landed on

Once we had the working endpoint and we'd ruled out hidden-tab DOM automation, the right answer collapsed to a tiny piece of code in the service worker:

// Pseudocode — service worker, MV3 extension
const csrf = await getCsrfToken();   // from JSESSIONID cookie
const res = await fetch(VOYAGER_INVITE_URL, {
  method: 'POST',
  credentials: 'include',            // session cookies attach automatically
  headers: {
    'csrf-token':       csrf,
    'x-restli-protocol-version': '2.0.0',
    'content-type':     'application/json',
  },
  body: JSON.stringify(buildInvitePayload(memberUrn)),
});

No DOM. No hidden tab. No selector polling. No MutationObserver. The user's real LinkedIn cookies attach via credentials: 'include', the service worker isn't subject to background-tab throttling, and the same architecture covers both reads (profile research) and writes (invitations). Our scheduler ticks every minute, fires the same fetch pattern, and the contact's profile updates server-side.

We chose this over DOM automation in the foreground tab because the foreground approach forces the user to keep LinkedIn open in a visible tab — which we can't guarantee, and which races with the user's actual browsing. We chose it over a desktop Electron app because we wanted browser-native session reuse without asking users to log in twice. The service-worker fetch pattern is the only one that gives you all three: reliability, no UX disruption, and zero new auth flow.

The two non-obvious gotchas in service-worker fetch

Two things bit us during the rewrite that aren't in any tutorial.

Cookie domain matching. LinkedIn's session cookies live on .www.linkedin.com with a leading dot, not www.linkedin.com. chrome.cookies.get({ url: 'https://www.linkedin.com' }) silently returns null because the domain match is exact. The reliable lookup is chrome.cookies.getAll({ domain: 'linkedin.com', name: 'JSESSIONID' }) and then iterating. We've seen this trip up at least three other extension teams in our network.

CSRF token quote-stripping. LinkedIn stores the CSRF token in JSESSIONID wrapped in literal double quotes — "ajax:1234567890" — and the API rejects requests if you send the quotes through. The fix is one regex:

const csrf = jsessionid.replace(/^"|"$/g, '');

Both gotchas cost us 30+ minutes each. Both would have been zero-minute discoveries if we'd read the latest tomquirk/linkedin-api source first.

The pattern, generalised

The specific lesson is about LinkedIn — but reverse engineering undocumented APIs has the same shape across every client portal, internal SaaS API, mobile-app backend, or browser-extension surface we've ever integrated with. Three rules we now follow on every project:

One — verify endpoint health before architecting around it. Send one request with curl before you write any wrapper code. If it returns 301, 404, or a redirect to a login page on a session you know is valid, the endpoint is dead. Don't debug auth on a dead endpoint. We've burned days on this; the curl check takes 30 seconds.

Two — read the latest published package source, not the README. PyPI and npm both let you download a tarball of the latest version. The actual fetch() calls or requests.post() calls inside it are the only ground truth. Anything in the README, the wiki, or the GitHub issues might be from 2019.

Three — match the architecture to the platform's actual constraints, not your preferred mental model. Hidden-tab DOM automation feels clean — same code path as foreground, just invisible. It doesn't work on modern Chrome. Service-worker fetch with the user's session cookies is the pattern that actually ships. We learned this by running into a wall; you can learn it by reading this. For a deeper look at how we apply this thinking to other browser-driven workflows, see our writeup on cutting browser automation token costs by using accessibility trees — same principle, different surface.

What "trust shipping code" looks like in practice

The next time you're integrating with an undocumented or community-reverse-engineered API, the workflow is: pull the latest version of the most-used wrapper from its package registry, read the actual HTTP calls in the source, replicate them with curl until one returns 200, then build your wrapper around that. Skip the README. Skip the GitHub issues. Skip the Stack Overflow answers from 2021. The only artifact that tells you what works today is the code that was published yesterday.

Reverse engineering undocumented APIs is mostly archaeology — and the only layer that matters is the most recent one. Everything underneath is folklore.

Why we stopped trusting the README on undocumented APIs

How we found the endpoint that actually ships

Why we briefly tried DOM automation and why it failed

The architectural pattern we landed on

The two non-obvious gotchas in service-worker fetch

The pattern, generalised

What "trust shipping code" looks like in practice

Related Posts

Building a Health Knowledge Base Pipeline: Discovery, URL Patterns, and Normalization

How to Make an App in 2026: The AI-Native Stack We Actually Ship With

Ready to Build Something?