Skip to main content
← Back to the Journal
AI · Security·April 2026·10 min read

Who owns your embeddings? Fine-tuning and PDPL reality.

Faisal Al-Anqoodi · Founder & CEO

Embeddings and fine-tuned weights are not ordinary files. They are processing outputs that can redefine what your data means — and contracts often discuss the base model while ignoring what was generated for you.

A telecom signed to fine-tune on support tickets. Six months later, they asked to move weights in-house. The vendor said base model ownership is ours; you only own outputs. A conversation missing from the appendix began.

At Nuqta, we split three artifacts: stored embeddings, fine-tuned weights or adapters, and training logs. Each has a compliance path under Oman PDPL and an IP clause that must be explicit — not implied [1][2].

Embeddings vs fine-tuned weights.

Embeddings are vector representations of text or documents; they can reflect sensitive content if sources are sensitive [3]. Fine-tuned weights adjust model parameters for behavior; they can carry training-data influence indirectly [4].

Mixing the two in contracts breeds dispute: who may reuse, who may delete on exit?

Owning the base model does not mean owning what your data produced. Write one contract sentence per artifact — or accept the unknown.

Qualitative risk matrix.

FIG. 1 — DATA / EMBEDDINGS / WEIGHTS / LOGS RISK MATRIX (QUALITATIVE)

Negotiation playbook.

  • Ask for an artifact list: files, sizes, formats, storage location.
  • Define export rights at contract end with a numeric deadline, not mutual agreement later.
  • Tie deletion to verification: wipe certificate or key destruction.
  • Separate sandbox from production; never fine-tune on production data without explicit consent [1].

Closing.

Private AI starts with owning the data path — then owning processing outputs. If that stays fuzzy, you pay twice: once to the vendor, once to counsel later.

Before any embedding or fine-tuning project this month, send the vendor a two-row table: who owns the file, who owns deletion rights. If they cannot answer on one page, the answer is rarely in your favor.

Frequently asked questions.

  • Are embeddings personal data? They can be if re-identification is easy; treat cautiously and involve counsel [1].
  • Can I port weights to another model? Depends on base license and contract; do not assume [4].
  • What about cloud? Demand region clauses for embeddings as you do for raw data [2].
  • How does this tie to RAG? Embeddings are part of the index; read the RAG guide.
  • What is the first vendor question? Show me the embedding path from document to vector — on one page.

Sources.

[1] Sultanate of Oman — Personal Data Protection Law (Royal Decree 6/2022).

[2] Sultanate of Oman — Executive Regulation to the Personal Data Protection Law (Ministerial Decision 34/2024).

[3] NIST — Privacy Framework.

[4] Hugging Face — Model cards and licensing guidance.

[5] Nuqta — internal IP boundary notes for training outputs, April 2026.

Related posts

Share this article

← Back to the JournalNuqta · Journal